Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes! SSH certificates are awesome, both for host- and client-verification.

Avoiding Trust on First Use is potentially a big benefit, but the workflow improvements for developers, and especially non-technical people, is a huge win too.

At work, we switched to Step CA [1] about 2 years ago. The workflow for our developers looks like:

  1. `ssh client-hosts-01`

  2. Browser window opens prompting for AzureAD login

  3. SSH connection is accepted
It really is that simple, and is extremely secure. During those 3 steps, we've verified the host key (and not just TOFU'd it!), verified the user identity, and verified that the user should have access to this server.

In the background, we're using `@cert-authority` for host cert verification. A list of "allowed principals" is embedded in the users' cert, which are checked against the hosts' authorized_principals [2] file, so we have total control over who can access which hosts (we're doing this through Azure security groups, so it's all managed at our Azure portal). The generated user cert lasts for 24 hours, so we have some protection against stolen laptops. And finally, the keys are stored in `ssh-agent`, so they work seamlessly with any app that supports `ssh-agent` (either the new Windows named pipe style, or "pageant" style via winssh-pageant [3]) - for us, that means VSCode, DBeaver, and GitLab all work nicely.

My personal wishlist addition for GitHub: Support for `@cert-authority` as an alternative to SSH/GPG keys. That would effectively allow us to delegate access control to our own CA, independent of GitHub.

[1] https://smallstep.com/docs/step-ca

[2] https://man.openbsd.org/sshd_config#AuthorizedPrincipalsFile

[3] https://github.com/ndbeals/winssh-pageant



GitHub does have support for SSH CAs, but it's an Enterprise feature: https://docs.github.com/en/enterprise-cloud@latest/organizat...


That's very interesting, thank you for linking!


At work, we switched to Step CA [1] about 2 years ago. The workflow for our developers looks like:

  1. `ssh client-hosts-01`

  2. Browser window opens prompting for AzureAD login

  3. SSH connection is accepted

How is that simple, compared to `ssh -i .ssh/my-cert.rsa someone@destination --> connection is accepted, here's your prompt` ?


The former is discoverable: it doesn't require developers having ANY knowledge of command switches (no matter how basic) nor following a set of out-of-band instructions; the "how to" is included within the workflow.


ssh-add (once per session) gives users back that incredible convenience. If you wanted to rotate certs, you’d have to add each new one, of course.


The server could display that info when a user tries to log in via interactive authentication.


It’s the exact same command as a regular SSH prompt and it generates and uses the cert. that seems very simple.

Your command is disingenuous in that it only works if the certificate has already been issued to you. If you were to include issuance, your command would very much turn non-simple.


If I'm reading it right then there's a non-insignificant amount of setup necessary for the proposed approach anyway, generating and sharing a public key is much easier even for the customer/client.


This is simple because it doesn’t require you to take any specific actions to make new/different hosts accessible.

If you deactivate someone in AD, poof, all their access is magically gone, instead of having to remove their public key from every server.


What if you're ssh-ing from a headless client, like a raspberry pi or a VPS?


Then it doesn’t work but their developers are ssh-ing from their work laptops so it doesn’t matter. Something doesn’t have to be a solution for all use cases to be a good solution.


That is also the flow for Tailscale SSH

https://tailscale.com/kb/1193/tailscale-ssh/


If you are in the terminal and don't have access to a browser?


Not the OP but if anyone doesn’t have access to a browser in my org then I can safely say they’re not accessing from a company laptop and thus should be denied access.


You really never ssh from one remote server to another?


Not GP, but:

I do, however when I do this I make sure the certificate is signed with permit-agent-forwarding and demand people just forward their ssh agent on their laptops.

This also discourages people from leaving their SSH private key on a server just for ssh-ing into other servers in CRON instead of using a proper machine-key.


Agent forwarding has its own security issues, you're exposing all your credentials to the remote.

It's better to configure jump hosts in your local ssh config.


There's SSH agent restriction now.

[1] https://www.openssh.com/agent-restrict.html


In general for systems like this, you can open the browser link from a different host.

For example, if I've SSHed from my laptop to Host A to Host B to Host C then need to authenticate a CLI program I'm running on Host C, the program can show a link in the terminal which I can open on my laptop.


Having to interact with the browser every time I need to ssh to a machine would be extremely painful.

If key forwarding works, that might be workable.

I'm extremely wary of non-standard ssh login processes as they tend to break basic scripting and tooling.


These tools usually cache your identity, so you might only need to go through a browser once a day.


I suppose this could be solved by using the first server as an SSH jump host -- see SSH(1) for the -J flag. Useful e.g. when the target server requires public key authentication and you don't want to copy the key to the jump host. Not sure it would work in this scenario though.


SSHing from one remote server to another won’t be possible in a lot of environments due to network segmentation. For example, it shouldn’t be possible to hop from one host to another via SSH in a prod network supporting a SaaS service. Network access controls in that type of environment should limit network access to only what’s needed for the services to run.


I've seen the exact opposite configuration where it's not possible to avoid SSHing from one remote server to another due to network segmentation, as on the network level it's impossible to access any production system directly via SSH but only through a jumphost, which obviously does not have a browser installed.


You don't need the jumphost to do the auth for the target host. You use -J and the auth happens locally and is proxied through.


I can count on 1 hand the number of reasons I might need to do that and on each occasion there’s usually a better approach.

To be clear, I’m not suggesting the GPs approach is “optimal”. But if you’ve gone to the trouble of setting that up then you should have already solved the problems of data sharing (mitigating the need for rsync), network segregation and secure access (negating the need for jump boxes), etc.

SSH is a fantastic tool but mature enterprise systems should have more robust solutions in place (and with more detailed audit logs than an rsync connection would produce) by the time you’re looking at using AD as your server auth.


The CA CLI tool we use supports a few auth methods, including a passphrase-like one. It likely could be set up with TOTP or a hardware token also. We only use OAuth because it's convenient and secure-enough for our use case.


Never. I’ve been at this company for 8 years and owned literally thousands of hosts and we have a policy of no agent forwarding. I’ve always wondered when I would be limited by it but it simply hasn’t come up. It’s a huge security problem, so I’m quite happy with this.


Not sure why you'd get downvoted for this comment. This is likely very applicable for many orgs that have operator workstation standards -- they're some kind of window/osx/linux box with a defined/enforced endpoint protection measures, and they all have a browser. Any device I can imagine ssh'ing from that doesn't have a browser is definitely out of policy.


because both of you narrow visioned the scenario to what you do daily. it is a common use case to ssh from a jump server, use ssh based CLI tools and debugging. the issue stems from windows users who are coupled to GUIs. the behavior pattern increases IT and DevOps costs unnecessarily.

an alternative example: our org solves the issue with TOTP, required every 8 hours for any operation; from ssh/git CLI based actions (prompted at the terminal) to SSO integrations. decoupling security from unrelated programs. simple and elegant.


The -J parameter to say will transparently use a jump server and doesn't require the ssh key being on the third party server. I can't speak for tooling on step-ca but my employers in house tooling works similarly and loads the short lived signed cert into your ssh-agent so once you do the initial auth you can do whatever SSH things.


There are better ways to access remote servers than using a jump box. If you’ve gone to the lengths to tie SSH auth into a web based SSO then you should have at least set up you’re other infra to manage network access already (since that’s a far bigger security concern).

Plus, as others have said, you can jump through SSH sessions with on the client ssh command (ie without having to manually invoke ssh on the jump box).


As pointed out, whether or not you go through a jump host isn’t relevant. We all go through jump hosts as well.

Besides, neither me nor GP is saying this needs to be a universal pattern. We are saying that it’s a viable pattern for a lot of orgs.



With e.g Azure's CLI az you can specify a flag something like "--use-device-code" which shows a copy-pastable URL that you then can just visit in the browser (on a different device even).


This is a bit off topic, but does anyone know how the mechanism that triggers the web page prompt from an ssh connection actually works? Is it some kind of alternate ssh authentication method (like password/publickey) or something entirely out-of-band coming directly from the VPN app intercepting the connection?

Ever since I saw it in action with Tailscale I've always wondered how it actually works, and I guess if anyone would know they'd be on HN


OOB: ".. during the SSH protocol’s authentication phase, the Tailscale SSH server already knows who the remote party is and takes over, not requiring the SSH client to provide further proof (using the SSH authentication type none)." https://tailscale.com/kb/1193/tailscale-ssh/#how-does-it-wor...


Smallstep uses ProxyCommand [0]. Not sure how Tailscale does it.

0: https://smallstep.com/docs/ssh/how-it-works


> we've verified the host key (and not just TOFU'd it!),

How.

Specifically, what I cannot determine from their docs is how the VM obtains a host key/cert signed by the CA. How does the CA know the VM is who the VM says it is? (I.e., the bootstrap problem.)

(I assume that you also need your clients to trust the CA … and that has its own issues, but those are mostly human-space ones, to me. In theory, you can hand a dev a laptop pre-initialized with it.)


StepCA supports quite a few authentication methods, including an "admin provisioner" (basically a passphrase that can be pasted into the CLI tools' stdin).

Because each of our servers are bespoke, we can use the admin provisioner when the server is first being set up (and actually, Ansible handles this part).

I don't have experience with it, but StepCA also has Kubernetes support, and I imagine the control plane could authenticate the pod when a cert needs to be issued or renewed.


I can't say in the general sense, but with GCP you can retrieve the EKpub of a VM's TPM via the control plane, and then use https://github.com/google/go-attestation to verify that an application key is associated with that TPM and hence that VM


I like this solution, thanks for sharing. Just need to swap it with my own OIDC compliant federated authentication server.


One thing I've never understood about SSH certificates for client identification - it looks like it causes the requirement that _at some point_ ssh private keys and the certificate private key need to both be in the same place? And if this is the case, then doesn't that imply that you need to have a service where users upload their private key?

Which would mean you have one single point of attack/DOS/failure that needs to be kept utterly secure at all costs?


You give your public key (typically into ~/.ssh/authorized_keys) and then prove you have access to the matching private key as the essential part of the challenge. You always keep the private key.


I thought the way it worked was that the certificate signed with the certificate private key only contains the public key, and the ssh server, after checking the certificate is valid, validates that the client has the private key corresponding to the public key in the certificate.


Also - key forwarding. Private key is on your local, you can forward it through ssh so you can hop around from your next destination


Vault also supports both client and server ssh certificates [1]. I use terraform and vault to sign server certificates at creation time.

[1] https://developer.hashicorp.com/vault/docs/secrets/ssh/signe...


Requiring the use of a browser, though, limits the usefulness a bit.


Now try to automate that.


How do you get the browser to open? Does it work on all operating systems and ssh clients, such as Android's JuiceSSH?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: