> I'd rather be on call for a serverless system than otherwise. Getting paged at 2am because some log file filled up a disk, or a million other details that your "couple of hours" solution didn't take into account? No thanks.
How about getting paged at 2am because somehow a lambda called by AWS step functions workflow is failing due to hitting a timeout while uploading a 20MB file to a S3 bucket? Because this is an actual real world case that happened in the real world.
Well, with step functions you can have auto retry with exponential back off in case of failure. But you had a lambda that couldn’t upload a 20Mb file to S3 in 15 minutes? Whatever the issue was, you would have more than likely had the same issue with a VM. A lambda runtime environment is nothing special for all intents and purposes but a Linux VM with well known constraints - a 512MB /tmp storage, and up to a 15 minute runtime.
How about getting paged at 2am because somehow a lambda called by AWS step functions workflow is failing due to hitting a timeout while uploading a 20MB file to a S3 bucket? Because this is an actual real world case that happened in the real world.