Can you share a little what your stack looks like? And how are you doing infrastructure automation to build and reproduce it? I'm really interested in where you found ways to cut costs, and what kind of effort was involved.
If you had a perfectly clean install of your distro of choice, could you write a shell script that could build your server from scratch?
If you can answer Yes to that (and you should), then you can build an if/then-heavy shell script that works with each of the APIs to create a perfectly clean install of your favorite distro.
One script to create the clean slate machine.
One script to build what you need.
We have 9 "pods" around the globe, each with API servers (Java/Tomcat/Apache), static servers (Varnish), a MySQL slave, and an HAProxy Maître D'. With close to 60 servers, our monthly bills are less than $1,000, and we haven't had downtime in years. Spinning up a new server is just: sh build.sh atl api 1.
Feel free to ping me if you want any more details: mark@areyouwatchingthis.com.
I use AWS Route 53 DNS with Health Checks, and put one "A" record in for each "pod". If an entire pod somehow disappears, Route 53 will take it out of the rotation.
Terraform is not AWS-specific - there are 30-odd providers covering most major cloud services and many SaaS systems also. It can be valuable for multi-cloud orchestration!
trust me, no matter how you slice it, it's a shitload of effort. that is exactly how AWS makes a ton of money. they make hard things easy (and expensive).
do you want the pain, or do you want the money, that's the basic proposition here. when you start looking at 25, 50, 150k/month of savings by doing stuff yourself, the choice becomes much clearer. in many cases i've seen, you could theoretically hire an entire team to take care of the stuff that AWS does for you, and still come out ahead.