I also think about layers when I set up IaC, but I'm more focused on how things connect and relate rather than sticking strictly to the OSI stack model. In my mind, it's all about grouping things that might influence each other. This approach usually leads me to think in three layers: foundation, shared services, and applications.
Starting at the bottom, the foundation layer holds the basics like networking, storage, accounts, and permissions. The shared services layer is where I place tools like certificate managers and secret storage. I keep services that interact closely together, while separating those that work more independently. At the top, I lay out the applications. This is where I slot in services like auto-scaling groups, individual server instances, load balancers (depending on whether they're communal or specific), and pods in platforms like Kubernetes. Depending on the complexity of the environment there may be 1 or multiples of each layer.
By structuring IaC this way, I find it’s clearer and more intuitive.
I'm using something similar at the moment, and it mostly works. There are issues now and again though - if the developers of a service own the "top" layers, and start adding new services then that often requires changes to the core.
For example if a team were to suddenly start using DynamoDB then the IAM roles for their application, in their account, suddenly need the permission to add/get records.
Most of the time the layers are distinct and the "ops" team can handle the core, leaving the application/service-specific stuff to the developers, but things do come up that have to be scheduled and coordinated across the layers/owners. It's a pain, but so far tolerable one.
I was looking for the explanation about how this grouping is like the OSI model, but found none...
Also, I think where OP uses "principal" they mean "principle".
The whole article reads as an advertorial for Pulumi. :|
OP also never bothers to ask themselves questions like "what if I'm wrong?" or "what to do with this obvious claim that doesn't add up?".
For example: why is "Data" layer below "Compute"? -- that's the kind of question that's never addressed by OP. I mean, most people in the industry wouldn't think about this as being layers, and definitely not being one on top of the other. To convince someone you need to give a very solid argument here... but there's nothing there...
Layers 4 through 6 make some conceptual sense if you consider the lower layers being support/infrastructure for the higher layers. In the old days, before cloud, we used to draw plenty of diagrams that were essentially DB Server -> App Server -> Web Server -> Load Balancer... it's the same kind of thing.
I say some sense because layer 3 "permissions" sticks out to me like a sore thumb. Whenever I work with terraform I spend 50% of the time on permissions. I'd hesitate to call it a "layer" given the pervasive nature of IAM roles/permissions across all resources.
Config files work fine. But so does expressive code with the right abstractions. And I think code scales up better and down almost as well. Just my opinion.
the recent years often remind me of alan kay talking of objects made up of object talking to objects, i wonder if IaC amongst other trends is not an incarnation of that on a wide scale
Cool, I skipped that link not knowing what it was. There really seems to be a strange conceptual redundancy, you used to wire bits of assembly into classes and plug objects together into cohesive graphs, now it's the same except at the upper layer, containers as entity of logic, network bridging as information transport.. and you suffer from the same issue.. you can have adhoc state in your system config files, inability to initialize a sub component without the rest, lack of logical interfaces..
This has to be the worst take on IAC organization I have ever seen. I would have never thought someone would try to apply the osi model to infra code management.
How long does it take to deploy a new service with this approach? A week?
Normally I would absolutely agree, but this article is the equivalent of one called "Flying a Boeing 737" in which the author advises you to invert the plane, fly at sea level and turn off all instruments. That is how little the proposed approach in this article makes sense.
Given that, how is one supposed to reply critically to such a post? I'm genuinely curious and open to suggestions, as it's something I'm clearly not good at.
>Given that, how is one supposed to reply critically to such a post? I'm genuinely curious and open to suggestions, as it's something I'm clearly not good at.
Link to or describe a better approach and explain specifically why it's better.
Your only specific part was that it's slow to deploy a new service which for most organizations is somewhere on the bottom of their priority list. In fact many organizations probably prefer slow deployments as that implicitly discourages unnecessary services and infrastructure bloat. That in turn lowers the long term maintenance burden and technical debt. Five hundred services that are in reality owned by no team or whom no one knows about are not what you want in an organization.
Do you know what tone policing is? hn is one of the most conservative echo-chambers I visit (yes I should stop) because constantly everyone is always being tut-tut-tut-ed.
I disagree, in a lot of cases a new service would require an update to 1 or 2 stacks only, and only those 2 stacks need deploying.
It is in some cases required to do some version of this as the vendor API support does not allow for proper feedback when an operation is complete so it needs to settle.
Or, building docker images which run each time and take a long time / resources unnecessarily (I believe Pulumi have a fix for some version of this).
If your stacks are deployed via CI/CD, it’s not really a big deal to deploy 10x stacks in sequence, or just.
This may be overkill for a lot of projects but it’s valuable insight from a respected organization / individual.
I'd say dont do any layering and stick with the standard naming convention of "stacks". Start with a common stack with all your common stuff and application stacks with stuff that specific to some application say all the resources for a microservice or everything for a BI system ...etc.
Avoid splitting this up as it introduces too much complexity. The IAC code should be very simple such that any dev can pick it up just coming off the tutorials.
Company I'm in has 3 layers and dozens of stacks and it's made the whole thing impossible to reason about. No one wants to touch it anymore which means we now have a Platform team that screws around with this chap for months on end.
Note: Lee Briggs works for Pulumi as a Principal Platform engineer so its in their interest to make this too complicated.
Global namespace shared resources, regional namespace shared resources, then each app provisions its own bit, consuming/linking the two aforementioned layers.
Everyone gets here eventually and you can just fight over stuff like “is an alb shared regional or app specific”
In what mature organization takes it more than a week? My current org is quite mature: You probably use it. It's also not a cloud provider: Our aws bill is in the high 8 figures a month. And yet launching a new service not directly pingable from the internet, and deployed in, say, 5 regions, is a matter of 3 PRs, adding the service to CI included. I've gone from having no repo at all to deployment in 4 days, because we were in a big hurry. All the infra-defining PRs will get eyes from an SRE or three, but the team that is writing the service is writing the PRs.
I bet we have far more instances under our name than the people that write this article, and yet we have nowhere near that level of complexity in our IaC definitions. And yet, somehow we manage. I guess we are immature?
There’s a huge difference between “identical clone of an existing service” and “a new service”.
My challenge at $dayjob is that it takes months to spin up a new cloud service because they’re new.
Either a new app that wasn’t on the cloud before — in which case the templates need extensive customisation.
Or, new app in the sense that the devs just cracked open Visual Studio and have no idea yet what they actually need from the cloud.
I get maybe 10-20 copies of a template (dev/tst/prd + ha/dr), and then I have to start from the beginning.
Guidance on how to maximise reusability would actually be very useful.
Unfortunately, in the real world, this seems difficult. Many small variations in requirements tends to make abstractions leaky.
For example, one vendor requires active-passive load balancing for licensing reasons. Millions of dollars worth of licensing reasons. Neither AWS nor Azure support anything but active-active in any of their load balancers. (They do in DNS, but for various reasons that won’t work for us.)
Another “new” app (industrial air quality monitoring) is actually from the stone ages and doesn’t support PaaS databases or even 3 of the 4 clustering modes available in IaaS. So a custom load balancer solution is required… just for it.
This is the issue. Everyone that loves the cloud and says it’s simple has easy mode turned on: cookie cutter clones they can stamp out for many identical customers or whatever.
Some people play the game with “big government” difficulty.
I could have elaborated. When I say "generic service template", I mean a service with any cloud requirements (that we've had at any point before) can be assembled from building blocks (TF modules) in 20~ min. This ofcourse doesn't work if every new service has new unique requirements.
Happy to talk through the setup we're using and some other setups I've seen work if you're interested, but it's quite extensive.
Our current setup is heavily inspired by the work Gruntwork (not affiliated) are doing. I highly recommend taking a look at how they do it and even subscribing to their service if the need is there. They provide pre-built modules for basically any usecase.
Starting at the bottom, the foundation layer holds the basics like networking, storage, accounts, and permissions. The shared services layer is where I place tools like certificate managers and secret storage. I keep services that interact closely together, while separating those that work more independently. At the top, I lay out the applications. This is where I slot in services like auto-scaling groups, individual server instances, load balancers (depending on whether they're communal or specific), and pods in platforms like Kubernetes. Depending on the complexity of the environment there may be 1 or multiples of each layer.
By structuring IaC this way, I find it’s clearer and more intuitive.