Let me provide some background information for people who aren't familiar with the agencies involved in this (Centrelink, Medicare, and the Australian Taxation Office).
These places have some of the worst-run IT departments on the planet. I can say this with more than a little evidence. As a consultant, I've worked on over a hundred customer sites, all the way from tiny private companies up to federal government, including all three of those agencies. I've seen how IT is done at just about every state government office in my state, and two dozen in other states.
There just is no comparison. Centrelink especially is so fucked up that people think that I made up my stories about my experience there. It's crazy beyond belief.
The sheer scale of it is amazing. They have over 1K IT staff in one building, and spent $2B on a single software upgrade project! They have huge teams for obscure tasks that other large enterprises might have just one or two people doing. There are Big Name consultants everywhere. Direct vendor support, often flown in from the US, which is otherwise rare around here.
Despite all these people, money, and support, nothing works. Nothing. It's all broken. Everything. Every part. It's a sight to behold.
I wrote a report for them about a key security system where I pointed out that out of something like 50 settings, 47 were incorrectly configured. The only reason it "worked" is because the errors cancelled out. That is, it was incorrectly rejecting valid access, but another error meant that the rejection was being ignored. And so on.
Similarly, their core authentication system was supposed to be distributed and highly available, but the main architect put all of the servers into one rack, one on top of another. He said with a straight face that a product that is well known in the industry for its efficient wide-scale replication is "bad at replication" and only works if the "network cables are really short". He meant 30cm, not 3000km. A power outage took out all three "redundant" controllers, and so something like 80K staff spent several days staring at login prompts on their monitors for a few days.
I could go on, and on, and on. I have a whole collection of stories like that.
The most amazing part is that I was only there for a couple of months, yet this short time period yielded 8 of my top 10 horror stories from the field.
It's also the only workplace setting where I had ever seen a man cry. For work related reasons. Several men, on several occasions.
Imagine trying to turn around an organization like that. Must be an interesting challenge.
The sheer amount of technical debt, legacy systems, dysfunctional team processes and culture. Not to mention the sheer motive inertia needed to change anything in that environment. Moving in any direction will have 1000 other things breaking/popping up to steal momentum. A Gordian knot impossible to untangle.
I was a hero briefly when I programmed the MFC copier at a govt agency to allow staff to scan to their desktop directly. This was one of those SUPER fancy ones (think 5 year overpriced contract with total cost of $120K) - it had every doc mgmt feature under the sun, but could ONLY be used for copying (no print, no scan no anything).
Some update reverted system, and IT was unhappy when staff asked for the feature back. Team asked me to help, I said if IT has said no I dare not.
So back they went to their old solution, which was to send someone 2x per day to a local copy shop and FAX at $3/page stuff they needed electronically in the computer, because they had a digital fax service they set up.
I kid you not, this is the only in govt type thing. They ban scan to USB / scan to network etc, but then demand stuff be uploaded electronically to some new system - what are folks supposed to do. 90% its left hand right hand stuff. IT security folks don't talk to anyone and lock systems to nth degree (no scanning, no USB). Then someone else NEEDS paper available electronically for some reason (upload to a new system).
The more money spent the worse it is because you can't actually talk to anyone. Once its $100M+ staff are just not in room there are so many layers.
> Imagine trying to turn around an organization like that. Must be an interesting challenge.
Most good devops book tells you how to do that. You scan for people who have the right skills and who actually care, as opposed to people who are at the other end of the spectrum who think that if it ain't broken don't fix it and "why change it we will have to support this stuff later".
Then you go commando and secretly pick projects with low cost and high return that would not normally get the go ahead. People copying Excel sheets full time? Automatate their job away. Full time sysadmin setting up one server a day? Would be a real shame that you have a docker container ready to use when he has an emergency and doesn't have time. Bonus success points if you do things that also help your fellow devs.
In a government settings, and in any large organizations, you will need to have upper leadership support otherwise this will always fail and all of your efforts will be undermined and suppressed. Be sure to leave an employee review on your way out and name names to HR.
The thing about those gov departments is 'many' people are there for secure jobs with low effort and actual work. They actively work against change.
So yes upper leadership support is required, but you also sometimes need to pull things out root and branch, at least enough to scare those people into changing their life when they have a psychology of never needing to change again once they built their public service fiefdom.
>Centrelink especially is so fucked up that people think that I made up my stories about my experience there. It's crazy beyond belief.
I know this is just an anecdote, but a guy I met who works at the DHS told me that, the online forms that people fill in are "printed" to PDF then manually entered into a database system from the 1980s.
The reason they don't update to a newer database with a proper API is because that would require taking the system offline for maintenance.
I worked for the APS as a very junior programmer 10 years ago.
To interact with our database we had a custom JDBC driver which used a VT100 terminal emulator to connect to what had at one point been a user-facing mainframe application. When a query was executed, the driver would:
- Emulate a user entering a series of key-presses in the terminal to navigate to the correct screen in the application.
- Tab to the query input field, enter the query, then send a return key-press to run the query.
- Read 20 rows of output, then send a key-press to show the next page of results, rinse and repeat.
- Parse the array of rows-represented-as-strings into properly types objects.
- Repeatedly "press" escape to get back to the main screen so that the application state would be ready for the next query.
One of my first tasks was to make this driver work with a column type that stored binary data.
> Let me provide some background information for people who aren't familiar with the agencies involved in this (Centrelink, Medicare, and the Australian Taxation Office). These places have some of the worst-run IT departments on the planet.
You can add Australia Post to that list as well. Even though it is now technically a corporation, it still carries the stench of its public service roots.
To add extra spice to their already broken organisation, Australia Post acquired StarTrack for parcel delivery.
I witnessed an operator in the distribution centre wait a solid thirty minutes for a key lookup in their database. I timed it with my phone. I had time to get get lunch and come back.
A key lookup. Literally the consignment number.
I grilled him a bit on the details, and it turns out that all single-row lookups take that much time. Name, phone number, or any other details all take about half an hour to produce a result.
There are parcel delivery services that can deliver door-to-door faster than their IT systems can look up a record.
It's a flabbergasting level of incompetence, but I'm told it's been like that for years, and that they were told not to fix it because during the merger they were to "put tools down" and not spend time and money on anything that Australia Post will fix anyway.
It's clearly just a missing index (or indexes) somewhere.
The guy explained that it has been "slowly getting worse" over many years, which is what you'd expect if there's table scans going on over a steadily growing volume of data.
IMHO, this is such as prevalent problem that it's the pandemic disease of SQL databases. They really should have indexing on by default for most columns, with the option of disabling unused indexes (perhaps even automatically).
Instead what happens is that 90-95% of all databases are 2-5 orders of magnitude slower than they ought to be, because developers are just unable to grasp these fundamental concepts. Developers haven't gotten any better at this over the last three decades. The tools have to get better to compensate instead.
PS: Almost all "no-SQL" databases are automatically or implicitly indexed. When people say that they're "much faster than SQL", I just assume that this is one of the biggest reasons. They're not inherently faster. Instead they're faster by default.
> They really should have indexing on by default for most columns, with the option of disabling unused indexes (perhaps even automatically).
Oracle does that. I think it is not enabled by default, but once you turn it on it's pretty much automated. It creates, rebuilds, and drops indexes based on the application workload.
There are third party tools for other databases (such as Dexter for PostgreSQL).
As a customer and tax payer, Aust Post gets top marks for actually contacting me after I submitted a message about not being able to login, then passing it on to an IT person who rang me a couple of days later to say it was fixed.
The vast majority of companies who promise to get back to me never do, yet Aust Post called me twice to investigate and resolve.
Having worked in both Aus Post and banking environments I can assure you that Aus Post is worse. If you had seen inside in the debacle that was their "digital mailbox" project you would know what I'm talking about.
Haha, I worked for a wholly owned Auspost subsidiary during the inception of the digital mailbox project, we did some work for it in the periphery.
The execs would talk in breathless terms about how innovative it was, and how agile the delivery is, and how we should try to emulate their success.
AP is a big place, and so are the banks, I guess my views are informed by my personal experience, but at least personally, I've found the banks to be more dysfunctional internally.
I guess everyone's experience will be different though.
I have a small business account registered to my home address (as a MyPost Business sender), so whenever I buy a parcel from anyone it emails the business that it's coming. They just do an address match on their side and email the business even if I gave the store my own email. Totally annoying and a privacy issue, they don't care and couldn't tell me how to fix it.
If I understand the issue correctly, you buy something online, giving your personal email address and home address. But because the home address is the same as your business address, AP 'intelligently' sends the notifications to your business email address?
What happens if you set up an AP MyPost account with your personal email address and home address?
They also associate by mobile phone number and name matching based on one reply I got to my ticket.
They suggested changing the name on the business account but that would just shift the issue to someone else unless we use a fake name. And I think it did not let me do it when I tried, it's been a year or two so I forget some details.
> What happens if you set up an AP MyPost account with your personal email address and home address?
I did, I can't remember the details but it didn't work. I think it prevented me from adding one type of data because that was already in the other account as a unique identifier (possibly phone number). I just tried today though and it let me add my number, so maybe that fixed it finally.
Worked at Services Australia up until the end of last year.
Horrible place to work things moved at a snail pace.
Constantly fighting fires and dealing with other organisations being merged into us.
Much happier now that I've moved :)
That's at least two catastrophic errors: Incorrect SAN configuration and missing backups.
PS: I worked at another department where they similarly misconfigured a SAN and made it highly vulnerable to multi-week outages due to even a single failed drive. I insisted they fix it, and my reward for this was seething hatred.
"You're just making us do extra work!"
"It's not a problem right now!"
"We have other priorities!"
Etc...
They literally refused to touch anything that's not on fire. Merely smouldering is "fine".
In my opinion this is the real root cause of the gov org dysfunctions - so many workers go there who actively resist change and improvements. They want to be mediocre/lazy at work and don't want anyone showing them up, and they don't want any change because they just want minimum effort for stable income and a pension.
Not everyone maybe it's only 10-20% or whatever, but a higher percentage than in private sector, and the percentage is that high that it ruins the workplace and actively drives good workers away.
None of what I witnessed going on there was technically illegal. It's not actual graft, it's just incompetence combined with a highly diluted sense of responsibility.
There's also a sort of "rocket equation" to bureaucracy where additional staff begets more staff. Or overheads beget more overheads to deal with the overheads. And just like how with rockets the key thing is to have a fuel with good specific power, scaling an org depends very heavily (nonlinearly!) on the efficiency of each person. Conversely, if you have inefficient, incompetent, and unmotivated staff but try to scale up, the inevitable consequence is that you end up in an exponential cycle of compounding inefficiency without limit.
At this place I could not get a single VM deployed to PRD despite three months of focused effort. It just could not be done!
Hence the comments about the hilarious 90 day sprints. Well... yes. That's the fastest pace at which they could possibly move! Some manager probably patted himself on the back for a job well done! That's an "agile" project relative to the multi-year monstrosities they normally give birth to in that place...
I don't think it's corruption so much as the public sector getting harvested for parts via privatisation and outsourcing to contractors.
The usual cycle goes like this:
- "We need to decrease costs in public organisation A because $reason"
- "Hey look, public org has growing wait times and growing infrastructure issues. We should reduce their budget because they're not doing their job!"
Rinse & repeat until you're left with Centrelink's current state. They don't have enough money to make the changes needed to clean up legacy systems AND process the work loads they have now AND maintain the current systems, so a choice is made by people in a sinking ship. Around 2014 the amount spent on "admin" was gutted by half with the election of the Liberal party (small govt party in AU), with funding only recovering to the previous levels during 2017.
edit: formatting (bullet point lists and newlines are hard)
I don't believe this was the problem in this case. As mentioned, they were blowing $billions on individual IT projects, and hiring vendor specialist consultants at $4-$5K per day in many cases. Similarly, their kit was over-specced to a ludicrous degree.
I asked their DBA team to deploy a ~100 MB "system configuration" database and they gave me four dedicated(!) physical quad-socket servers in a 2+2 HA configuration. The active server showed 1% load, the three replica servers rounded the load down to 0% in Task Manager.
All that for that one tiny database!
Their excuse was that this was their "standard pattern", and that everyone gets the same spec, irrespective of need.
In any private org, you would be walked out the door if you spent nearly half a million dollars on kit+licensing for something like that because you were too lazy to have more than one option for database hosting.
PS: There was a huge database team. You can't tell me it was a staff capacity issue either. This particular product had it's own sub-team dedicated to it.
I'm wondering if the consulting company I used to work for is behind this. Hardware sales were behind many decisions, because that's where the sales team made commissions.
> Around 2014 the amount spent on "admin" was gutted by half with the election of the Liberal party (small govt party in AU)
Inaccurate if not misleading. The Liberal party are firm believers that private companies do everything better than Government. Pretty much the UK conservative party in function and form
Yeah, the pattern has been a massive increase in spend in consultancies (especially the big 4) for things that the public service used to do itself. I believe it's over a billion dollars per year to the big 4 now, from tens of millions p/a back then.
These places have some of the worst-run IT departments on the planet. I can say this with more than a little evidence. As a consultant, I've worked on over a hundred customer sites, all the way from tiny private companies up to federal government, including all three of those agencies. I've seen how IT is done at just about every state government office in my state, and two dozen in other states.
There just is no comparison. Centrelink especially is so fucked up that people think that I made up my stories about my experience there. It's crazy beyond belief.
The sheer scale of it is amazing. They have over 1K IT staff in one building, and spent $2B on a single software upgrade project! They have huge teams for obscure tasks that other large enterprises might have just one or two people doing. There are Big Name consultants everywhere. Direct vendor support, often flown in from the US, which is otherwise rare around here.
Despite all these people, money, and support, nothing works. Nothing. It's all broken. Everything. Every part. It's a sight to behold.
I wrote a report for them about a key security system where I pointed out that out of something like 50 settings, 47 were incorrectly configured. The only reason it "worked" is because the errors cancelled out. That is, it was incorrectly rejecting valid access, but another error meant that the rejection was being ignored. And so on.
Similarly, their core authentication system was supposed to be distributed and highly available, but the main architect put all of the servers into one rack, one on top of another. He said with a straight face that a product that is well known in the industry for its efficient wide-scale replication is "bad at replication" and only works if the "network cables are really short". He meant 30cm, not 3000km. A power outage took out all three "redundant" controllers, and so something like 80K staff spent several days staring at login prompts on their monitors for a few days.
I could go on, and on, and on. I have a whole collection of stories like that.
The most amazing part is that I was only there for a couple of months, yet this short time period yielded 8 of my top 10 horror stories from the field.
It's also the only workplace setting where I had ever seen a man cry. For work related reasons. Several men, on several occasions.