Addressing GitHub’s recent availability issues

vb-8448 · on May 16, 2023

It's curious that all of the incidents are database related.

fwlr · on May 17, 2023

One thought: GitHub is basically a bunch of very complicated interconnected databases, no? I mean, to some extent all large web companies are, but GitHub much more than most. Any failure is going to be, in some sense, a database failure.

Another thought: this could speak to a critical weakness in GitHub’s database team. Maybe they hired a lot of bad developers to the database team at the same time, maybe a bad culture took root in the database wing’s management team.

vb-8448 · on May 17, 2023

Looking at the issue details, I'll guess they have "poor ops practices":

- 9th may: during a deployment, rollback didn't work

- 10th may: a recently deployed function killed the database

- 11th may: failover didn't work

aaomidi · on May 18, 2023

Databases are legitimately the worst part of every company.

All the new, and much better tech is so fucking expensive.

CockroachDB has the worst sales people that can’t even demonstrate their own benchmarks.

Spanner is just unaffordable for most.

Postgres/MySQL are great but making them scale for write heavy workloads is a pain. I also hate managing primary/failover systems. We have consensus algorithms now. Let’s just use them.

Randis · on May 18, 2023

Care to expand on "CockroachDB has the worst sales people that can’t even demonstrate their own benchmarks"

aaomidi · on May 19, 2023

Ask the sales folk on cockroach to re-create their own benchmarks for you. They won’t be able to.

grepfru_it · on May 16, 2023

No, well kinda. I think it's related to how github chunks data on the backend. That chunking methodology does depend on the database. They mention it worked elsewhere in the org, which makes me wonder if it was a fix to support their managed hosts.

jtschmidt · on May 16, 2023

While the ad-hoc blog post to address last week's outage is appreciated, I'm curious as to why GitHub would continue to roll these up in a monthly report as opposed to when an investigation concludes? Many companies that operate live services that have a global user base do this now.

footlose_3815 · on May 17, 2023

>GitHub recently has been

Not just recently, not recently too. Within the last 3 months, maybe longer, it's been miserable.

I've had to add their status to an RSS feed so my dev team knows when they're down, now.

toastal · on May 17, 2023

Have you considered a different host or at least mirroring elsewhere?

thadjo · on May 16, 2023

aaaaand it's down again