PostgreSQL uses heap files for the primary table storage, not B-trees. In Postgr...

Sesse__ · 2025-05-07T19:28:54 1746646134

Indexes that point directly to the disk column are also significantly faster to access; it is a persistent pain point for OLAP on InnoDB that all secondary indexes are indirect. You can work around it by adding additional columns to the index to make your lookups covering, but it's kludgy and imprecise and tends to bloat the index even further. (The flip side is that if you have tons of indexes, and update some unrelated column, InnoDB doesn't need to update those indexes to point to the location of the new row. But I'm generally very rarely annoyed by that in comparison.)

sroussey · 2025-05-08T14:56:29 1746716189

On commercial databases from Oracle and Microsoft, you can cluster the DB on any index. Really would love for Postresql to do the same.

cwbriscoe · 2025-05-08T18:12:42 1746727962

You can but new rows will not be clustered until you re-cluster.

CLUSTER table_name USING index_name;

sroussey · 2025-05-10T20:05:12 1746907512

I remember that for MySQL MyISAM (syntax was different) in the olden days. I thought Oracle let you do it though. My bad.

MySQL Innodb and other have real clustered primary key indexes with all the pluses and minuses that entails.

farazbabar · 2025-05-07T17:57:17 1746640637

Don't forget high speed committed writes to append only tables (the opposite of scans), postgres approach is better here as well.

saltcured · 2025-05-07T18:11:58 1746641518

It's also deeply entwined with the MVCC concurrency control and the ability to do DDL in transactions, right?

jiggawatts · 2025-05-07T22:16:59 1746656219

SQL Server supports every combination of heap storage, clustered storage, MVCC, and DDL in transactions.

foobahhhhh · 2025-05-08T11:07:38 1746702458

Takes me back 20 years to using SQL Server! It let you choose clustered index or not IIRC.