> > QuestDB organizes data sorted by time, so relying on insertion order may help to avoid redundant sorting if there is an ORDER BY clause with the timestamp column.
> If data is already sorted and you have an 'order by' then just use the data directly – bingo, instant merge join, no hash table needed.
I reckon keeping data on heap in insertion order isn't that useful for joins because hash table is used for lookups while iterating the other table (so the main table determines output order).
Where it could help is e.g. storing results of GROUP BY.
For query such as:
SELECT timestamp, key, sum(value)
from data
GROUP BY timestamp, key
order by timestamp
if data table stores data ordered by timestamp and hash table maintains insertion order then sorting is not required after aggregating all rows because iterating heap produces the right order.
> If data is already sorted and you have an 'order by' then just use the data directly – bingo, instant merge join, no hash table needed.
I reckon keeping data on heap in insertion order isn't that useful for joins because hash table is used for lookups while iterating the other table (so the main table determines output order). Where it could help is e.g. storing results of GROUP BY. For query such as:
SELECT timestamp, key, sum(value) from data GROUP BY timestamp, key order by timestamp
if data table stores data ordered by timestamp and hash table maintains insertion order then sorting is not required after aggregating all rows because iterating heap produces the right order.