Really? I know that SQL is old but I would have expected `null` to refer to poin...

adrian_b · on Jan 10, 2025

The term "null" comes from C. A. R. Hoare, who has proposed in November 1965 various additions to the programming language ALGOL 60, including pointers (called references by Hoare, "pointer" comes from IBM PL/I, in July 1966), and including a special value "null" for references a.k.a. pointers, for denoting unknown or not applicable values.

C. A. R. Hoare, 1965-11:

"In order to enable references to represent partial functional relationships, i.e. ones which do not necessarily yield a value, a special reference value null is introduced. This value fails to refer to a record, and any attempt to use it to refer to a record leads to an undefined result."

In his proposal, Hoare has followed the COBOL 60 terminology for some of the additions to ALGOL, i.e. "record" instead of the word "structure", introduced by IBM PL/I in 1964-12, and "record class" instead of "structure type", hence the "class" of SIMULA 67, from where the word "class" has spread into all OOP languages.

In Hoare's proposal, references a.k.a. pointers could point only to values belonging to user-defined types, i.e. records a.k.a. structures, not to primitive types.

wvenable · on Jan 10, 2025

I wouldn't necessarily define `null` as "unknown" -- it's just "no value" -- which is really the same thing and also somewhat equivalent to "unset". But null pointers aren't unset as pointers aren't initialized to null in C and you can explicitly set a pointer to null.

E.F. Codd added nulls to relational model in 1970 so that does pre-date C. The concept is even older than that I imagine.

recursive · on Jan 10, 2025

In nth normal form, you can't have 'no value'. That would mean your model is wrong. In academic relational data books, null does mean "unknown". There is a value, we just don't know what it is (yet).

If there might actually not be such a value, you're supposed to change your schema to reflect that.

wvenable · on Jan 10, 2025

> There is a value, we just don't know what it is (yet).

In all my years, I've never used null for that. If I don't have a value yet then generally I'm not writing any part of the record. I only ever use nulls for the absence of a value. Creating a new table for every potential optional column is the "academic relational way" and also insane. :)

chongli · on Jan 10, 2025

What happens if your data is produced by some automated process such as a sensor reading and occasionally the sensor fails to return a value? NULL seems exactly the appropriate value to use.

recursive · on Jan 10, 2025

Then you're supposed to use another table with a foreign key to canonical measurement record. This is the concept of fully normalized schemas.

What you're describing is closer to how people do it in practice.

chongli · on Jan 10, 2025

I'm still a bit confused. Suppose you have another table, call it temperatures with columns id and temperature, where every row contains only a valid temperature (no NULL records), and you have a main logging table with date and temperature_id so that you can join on temperature_id = temperatures.id. This seems to be what you mean, with a canonical measurement record table related via the temperature_id foreign key.

But then if your sensor fails to record a measurement don't you end up with NULL for that row's temperature_id?

recursive · on Jan 11, 2025

Foreign key would probably go the other way:

LogEntry(LogEntryId, Date)

Temperature(TemperatureId, LogEntryId, DegreesF)

If there is no temperature measured, then you don't create a record in Temperature.

wvenable · on Jan 10, 2025

Ah but then how do you record that the measurement actually happened but did not produce a value? I want a record of that failure.

I mean sure, you could do yet another table. But honestly that level of normalization is much more work than it's worth. Just because it's some academic definition doesn't make it right.

recursive · on Jan 11, 2025

You put a record in the Measurement table, and none in the Value table.

> But honestly that level of normalization is much more work than it's worth

Yes. I question whether it's worth anything to begin with.

rawgabbit · on Jan 11, 2025

     Select M.*, V.* from Measurement as M left outer join Value_table as V…

You end up with nulls again.

recursive · on Jan 11, 2025

Yes. That's part of the semantics of outer joins.

But there is no stored null representing 'no value'.

floating-io · on Jan 10, 2025

You'd also have to ask when NULL came into common use in C (to which I do not know the answer). AFAIK NULL was not considered to be standard until C89. As far as I'm aware, all C compilers implement it as a #define in a standard header somewhere; it's not actually part of the core language itself.

I wonder who first added that macro? Was it there from the beginning?

Just random thoughts...