The term "null" comes from C. A. R. Hoare, who has proposed in November 1965 various additions to the programming language ALGOL 60, including pointers (called references by Hoare, "pointer" comes from IBM PL/I, in July 1966), and including a special value "null" for references a.k.a. pointers, for denoting unknown or not applicable values.
C. A. R. Hoare, 1965-11:
"In order to enable references to represent partial functional relationships, i.e. ones which do not necessarily yield a value, a special reference value null is introduced.
This value fails to refer to a record, and any attempt to use it to refer to a record leads to an undefined result."
In his proposal, Hoare has followed the COBOL 60 terminology for some of the additions to ALGOL, i.e. "record" instead of the word "structure", introduced by IBM PL/I in 1964-12, and "record class" instead of "structure type", hence the "class" of SIMULA 67, from where the word "class" has spread into all OOP languages.
In Hoare's proposal, references a.k.a. pointers could point only to values belonging to user-defined types, i.e. records a.k.a. structures, not to primitive types.
I wouldn't necessarily define `null` as "unknown" -- it's just "no value" -- which is really the same thing and also somewhat equivalent to "unset". But null pointers aren't unset as pointers aren't initialized to null in C and you can explicitly set a pointer to null.
E.F. Codd added nulls to relational model in 1970 so that does pre-date C. The concept is even older than that I imagine.
In nth normal form, you can't have 'no value'. That would mean your model is wrong. In academic relational data books, null does mean "unknown". There is a value, we just don't know what it is (yet).
If there might actually not be such a value, you're supposed to change your schema to reflect that.
> There is a value, we just don't know what it is (yet).
In all my years, I've never used null for that. If I don't have a value yet then generally I'm not writing any part of the record. I only ever use nulls for the absence of a value. Creating a new table for every potential optional column is the "academic relational way" and also insane. :)
What happens if your data is produced by some automated process such as a sensor reading and occasionally the sensor fails to return a value? NULL seems exactly the appropriate value to use.
I'm still a bit confused. Suppose you have another table, call it temperatures with columns id and temperature, where every row contains only a valid temperature (no NULL records), and you have a main logging table with date and temperature_id so that you can join on temperature_id = temperatures.id. This seems to be what you mean, with a canonical measurement record table related via the temperature_id foreign key.
But then if your sensor fails to record a measurement don't you end up with NULL for that row's temperature_id?
Ah but then how do you record that the measurement actually happened but did not produce a value? I want a record of that failure.
I mean sure, you could do yet another table. But honestly that level of normalization is much more work than it's worth. Just because it's some academic definition doesn't make it right.
You'd also have to ask when NULL came into common use in C (to which I do not know the answer). AFAIK NULL was not considered to be standard until C89. As far as I'm aware, all C compilers implement it as a #define in a standard header somewhere; it's not actually part of the core language itself.
I wonder who first added that macro? Was it there from the beginning?
Going by Wikipedia, I see that SQL is from 1974 and C from 1972. Were there earlier uses/drafts where `null` is "unknown" instead of "unset"?