Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Really? I know that SQL is old but I would have expected `null` to refer to pointers at first.

Going by Wikipedia, I see that SQL is from 1974 and C from 1972. Were there earlier uses/drafts where `null` is "unknown" instead of "unset"?



The term "null" comes from C. A. R. Hoare, who has proposed in November 1965 various additions to the programming language ALGOL 60, including pointers (called references by Hoare, "pointer" comes from IBM PL/I, in July 1966), and including a special value "null" for references a.k.a. pointers, for denoting unknown or not applicable values.

C. A. R. Hoare, 1965-11:

"In order to enable references to represent partial functional relationships, i.e. ones which do not necessarily yield a value, a special reference value null is introduced. This value fails to refer to a record, and any attempt to use it to refer to a record leads to an undefined result."

In his proposal, Hoare has followed the COBOL 60 terminology for some of the additions to ALGOL, i.e. "record" instead of the word "structure", introduced by IBM PL/I in 1964-12, and "record class" instead of "structure type", hence the "class" of SIMULA 67, from where the word "class" has spread into all OOP languages.

In Hoare's proposal, references a.k.a. pointers could point only to values belonging to user-defined types, i.e. records a.k.a. structures, not to primitive types.


I wouldn't necessarily define `null` as "unknown" -- it's just "no value" -- which is really the same thing and also somewhat equivalent to "unset". But null pointers aren't unset as pointers aren't initialized to null in C and you can explicitly set a pointer to null.

E.F. Codd added nulls to relational model in 1970 so that does pre-date C. The concept is even older than that I imagine.


In nth normal form, you can't have 'no value'. That would mean your model is wrong. In academic relational data books, null does mean "unknown". There is a value, we just don't know what it is (yet).

If there might actually not be such a value, you're supposed to change your schema to reflect that.


> There is a value, we just don't know what it is (yet).

In all my years, I've never used null for that. If I don't have a value yet then generally I'm not writing any part of the record. I only ever use nulls for the absence of a value. Creating a new table for every potential optional column is the "academic relational way" and also insane. :)


What happens if your data is produced by some automated process such as a sensor reading and occasionally the sensor fails to return a value? NULL seems exactly the appropriate value to use.


Then you're supposed to use another table with a foreign key to canonical measurement record. This is the concept of fully normalized schemas.

What you're describing is closer to how people do it in practice.


I'm still a bit confused. Suppose you have another table, call it temperatures with columns id and temperature, where every row contains only a valid temperature (no NULL records), and you have a main logging table with date and temperature_id so that you can join on temperature_id = temperatures.id. This seems to be what you mean, with a canonical measurement record table related via the temperature_id foreign key.

But then if your sensor fails to record a measurement don't you end up with NULL for that row's temperature_id?


Foreign key would probably go the other way:

LogEntry(LogEntryId, Date)

Temperature(TemperatureId, LogEntryId, DegreesF)

If there is no temperature measured, then you don't create a record in Temperature.


Ah but then how do you record that the measurement actually happened but did not produce a value? I want a record of that failure.

I mean sure, you could do yet another table. But honestly that level of normalization is much more work than it's worth. Just because it's some academic definition doesn't make it right.


You put a record in the Measurement table, and none in the Value table.

> But honestly that level of normalization is much more work than it's worth

Yes. I question whether it's worth anything to begin with.


     Select M.*, V.* from Measurement as M left outer join Value_table as V…
You end up with nulls again.


Yes. That's part of the semantics of outer joins.

But there is no stored null representing 'no value'.


You'd also have to ask when NULL came into common use in C (to which I do not know the answer). AFAIK NULL was not considered to be standard until C89. As far as I'm aware, all C compilers implement it as a #define in a standard header somewhere; it's not actually part of the core language itself.

I wonder who first added that macro? Was it there from the beginning?

Just random thoughts...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: