Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Background for CAR and CDR (howardism.org)
46 points by tosh on Dec 27, 2021 | hide | past | favorite | 22 comments


> Whew. This is why a twenty-first century language, like Racket, uses archaic, un-helpful fifty year old terms. While Matthew Flatt and team could have changed these terms, they were keeping with a long tradition.

To be fair, Racket and LISP do have the procedures `first` and `last` for accessing lists, but also keeps `car` and `cdr` for accessing either pairs or lists. Clojure did away with `car` and `cdr`, but it also did away with the ability to make plain pairs from `cons`.

Racket:

  > (first (cons 1 '())) # w list
  1 
  > (car (cons 1 '())) # w list 
  1
  > (first (cons 1 2) # w pair
  ERROR
  > (car (cons 1 2)) # w pair
  1
Clojure:

  > (first (cons 1 '())) # w list
  1
  > (cons 1 2) # pairs not allowed
  ERROR
https://docs.racket-lang.org/reference/pairs.html


Common Lisp has FIRST, SECOND, ..., TENTH and REST.

http://www.lispworks.com/documentation/HyperSpec/Body/f_firs...


Sure does. In my own code I prefer `NTH` for anything beyond the cadr. Or just use arrays for random access.


Note that "second" will not work as intended if you just have a pair; it access the second element of a list. I rarely find any use for anything beyond first, second, and rest; if I need to access nth elements I will likely switch to a different data structure.


Thanks, I didn't know that but edited my comment accordingly.


Racket doesn't keep any "tradition"; the traditional, mutable conses are found under different naming here:

https://docs.racket-lang.org/reference/mpairs.html

Moreover, "a mutable pair is not a pair; they are completely separate datatypes".

So, I think, you can't make a piece of syntax out of these and pass them to the evaluator or compiler.

So, this doesn't even give a tip of the hat to the tradition of Barbara Liskov's substitution principle where you'd expec that a "mutable pair" is-a kind-of "pair", and so on.


> Clojure did away with `car` and `cdr`, but it also did away with the ability to make plain pairs from `cons`.

And it also means that more things than lists can be accessed as sequences using iterators.

The whole "Have to recurse the entire chain to figure out if something is terminated with nil/() and therefore a list" is part of the reason why everybody bangs on about tail recursion instead of just writing an imperative loop.


One of the things I find fascinating about the use of the names 'car' and 'cdr' is the 'c' part of it. As a developer who rights a small to moderate amount of assembly, I would tend to refer to the value in, say, 'r8' as 'r8' almost always -- I recognize the difference between "the register named r8" and "the value stored in the register named r8", but it's rare that I need to draw that distinction.

On the other hand, in writing (and more importantly in teaching!) C and C-style languages, I find that one of the strongest tools for ensuring understanding is to distinguish between "the variable named x", "the value of the variable named x", and "the value at the memory location named by the value of the variable named x." In C we use * for part of this distinction; it seems like the 'car'/'cdr'-era equivalent would be to distinguish between 'r8' and 'cr8'/'*r8'.


My understanding is that the original vacuum tube based IBM computers on which Lisp was implemented were accumulator based architectures (ie. had one register usually named "accumulator") with some instructions using parts of the accumulator in particular harwired ways.

Also, for CISC architectures that can do indirect loads on the ISA level (nowadays this is the only remaining practical difference between RISC and CISC), the difference between value of register (for i386: eax or %eax) and value referenced by that register ([eax] or $(%eax)) leads to different syntax of the assembly with same mnemonic.

i386 with intel assembly syntax is particularly good example of this as 'mov' refers to at least a two dozens of different opcodes depending on the syntax of its two arguments and current mode of the assembler.


What's mind boggling is that in this historic thing called FLPL (Fortran List Programming Language), these names were complicated further with X...F wrapping. So CAR was spelled XCARF, CDR spelled XCDRF and so on.


> Although the exact origins of the names for car and cdr are lost in the mists of time

This is absolutely wrong. If you read AI memo 1 you will find a bunch of functions: cwr, cpr, csr, cir, cdr, ctr, car, cbr, csgr. These return various parts of an IBM 704 word (also called a register) or will set it when assigned to. As we know only car and cdr survived, but it's explained very clearly how they wanted to build list structures.


I was introduced to Scheme as part of a course in the tradition of SICP.

Hmm. I'm not sure how much using "pair, first, second" instead of "cons, car, cdr" would clarify things. The nature of "a list is either nil, or a cons of an item and a list" is entirely different from e.g. how you interact with Python lists. I'm not sure "a list is either nil, or a pair of an item and a list" is that much clearer.

But also, with that algebraic definition of a list, it often didn't require using car/cdr directly. During the SICP course, you find that you can solve the given exercises by either writing functions which use car/cdr, or by using the higher-order map/filter/flatten/fold. Or e.g. in Haskell, a list can be pattern-matched to x:xs.


This isn't a very good article. First of all, this has to be the most rehashed curiosa topic about Lisp, ever. Secondly, "keeping in touch with tradition" isn't a shallow thing you do for fun---it is both about identity and about the ability to read and reuse code, even though it is decades old. And thirdly, the reason you keep car and cdr isn't only about tradition, but also about the ability to quickly reach into nested lists:

(cdaddr '(1 (2 (3 4 (5) (6 7) 8) 9) (10 1 3) 3)) => (1 3)

Without these oldies, you would have to write several combinations of first, second etc. and rest, or some combination of first/rest and nth n list.


cdaddr is an unreadable monstrosity. Keeping it around for the sake of reusing good code from decades ago is one thing, but let's not praise it for giving programmers a compact way to retrieve a nested list. It is too easy to mistake "cdaddr" for "cddadr" in a large codebase and the bugs it would introduce will probably be very hard to deal with.

If your list structure is so complex that you are using cdaddr or anything like it, you did something wrong. You should use defstruct with human-readable slot names to create more readable and easier to work with code.


Actually, there is no need to make guesses about whether cdaddr is bad, unreadable, introduces hard to handle bugs or what the limits of list structure complexity should be. It isn't some new idea or brief historic oddity that you can pass whatever judgement on.

The term is at least 40 years old, standardized, still in use, has its own established pronunciation and isn't known to cause a lot of bugs. It is also at the periphery of the car/cdr body of terms -- i.e. programming tradition shows that the level of complexity it covers is needed (or the term would have disappeared), but any further is too rare to merit a shortcut. In addition, it is easy to remember what it does: each d is a cdr, each a is a car. So it isn't any harder to read than a chain of firsts, seconds, rests and nth n's.


A chain of first/second/rest/nth is also unreadable, as is the equivalent using plain car/cdr. Lists should not be used as a substitute for defining a structure with properly named slots.

I am a big fan of Lisp but it is an old language family that comes with some baggage. The "list processing" view of the language is part of that baggage. One of the greatest shortcomings of Common Lisp is the lack of standardized data structures -- no binary search trees, no priority queues, etc. The language encourages people to use lists as a one-size-fits-all structure, to the detriment of readability, maintainability, and often efficiency.

Sure, constructs like cdaddr have not created the mountain of bugs that we see from the "traditions" of C, but I suspect that has more to do with what you said: cdaddr is rarely used. Most professional Lisp programmers would find a better way to write their code, almost certainly using defstruct to define complex data structures and only using lists when they need list semantics.


First of all, there are plenty of different data structures in Common Lisp. I don't know any Lisp resource that will tell you or encourage you to use lists for everything. For example, I think all the major Common Lisp books encourage you, explicitly, to use the advantages of different data structures.

Secondly, there is no contradiction between the usefulness of structs and the usefulness of nested lists and the ability to reach directly into them. But structs are no more a universal solution than lists are, so I don't know why you would be so hung up on them---in particular since we aren't discussing a particular use case.

I think the "unreadable monstrosity" thing in its various forms boils down to non-lispers not being used to car/cdr. That may be a fact, but it isn't an argument (and it takes about 5 mins to remedy).


Where are the search trees, heaps/priority queues, deques, and so forth in the Common Lisp standard? There are arrays, lists, associative lists, and hash tables, and a few scattered functions for manipulating trees of cons cells. Beyond that you are on your own, stuck either implementing textbook data structures and algorithms yourself or relying on third party libraries. Considering how well-developed the loop syntax and CLOS are, and even how many string manipulation functions are in the standard, I consider the lack of standardized data structures a pretty significant oversight.

I never said structs are a universal solution. It would make no sense to define lists using structs because lists already exist and are already useful as lists (and as stacks, which lists immediately give you). Sometimes a tree of cons cells works well and requires no further clarification. On the other hand, there are many cases where relying just on cons, car, cdr, and related functions results in hard-to-read, hard-to-change code. You are speaking of "usefulness" but I never doubted that you could get a working solution based on car/cdr, I said it is unreadable and I stand by that. You say that we are not talking about a specific use-case, but a function that accesses structured data in a very specific way without regard to any specific context or use-case screams "readability hazard."

Why should you dismiss someone who sees cdaddr and calls it an "unreadable monstrosity" as simply not knowing the language well enough? There is no need to get religious about a programming language. Common Lisp is a great language that I have been using for many years, both professionally and as my preferred language for personal projects. That does not mean that I should think everything in the language or everything that people commonly use are good ideas.


A simple deque in the spirit of unencapsulated lists can be made in Common Lisp using two back-to back lists. So that is to say, each end of the deque is a Lisp list, where we chase cdr going toward the middle of the deque.

Only the pop operation has to know about the arrangement. We make it a macro which cleverly transfers nodes from one list to the other to keep the amortized cost low.

https://www.kylheku.com/cgit/lisp-snippets/tree/deque.lisp [oops, where are the test cases?]

> That does not mean that I should think everything in the language or everything that people commonly use are good ideas.

I, however, provide the "cadavers" down to one level deeper in TXR Lisp than Common Lisp, from caar to cdddddr. Therefore, I'm not in a position to turn around and disown them as a bad idea.

There are people genuinely invested in various aspects of this stuff.


you don't need them in the language standard. Common Lisp is an extensible language and you would use one of the libraries available.

> On the other hand, there are many cases where relying just on cons, car, cdr, and related functions results in hard-to-read, hard-to-change code.

That's why Common Lisp has structures and CLOS. One builds on top of those.


cdaddr is very rarely used, because the ANSI CL versions of these functions only go four "letters" deep: car, cdr, caar, ..., cddddr.

For understandability, what you might want to reach for is:

   ;; d <- (cdaddr obj)

   (destructuring-bind (a b (c . d) . rest) obj
      d)
or else some pattern matching macro that doesn't require dummy variables up to d. For a single, one-off retrieval of a buried value, it doesn't compete with cdaddr.

But that particular ANSI CL construct isn't as forgiving; if the list doesn't have that (c . d) node, you get a destructuring error. Whereas cdaddr will only give you grief if any car or cdr step encounters a non-nil atom. You can't blindly substitute one for the other.

If you have to get several neighboring objects from the tree at that depth, using the cdaddr type function would be inefficient, because you're traversing the full path from the root multiple times. For that reason, efficient destructuring and pattern matching won't use them.

Still, in TXR Lisp, I provide these things to five levels deep.

Not only that, but I have provided numeric access to Lisp structure, which I borrowed from none other than ... Urbit!

Unlike Urbit, I have both big-bit-endian and little-bit-endian access; I call the functions cxr and cyr.

https://www.nongnu.org/txr/txr-manpage.html#N-01DA4F04

I seem to have under-documented this (probably also inspired by Urbit), but the test suite has a bit of coverage:

  (mtest
    (cxr 1 42) 42
    (cxr #b11 '(a . b)) a
    (cxr #b10 '(a . b)) b
    (cxr #b11000 '(1 2 3 4 5)) 4
    (cyr #b100001 '(1 2 3 4 5)) 5
    (cyr #b1111 '(((a)))) a
    (cyr #b111 '(a)) :error)

  (let ((r (range* 0 100)))
    (vtest (mapcar (op cyr (succ (expt 2 (succ @1))) r) 0..100) r)
    (vtest (mapcar (op cxr (* 3 (expt 2 @1)) r) 0..100) r))


One of the reasons they're still around in Common Lisp is to serve as a baseline for the set of cadr functions it provides.

Never use them myself, but I suspect once they're part of muscle memory they make dealing with lists very convenient.

clhs.lisp.se/Body/f_car_c.htm




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: