Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Doug McIlroy on Unix Taste (2014) (tuhs.org)
47 points by _acme on June 10, 2016 | hide | past | favorite | 64 comments


Doug's message brings to mind Chuck Moore, the creator of Forth, who is probably as fierce a "negative" coder as anyone has ever seen, and he extended his approach similarly to his stack-based hardware. Sadly, Forth (in any guise/implementation) has been largely written off by the world, and is probably better considered a divergence in evolution from today's computing trends. But I think the lessons it teaches remain relevant today--its minimalism, similarly "born of a taste for achieving big power by small means rather than by unbounded accumulation of facilities", is breathtaking, even if Forth is no longer quite practical to use in many of today's software/hardware ecosystems.


Here, Chuck complains about bloated C programs: https://web.archive.org/web/20121028010948/http://www.colorf...

Of course, even C is a simple, tiny language compared to C++.


> An interesting question is how the corrective of taste manages ever to recenter the exuberance of evolution.

can someone please parse this portion of the reply for me ? i think i can get the _glimpse_ of a meaning, but unfortunately don't / cannot grok its complete elegance...


It's essentially design vs. evolution. Evolution has no taste -- it only cares what works. If a system is full of contradictions and inconsistencies and redundancies, that's OK. What survives is right, by definition.

Design implies a coherent intention achieved by a single person or small group of people. Unix was designed by a relatively small group Bell Labs, with a distinct culture.

Linux is probably the best example of evolution, and Linus has explicitly pledged his allegiance to this force. The irony is that Linux would have never succeeded if it were a greenfield project; it needed the "bones" of Unix in order not to collapse on itself. It needed the 20+ years of tribal Unix knowledge spread throughout the world. (See "In the Beginning was the Command Line" by Stephenson)

I do think McIlroy is right. You need taste to "correct" the ugliness of evolution, at least in software. Evolution is important, but so is design. Like all things, neither extreme is right.

Linux exploded on top of the design of Unix. It's time for more design.

Here is a good example of design errors Linux has been making, when it has been trying innovate without the lessons of the Unix model: https://lwn.net/Articles/574317/

"To begin with, he said, much of what is being done with control groups is new; all of it is new to Linux in particular, and some is new in general. So the community did not have any sort of model to follow when designing this new feature."


A lot of the "design" of Unix was just happenstance, though. It didn't start from a tabula rasa, but mostly be reimplementing existing systems within a very constrained environment. And until you arrived at what's recognizably "Unix" for most people, there already was quite some evolution in place (At least V3, probably more V5).

Just look at all the ad-hoc languages and semi-languages you've got there. Or the source code for the Bourne shell. roff formating. Then add BSD stuff on top of it (e.g. vi, sockets), the X window system, some more outside stuff (emacs, TeX) and you've got a bewildering plethora of icongruent systems way before Linux even splattered onto the scene.

And I would say that McIlroys biggest invention (pipes) is the prime reason why this "mess" actually worked out, by providing a common way to tie these baroque and/or ad-hoc systems together.

Unix never was a bright and shiny diamond of singular design beauty (there we'd have to look at language-based systems, IMHO, like Smalltalk, Lisp Machines, Mesa or Oberon).

As for editors, well, with any application you end up having to trade simplicity for speed or power. For straight-forward text manipulation, "shelling" out is often a good alternative -- those tools might be arcane and baroque too (it's not like regular expressions are particular intuitive), but at least it's something you can use in other contexts, too. But when it comes to speedy and expedient movement through text and files, it's hard to beat a keyboard. And yes, I've used both acme and sam.

Why would a shortcut be the epitome of mystery anyway? Sure, it's not evident on my screen/keyboard etc., but neither is having to use "==" for equality, '.*' for "match some characters" or, heck, these weird English words I'm now typing.


I disagree. The main design idea behind Unix is composition. Uniformity enables composition, which means the overall system is smaller and doesn't collapse under its own weight. That means it survives, which we have observed empirically.

Composition shows up in the following ways:

1) Everything is a file -- open() / close() / read() / write() are polymorphic, with ioctl() and others for the necessary hacks that don't fit the model. Devices are files; sockets are files; pipes are files; /proc is a file system and not a bunch of C APIs. Compare with Windows. This is not to say there's no cost to this uniformity. There is -- serialization and escaping bugs -- but in practice the benefit seems to have outweighed the cost. You can't underestimate composition, because without it your code collapses under its own weight.)

2) In addition to files composing, file systems compose. mount() is a way of composing a uniform namespace from disparate entities. (Not true under Windows)

3) Processes compose. You can do stuff like "sudo chroot ls ...". fork/exec allows one process to set state for another, without code changes to the child. (Also not true under Windows)

4) Language heterogeneity. Via the simple mechanism of the shebang line, shell can invoke awk/sed/Python/Perl/Ruby/JS in the same way. And the shell itself. (You could call this an instance of the Composite Design Pattern).

Unix is certainly messy NOW, but that's because it is so widely used that everyone needed to add their own stuff to it (threads and shared libraries being very popular and arguably anti-Unix). You can try to make something "cleaner", but it will likely collapse under its own weight. There is no uniform model that encompasses all of computing. It's all little truths in specific situations. That's why there are so many little languages.

I've had this argument before on HN: Lisp is not a better Unix. Lisp is solving a simpler problem; decades of experience have shown that. Unix is solving a harder problem, doing it in a somewhat simplistic and strained fashion, but it's the best we know how to do right now. Lisp or Smalltalk don't solve the same problem by any stretch of the imagination.


> The main design idea behind Unix is composition

The design was "strip away as much from Multics to get it running on this crappy computer". That after a few iterations this might generate a philosophy of composition is another matter entirely. And I'm not really arguing against that, at least for a very small subsection of Unix' timeline (around v5-ish?). But the assumption that this was all a masterstroke of, erm, intelligent design seems a bit outlandish.

It seems to have been an element of Plan 9's design, though.

And let's not get into "decades of experience" and "empirically". There's way too few data points and way too much simple economics muddying those waters. By the same argument I could say that Unix really failed, as the biggest applications (nay, the systems themselves) on today's variants are quite monolithic and/or isolated subsystems themselves (browsers, languages, most servers).


Plan 9 was basically more Unix than Unix. More things were files, and more things were file systems (Networking was done with file systems.)

The design ideas originated with Unix. They were quite explicit, read this book: http://www.catb.org/esr/writings/taoup/html/index.html


"The main design idea behind Unix is composition. Uniformity enables composition, which means the overall system is smaller and doesn't collapse under its own weight. That means it survives, which we have observed empirically."

The main design idea behind UNIX was trimming MULTICS to fit into a PDP-11 with a new architecture. More effective composition was already demonstrated by Dijkstra's THE, Burroughs B5000, and even MULTICS. Their method, an unsafe language + heavyweight composition, led to a system that was both enabled for composition and quite unreliable/insecure. Later, Wirth pulled on pre-UNIX work to show how much better it could be done with Modula-2 etc. Nastiness of system hid away, type/interface checks, and otherwise just programming modules calling functions in modules. That's composition done right.

It's no surprise that best technologies around improving reliability, maintenance, throughput, etc always involved working around rather than with UNIX model.

" That means it survives, which we have observed empirically."

Empirically, we've observed COBOL on System/360 survived with other mainframe systems. As did Windows monolithic kernels and apps. DOS was all over the place. What survived tells you more about economics and sociology than whether an approach is technically superior.

"Lisp is not a better Unix."

http://www.symbolics-dks.com/Genera-why-1.htm

Yet, UNIX still can't do all of the above despite matching or exceeding many items on list through labor put in. Those rarely crashed, had OS + apps in same language, took you right to OS code with OS-related error, and let you patch live systems to keep them running. On top of LISP advantages like incremental, per-function compilation. I'm a Linux user that doesn't code LISP. Even I still see that UNIX + C architecture can't touch LISP architectures in flexibility and safety combination. And, no, that's not a simpler problem at all given how many PhD's have tried, failed, or only approximated such results in UNIX + C. It's just a simpler, consistent, designed-in approach to solving hard problems instead of scaling up a platform that chopped down ALGOL and MULTICS to fit on a PDP-11 plus some arbitrary changes.


UNIX stated getting messy the moment they added sockets and ioctl to it.

I really don't know where this ideal model of UNIX lives.

All your "not true under Windows" can be extended to all other OSes that aren't UNIX.

Windows isn't the only non UNIX OS, although it appears to be like that to most HN eyes.


The way to have a system with an ideal model is to not use it... just publish a paper about it, and then lament that nobody gets it. Even though it hasn't been tested in many different situations and settings, you can still be smug about it.

Whenever something hits the real world, people will add hacks to solve their immediate problems. That's just a property of software that has nothing to do with Unix.


So UNIX being a pile of hacks is ok, but other OSes having their quirks is bad design?


No, my claim is that Unix has good "bones", and Linux is all the hacks on top that make it work in the real world (more diverse hardware than Unix ever saw, diverse users, alternate models like Android and JVM, etc.) And that Linux needs some design leadership right now.

When you have good bones, the hacks have something to rest on. The system still has some coherence, and doesn't degrade into a pile of mud. Linux is wading dangerously into that territory.

When something is widely used, there is no option to not have hacks... you either have bad architecture and hacks, or good architecture and hacks.

There was a thread here a few days ago discussing how NT is the same... it has good bones from Cutler, but then Microsoft had to layer all the compatibility hacks on top, for Windows 95 earlier. That's probably true to some extent.

I guess I'm mainly saying that Windows is inferior in terms of system administration and scripting: shell scripts, fork/exec, shebang lines, the file system abstraction, etc. Threads and async I/O were not in the original Unix model, so it's perhaps not surprising if Windows does that better.


The only good bones of UNIX is being free.

If AT&T was allowed to sell it, I doubt very much the market would have seen such good bones.

So like in any free gift regardless of quality, people build gadgets on top of gadgets and think that is the best there is.

The UNIX haters handbook still holds quite some truths when checking an actual version of HP-UX or other commercial UNIX es.

A fact of the UNIX free culture is that people only started contributing to gcc after Sun decided to stop offering the developer tools.

Some of us prefer doing administration of machines via tools like Active Directory or have an OS that has security access lists applied to every kernel object.

Also UNIX has lots to learn from mainframe OSes like IBM I, z/OS or Unisys MCP.

But their owners sells them, they weren't forced to give the source code for free to universities.


Unix interfaces have indeed stood the test of time, but how many applications nowadays are using them directly, without layers upon layers of abstractions? Most of the programmers I meet develop software that ends up running on some kind of Unix, yet are clueless about the system calls and OS-level abstractions. If Unix interfaces were truly successful, we would be writing most of our applications with nothing but the C standard library and the few hundred syscalls.


I don't really agree with that -- layers are necessary to build bigger systems and control complexity. If the programmer ends up ignorant of the syscalls, that's analogous to being ignorant of how many bytes of space a Python dict or list takes. (You can find out if you want, but you don't have to.)

As long as it's possible to build these systems efficiently on top of the kernel, that means the kernel succeeded. ioctl() is the hack where the kernel APIs failed.

However, there was an excellent paper on HN a few weeks ago: https://news.ycombinator.com/item?id=11652609

Their observations about IPC are great. Android, OS X, and Ubuntu/Linux have their own IPC abstractions with structured data. So arguably for this class of applications, Unix APIs failed. (But note that Android is Linux but not Unix, because it's basically a single language OS, which uses single language IPC.)

"We observe across all three OSes that no POSIX calls belonging to traditional POSIX IPC are among the frequently invoked operations. As discussed earlier, Android departs from the traditional IPC in favor of higher-level IPC abstractions, namely Binder, the core Android message passing system. Similarly, OS X supports higher-level IPC primitives built atop Mach IPC, which diverged from POSIX since its inception. Finally, Ubuntu also provides applications with message passing capabilities based on D-Bus."


> The main design idea behind Unix is composition

I'm not sure how well-supported this is in the historical record.

You give /proc as an example, but it's first appearance was in the 80s, and it took until solaris in the early/mid 90s to really get going. ( https://blogs.oracle.com/eschrock/entry/the_power_of_proc )


It is definitely well-supported -- I recommend reading http://www.catb.org/esr/writings/taoup/html/index.html (available in print too)


I don't consider ESR to be an expert in anything except promoting himself.


> Uniformity enables composition

Type compatibility enables composition. That is, given functions `foo :: A -> B` and `bar :: B -> C`, you can form the composite `bar . foo :: A -> C`. Unix provides a very simplistic solution that guarantees type compatibility (all I/O is byte stream I/O) at the price of creating other problems (what should be the result of `lseek()`ing a socket?).


Unix is the case where A == B == C. e.g. Pipes are an algebraic monoid (not monad).

I noted in my message that Unix polymorphism isn't perfect, and comes with the costs of serialization and escaping. However, I challenge anyone to design something better with types. It's not like people haven't tried.

From Alan Perlis: "It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures."

That is the essence of Unix (and Lisp for that matter). You don't want A, B, and C; you want just A, and then 100 functions on it that can be composed.


> Unix is the case where A == B == C. e.g. Pipes are an algebraic monoid (not monad).

Who said anything about monads? I only established the conditions for composition to be possible. Composition is the binary operation of a category, and, while every monoid is a category in a very trivial way, there are categories with more interesting structure. But, categorical blablabla aside, what matters is that making everything a byte stream isn't the only way to make things composable.

> From Alan Perlis: "It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures."

Well, Perlis is wrong on this one. It is better to have types and operations that reduce the number of ways in which your program can possibly fail.


But the point is that types don't belong that the OS level. You can have your type composition in user space, at the language level -- what's the problem with that? Haskell and OCaml run on Unix just fine.

It would be a mistake to bake types into the OS -- whatever that even means -- because nobody has found the perfect type system. For every type system, there are some things that don't fit in the model. (This reminds me of some theorems in the foundations of mathematics...)

There are even libraries to shell scripts in Haskell and OCaml. If there were some inherent advantage to doing so, they could easily replace shell scripts. But I predict they will not.


> But the point is that types don't belong that the OS level. You can have your type composition in user space, at the language level -- what's the problem with that?

I don't do kernel programming, so I don't care what kernels do internally. As an application programmer, I want usable APIs. I want an API that makes resource leaks inexpressible, that doesn't let me seek intrinsically sequential streams, etc. Of course, such an API can't possibly be a C API.

> Haskell and OCaml run on Unix just fine.

Haskell has an I/O model that's barely safer than C's. The only improvements are that the `Handle` type is abstract and the `bracket` function is a `try...finally` of sorts. OCaml fares even worse: Can I open a single file for both input and output with just the standard library? In either language, is there a comprehensive solution for manipulating memory-mapped files?

> It would be a mistake to bake types into the OS -- whatever that even means -- because nobody has found the perfect type system.

Making a usable interface to OS services doesn't require a “perfect” type system, just a type system that prevents the most common mistakes that happen when using OS services. OS services are to a large extent about manipulating resources, so substructural types can help a lot.


a nice explanation. thank you ! i do have a follow up question though:

> I do think McIlroy is right. You need taste to "correct" the ugliness of evolution, at least in software. Evolution is important, but so is design. Like all things, neither extreme is right.

do you think, 'correction' might actually prune some branches in evolution ? but, honestly, i think that is like predicting the future...


Yes, design is pruning the "bad" branches of evolution. You are ruling out what you think are bad features / designs.

You could be wrong at predicting the future, in which case your system will fail. And that happens all the time. But someone else will succeed (I guess that is another level of evolution, among competing systems rather than evolving features within a system.)


The evolution is opportunistic, it doesn't favour elegant design. Just look at the human body with all its staggering complexity.

Software, if written and adjusted incrementally, tends to evolve in similar way.

Now, what Mr. McIlroy asks is how can personal taste (i.e. ability of a single person to create small and simple designs) correct that big evolutionary trend.


Software, if written and adjusted incrementally without refactoring, tends to evolve in similar way.



Negative coding. Eliminate cruft. Drop "features". Go small. Stop modifying the same program to do more. Finish.

Open to interpretation but that's my take. You use the word "elegance" and I think that's the idea. Alas, software usually does not "evolve" toward elegance.

tr -h | wc


$ tr -h

tr: illegal option -- h

usage: tr [-Ccsu] string1 string2

       tr [-Ccu] -d string1

       tr [-Ccu] -s string1

       tr [-Ccu] -ds string1 string2


That's GNU, I'm guessing. BSD is a little different.


TR(1) BSD General Commands Manual TR(1)


FreeBSD. Got it. But not OpenBSD, NetBSD, or Solaris. Interesting how even with a relatively simple utility like tr, changes were made.


I've been thinking a bit about this lately, with my OS dev hobby. If you look at a project like xv6, and its implementations of the essence of coreutils... compare:

https://github.com/mit-pdos/xv6-public/blob/master/cat.c to http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f...

https://github.com/mit-pdos/xv6-public/blob/master/echo.c to http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f...

Or even something like https://github.com/mit-pdos/xv6-public/blob/master/grep.c

That's not exactly to say these things are _better_, mind you: the tools have grown complexity for a reason. Namely, the simple versions are very much ASCII focused, for example...

But sometimes, I wonder. I barely use any of the more advanced features of most of my tools anyway. Should I be using some distro or OS which mostly keeps it simple?


Oh, it gets much better. Here's printf being traced in detail:

http://blog.hostilefork.com/where-printf-rubber-meets-road/

Seeing this, one realizes there is something fundamentally wrong with how their language or libraries work. ;) Personally, I think these things are perfect examples of where a language providing macros, DSL's, and/or generic interfaces can be much more readable. You could just describe the grammar or whatever with ties to specific, low-level functions for the results. Compiler does the rest. Racket people do that kind of stuff all the time. So, do tools from sklogic, REBOL, RED, Julia, Nim... Inherent limitations of C and their coding approaches combining to create trouble. Plus legacy effects.


grep and regular expressions in general are a great example of something that Unix seems to excel at but that leaves me cold: shitloads of power, but all dependent on learning arbitary syntax by the bucketloads, thus requiring a time investment I just don't have to make.

I marvel at and am impressed by the things that clever shell command chains or carefully constructed regexes can do, but the gap between being able to do those things and when you're awkwardly hunting through man pages and 40 year old docs just to do basic things is huge, and I so rarely need to actually do those things that putting that kind of time in is rarely worth it. Is 3 hours to save 5 minutes really worth my time?

99% of my use of grep is just a simple string search. I don't need an entire esoteric programming language to do that.


"arbitrary syntax"… by that implication, isn't all syntax arbitrary?

Your 99% use case is handled by "grep needle". You don't need an entire esoteric programming language to do a simple string search.


That's exactly the point.


Just to complement: he is a notable user of the "sam" editor.

http://sam.cat-v.org/


This post is about more than emacs/vi. HN title should be renamed "Unix taste" because that is the subject line.

I wish more programmers thought like McIlroy. Good taste is rare in programming. He has it.


Ok, we changed the title from "Doug McIlroy on Emacs/Vi".


I think a similar example of evolution of tech going wild is the web.


Perhaps classic McIlroy: calling Emacs and vi "baroque efflorescences totally out of harmony with the spirit of Unix"


They are pretty words, for sure, but I'm not sure what "intricately designed blooming flowers" (baroque efflorescences) is supposed to even mean in this context.

It feels like a sesquipedalian circumlocution by an ebullient iconoclast :P


One of the meanings of “baroque” is “Extravagant, complex, or bizarre, especially in ornamentation”, or so DuckDuckGo tells me: https://duckduckgo.com/?q=define+baroque. Also see: https://www.cs.utexas.edu/~EWD/transcriptions/EWD03xx/EWD340....

And it's pretty clear that McIlroy used “efflorescence” as a metaphor for “proliferation (of features)”. This usage of “bloom” and its synonyms isn't uncommon.


Perhaps he's saying vi users should be using ed, and emacsers TECO. Or maybe just magnetised needles.


Isn't this what the "Tick-tock" method of releasing meant to resolve - the first phase, the tick, is all about new, bombastic features, the tock is about removing unused features and tidying up the codebase in general and focusing on performance.

I always thought it was quite a neat compromise - you get the best of both.


I don't understand his comment about less & wc. It appears to work for me:

    $ less --help | wc
        237    1298   11912


It has a ridiculously big help section. It is a pager!


That's not even the entirety of its help; the man page is gargantuan:

    $ man less | wc
        1530   10418   77708


So his complaint is that it is well documented and featureful?


Look in the man page. There's an entire section about how to edit the command line that is used when entering, for example, the command to view a different named file. That smells really bad - surely that mechanism shouldn't be specific enough to a pager that it needs a huge section in the help file.

All-in-all, less does feel big relative to other programs adhering to the unix philosophy. I tried to do a feature-comparison between more and less just now but, unfortunately, on OSX:

    ~/src/c$ more --version
    less 418


So now you're complaining that "there's too much documentation"? (or that "once you're in less you shouldn't be able to view another file without quitting"?)


He's complaining that there's too much crap in a single program that shouldn't have to do anything other than page files.

It has been possible to view another file without quitting less since pretty much the dawn of Unix time by suspending less and launching another instance (or, more recently, as in for the last thirty years or so, by using a terminal multiplexer).


One could say that if a pager needs +1k lines of help, something is wrong. Complexity is usually not good. And never free.


Is the code particularly buggy or inefficient? If not, having a feature that you don't use doesn't really affect you. There's an argument that lets you squeeze together multiple blank lines into one. That's arguably superfluous, as you could pipe your text through a sed/awk/ed/perl/snobol script beforehand.

I've never used that argument. But I never had less crash on me because of it, and I don't think it caused a big security impact. If having that command means that every second procedure in the code has to handle special cases because of it, then yes, it's probably not worth it. If it's just a few lines in one place, plus another command line argument parsing section, plus a line in the docs, then why not?

Extraneous features can cause harm, but one shouldn't be too extremist about this and consider each case. When it comes to handling text, there are a lot of special cases that require little code and thus don't really hurt. That's one of the main reasons why GNU utilities might be considered bloated, but most of the time it's no big deal beyond philosophy/aesthetics. (I'm sure my Python/Perl standard lib has a lot of modules I'll never use, too)

On the other hand, I'm not quite sure whether I'll ever need the Tektronix emulation of xterm...


This is going against exactly what Doug is arguing, though: Unix taste includes having many small, perfectly-designed programs to achieve great power. Adding more and more features to something simple adds complexity, and therefore, in Doug's opinion, makes it less elegant.

Obviously `less` is a pretty bad example for this, because it's so simple. But that's his point-- it should just be a pager, that's it. But it doesn't just do that-- it does a great many other things, has all sorts of arguments that you can pass to it unnecessarily.

Regarding your xterm comment, you might want to consider the `st` terminal emulator, a suckless tool. It's really great, consists of one C file, and is extremely minimal.


Is it affecting you adversely in some way? Almost all the time you can write:

    something or other | less
and it does what you expect. The additional features and customization are there if you need them, and hide out of the way when you don't.

If he really wants a pager with fewer features, then plenty of others exist, like plain "more":

    $ more |& wc
         20      94     676
But people don't do that. They use less because it's better.


When you do `smthin | less' the other bazilion options are loaded in less, using resources. If you want to have a quick look a the help you can get overwhelmed by all the information.

Having so many options in less can be seen as over-engineering. Bloating the system. There are duplicate features, features that probably don't belong in less etc. It goes against the minimalism of Unix.

It probably isn't a big deal in this case, but as a general rule being simple and lean is a feature IMO.


> the other bazilion options are loaded in less, using resources

What kind of tiny little system are you using that you have to worry about less taking up too much of it?


I think SVR2 only included vi because it was forced to, right? For fun, look up Mark Horton and terminfo vs termcap.


If "recentering the exuberance of evolution" means BCPL -> C, then I'm okay with exuberance myself.


great to have my own gut feeling put so eloquently




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: