Martin Pool's blog

jmason agrees

TAOUP: Open source and code reuse

I think the best and most insightful section of TAOUP is that on why software reuse drives programmers to free software:

Why do programmers reinvent wheels? There are many reasons, reaching all the way from the narrowly technical to the psychology of programmers and the economics of the software production system. The damage from the endemic waste of programming time reaches all these levels as well.[...]

Beneath the surface gloss of their demo applications, the components he is re-using seem to have edge cases in which they behave unpredictably or destructively — edge cases his code tickles daily. He often finds himself wondering what the library programmers were thinking. He can't tell, because the components are inadequately documented — often by technical writers who aren't programmers and don't think like programmers. And he can't read the source code to learn what it is actually doing, because the libraries are opaque blocks of object code under proprietary licenses.

Newbie has to code increasingly elaborate workarounds for component problems, to the point where the net gain from using the libraries starts to look marginal. The workarounds make his code progressively grubbier. He probably hits a few places where a library simply cannot be made to do something crucially important that is theoretically within its specifications. Sometimes he is sure there is some way to actually make the black box perform, but he can't figure out what it is.

This should be familiar to anyone who's tried to make nontrivial use of a complex closed library.

I wrote a while ago about why this meant the Objective C / "Superdistribution" idea of reusable closed components was doomed from the start.

Of course RMS has been saying this all along...

I think this is one reason why many of the best programmers want to work on open software, and will do it in their spare time even if they can't get a satisfactory day job. If you're going to bother to write the best code you can, you don't want it to be lost in a couple of years when the original project is killed for business reasons.

A.M. Kuchling on TAOUP

Python hacker amk reviews The Art of Unix Programming:

In spots there's too much history. [....] I don't really think anyone should really care about the fact that Uniforum, Ultrix, or NeWS once existed, whatever their influence has been.

There are occasional splashes of self-congratulation or silly assertion. My favorite silly assertion is a mention that "Commercial Unix distributions that have [removed] the BUGS section or euphemizing it ... have invariably fallen into decline."

I agree: it is an excellent book, though flawed by overindulgence of the author's pet ideas and projects.

TAoUP: The significance of patch

But something else happened in the year of the AT&T divestiture that would have more long-term importance for Unix. A programmer/linguist named Larry Wall quietly invented the patch(1) utility. The patch program, a simple tool that applies changebars generated by diff(1) to a base file, meant that Unix developers could cooperate by passing around patch sets — incremental changes to code— rather than entire code files. This was important not only because patches are less bulky than full files, but because patches would often apply cleanly even if much of the base file had changed since the patch-sender fetched his copy. With this tool, streams of development on a common source-code base could diverge, run in parallel, and re-converge. The patch program did more than any other single tool to enable collaborative development over the Internet — a method that would revitalize Unix after 1990.

Joel on TAOUP

Joel Spolsky takes a look at esr's The Art of Unix Programming.

Let's look at a small example. The Unix programming culture holds in high esteem programs which can be called from the command line, which take arguments that control every aspect of their behavior, and the output of which can be captured as regularly-formatted, machine readable plain text. Such programs are valued because they can easily be incorporated into other programs or larger software systems by programmers. To take one miniscule example, there is a core value in the Unix culture, which Raymond calls "Silence is Golden," that a program that has done exactly what you told it to do successfully should provide no output whatsoever. It doesn't matter if you've just typed a 300 character command line to create a file system, or built and installed a complicated piece of software, or sent a manned rocket to the moon. If it succeeds, the accepted thing to do is simply output nothing. The user will infer from the next command prompt that everything must be OK.[...]

So you get these religious arguments. Unix is better because you can debug into libraries. Windows is better because Aunt Madge gets some confirmation that her email was actually sent. Actually, one is not better than another, they simply have different values: in Unix making things better for other programmers is a core value and in Windows making things better for Aunt Madge is a core value.[...]

Raymond does attempt to compare and contrast Unix to other operating systems, and this is really the weakest part of an otherwise excellent book, because he really doesn't know what he's talking about. Whenever he opens his mouth about Windows he tends to show that his knowledge of Windows programming comes mostly from reading newspapers, not from actual Windows programming. That's OK; he's not a Windows programmer; we'll forgive that. As is typical from someone with a deep knowledge of one culture, he knows what his culture values but doesn't quite notice the distinction between parts of his culture which are universal (killing old ladies, programs which crash: always bad) and parts of the culture which only apply when you're programming for programmers (eating raw fish, command line arguments: depends on audience).

There are too many monocultural programmers who, like the typical American kid who never left St. Paul, Minnesota, can't quite tell the difference between a cultural value and a core human value. [...]

Joel often seems to have basically interesting ideas and then to mix in a lot of chaff that's either obvious or oversimplified. Having made a basically good point about the cultural differences between Windows and Unix, he seems to assume that the cultures are entirely fixed and unchanging, and Unix will never get a good GUI. This is more or less like self-righteous American car companies in the 60s assuming that Japanese cars are will always be cheap, unreliable and nasty.

Honda NSX 2005, streetracersonline.com
Honda/Acura NSX 2005 model preview

You need to distinguish the core values (solidity, reuse) from their accidental expressions (preference for command-line tools, terseness.) Sadly neither Joel nor ESR do this very well at the moment. It is still an open question how to make graphical components that can be recombined as fluidly as unix pipes, but perhaps we will know in ten years.

As an example, some number of free software programs are developing excellent log/debug/trace frameworks. distcc is one of them: many releases have discovered bugs either in distcc, gcc or the kernel that could not have been reproduced on my machine. The trace mechanism turned on by DISTCC_VERBOSE=1 is good enough that in almost every case I've been able to work out what was wrong and prepare a fix by looking at the log file. (OK, in some difficult cases looking at it for a long time.)

This is a small departure from traditional Unix terseness but I think a valuable one. I don't claim to be the first person to do things this way but I do think it's an evolution based on the core values of wanting to help a technical user and being good to developers. Contrast it to the classic Windows system log message: "The operation failed because: Success."

Does it help Joel's auntie? No, but then she probably doesn't need a distributed compiler anyhow. Does it help, say, Joel's niece who's getting into Linux for the first time by installing Gentoo? Yes, it just might. If you called it "progressive disclosure" you'd start to sound like a UI designer, or enough to fool a bearded unix-head anyhow.

I agree that it is a real weakness of esr's book that he doesn't really understand any of the other systems he talks about, or why they might be good. The Practice of Programming does much better in this regard.

Tim has some thoughts on this too.

By the way...

More thoughts on TAOUP are in /weblog/books/taoup.

The Art of Unix Programming released

TAOUP has gone to press. I think it will be worth buying; for more review notes see /weblog/books/taoup/.

!CrackMonkey!

What happens when you feed monkeys crack, then give them a copy of The Art of Unix Programming?

(That makes me think running dissociated-press acrss TAOUP might be entertaining.)

dmullen: What's wrong with TAOUP?

dmullen asks of Aaron Swartz:

What, precisely, is so awful about TAOUP? Personally, I consider it to be one of the best books ever written on programming in general, let alone Unix programming. I found useful information and advice on every page.

I think TAOUP is a pretty nice book, and I will probably buy a copy when it comes out, and recommend it to my friends.

However, I don't think it is perfect, or above criticism. It might be in my top 20 books on programming but I don't think it would make my "top few".

What's wrong? Here are some complaints I prepared earlier.

I guess overall it seems to focus too heavily on "things the author thinks are interesting/funny". (I think that's a failure of later versions of the Hacker's Dictionary too.) Don't get me wrong: I'm all for authorial voice, and I know it's impossible to suppress completely. But in TAOUP's current draft, I think it is too obtrusive.

For example, a fair number of the examples are taken from esr's own projects, such as fetchmail. But I don't know anybody aside from esr, who would say fetchmail is a really prime example of Unix design! Why not pick an example that really is broadly recognized as brilliant?

Has esr been lazy in just using examples from his own home directory? Or is he suggesting that his own designs are sublime and perfect examples of Unix? The kernel hackers who make special mention that that esr's CML2 system was *not* merged might disagree. Either way it's a bit irritating.

One thing that sprang to mind is that rusty's "iprint" (apt-get install iprint) is probably more Unixy than esr's "ascii". (iprint's smaller :-)

Now certainly looking at the design of your own programs is easier, but if you're aiming to be the definitive work and explicitly comparing yourself to Knuth then I think you are obliged to go further afield. (Does The Art of Computer Programming deal mostly with code from TeX? No.)

Similarly, his discussion of the history of Unix, or of Unix compared to other operating systems, seems really skewed to his personal views. You can see his position on the open source/free software kerfuffle intruding. Of course history is shaped by the historian, but I think he does it more than is really needed or helpful.

Most of the problems seem like they could be fixed by more editing and constructive criticism before it's released. I wouldn't be surprised if a second edition, if there is one, is better. (Perhaps esr's trying for early-and-often in books?) Now it is perhaps a bit unfair to criticize it for this when it's not in print yet, but the web page gives the impression that it is nearly a final draft.

The tone and level is a bit uneven too. Sometimes it is very jokey and sometimes moderately formal. Some examples are a bit labored and some parts that I think ought to have examples are missing.

TAOUP does deal with a topic that has not been fully explored before, which makes a nice change from the two shelves of "C++.NET in 24 Hours for Complete Morons" at my local bookstore. It's not quite the only one there: The Practice of Programming, Patterns of Software, Code Complete, and Unix Network Programming

approach different parts of the topic. If TAOUP irritates Swartz enough that he's motivated to try to do better then I think that's a good thing.

--enable-hijacking

Ken Arnold is quoted in The Art of Unix Programming:

I insisted SIGUSR1 and SIGUSR2 be invented for BSD. People were grabbing system signals to mean what they needed them to mean for IPC, so that (for example) some programs that segfaulted would not coredump because SIGSEGV had been hijacked.

This is a general principle — people will want to hijack any tools you build, so you have to design them to either be un-hijackable or to be hijacked cleanly. Those are your only choices. Except, of course, for being ignored — a highly reliable way to remain unsullied, but less satisfying than might at first appear.

I wondered if esr's "talmudic" concept for TAOUP was a bit wanky, but it turns out to work quite well.

Pre-review thoughts on TAOUP

Eric Raymond is writing a book called The Art of Unix Programming. It is coming out in August 2003, but a late draft (?) manuscript, version 0.86, is on his web site. In everything below, bear in mind that the book is not finished yet.

Overall, it is a pretty good book. It is interesting enough and a worthwhile contribution. There are some irritating bits where esr is too insistent on his particular view of the world, but they are outweighed by parts that can be broadly appreciated.

The momentary initial impression is of hubris. However good you think you are, should you really explicitly compare yourself to Knuth? Should something "dashed off" in a couple of years compare to somebody's life work? Too late now, I suppose. Perhaps I'm too harsh: it is after all a tip of the hat, and in scope and focus on the aesthetics it is not so far away from The Art of Computer Programming.

The Practice of Programming is quite similar in content, though more oriented towards practical tips for programmers on any platform than exaltation of Unix.

The Basics of the Unix Philosophy are nicely summarized, perhaps better than has ever been done before in one place:

Another thing that bugs me is the slightly preachy tone at times. I think if a thing is good, it enough to merely describe its good features and allow audiences to draw their own conclusions. Pounding on about it will irritate your friends and alienate neutral readers. But this is largely redeemed by this epigram:

If you have any trouble sounding condescending, find a Unix user to show you how it's done.
-- Scott Adams Dilbert newsletter 3.0, 1994

for example, in discussing Windows NT Raymond doesn't discuss NT's Domain Security model, which I think is one of their most interesting design features and one to which Linux still has no complete answer. Is he unaware of it, or is it not important to him, or is he supressing it to make Unix look better?

There are however genuinely interesting ideas behind his analysis of operating systems. There is a feedback effect on ease of "casual programming", which generates a larger developer base. (c.f. the Angry Monkey Dance)

It can be hard to argue for a particular design school. Leaving aside religious/emotional attachments, aesthetics by definition make sense only mostly on their own terms. Showing that a particular design approach has produced a good outcome in a particular example makes the reader wonder if the example was accidental or specially chosen. TAOUP is at its best when it explains the Unix philosophy with examples.

TAOUP does recognize and discuss in a useful way the tension between the desirability of small programs and the existence of desirable large programs.

Discussion of the problems with OO is also good, and as clear an explanation of the problem as I can remember.

For Stephane

Old photo of PDP-7 computer

Archives 2008: Apr Feb 2007: Jul May Feb Jan 2006: Dec Nov Oct Sep Aug Jul Jun Jan 2005: Sep Aug Jul Jun May Apr Mar Feb Jan 2004: Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan 2003: Dec Nov Oct Sep Aug Jul Jun May