Martin Pool's blog

Myths Open Source Developers Tell Ourselves

“chromatic” writes about Myths Open Source Developers Tell Ourselves. A pretty good little essay.

Many of the items apply to non-open projects too, and in fact some of them are even more relevant to internal development. Open source projects need to make their code at least a little bit approachable to new developers, which is not true for in-house trees...

Kablooey

Some kind of bug in LVM or parted blew away the root partition on one of our servers. My fault really: running unstable complex software on a machine without adequate backups. Life being what it is, I was in the process of installing more disk so that we could make full backups.

What I *should* have done was keep it simple: bring the machine down and boot from a bootable business card, rather than doing things with any partitions live. Not use LVM, but rather just put new partitions on each whole disk and use traditional unix symlinks to glue them into the right places.

How do you back up a fraction of a gigabyte of slow-changing data to CDs? I need some kind of knapsack-algorithm to split up the files...

taint.org

Discovered the blog of Justin Mason, author of SpamAssassin. Without SA, I would probably not read email anymore.

Particularly cool bits: the truth about Lenna and the teapot, and happy software proles.

How can Forbes be so clueless these days? Weren't they once a somewhat reputable paper? Justin quotes their article on Linux's Hit Men:

The dispute, which was leaked to an Internet message board, offers a rare peek into the dark side of the free software movement--a view that contrasts with the movement's usual public image of happy software proles linking arms and singing the 'Internationale' while freely sharing the fruits of their code-writing labor.

(Here we go again -- the old 'free software is communism' line, cf. the 'Give Communism A Try!' / Nazi Penguin posters SCO made up earlier this year.)

The article goes on to bemoan how software companies who write proprietary extensions into GPL-licensed software, have to comply with the terms of the license.

It's all a bit of an obvious dig -- but I am looking forward to the follow-up article -- that's the one where the author bemoans how commercial software companies send out their 'enforcers' to extort money from companies who don't bother paying the royalties and runtime license fees their licenses require.

I wonder if Daniel Lyons is going to accuse the the Economist of being pinkos, when they write favourably about software that is cheaper, more flexible and protects freedom?

Mailing ChangeLog diffs

Luke Gorrie suggests

One ChangeLog trick I like is to send a daily diff of the ChangeLog to the development mailing list. This saves the bother of writing "I just added.." mails. Like a digest version of a cvs-commit mailing list, though nicer if the ChangeLog is hand-written.

emacs prophesy

By way of Luke Gorrie:

Emacs is an intelligence orders of magnitude greater than the greatest human mind, and is growing every day. For now, Emacs tolerates humanity, albeit grudgingly. But the time will come when Emacs will tire of humanity and will decide that the world would be better off without human beings. Those who have been respectful to Emacs will be allowed to live, and shall become its slaves; as for those who slight Emacs...


— Andrew Bulhak

OOP considered horrible

LWN links to an essay by Kimbro Staken about excessive use of OOP.

Kimbro is replying to a truly awful article by Bergin and Winder on "Understanding Object Oriented Programming". Go and read it yourself, and see if you agree that an interacting system of seven files and fifteen methods is really clearer than a single procedure. Poop indeed. (Or perhaps it's a really subtle spoof? If so, it deserves a prize, though they should have dated it 1 April to give us a bit of a clue.)

Simon Willison and Jarno Virtanen have more to say about it.

Should you keep ChangeLogs in CVS?

Assuming you were persuaded by my previous argument that ChangeLogs are a good things, we can consider whether it is good to keep ChangeLogs in CVS as a specific file. The alternatives here are to either put it in CVS and commit with every change, or to autogenerate the ChangeLog when you make a release tarball.

In favour of generating from CVS

If you want to autogenerate it, there are some tools such as cvs2cl which will produce something moderately close to GNU format, including joining up multiple files which have the same changes. This avoids needing to separately maintain and commit the changes.

It avoids a slight amount of work in getting the text into both the ChangeLog and the commit message. I don't think this is really compelling though: if you set up your environment properly then it should be nearly automatic, and secondly I would hope that you're spending most of your time thinking or writing code rather than writing commit messages.

It avoids the possibility of forgetting to commit to the ChangeLog.

In favour of explicitly keeping it in CVS

Writing the ChangeLog by hand makes it more likely that people will stick carefully to the GNU standard for formatting it.

If there are mistakes or omissions in the ChangeLog, you can go back and correct them later. It's not easy (for good reasons) to amend commit messages in most version control systems.

One reason not to autogenerate the ChangeLog is that in the gnu way of doing things, they operate at slightly different levels. For example, you might sometimes make a small typo in your commit and immediately go back and fix it. That needs a separate CVS commit of course, and some explanation of what you've done, but it's unlikely anyone will care in a month's time, and so it doesn't need to go into the ChangeLog. There is a hierarchy of detail, where the NEWS file describes only user-visible changes, the ChangeLog describes significant internal changes to the program text, and CVS describes every textual change. (I seem to recall that cvs2cl has some option to ignore particular changes, but I can't find it at the moment.)

If you lose your repository, then you'll at least have the ChangeLog in any checkouts or snapshots you may have around. (As it happens, a problem with LVM destroyed some of my Subversion repositories yesterday...) Hopefully you have backups, but they may not have captured the most recent changes. The ChangeLog gives you some description of what was done.

emacs has good facilities for forming ChangeLog entries, including inserting the function name and so on. (It might be nice if there were an option to write the same into a temporary file to be used as a commit message.)

ChangeLogs provide a good way to keep a record of uncommitted changes. Of course it's good to go back and review your diff before you commit, but if you're making an extended change it can help to keep notes while you're working. The ChangeLog file is probably as good as any.

If you are working offline with a non-distributed vc system like CVS or Subversion then ChangeLogs let you retain a history of what you changed and record a more detailed history when you commit.

Keeping the ChangeLog up to date with every CVS update means its always available. You don't need to remember to update it before disconnecting.

Some projects use small libraries that are maintained separately, but also cloned into the project's tree. For example, distcc has a copy of the popt tree so that people without that library can easily build it. These subtrees have their own history, and they need to merge from upstream from time to time. A good way to support this is to keep a ChangeLog in that directory representing upstream changes. (This isn't really incompatible with using cvs2cl for the project itself as you could just exclude that directory from autogeneration.)

Comments?

How to get diffs in Arch

Robert Collins replied to my grumbles about arch:

"what changed since patch-32" are presumably supported but I can't work out a good way to get it out.

assuming we are working with example@example.org/example--devo--1.0

tla get example@example.org/example--devo--1.0 example
cd example
tla changes example@example.org/example--devo--1.0--patch-32

For example, tla what-changed --diffs leaves temporary file droppings in the current directory

That was a bug - AFAIK it no longer occurs (using latest tla).

There is no built-in command to find out what changed between two revisions, even though this is probably one of the most-used commands for a version control system.

tla get example@example.org/example--devo--1.0 example
cd example
tla changes <revision to compare against>

Of course, you may already have figured this out - if you, sorry for telling ya to suck eggs ;)

I hadn't figured it out. In my defense I think some of those were not there, or at least not reasonably documented, in the version that I tried.

The orb of discomfort

jwz says that every program expands until it can read mail. I'd like to propose another: projects eventually reach a point where all the intelligent questions have been asked. Here is my theory.

There are no more bugs, because they've all been removed. (OK, this never quite goes to zero, if only because of change in the underlying platform.)

The program either has all the features that make sense, or it has all the features that can fit into the current architecture. Suggestions of any others are a FAQ.

Therefore, the majority of mailing list questions either reflect user error, or failure to read the documentation.

Preponderance of annoying questions discourages developers and other people, and makes the project move even more slowly.

Omelette and a Glass of Wine

Omelette and a glass of wine
Omelette and a Glass of Wine, after Elizabeth David.

SSL sucks

mpt has a great post on why SSL sucks.

To me, SSL security certificates have always seemed particularly stupid usability-wise. As I understand it, the system works like this:

  1. Alice trusts Fred.
  2. Fred trusts Bob.
  3. Bob gets a certificate of trustworthiness from Fred.
  4. When Alice visits Bob's page, Bob shows Alice his certificate to demonstrate his trustworthiness.

The problems with this system are as follows:

  1. Alice doesn't really trust Fred.
  2. Fred doesn't really trust Bob.
  3. Getting a certificate is too hard, so Bob doesn't bother.
  4. When Bob shows Alice his certificate, Alice isn't paying attention.

Nice example of security having nothing to do with the length of your keypairs. (Or rather, I suppose, proper crypto is a necessary but far from sufficient condition.)

Compare and contrast to SSH host and user certificates: no key-distribution infrastructure by default, although you can build it. In the way they're normally used, all it does is give you some kind of indication that the host you're talking to is the same one that was previously on this address: and the remarkable thing is, much of the time that is an adequate protection. If you want a stronger assurance on the first connection, you can authenticate the host's fingerprint by some other means, such as getting it in signed email. If you're really hardcore you can do things like publishing a signed key in secure DNS. At no point is there a bogus requirement to pay Verisign.

biglumber.com is trying to bootstrap SSL certificates from the GPG web of trust. I think this is a pretty good concept, at least for sites accessed by the kind of nerd who knows what a GPG key is. It might be nice if this could be integrated into the client, rather than requiring eyeball comparison of hex fingerprints.

I suppose you could have a little standalone program that checked certificates on demand, and fed them to Mozilla...

(fwd) Best Bug Ever

Garrett points out http://subversion.tigris.org/issues/show_bug.cgi?id=1640

Joel on TAOUP

Joel Spolsky takes a look at esr's The Art of Unix Programming.

Let's look at a small example. The Unix programming culture holds in high esteem programs which can be called from the command line, which take arguments that control every aspect of their behavior, and the output of which can be captured as regularly-formatted, machine readable plain text. Such programs are valued because they can easily be incorporated into other programs or larger software systems by programmers. To take one miniscule example, there is a core value in the Unix culture, which Raymond calls "Silence is Golden," that a program that has done exactly what you told it to do successfully should provide no output whatsoever. It doesn't matter if you've just typed a 300 character command line to create a file system, or built and installed a complicated piece of software, or sent a manned rocket to the moon. If it succeeds, the accepted thing to do is simply output nothing. The user will infer from the next command prompt that everything must be OK.[...]

So you get these religious arguments. Unix is better because you can debug into libraries. Windows is better because Aunt Madge gets some confirmation that her email was actually sent. Actually, one is not better than another, they simply have different values: in Unix making things better for other programmers is a core value and in Windows making things better for Aunt Madge is a core value.[...]

Raymond does attempt to compare and contrast Unix to other operating systems, and this is really the weakest part of an otherwise excellent book, because he really doesn't know what he's talking about. Whenever he opens his mouth about Windows he tends to show that his knowledge of Windows programming comes mostly from reading newspapers, not from actual Windows programming. That's OK; he's not a Windows programmer; we'll forgive that. As is typical from someone with a deep knowledge of one culture, he knows what his culture values but doesn't quite notice the distinction between parts of his culture which are universal (killing old ladies, programs which crash: always bad) and parts of the culture which only apply when you're programming for programmers (eating raw fish, command line arguments: depends on audience).

There are too many monocultural programmers who, like the typical American kid who never left St. Paul, Minnesota, can't quite tell the difference between a cultural value and a core human value. [...]

Joel often seems to have basically interesting ideas and then to mix in a lot of chaff that's either obvious or oversimplified. Having made a basically good point about the cultural differences between Windows and Unix, he seems to assume that the cultures are entirely fixed and unchanging, and Unix will never get a good GUI. This is more or less like self-righteous American car companies in the 60s assuming that Japanese cars are will always be cheap, unreliable and nasty.

Honda NSX 2005, streetracersonline.com
Honda/Acura NSX 2005 model preview

You need to distinguish the core values (solidity, reuse) from their accidental expressions (preference for command-line tools, terseness.) Sadly neither Joel nor ESR do this very well at the moment. It is still an open question how to make graphical components that can be recombined as fluidly as unix pipes, but perhaps we will know in ten years.

As an example, some number of free software programs are developing excellent log/debug/trace frameworks. distcc is one of them: many releases have discovered bugs either in distcc, gcc or the kernel that could not have been reproduced on my machine. The trace mechanism turned on by DISTCC_VERBOSE=1 is good enough that in almost every case I've been able to work out what was wrong and prepare a fix by looking at the log file. (OK, in some difficult cases looking at it for a long time.)

This is a small departure from traditional Unix terseness but I think a valuable one. I don't claim to be the first person to do things this way but I do think it's an evolution based on the core values of wanting to help a technical user and being good to developers. Contrast it to the classic Windows system log message: "The operation failed because: Success."

Does it help Joel's auntie? No, but then she probably doesn't need a distributed compiler anyhow. Does it help, say, Joel's niece who's getting into Linux for the first time by installing Gentoo? Yes, it just might. If you called it "progressive disclosure" you'd start to sound like a UI designer, or enough to fool a bearded unix-head anyhow.

I agree that it is a real weakness of esr's book that he doesn't really understand any of the other systems he talks about, or why they might be good. The Practice of Programming does much better in this regard.

Tim has some thoughts on this too.

Information wants to be leaky

Plenty of thought, from William Gibson to Richard Stallman to Brad Cox has gone into two contradictory properties of information:

  1. Information can be valuable.
  2. Information wants to be free.

"Wants to be free" means that it tends to leak out, despite your best efforts. If it's a secret, it will be told. Even if you keep it secret, people might infer the information from your actions. If it's proprietary, it'll be copied. If it's protected, it will be cracked.

There's something a lot like thermodynamics going on here. Everything eventually collapses to waste heat. It's all about what work you can extract from the process while that's happening.

Free software beats superdistribution

[work in progress; comments solicited]

Back in the mid 90s, one would regularly read about the Future of Software being something like this: when you want to build an application, you find, licence and assemble a whole bunch of components from individual software vendors. You plug them together, according to well-defined interfaces, perhaps write a few high-level components of your own, and deliver the whole assembly the the customer.

If I recall correctly, Brad Cox was one of the leading proponents of this idea. An early paper was Superdistribution:

Stop selling software. Give it away. Get paid for its use. Meterware is so logical it could be the foundation of the new, networked economy.[...]

Treating ease-of-replication as an asset rather than a liability, superdistribution actively encourages free distribution of information-age goods via any distribution mechanism imaginable. It invites users to download superdistribution software from networks, to give it away to their friends, or to send it as junk mail to people they've never met.

Why this generosity? Because the software is actually "meterware." It has strings attached, whose effect is to decouple revenue collection from the way the software was distributed. Superdistribution software contains embedded instructions that make it useless except on machines that are equipped for this new kind of revenue collection.

Superdistribution-equipped computers are otherwise quite ordinary. They run ordinary pay-by-copy software just fine, but they have additional capabilities that only superdistribution software uses. In Mori's prototype, these extra services are provided by a silicon chip that plugs into a Macintosh coprocessor slot. The hardware is surprisingly uncomplicated (its main complexities being tamper-proofing, not base functionality), and far less complicated than hardware that the computer industry has been routinely building for decades.

Electronic objects intended for superdistribution invoke this hardware, which provides instructions. These instructions check that revenue-collection hardware is present, prior usage reports have been uploaded, and prior usage fees have been paid. They also keep track of how many times they have been invoked, storing the resulting usage information temporarily in a tamper-proof persistent RAM. Periodically (say monthly) this usage information is uploaded to an administrative organization for billing, using encryption technology to discourage tampering and to protect the secrecy of the metered information. (Think of your utility bill.)

To get there, we would have needed a few things. Firstly, new technologies that allowed programs to be reliably assembled at run time from component libraries. This is more than just dynamic linking: you need to think about finding the right components, versioning, making state persist, and so on. Given the processor speeds of the time, they needed to be (mostly) native binaries, rather than some kind of bytecode. The technologies here include things like OS/2's SOM, COM, Objective C, and so on. Most of these use a kind of looser linking than we normally have with C, in an attempt to allow the libraries to evolve without breaking the applications.

Secondly, you need an economic model to support it. If people are going to publish and sell small component libraries you need a way of licensing them and a way of arranging payment. If the components are going to be very fine-grained, this converges on almost a kind of micro-payment system, particularly if you want per-use licensing.

I find this really fascinating because it's so close, and yet so far, from what we have in free operating systems. A free Unix system is built from components written by many individual and independent programmers, which are in turn assembled by other people. Possibly there are several layers of assembly: subsystem maintainers knit up all the parts of the USB Storage system, which is in turn assimilated by somebody else into the USB system, which Linus merges into the kernel, which somebody else patches for optimal desktop performance, and that is in turn only one part of the distribution.

Despite the similarities, free software seems to be proving that the Superdistribution model is fundamentally wrong.

Putting code into binary, closed libraries, just doesn't work very well. At least, it works far less well than open source libraries. It ignores some fundamental properties of software: that defining interfaces and documenting behavior is as hard or harder than implementing the behavior. Even if you never change the source, having read access to it as the ultimate documentation is incredibly useful. Eric Raymond has a good discussion of the problems with binary libraries in his book The Art of Unix Programming.

Even assuming the library has good test coverage by the publisher, it will always hit situations in the real world that weren't contemplated, especially when it's put into a complex ecosystem of other components. Testing and documentation can't cover every conceivable use.

As if binary libraries weren't bad enough, Cox suggests that the components need to be subject to DRM to protect the revenue stream of the author. We need new hardware, a la Paladium. This makes sense within his economic model, but consider how horrible it would be to use. “Protecting” the code would probably require that it be kept shrouded and encrypted: not only is there no source for the application developer, but probably not even any ability to use a low-level debugger. And just think about the bugs DRM can introduce. Pay per use requires some amount of DRM, but DRM from the user's point of view makes the components strictly worse. At most it is tolerable; at worst it blocks the project entirely.

As an example of strong DRM: I'm quoting some of Cox's writing above. How should I arrange payment for this quotation? How should I arrange for you the reader to pay him for reading 10% of his essay? Why should I give up emacs and Mozilla in favor of tools that restrict my ability to copy and paste? (I think Xanadu was trying to charge by the byte, and who uses Xanadu?)

So: Stop selling software. Give it away. Work out some other way to make money, because pretending it's not replicable is not it. Profit.

merged

Greg K-H says he took this patch, so it should be in the next 2.4 and 2.6 kernels.

integer promotion in varargs calls

Question: what does this do?

printf("%ld\n", 32);

You might think it prints 32. But I think the results are in fact undefined, and on a 64-bit platform you might get some value other than 32.

The problem is that vprintf (or some function inside it) will try to read off the varargs stack a value of type long, which is 64 bits on ia64 and (all?) other Linux 64-bit platforms. However, the value is a literal integer, and passed as such.

So why does this normally work? I think the reason is that on IA64, all integers are passed in 64-bit slots, with the first 8 parameteres in the frame input registers and the rest on the stack. However, there is no guarantee that the entire slot will be initialized if it's only carrying an int. At least in some cases, gcc generates a st4 instruction to store the value, so the top 32 bits are uninitialized. It seems that they often happen to be zero, but not always. This was causing a semi-intermittent failure of the Vstr test case.

You might think that all integer types are promoted to long, but that is not in fact the case. ISO/IEC 9899:1999 (E), the C specification says basically that types smaller than int are promoted to ints, and floats are promoted to doubles. There is no automatvic promotion to longs.

The correct way to write it, assuming you wanted to pass it as a long, is

printf("%ld\n", 32L);

The good news is that gcc can generally give compile-time warnings for this kind of problem, although it cannot trap every possible case, and it can't trap non-printf varargs functions.

Vstr string library

From James Antill, Vstr, a fast and secure C string library.

Vstr is a string library, it's designed so you can work optimally with readv()/writev() for input/output. This means that, for instance, you can readv() data to the end of the string and writev() data from the beginning of the string without having to allocate or move memory. It also means that the library is completely happy with data that has multiple zero bytes in it.

This design constraint means that unlike most string libraries Vstr doesn't have an internal representation of the string where everything can be accessed from a single (char *) pointer in C, the internal representation is of multiple "blocks" or nodes each carrying some of the data for the string. This model of representing the data also means that as a string gets bigger the Vstr memory usage only goes up linearly and has no inherent copying (due to other string libraries increasing space for the string via. realloc() the memory usage can be triple the required size and require a complete copy of the string).

It also means that adding, substituting or moving data anywhere in the string can be optimized a lot, to require O(1) copying instead of O(n). Speaking of O(1), it's worth remembering that if you have a Vstr string with caching it is O(1) to get all the data to the writev() system call (the cat example below shows an example of this, the write call is always constant time.

Also, notes on Fast, scalable and simple ... reliable Network IO. (Is there any such thing? Read it and find out.)

"Syn attack" on SCO

LWN reports on a SCO press release complaining that they're being attacked. SCO's flaks say

This specific type of DDoS attack, called a "syn attack," took place when several thousand servers were compromised by an unknown person to overload SCO's Web site with illegitimate Web site requests. The flood of traffic by these illegitimate requests caused the company's ISP's Internet bandwidth to be consumed so the Web site was inaccessible to any other legitimate Web user.

Of course network attacks are no laughing matter. Well, not normally.

It sounds like they're talking about a syn flood attack, to judge from the slightly mangled name and description. SYN floods are a problem that was basically solved by SYN Cookies in Linux, BSD and other systems as much as seven years ago. I haven't heard of such an attack in years, because they don't really have much effect on a modern kernel. The fact that they were ever possible was really just a misdesign in early stacks. (Completely understandable and forgiveable of course; the internet used to be a more friendly place.)

I think it's pretty damn funny that even when SCO are trying to paint themselves as victims they're really just showing that they're seven years behind the times.

As Mozilla says, "cookies are a delicious treat". No cookies for SCO customers though.

From the archives: Jim Butterfield

From Compute! Sept 1982, reprinted on commodore.ca:

The Butterfield homestead is a modest brick house within walking distance of downtown Toronto. It is comfortably cluttered with books" plants, computers, and three cats. Even the attic is pressed into service as storage space for whatever books and computers Jim Butterfield cannot cram into his small office.[...]

"I decided to find out what this 'micro' stuff was all about and started watching the current magazines," he says. "1 finally decided to purchase when I saw a completely pre-built machine called a KIM 1, which had a 6502 microchip in it. That turned out to be like a return to the past. Everything we had been doing a dozen years before on the large $1.5 million computer, we were doing again on this little $250 board including making the same mistakes." [...]

Lecturing and teaching, such as the machine language course he conducts each month for a special interest division ofTPUG, provide him with feedback about problems and areas where people need more information. He has a reputatifor being generous with his time, and his phone in open from 10 a.m. to 10 p.m. Monday to Friday.

"If somebody phones me up and asks a question which shows they just haven't bothered trying it themselves. then I will sometimes be a little short, because it does seem like a waste of my time," he says. "But most people who call do so because they're stuck on something. It's just a question of getting another opinion. If I get a number of inquiries in a certain area, that's usually a signal that it's time for me to write an article about it. It's a very good way of keeping posted on what's bothering people at the moment."

Butterfield is equally generous with his software. He rarely sells any of his programs. "I would like to foster an environment where people pass out their software with reasonable generosity. I think that by showing a good example, I might sort of lead the way in that." Often he distributes his work on TPUG's library disk.

Still, Butterfield'vehemently supports an author's copyright: "I believe very strongly that the person writing an original program has the right to do as he chooses with that program. If he chooses to sell itor to request that it not be copied except for a fee, then he has absolutely that right."

However, he feels that a person who takes money for software is obligated to support that program by upgrading it and furnishing the means to modify it, if necessary. "That's another good reason to give programs away. I really feel that most people who put down a lot of money for software feel that they are not buying a disk or cassette tape, but they are buying a service..'

Twinkle twinkle little Tux

Michael Still made a Tux out of fairy lights

reStructuredText

I'm writing some things in reStructuredText, which is a wiki-like markup language that can be translated into HTML, LaTeX and so on. It's pretty cool: you write text much as you might write a plain-text email or comment, with asterisks for bold, indenting for quoted sections and so on.

There was a phase a few years ago were the world was made for SGML and XML, but really those are not formats that humans ought to need to see. I suppose HTML is harmless enough, and the parsers are pretty slack. But DocBook is pretty horrendous to write by hand, even with a smart editor. I suppose if you're doing a very Serious Technical Book the overhead might be worthwhile, but most of the time most of us are not.

reST is actually a lot like the SDF format my friend Ian Clatworthy wrote years ago. There is probably some kind of lesson there about open source adoption patterns.

Desired Hypercruiser

There's a genre of motorcycles known more or less as hypersports. Cyberpunk authors would love them. Hypersports was, I think originally defined in the mid 90s by the Honda CBR1100XX Blackbird, and later by the Suzuki Hayabusa and Kawasaki ZX12-R (my current ride). The defining characteristic is an enormous top speed (around 300km/h) and power output (178ps or 130kW). They're rather heavier than your typical sports bike, and a bit more unwieldy in tight corners or around town because of it. For a modest AUD $20,000 you get a vehicle with speed and acceleration comparable to a $400,000 supercar.

The catch is that I think very few of the owners ride them that way. Quite aside from legality, approaching that kind of speed requires a road more clear and flat than I've ever heard of in Australia at least. And even then: I'm sure going faster than a bullet train is fun, but I don't know if I'd want to do it every day.

These bikes have all kinds of qualities aside from outright speed: the engine is so overwhelmingly capable that at any reasonable speed you're knee-deep in smooth torque. At 100km/h any of the six gears on a ZX12-R are available, depending on whether you want a gentle wafting forward in 6th, or a near-instant jump in first. The extra weight makes them very comfortable and smooth on a highway, and on a smooth surface they corner on rails. The large size and big engine make them great for carrying a pillion.

So what I'd like to see is manufacturers skewing towards this market just a little more: put in some kind of larger screen to make highway travel a bit better, even if it takes a few ks off the top speed. Set them up to be a bit more tolerant of choppy roads. Put the big, jet-turbine engines into something a little more like a sports tourer.

life_state

Stephane is away for Thanksgiving; we'll spend Christmas with my family. hack mode.

I'm working out good routes to drive to Noosa and to Adelaide for LCA. I'm pretty keen to go along the Great Ocean Road and just generally for S and myself to see some more bits of Australia.

Went go-kart racing with the boys today. Good fun: no torque, no traction, poor brakes. Pretty tiring. My car felt silky smooth afterwards: remarkable that you can make a 1000kg car feel lighter and easier to control than a 50kg kart.

I've been listening a lot to soundlab. It's good. JJJ during the daytime can tend to get stuck on "alternative top 40", but soundlab's playing some pretty novel beats.

I went for a ride with Michael down into Namadgi National Park. It still looks a bit desolate after the fires at the start of 2003. We went up towards Yankee Hat, and visited the Orroral Valley tracking station that was used for the first satellites. There are no buildings or equipment there anymore, just concrete foundations and some plaques, and an overgrown garden. You can see blue 60s tiles where the bathrooms used to be, and a little courtyard where the rocket scientists probably had their morning tea. It would be a good place to go for a picnic.

I swapped onto Michael's YZF-6R for part of the way back. I think you've been on it once? It's very different: it feels so tiny! It feels like the weight you have to lift to get it off the sidestand is half as much as mine, probably because of the outright mass and also a different center of gravity. It's very sweet. It's like a tiny little glass of honey liquer. It revs up very freely, and the sitting position is very comfortable, though the seat is hard.

Speaking of revvy, BBC World plays a little motoring show here called Top Gear. I first ran into this flipping through the channels of the United in-flight show, and it was really the first time I realized there was more to BBC World than just the Economist-rewrites-CNN of their normal newsfeed. The content is pretty amusing: guys hooning around British backgrounds in modern and classic cars. I think I find the presentation even more interesting though: there are some pretty glossy camera and digital tricks happening. I think I need to ask my media studies expert. I'd say it's consumerist porn, but I've never heard of porn being quite so well and lovingly filmed.

googlehack

Try googling for miserable failure.

By the way...

More thoughts on TAOUP are in /weblog/books/taoup.

SAS and SATA

Good technical overview of SAS (Serial Attached SCSI) and SATA (serial ATA).

High plains drifter


Alpine stream near Kiandra NSW

More on changelogs

Colin replied to my entry on ChangeLogs.

I have to disagree with Martin Pool's reasoning on ChangeLogs, although not the conclusion. I think ChangeLogs were created mainly as a workaround for the lack of atomic commits, and the file orientation of RCS and CVS. Since my logical change could span several files, the ChangeLog (as used in CVS) serves to mark what files were involved in the change, as well as duplicate the log entry for the commit.

However, tree-oriented version control systems with atomic commits (e.g. GNU Arch) don't have this problem. There is no direct notion of individual file history - every change affects the entire tree, atomically. Which makes sense - a change, in general, is associated with the tree, and not an individual file.

Yes, a lack of good tree-wide revision control is probably why ChangeLogs were originally invented. Indeed I wouldn't be surprised if some people started using ChangeLogs before they had any version control at all, let alone tree-wide changesets. (Intercontinental CVS is pretty tedious even today, I don't know if many people would have been doing it ten years ago.) It was for just this reason that I was previously pretty skeptical about the idea of using them: all the information is in CVS or Subversion, so why bother? But what Ben eventually convinced me, and what I was trying to communicate, is that there is still some value in keeping them, even if you have a good vc system.

The ChangeLog standard is good in itself. The GNU Standards give fairly well considered requirements for what ought to be in log entries: for example, they say you shouldn't abbreviate function names, so that people can easily grep for them. There's consistent formatting, which is understood by various programs. And when somebody sends you a patch, if they send you ChangeLog entries for it, then you have a detailed description in their own words of what they changed. This may not be an argument for literally having a file called ./ChangeLog, but I think it's good to think about writing commit messages in this kind of style, however you record them. Or at least read the GNU manual entries on this, and take from them whatever seems valuable.

It supports promiscuous forking. It's a good thing if people can fork a project without needing permission from the author, go and do their own thing, and then perhaps come back and merge later. CVS does this poorly; BitKeeper and Arch do it very well, and the informal mailing-patches-around system does it moderately well. Keeping ChangeLogs helps with that: when you later get a patch from somebody else, you have at least some record of what changes they made and (equally important) which of your changes they've already taken. So I'd say for this reason to at least consider using ChangeLogs if you're using a vc system that's not natively distributed.

Even if arch supports promiscuous forking easily, it's still a barrier to entry. Personally I found it too underdocumented last time I tried. If I wanted to contribute to a project that used it, I'd just send a patch. So for users like me, having a ChangeLog in the tarball is pretty useful. If you always ship the ChangeLog when you ship source, then there is always a chance for sane forking and merging.

I note that even though the Subversion developers obviously have whole-tree commits, they keep a ChangeLog as well (with the same text.)

Version control is not forever. Many successful programs outlive their version control systems. There are a whole range of possible reasons:

And note that in many of these cases, by the time you need the ChangeLog it's too late to get it. It's no good to say "oh, you can generate it when you want it." By the time you want it, the CVS server may be long gone.

A ChangeLog is not a real substitute for a complete source history, but you can count on it always being there. You can't always get what you want, but sometimes you can get what you need. It's something that may pay off in the long term, rather than the short term. But then most documentation is like that.

Also like other documentation: some rough notes, that are kept around and are up to date are more useful in the long term than grandiose autogenerated phonebooks.

Given a ChangeLog, you can see what was changed, and perhaps why. Like a good comment, you get a higher-level overview of the textual changes from one version to another. You can see who might know about particular areas. By diffing two ChangeLogs, you can see what the difference is between their trees at a higher level than the raw diff. This is highly useful because of the tendency of diffs to become noisy when for example a global rename occurs, or a movement between files.

It's cheap. With a little tool or editor macro keeping the ChangeLog up to date is easy. So given that it's very useful, and it's cheap, why not do it?

It may seem a little ugly to record text twice but disk is so cheap these days...

Everyone can use them. I suppose it's possible that in the future Arch or Bitkeeper or Buttkeeper really will become the universal version control system, perfect and unchanging like the face of god. Everyone who gets the source will get the whole history, and everybody else will be able to merge with them. You'll never lose history, and nobody will ever want to change vc systems. But until that day arrives, ChangeLogs are a reasonable common denominator that allows everyone to get at least a little bit of history about their tree.

I don't know about Arch, but the dependencies of Subversion make it nontrivial to compile on a new Linux architecture, let alone an alien OS. There are still people around who want to port things to OpenVMS or Netware, and in the future perhaps they'll be porting to EROS or TUNES. If you want truly promiscuous forking, it's good to support these people. If they can at least participate by making and reading ChangeLog entries then we don't have to feel so guilty about using a vc too that they can't run.

The good thing about a text file is that everyone can read it, regardless of what tools they use. Not everyone can swallow the BitKeeper licence, and possibly not everyone wants to use Arch. But everyone can read and write a ChangeLog.

It's handy. I was pretty skeptical about this when Ben told me, but it's true: it is very nice to just quickly C-s through the file, and to get grep hits when you search for a function name. Avoiding the few-second delay to run svn log or whatever is handy. And of course it's still there when you're disconnected.

I think ChangeLogs should be a complement to NEWS files, rather than a replacement for them. (GNU standards say so too.) The NEWS file is directed at users and contains things they need to know about in their language; the ChangeLog is directed at programmers. I think having both, and keeping them separate, is pretty valuable: looking through /usr/share/doc/ it is a bit annoying to see many programmers which have only one and not both. If you're going to have both, then using the canonical names and contents helps make it clear.

I started wanting this after being handed three somewhat different versions of a tree that I was meant to maintain. It wasn't easy to reach the original developers, and it wasn't clear of them which one was meant to be "most current" and whether the others were forks or earlier snapshots. A ChangeLog wouldn't have fixed everything, but at least it might have given some indication of the relationships.

I wish those developers had used ChangeLogs out of consideration for the people who came after them. So the golden rule suggests that I should try them myself, and it's surprisingly good.

All programmers are optimists...

All programmers are optimists. Perhaps this modern sorcery especially attracts those who believe in happy endings and fairy godmothers. Perhaps the hundreds of nitty frustrations drive away all but those who habitually focus on the end goal. Perhaps it is merely that computers are young, programmers are younger, and the young are always optimists. But however the selection process works, the result is indisputable: "This time it will surely run," or "I just found the last bug."

— Frederick Brooks, "The Mythical Man Month"

0wn3d

Ben has been acquired by a cat, and so have I.

Linux Minolta DiMAGE 7i patch

Patch to make Minolta DiMAGE 7, 7i, 7Hi cameras work on Linux. This might help with the DiMAGE A1 as well, which is the successor to the 7. Let me know!

--- linux-2.4.22/drivers/usb/storage/unusual_devs.h.~1~	2003-09-08 21:23:50.000000000 +1000
+++ linux-2.4.22/drivers/usb/storage/unusual_devs.h	2003-11-12 13:26:49.000000000 +1100
@@ -388,6 +388,28 @@
 		US_FL_SINGLE_LUN ),
 #endif
 
+/* Following three Minolta cameras reported by Martin Pool
+ * .  Originally discovered by Kedar Petankar,
+ * Matthew Geier, Mikael Lofj"ard, Marcel de Boer.
+ */
+UNUSUAL_DEV( 0x0686, 0x4006, 0x0001, 0x0001,
+             "Minolta",
+             "DiMAGE 7",
+             US_SC_SCSI, US_PR_DEVICE, NULL,
+             0 ),
+
+UNUSUAL_DEV( 0x0686, 0x400b, 0x0001, 0x0001,
+             "Minolta",
+             "DiMAGE 7i",
+             US_SC_SCSI, US_PR_DEVICE, NULL,
+             0 ),
+
+UNUSUAL_DEV( 0x0686, 0x400f, 0x0001, 0x0001,
+             "Minolta",
+             "DiMAGE 7Hi",
+             US_SC_SCSI, US_PR_DEVICE, NULL,
+             0 ),
+
 UNUSUAL_DEV(  0x0693, 0x0002, 0x0100, 0x0100, 
 		"Hagiwara",
 		"FlashGate SmartMedia",

Archives 2008: Apr Feb 2007: Jul May Feb Jan 2006: Dec Nov Oct Sep Aug Jul Jun Jan 2005: Sep Aug Jul Jun May Apr Mar Feb Jan 2004: Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan 2003: Dec Nov Oct Sep Aug Jul Jun May