Martin Pool's blog

Passing the baton for distcc

The time's come for me to pass over the baton of distcc to a new maintainer, Fergus Henderson.

Fergus, along with his Google colleagues Nils Klarlund and Craig Silverstein, have been working for some time on a large new extension, distcc-pump. distcc up to the last release I made always did preprocessing on the client machine, then sent the preprocessed source to a server for compilation. This keeps things pretty simple and works well across a wide domain but does eventually reach a scalability limit. With distcc-pump more of the work can be distributed. It looks like there will also be support for gcc precompiled headers coming in soon, in collaboration with Sascha Demetrio. A release candidate for 3.0 is now up on the new distcc site, and they're churning through the several outstanding patches.

I'm very touched by his words about the project. Thanks, Fergus.

Made my day...

It's nice when you come in and see this mail:

Wanted to thank you for taking the time to write distcc. It is a model of elegant simplicity from a user's perspective.

I've been making infrastructure changes in our software which have roughly the same effect as "make clean" so my build times were consistently outrageous. I was able to reduce the build time by 76% without spending a penny on hardware. 'course management loves that kind of thing ...

The funny thing is nobody else seemed to mind how slow builds were until they saw how bad they were in comparison (I'm a recent hire). Now, when developers have to compile older versions of our code base that don't contain my parallel-safe makefile changes, they moan and groan about how long it takes. These are the same ones that didn't see a problem before I started. You gotta love human nature ...

distcc 2.15 released

I released distcc 2.15, which fixes a nasty bug in LZO compression which had been hanging around for a while. It turned out to be just a one line fix, but it was hard to find the problem because it only occurred intermittently and the failure happened a long time after the error. Basically I accidentally unmapped a too-long region of memory, clobbering whatever happened to be there.

I knew last time I changed it that this function was too long to be safe. I should have pulled things out to make it more obviously correct, which is what I have done now. It wasn't even a particularly complex function — certainly not unmaintainable — but large enough that it should have raised a warning.

distcc for Java

Anthony Green is trying to hack (as with an axe) distcc into something that can do distributed compiles for Java. Good luck!

Tom Tromey has some comments in response.

I think the compilation process for Java and C are so different that they may not fit into the same program, but perhaps it can work.

I had wondered in the past whether I should have made the distcc server and protocol completely independent of the work of compilation, so that they just distribute arbitrary work. The client needs some special intelligence about how to interpret gcc command lines and run the preprocessor. But I think this can be done purely on the client, in a gcc-specific skin.

There really are a remarkable number of bright people at Red Hat these days.

(fwd) kernel BUG at page_alloc.c:98 -- compiling with distcc

I am so good tweaking kernel bugs. I think this is the third or fourth one distcc has trapped.

distcc 2.11 out

I released distcc 2.11, with the new GNOME monitor. Implementing this has been an interesting trip through the GNOME API.

There is semi-infinite scope for adding additional eye-candy, but I think it gives a pretty good overview of what is happening on the system as it is.

Progress on distcc monitor

Getting something that looks good, conveys the right information, and does not burn an unreasonable number of cycles is a bit tricky.

GTK+ is very nice to program too. C is not ideal for this kind of stuff though.

distcc and AOSS Awards

If you're here looking for information about distcc's nomination for the AUUG Australian Open Source Awards, please note that the correct home page is distcc.samba.org.

Compression working in distcc

I got LZO compression working for distcc in CVS last night. The compressed:plain ration for preprocessed source is typically 25%, and about on 70% on non-debug object files. In other words the amount of network traffic is cut by a factor of nearly four, since source files are typically much larger than the output.

This should help compile times when the network is a limiting factor. That is, when there are many remote machines, or the network is quite slow because it's wireless or already loaded. It might even help in other cases because the compiler can start relatively earlier, and we need to do a few less IOs and therefore less work on the kernel. The drawback is a little more work in userspace to do compression, and we can no longer use sendfile() for transmission.

Early tests indicate that for three machines on a 100Mbps network it is roughly the same speed as uncompressed traffic. It's within experimental variation, which is about 10s for a 5 minute build of Samba or the Linux kernel. The network is not saturated, so I think the extra CPU overhead of doing compression is cancelled out by the reduced network transit time.

Compression completely dominates the CPU usage of the distcc client, although it is quite small compared to preprocessing and compiling.

Note that the version in CVS is currently not compatible with earlier releases, although it will be by the time 2.9 is out.

Given all this I think it will not be on by default, because it's not compatible with older servers. It should be good to turn it on primarily if you are on a slow or high-latency network, and in particular for wireless networks.

distcc talk at CLUG

I gave a talk about distcc last night at Canberra LUG. I was feeling a bit tired after getting up quite early but it went pretty well. Without any previous preparation we set up distcc servers on a few laptops in the audience, connected over wireless. It did work, although because wireless is a bit slow and the laptop I was using was quite fast the results were not compelling.

Archives 2008: Jun May Apr Feb 2007: Jul May Feb Jan 2006: Dec Nov Oct Sep Aug Jul Jun Jan 2005: Sep Aug Jul Jun May Apr Mar Feb Jan 2004: Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan 2003: Dec Nov Oct Sep Aug Jul Jun May