Martin Pool's blog

gcc makes my day

Ben pointed out the -ftrapv feature in gcc, something I have wanted for a while.

-ftrapv
This option generates traps for signed overflow on addition, subtraction, multiplication operations.

Example:

int main()
{
     int x = 1;
     while (1) {
          x *= 2;
          printf("%u\n", x);
     }
     return 0;
}

gcc -Wall -ftrapv overflow.c -o overflow -O3 && ./overflow >/dev/null

aborted

The current implementation seems to turn these operations into function calls, which may be a bit slow. It seems like it could potentially be done inline. But it's still something that might be very useful in test-builds or in security-sensitive software.

Threads ignore the last 20 years of OS and hardware development

Thomas Smits:

The design of the Java Virtual Machine ignores the painful lessons operating system vendors have learned in the past 40 years. The concepts of processes, virtual memory management, and different protection modes for kernel and user code can be found in all modern operating systems. They focus on the question of isolation and therefore robustness: an application with errors cannot affect the other applications running in the system.

In contrast, Java follows the all-in-one-VM paradigm: everything is processed inside one virtual machine running in one operating system process. Inside the VM, parallelism is implemented using threads with no separation regarding memory or other resources. In this respect Java has not changed since its invention in the early nineties. The fact that Java was originally invented as a programming language for embedded devices may explain this approach.

Andrew Tridgell (via Tim Potter):

What is it about the word "thread" that people find so damn sexy? Maybe it needs a name change — "slow-as-hell-no-memory-protection-locks-dont-work" API might be suitable, but I suspect the standards committees wouldn't like that one.

The MMU was added to CPUs for a very good reason. Why is it so hard to understand that trying to avoid it is a bad idea?

Have you thought about the orders of magnitude here? With process switching on a modern CPU you basically have to swap one more register. That's one extra instruction. Modern CPUs have nanosecond cycle times.

Now, some CPUs also need to do an extra tlb flush or equivalent, but even that is cheap on all but the worst CPUs.

Compare this to the work that a file server has to do in responding to a packet. Say it's a SMBread of 4k. That is a 4k copy of memory. Memory is slow. Typically that SMBread will take tens of thousands of times longer than the context switch time.

But by saving that nanosecond you will make the read() system call slower! Why? Because in the kernel the file descriptor number needs to be mapped from a integer to a pointer to a structure. That means looking up a table. That table needs to be locked. If you have 100 threads doing this then they all lock the same structure, so you get contention, and suddenly your 16 cpu system is not scaling any more. With processes they lock different structures, so no contention, so better scaling.

This lock contention can be fixed with some really smart programming (like using RCU), and recently that has been done in Linux. That's one reason why Linux sucks less than other systems for threads.

Try thinking about this. How do threads do their IPC? They use the same system calls and mechanisms that are available to processes. The difference is that in a threads library these mechanisms are wrapped into a nice API that makes it convenient to do IPC really easily. You can do exactly the same types of IPC with processes, you just need to write a bit more code.

For many things, perhaps even for some file server applications, that extra convenience is worthwhile. Convenience means simpler code which means fewer bugs. So I'm not saying to never use threads, I'm just trying to kill this persistent meme that says threads are somehow faster. That's like believing in the tooth fairy.

Random C idea: limited integers

I wonder if this could be added to gcc

int a __attribute__((limited));

So if a every overflows or underflows, the machine will will check, perhaps by raising a SIGTRAP.

How much would it cost in performance? At the worst, one conditional branch operation after each manipulation of such a variable, to check the overflow status bit. But probably only one or two additional instructions, if we only care about limiting to the size of the variable and not an arbitrary range. Maybe something more efficient is possible.

The goal is to prevent the significant class of bugs caused by an integer overflow without necessarily having a following array overflow.

This probably needs to be optional per-variable so as not to break existing code.

Maybe this has already been done or it isn't feasible or it's just silly?

More on fragmentation of lisp

I complained previously about the relative fragmentation of Lisp into incompatible dialects. Each of several times I've tried to do something nontrivial I seem to find that the book or library I want to use doesn't correspond to the implementation I have available.

We're developing a networking library at work in Python. It opens sockets; it parses XML; it has a web and a graphical interface (using PyGTK). It's not enormous (about 5kloc), but useful. With relatively little effort Tim got not just the library but also the GUI to run on both Linux and Windows. If he wanted, he could probably get it to run on his Wince handheld as well. This is pretty cool, but not really remarkable: of course it runs everywhere.

Some of the data structure manipulations would probably be simpler in Lisp than in Python. Maybe it could be only 3kloc or less. That would be great. But I don't think we could find a single open Scheme or Lisp implementation which had the right libraries and ran on every platform.

The most-commonly-suggested Common Lisp implementation, SBCL, doesn't run on Windows, or even 64-bit Linux platforms. (I just discovered there is a GTK+ binding in prerelease, which is a start.)

Ten or more years ago people expected languages and environments to be a bit different between machines. You picked a system, then wrote in whatever dialect existed there. Adding a dependency on a library was a substantial risk. Open source and scripting languages have expanded expectations: everything ought to run everywhere by default, and there ought to be libraries for almost everything by default. (Seven years ago it was a big deal that Java supposedly ran everywhere. Now Python just does it.)

Python may be converging on Lisp, but it would be really nice if Lisp could converge on Python too.

The Lisp Paradox

Paul Graham wrote a while ago on the idea that good programming languages are converging on Lisp.

In it, he presents a programming language elegance microbenchmark: write a function that returns a function that accumulates values. To some extent this is skewed towards Lisp — it's impossible to write at all in pure functional languages, despite that some people find them very valuable. And it's slightly artificial in Python: if you wanted to sum the values in some sequence, you'd just do it without using a function.

But still, it does show Python as a language is perhaps not quite so elegant as Lisp. I'm inclined to agree. And yet I find Python far more useful.

Though Paul didn't say so, I think his results demonstrate the big problem with lisp:

Lisp: Arc (def foo (n) [++ n _])

Lisp: Common Lisp (defun foo (n) (lambda (i) (incf n i)))

Lisp: Goo (df foo (n) (op incf n _))

Lisp: Scheme (define (foo n) (lambda (i) (set! n (+ n i)) n))

Four rather different definitions, all for a nearly-trivial function. (Four different spellings of a single keyword!) And this is just for a pure algorithm, with no consideration of how to make the program run, or how to print the results.

We may leave aside the experimental Arc and Goo dialects, or concentrate on only Scheme or CL, but allow for the different ways of launching various lisp interpreters and compilers on Linux and you're probably up to a dozen variations. The most basic thing for a Unix language is to have a "shebang" line that can go at the start of a program to make it executable, but Lisp/Scheme doesn't even standardize this.

And this is to say nothing of the libraries you might need to do something useful, like split a string or display a dialog or parse web input. Not only do the libraries differ, but even the syntax for loading a library differs.

Of course you could just pick one dialect, but which one? If you say "I'm using Foo Lisp" in particular then you're using a very unusual language, with corresponding worries about it being dropped or inadequately supported in the future. Sometimes it's worthwhile.

The Python solution to the benchmark is longer and slightly klunky compared to Lisp, but at least it will run everywhere Python runs, including under the CLR and the JVM. There is a single standardized method for calling this function from another program.

Peter Norvig (who knows infinitely more about Lisp than me) says

Python can be seen as either a practical (better libraries) version of Scheme, or as a cleaned-up (no $@&%) version of Perl.

Pathological variation is an endemic disease for Lisp: it's fun and easy to write an interpreter, so everyone does, producing a maze of incompatible versions, eternally irrelevant. I say that sadly, not triumphantly, because I would really like a good Lisp that was widely useful, but I don't expect to ever get it.

Three flavours of programming

graydon has a particularly interesting post:

I think the programming world we are living in has about 3 major styles of activity:

1. small, precision-engineered objects which operate as fast as we can possibly make them go, over very specialized mathematical structures, in extremely delicate circumstances. objects like this are RDBMSs, LAPACK/ATLAS/GSL, DSP codecs, renderers, embedded realtime controllers[...]

2. distributed systems made of sloppy, forgiving, easily modified, adaptive jumbles which connect together several objects of type #1.[...]

3. frustratingly large and inflexible "business logic" programs. this sort of program is what most cobol, java and C# code is for.[...]

I think it's a good map.

Furthermore, you can distinguish some design patterns to do with moving between these spaces.

One approach is to use different languages for different parts: call C libraries from PHP; write a Python client for a C++ server. OK. Works well, but there is a transaction cost in crossing the boundary. Things like Swig or CLR might make it cheaper.

Alternatively, don't pay the transaction cost, but stay in a single language and write in different styles. For example in C you can go to a very data-driven form, in which constant initializers form almost a simple scripting language, of which printf is only the beginning. (See the discussion of mini-languages in The Practice of Programming and The Art of Unix Programming.)

The saying that large C programs tend towards containing a half-assed implementation of lisp is true but not necessarily bad. It's entirely possible that you need just a little bit of lisp, and the pain of writing it yourself is less than the pain of linking in a real interpreter.

(This touches on the class and insoluble debate of whether you should add a real lisp because you might need it later, or suffer later because you wrote your own...)

For some problem spaces, adding a half-assed Perl or Forth or Fortran may be a very good approach. GNU Arch contains a reimplementation of unix pipeline primitives like awk and uniq in a way that can be easily called from C... This aided translation from a prototype written in shell, and provides a good language for the problem anyhow.

The Year of Python

Troutgirl:

From the IM logs:

Joyce: Boy, this is definitely the Year of Python

Joyce: Last year, people said "Python might be OK for prototyping Java"

Joyce: This year, it's more like "Java is the new COBOL"

Murph: python has been, if you'll excuse the pun, creeping up on us for a while.

Murph: it's funny, because python has had some killer apps for a while, but they've been pretty invisible

Joyce: Like what?

Murph: things like Mailman

Joyce: Oh, that's Python?

Murph: see?

warn_unused_result

gcc has a neat attribute that helps you check that function return codes are used.

For example, I have a function dcc_timeout_arm in distcc which uses longjmp and so returns a second time with an error if the timeout fails. So it's absolutely critical that the error return code be checked, or the code can't possibly work correctly.

In a earlier draft of the code I forgot to check it in some places. Unfortunately for me the non-error path works fine, but the error path would always fail. Of course error paths are notoriously poorly tested.

Therefore:

#ifdef __GNUC__
#  define WARN_UNUSED  __attribute__((warn_unused_result))
#else
#  define WARN_UNUSED
#end
...
int dcc_timeout_arm(int timeout, int) WARN_UNUSED;

So this gives a warning:

dcc_timeout_arm(5, DCC_PHASE_CONNECT);

and this is correct:

if ((ret = dcc_timeout_arm(5, DCC_PHASE_CONNECT)))
    goto out; */

Great stuff. In terms of Rusty's module interface continuum, we just went from "6: follow common convention (check return codes) and you'll get it right" to "2: compiler will warn if you get it wrong." Build with -Werror and you get up to "1: the compiler won't let you get it wrong."

I hope gcc gains more attributes like this in the future. It would be good to get things like in Splint to check more invariants.

I had thought it would be nice if there were a global option to warn about ignored return values. The gcc team say it's not good, and on reflection I agree:

10.11 Certain Changes We Don't Want to Make

[....]

Warning when a non-void function value is ignored.

C contains many standard functions that return a value that most programs choose to ignore. One obvious example is printf. Warning about this practice only leads the defensive programmer to clutter programs with dozens of casts to void. Such casts are required so frequently that they become visual noise. Writing those casts becomes so automatic that they no longer convey useful information about the intentions of the programmer. For functions where the return value should never be ignored, use the warn_unused_result function attribute.

UNUSED in gcc

Of course you build programs with warnings turned on? If not, write it out one hundred times...

-Wunused causes gcc to complain about unused parameters. This is often a good thing, if it makes you check twice whether you mistyped a variable name or forgot to implement something. But sometimes you really need to have a parameter which is not used, perhaps because the function needs to match a particular prototype.

gcc has a __attribute__((unused)) thing you can apply to paramters to quieten the warning. This actually means possibly unused: if you do use the parameter, gcc doesn't complain. So with unused alone, you can make errors in the opposite direction.

I use this little macro:

#ifdef UNUSED
#elif defined(__GNUC__)
# define UNUSED(x) UNUSED_ ## x __attribute__((unused))
#elif defined(__LCLINT__)
# define UNUSED(x) /*@unused@*/ x
#else
# define UNUSED(x) x
#endif

void dcc_mon_siginfo_handler(int UNUSED(whatsig))

This applies the unused attribute and also mangles the variable name so that you really can't use it. It also lets you just write UNUSED(x) regardless of compiler. Maybe there should be more branches for other compilers.

(Sometimes this won't do: you might have a parameter or variable which is only used when particular preprocessor variables are set. But in that case you might want to simplify the ifdefs anyhow.)

Tim points out that this helps shift up Rusty's interface simplicity continuum: the compiler won't let you get it wrong. The parameter is either used, or unused.

Of course you can also say

void a(int b) {
    (void) b;
    ...

I have a feeling there was some reason this was inferior to the unused attribute but I don't remember what it was.

Python, Haskell, Lisp

Python vs Haskell,

Why people aren't using Haskell.

Is Python Lisp? No.

Python is growing, but not towards Lisp. As Python becomes more popular, I expect advocates of other languages will try to claim it as a descendant of theirs (call it "Alexander Graham Bell is Canadian" syndrome). Python is really a little Lisp, says Graham. A Haskell programmer could claim that Python is really a little Haskell, thanks to its support for List Comprehensions and some lazy features. An Icon programmer could claim that Python is getting to be more like Icon with the addition of lazy generators. A Smalltalk programmer would recognize metaclasses, the unit testing features and probably the new method resolution order. The warning, logging and exception handling infrastructures are probably closest to Java. In ten years, Python will probably have stolen more ideas from these languages (including Common Lisp) and they may even have stolen some back. But if you expect Python to grow towards any particular one of these, you'll be waiting a long, long time.

Programmers do not like deeply nested expressions. They like a language that encourages a style where expression results are assigned names. A statement/expression distinction encourages (and in some cases requires) that. A symbol type is not a bad idea but the marginal gain over interned strings is minimal. And the Lisp S-expression notation has been loudly and explicitly rejected over the last half century.

I like Python. I lovehate lisp. Should I learn Haskell?

What's new in Python 2.4

What's New in Python 2.4:

(From LWN)

Java: the next COBOL

Tim Bray writes:

I had lunch with Mike and Christian of Make Technologies here in Vancouver, and in my new capacity at Sun got my ear bent about the Java value proposition. Their key point was: probably more half of the data being crunched out in the business world is being crunched by COBOL programs on mainframes. When these systems really finally can't be lived with any longer, the CIOs who have to replace them notice that they're decades old. They're smart guys who try to learn from what they observe, and they deduce that the next big piece of infrastructure is apt to be with them for a long time. "So," they wonder, "this JES stuff (or .NET, or whatever) they're trying to sell me, will it still be a viable platform in 25 years?" Put that way, it sounds to me like a damn good question. I think the Java answer is about as good as anyone's at the moment, but I suspect it's something that none of us on either the vendor or customer side have been putting enough thought into.

That might just be a decent explanation of both the good and the bad points of Java, and a guide to when to use it. Not so exciting, not so agile, but a safe choice, if you're really planning to keep the code for 25 years.

It's a bit like describing people as "born old".

groggy on C++, etc

Groggy has some thoughts on C vs C++ and on revisiting old code.

(Isn't it strange that job ads ask for experience in "C/C++"? It's a bit like asking for candidates who speak French/Italian.)

Java, the Chicken of Tomorrow

The rant I was referring to the other day was from Java, the Chicken of Tomorrow by Miles Nordin:

Sure, there is some association between the imaginary JavaCPU and the Java language. But there is also some association between the real MIPS CPU and the C++ language: modern CPUs are overtly designed with the foreknowledge that they will be judged based on how fast they execute algorithms written in C++. Is the nonexistant Java CPU really uniquely adept at running Java programs, like a Symbolics `Ivory'' CPU is uniquely adept at running Lisp programs? No.

This observation makes me wonder if we wouldn't be better off throwing out Java and writing programs in C, then cross-compiling them for the VAX CPU. Instead of a JRE, we could simply write a VAX emulator for all interesting architectures, and put virtual-VAX sandboxes inside web browsers. VAX insns would become the Language Of The Web. We could standardize a crippled miniature virtual VMS called WebVMS for building into web browsers, to give all these VAXlets access to the network, the local filesystem, a GUI toolkit. We would have VAX-compatible smartcards. There is no reason a VAX emulator can't translate VAX instructions into native-CPU instructions just like a JRE's JIT does. This can be done quite well---like I said, the VAX emulator for the Alpha is faster than any physical VAX CPU. I suspect VAX machine-code would also be similarly compact to Java bytecode, since machine-code-compactness was the biggest priority when the VAX was designed. There is no missing piece to invent or design: we simply agree that, henceforth, all applications will be VAX applications so as to be equally inconvenient for everyone. Voila! Portability!

And when he hits his stride

The most important feature of the Java CPU is that it doesn't exist yet, but it could. Part of Sun's plan to make money from giving away Java was to build Java CPUs. This was a stated plan of theirs, not my speculation. The following is much more speculative.

Sun reasoned that if they could construct computers that ran Java bytecode natively, everyone else would suddenly realize that their JREs were CPU emulators for the computers Sun was selling. I suspect the fastest JavaCPUs would always be emulators, just as the fastest VAXen are emulators running on Alphas, but perhaps Sun could make some slow computers out of physical JavaCPUs that cost less to make than competing emulators of similar speed. The JavaCPU is thus extremely simple, so as to be implementable with minimal silicon and minimal research. Compared to a modern architecture like the Alpha, the JavaCPU looks like something an undergrad dreamed up in the men's room based on the mathematical elegance of two urinal cakes, one stacked upon the other. The JavaCPU idea gives Sun a brief window for a hit-and-run industrial subversion. If Sun can open hundreds of these tiny windows-of-opportunistic-slack, maybe a few of them will pay off. Something like the Alpha matures slowly, with many years of compiler research and successive core revisions abstracted by PALcode: the Alpha is not hit-and-run, and indeed it looks like Digital didn't survive long enough to finish cashing in on the Alpha. A low-end infantile design like the Java bytecode makes business sense in an industry where everyone with a big plan or a big research investment eventually gets screwed over by speculative neck-tie damage and goes bankrupt.

It's an interesting essay/rant. (Incidentally, Stephane has a copy of the original The Chicken of Tomorrow which is also very amusing.)

Colin likes D; the real universal virtual machine

Colin Walters writes that he likes D, the latest installment in what tridge calls the Tool of the Month Club. (Send no money now!) I have to say I like the elevator pitch, but I haven't tried it and I am a bit skeptical whether there is enough of a niche there for it to survive. We shall see.

One interesting thing is that D is not based on a virtual machine, which is a core feature of Java. (I guess you can compile Java to native code using something like gcj, but this is not a very common scenario at the moment.)

[These are kind of sketchy; it's a blog; don't shoot.]

There is already a standard virtual bytecode format; we don't need Java to introduce another. It's called x86 machine code. In fact people have already developed very efficient hardware implementations...

I'm wondering if most of the arguments for using a platform-neutral bytecode could still be achieved if that bytecode is x86: security is mostly about whether code can have access to system resources (files, other processes, the network). So block what system calls it can make. Running untrusted native code under SELinux is safe because it enforces security on the boundary. I'm sure more work can be done here in allowing something like a native applet.

Crusoe is something like a JIT for x86, and there are others for ia64, Alpha and other platforms. Perhaps a higher-level intermediate language makes it easier...

Perhaps garbage collection is something that should be done at a lower level than the compiler output.

You can mix languages in a single program to some extent with JVM or common intermediate language. On the other hand you can do that too with C and Python, Perl, Scheme, etc, and use this to good effect to implement alternate hard and soft layers. More complex systems might be better off running as separate communicating services than being jammed into a single program.

There is a good essay that discusses this idea, but I can't find it anymore amongst all the other good Java rants out there.

python - 70MB/s

I did a simple test the other night of how fast Python can read/write data. A simple 'cat' in Python can pump data through at about 70MB/s on a Pentium M laptop. This is just with simple read() and write() into strings, without any special tricks. By contrast bzip2 on that machine is only about 300kB/s.

That's probably pretty slow compared to C, but faster than most network connections could sustain. It's an interesting data point for writing moderate-performance software in Python.

(Doing a lot of manipulation might well slow it down.)

The (lack of) future of Java

I was just reading cbrumme's very interesting Microsoft blog.

I think one thing you can see here is that Microsoft are absolutely clearly trying their standard pattern #1 on Java: let Sun invent it, wait for it to be adopted, design something a little better and much less open, and ram it through the ISV/IT channel. Embrace, extend, extinguish.

(This sounds a bit harsh on CLR, which from my limited reading does seem to be the product of rather more intelligent thought than your average paperclip. But I don't think this is going to be decided primarily on technical merit.)

So the big question is, does Sun have the brains and/or balls to play the one gambit that gives Java a chance of survival: open source Java?

The only commercially interesting operating systems these days are either open source or Microsoft. Sun's Java runtime can't be included in free operating systems like Debian, and Microsoft doesn't (?) ship the current JDK. So Sun have just wilfully excluded themselves from being preinstalled on the two most important platforms. Way to go.

(OK, so failing to be preinstalled is not the end of the world. But it's not helping. Having a standard JRE on each release might make Sun less sloppy about cross-version compatibility than they have been to date.)

Beyond and Open Source Java

Ganesh Prasad has a brilliant article on Sun open-sourcing Java. This is what ESR's open letter should have been in the first place:

It is definitely in Sun's own interests to open up Java. This goes far beyond Open Source-ing the Java libraries, as we will see. Making the Java language and platform more affordable and friendly will make them more popular than their rivals (not Perl or Python, but C# and .NET).

But Sun has an indisputable right to exploit its creation for its own commercial gain. In fact, it would be illegal for Sun's management to sell out its shareholders by giving away the company's crown jewels.

Raymond recognises the problem, but poses it as an either-or choice. That implies a zero-sum way of looking at things. But it's not about ubiquity versus control at all. It's about ubiquity with commercial advantage.

As Open Source advocates, we know ways by which Sun can achieve both objectives.

We are not going to ask Sun to do something for Open Source (at least, not directly). We will show Sun how they can help themselves by making Java even more widespread than it is, and by harvesting it commercially. We will also show that if they continue to do nothing, they will pay a heavy price.

If, in the process of helping themselves, Sun proves to be a friend of Open Source, that's a bonus.

I think this is probably Java's only real hope of surviving .NET. I think I'd give Sun about a 40% chance of being smart enough to realize it.

integer promotion in varargs calls

Question: what does this do?

printf("%ld\n", 32);

You might think it prints 32. But I think the results are in fact undefined, and on a 64-bit platform you might get some value other than 32.

The problem is that vprintf (or some function inside it) will try to read off the varargs stack a value of type long, which is 64 bits on ia64 and (all?) other Linux 64-bit platforms. However, the value is a literal integer, and passed as such.

So why does this normally work? I think the reason is that on IA64, all integers are passed in 64-bit slots, with the first 8 parameteres in the frame input registers and the rest on the stack. However, there is no guarantee that the entire slot will be initialized if it's only carrying an int. At least in some cases, gcc generates a st4 instruction to store the value, so the top 32 bits are uninitialized. It seems that they often happen to be zero, but not always. This was causing a semi-intermittent failure of the Vstr test case.

You might think that all integer types are promoted to long, but that is not in fact the case. ISO/IEC 9899:1999 (E), the C specification says basically that types smaller than int are promoted to ints, and floats are promoted to doubles. There is no automatvic promotion to longs.

The correct way to write it, assuming you wanted to pass it as a long, is

printf("%ld\n", 32L);

The good news is that gcc can generally give compile-time warnings for this kind of problem, although it cannot trap every possible case, and it can't trap non-printf varargs functions.

pop quiz

What value is assigned to the macro by this line?

#define PEGASUS_ATOMIC_INT_NATIVE = 1

Spark Ada

Spark Ada, mentioned on RISKS, looks interesting: an annotated subset of Ada with unique and precise semantics allowing static proof of, amongst other things, that no run-time exceptions will occur.

From the Preface to the book,

SPARK has just those features required for writing reliable software: not so austere as to be a pain, but not so rich as to make program analysis out of the question. But it is sensible to share compiler technology with some other standard language and it so happens that Ada provides a better framework than many other languages. In fact, Ada seems to be the only language that has good lexical support for the concept of programming by contract by separating the ability to describe a software interface (the contract) from its implementation (the code) and enabling these to be analysed and compiled separately. The Eiffel language has created a strong interest in the concept of programming by contract which SPARK has embodied since its inception in the late 1980s.[...]

I have always been interested in techniques for writing reliable software, if only (presumably like most programmers) because I would like my programs to work without spending ages debugging the wretched things.

Perhaps my first realization that the tools used really mattered came with my experience of using Algol 60 when I was a programmer in the chemical industry. It was a delight to use a compiler that stopped me violating the bounds of arrays; it seemed such an advance over Fortran and other even more primitive languages which allowed programs to violate themselves in an arbitrary manner.

On the other hand I have always been slightly doubtful of the practicality of the formal theorists who like to define everything in some turgid specifica- tion language before contemplating the process known as programming. It has always seemed to me that formal specifications were pretty obscure to all but a few and might perhaps even make a program less reliable in a global sense by increasing the problem of communication between client and programmer.

Bitshifts in C

What do you think this does?

uint32_t a, b;
[....]

b = 32; a = a >> b;

You might think this will reduce a down to zero. But in fact, the C99 specification says that shifting either left or right by more than the width of the type causes undefined behaviour. On gcc on i386, it in fact shifts by b % 32, so in this case a is unchanged.

Beginners mistakes in Python

Hans Nowak wrote a good short article on beginners mistakes when moving to Python from some other language.

To preserve cosmic balance, Richard presents anti-pitfalls: mistakes that you can't make in Python that are possible in other languages.

First look at Objective C

I just had a quick look at Objective C today.

My initial impression is that it's very nice indeed: a far more competent addition of object-oriented stuff to C than C++. It adds just enough features to do dynamic dispatch, reference counting, and so on, without going into the enormous tarpit of complexity. It seems to give just enough features: OK, you can't do static dispatch, but you rarely really need that.

I should say though that I have not even written Hello World in ObjC, and probably there are drawbacks. In particular the syntax feels a little alien to regular C, but perhaps that's a feature.

I think the only real platform where it's been used is on the NeXT, and now on OS X's Cocoa, interface, which is an evolution of the same thing. It seems like a shame that it's not more widely used -- things like GTK+ that do OO in C idioms might more easily have been done in ObjC. It says something good about NeXT that basically the same programming interface still made sense for Apple to adopt it ten years later for their new system.

nice Python Generators example

Todd links to a good example of generators in Python. You can do things like this in many languages with varying degrees of ease. In C you probably need to explicitly store all your state; in Scheme you probably need to explicitly get your head around continuations. Python is arguably the first mainstream language to add them as a first-class feature. It will be interesting to see how much people pick them up and use them. They're probably not something you want to use very often, but when you want them they're very handy.

The "level of elegance" in providing explicit generators is very typical of Python: you certainly could factor them into smaller components and write the rest as a library, but it's perhaps easier to use them in this packaging.

def genResults(db, sql):
   cursor=db.cursor()
   cursor.execute(sql)
   while 1:
      row=cursor.fetchone()
      if row is None: break
      yield row

Archives 2008: Apr Feb 2007: Jul May Feb Jan 2006: Dec Nov Oct Sep Aug Jul Jun Jan 2005: Sep Aug Jul Jun May Apr Mar Feb Jan 2004: Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan 2003: Dec Nov Oct Sep Aug Jul Jun May