Sick of XML? Try YAML!
From slashdot:
YAML(tm) (rhymes with "camel") is a straightforward machine parsable data serialization format designed for human readability and interaction with scripting languages such as Perl and Python. YAML is optimized for data serialization, configuration settings, log files, Internet messaging and filtering. YAML(tm) is a balance of the following design goals:
- YAML documents are very readable by humans.
- YAML interacts well with scripting languages.
- YAML uses host languages' native data structures.
- YAML has a consistent information model.
- YAML enables stream-based processing.
- YAML is expressive and extensible.
- YAML is easy to implement.
posted Wed 26 Nov 2003 in /software/xml | link
Working more productively with bash 2.x
Ian Macdonald has a good page on Working more productively with bash 2.x, covering both his superb bash-completion package but also some other tips. Plain Unix sh always seems so drab after bash.
posted Tue 25 Nov 2003 in /software | link
SCO owns C++!
Courtesy of We Love the SCO Information Minister: McBride says "And C++ programming languages, we own those, have licensed them out multiple times, obviously. We have a lot of royalties coming to us from C++."
Wow.
posted Thu 20 Nov 2003 in /issues/sco-vs-linux | link
Subversion tip -- What's new?
What's new in the repository since I last updated?
$ svn log -r BASE:HEAD | less $ svn diff -r BASE:HEAD | less
posted Thu 20 Nov 2003 in /software/vc/subversion | link
Rapid Testing
Jason pointed me to James Bach's writing on rapid testing. It looks interesting:
How is Rapid Testing different from normal software testing?
Testing practice differs from industry to industry, company to company, and tester to tester. But there are some elements that most test projects have in common. Let's call those common elements "normal testing". In our experience, normal testing involves writing test cases against some kind of specification. These test cases are fragmentary plans or procedures that loosely specify what a tester will do to test the product. The tester is then expected to perform these test cases on the product, repeatedly, throughout the course of the project.
Rapid testing differs from traditional testing in several major ways:
- Mission. In Rapid Testing we don't start with a task ("write test cases"), we start with a mission. Our mission may be "find important problems fast". If so, then writing test cases may not be the best approach to the test process. If, on the other hand, our mission is "please the FDA auditors", then we not only will have to write test cases, we'll have to write certain kinds of test cases and present them in a specifically approved format. Proceeding from an understanding of our mission, we take stock of our situation and look for the most efficient and effective actions we can take right now to move towards fulfilling that mission.
- Skills. To do any testing well requires skill. Normal testing downplays the importance of skill by focusing on the format of test documentation rather than the robustness of tests. Rapid Testing, as we describe it, highlights skill. It isn't a mechanical technique like making microwave popcorn, or filling out forms at the DMV. Robust tests are very important, so we practice critical thinking and experimental design skills. A novice tester will not do RT very well unless supervised and coached by a senior tester who is trained (or self-trained) in the art. We hope the articles and presentations on this site will help you work on those skills.
- Risk. Normal testing is focused on functional and structural product coverage. In other words, if the product can do X, then try X. Rapid Testing focuses on important problems. We gain an understanding of the product we're testing to the point where we can imagine what kinds of problems are more likely to happen and what problems would have more impact if they happened.Then we put most of our effort into testing for those problems. Rapid Testing is concerned with uncovering the most important information as soon as possible. [....]
posted Wed 19 Nov 2003 in /software/testing | link
pop quiz
What value is assigned to the macro by this line?
#define PEGASUS_ATOMIC_INT_NATIVE = 1
posted Mon 17 Nov 2003 in /software/languages/C | link
Why Changelogs?
In many GNU packages like gcc or emacs you'll see a ChangeLog file containing a description of all of the changes to the package. I'd always thought of them as a vestige of a previous era before CVS, but bje recently made a pretty good argument for continuing to use them.
Here's a sample from emacs's ChangeLog, for any readers who aren't familiar with the form:
2003-09-23 Dave Love <fx@gnu.org>* configure.in: Check members of struct ifreq.
2003-09-14 Kim F. Storm <storm@cua.dk>
* configure.in: Add checks for sys/ioctl.h and net/if.h.
2003-09-12 Luc Teirlinck <teirllm@mail.auburn.edu>
* Makefile.in (install-arch-indep, uninstall): Add SES manual.
2003-08-18 Lute Kamstra <Lute.Kamstra@cwi.nl>
* configure.in: Revert the change of 2003-07-29 as GTK+ 2.2 is not required anymore.
2003-08-07 Andrew Choi <akochoi@shaw.ca>
* configure.in [powerpc-apple-darwin*]: Use the -no-cpp-precomp option instead of -traditional-cpp for CPP.
There's a text format for listing changes and then also a description in the GNU coding standards of what information ought to be in the entries. This list of what was changed, when, and by whom is pretty similar to what you might see in a CVS history.
So if you're storing your project in CVS or some other revision control system, then keeping a ChangeLog as well is redundant. Indeed, if you keep the ChangeLog in CVS then every comment is literally being stored twice...
The main reason for using a ChangeLog is that it travels with the source it describes. Years in the future if somebody obtains the source they can still see its history, even if they no longer have access to the version control system. It's fairly common for projects to move between different vc systems over their lifetime. It's not unheard of for a vc system to be lost entirely, leaving only a set of tarballs as a record.
If some other party wants to fork the project, or develop it offline for a while before merging back, then the ChangeLog gives some hope that their history will be recorded.
Another advantage is that it's easy to scan through the log for mentions of a particular function. Easier than with any vc system I know of, at least if the log is well written.
I've recently got hold of a few source tarballs written by parties unknown, and existing in different versions with no clear indication of when things were changed or why. If they'd come with ChangeLogs, things might be a bit easier. Of course the kind of person who omits even a README might not write a ChangeLog, but if somebody else had started one perhaps they might have continued it.
If we all used a version control system like arch or bitkeeper that carried history with the source then perhaps this might be less necessary. But even then there's no guarantee that every person who gets the source in the future will want to keep using that system...
If things are being kept in both CVS and a ChangeLog, it ought to be easy to use a little script or macro to keep them in sync.
There are scripts such as cvs2cl that produce a ChangeLog for CVS sources.
GNU emacs also has commands to integrate version control and ChangeLogs.
Some projects require that ChangeLog entries be submitted with patches. This means an explanation of the change in the originator's own words always gets into the project history. If the ChangeLog standards are enforced then the entry will have a level of detail and preciseness that might not be present in an informal description of the change.
As with any history logs, there is a small challenge in writing descriptions that will be comprehensible and useful to people reading them months or years hence.
I'm going to try this for a while...
posted Mon 17 Nov 2003 in /software/vc/changelogs | link
Alli Russell
Alli takes a step towards nerdliness.
posted Sat 15 Nov 2003 in /blogs | link
Ethics of replication
I rediscovered a post from Mark Wooding that I particularly like:
From: mdw.at.nsict.org (Mark Wooding)
Newsgroups: comp.text.pdf,sci.crypt,gnu.misc.discuss
Subject: Re: FBI - Adobe's lapdogs & government war on citizens
Date: 13 Aug 2001 20:45:40 GMT
Organization: National Society for the Inversion of Cuddly Tigers
Message-ID: <slrn9ngf3k.u7f.mdw@tux.nsict.org>Robert J. Kolker
wrote: Without getting in pejorative terminology, do you think it is kosher to deny or deprive the owner of intellectual property an opportunity to sell it?
Yes. Not every situation is an appropriate sales opportunity.
I'm not qualified to decide on what's kosher, or halal for that matter.
Perhaps if you meant to ask a different question, you should have done.
For example, say you borrow a book from a library (no problem). You make a two hard copies, one for you and one for your friend. Niether he nor you are likely to buy the book since you already have readable copies at hand.
Result. The publisher of the book has probably lost two sales.
I don't follow. First of all, you state that neither I nor my friend are likely to buy the book, since we have a copy at hand (in the library, presumably), and then complain that the publisher has lost sales. But libraries are OK.
And then there's the issue of a `lost' sale. How can it be lost? He never had it in the first place!
And this is the result of "sharing".
If the alternative is Stallman's `Right to Read' world, and we seem to be getting closer to that, uh, `ideal' every day, then count me down for sharing. Or whatever you want to call it.
Here's a thought experiment. Imagine you have a replicator, like in Star Trek. It costs about as much as a 100W light bulb to run, and is easy to maintain. It makes copies -- perfect working copies -- of inanimate objects[1]. All it takes is some space, to put the new copy in, and time to scan the original and make the new one. Suppose further that you bought yours from some guy in a corner shop for some small amount of money -- it's no hassle for him: he just replicates 'em, after all, and he's not in it for the money.
Which of these things do you think are morally `wrong', or should be `forbidden'? Justify your answers.
- selling replicators;
- giving replicators away;
- replicating up a small quantity of food to feed a family;
- replicating a friend's lunch (with his permission);
- replicating a friend's watch (with his permission);
- replicating a friend's computer (with his permission);
- replicating a friend's car (with his permission);
- replicating an artwork: (a) some old master, from before we had copyright; (b) some recent piece;
- replicating someone's watch (/without/ permission);
- replicating someone's car (/without/ permission);
- replicating stuff in a supermarket (/without/ permission);
- preventing replication of your private stuff;
- preventing (large scale) replication of food products (e.g., in third world countries).
I think that last is the only one which is actually `wrong' in any obvious way. I can argue for and against the others, but tend to fall in favour or allowing them.
I'm interested in answers from both sides of the debate.
[1] I don't want to get into the ethics of replicating live people, or even animals. We'll allow replication of dead stuff, so food is fair game.
-- [mdw]
posted Sat 15 Nov 2003 in /issues/copyright | link
The Geneva Convention On The Treatment of Object Aliasing
The Geneva Convention On The Treatment of Object Aliasing, Hogg, Lea, Wills, deChampeaux and Holt.
Aliasing has been a problem in both formal verification and practical programming for a number of years. To the formalist, it can be annoyingly difficult to prove the simple Hoare formula {x = true} y := false {x = true}. If x and y refer to the same boolean variable, i.e., x and y are aliased, then the formula will not be valid, and proving that aliasing cannot occur is not always straightforward. To the practicing programmer, aliases can result in mysterious bugs as variables change their values seemingly on their own. A classic example is the matrix multiply routine mult(left, right, result) which puts the product of its first two parameters into the third. This works perfectly well until the day some unsuspecting programmer writes the very reasonable statement mult(a, b, a). If the implementor of the routine did not consider the possibility that an argument may be aliased with the result, disaster is inevitable.
Over the years, solutions or workarounds have been found for aliasing problems in traditional languages, and the matter is seemingly under control. Unfortunately, as described below these solutions tend to be too conservative to be useful in object-oriented programs.
The object paradigm has been sold partly on the basis of the strong encapsulation that it provides. This is a misleading claim. A single object may be encapsulated, but single objects are not interesting. An object must be part of a system to be useful, and a system of objects is not necessarily encapsulated.
posted Sat 15 Nov 2003 in /software/theory | link
Photos

posted Sat 15 Nov 2003 in /travel | link
Tell us how you really feel...
posted Fri 14 Nov 2003 in /issues/sco-vs-linux | link
SCO releases a list of files
At long last, SCO have released a list of the Linux source files they claim infringe on their copyrights and/or proprietary information.
Intial analysis seems to show they just grepped for anything with the word "SMP". In particular, they think include/asm-m68k/spinlock.h infringes. The entire file is:
#ifndef __M68K_SPINLOCK_H #define __M68K_SPINLOCK_H #error "m68k doesn't do SMP yet" #endif
Jon Corbet writes:
The other amusing thing is that they listed the files in a different form:
include.asm-m68k.spinlock.h
People finally figured it out - they needed to flatten the entire kernel directory hierarchy in order to be able to grep through it. It seems that SCO's products, those luxury cars of operating systems, lack a recursive grep...
I can just imagine some law intern somewhere renaming all those files, one by one.
posted Fri 14 Nov 2003 in /issues/sco-vs-linux | link
Spark Ada
Spark Ada, mentioned on RISKS, looks interesting: an annotated subset of Ada with unique and precise semantics allowing static proof of, amongst other things, that no run-time exceptions will occur.
From the Preface to the book,
SPARK has just those features required for writing reliable software: not so austere as to be a pain, but not so rich as to make program analysis out of the question. But it is sensible to share compiler technology with some other standard language and it so happens that Ada provides a better framework than many other languages. In fact, Ada seems to be the only language that has good lexical support for the concept of programming by contract by separating the ability to describe a software interface (the contract) from its implementation (the code) and enabling these to be analysed and compiled separately. The Eiffel language has created a strong interest in the concept of programming by contract which SPARK has embodied since its inception in the late 1980s.[...]
I have always been interested in techniques for writing reliable software, if only (presumably like most programmers) because I would like my programs to work without spending ages debugging the wretched things.
Perhaps my first realization that the tools used really mattered came with my experience of using Algol 60 when I was a programmer in the chemical industry. It was a delight to use a compiler that stopped me violating the bounds of arrays; it seemed such an advance over Fortran and other even more primitive languages which allowed programs to violate themselves in an arbitrary manner.
On the other hand I have always been slightly doubtful of the practicality of the formal theorists who like to define everything in some turgid specifica- tion language before contemplating the process known as programming. It has always seemed to me that formal specifications were pretty obscure to all but a few and might perhaps even make a program less reliable in a global sense by increasing the problem of communication between client and programmer.
posted Thu 13 Nov 2003 in /software/languages | link
No, SCO don't indemnify Samba customers
Turns out that SCO won't indemnify their customers:
In addition, the company continues to ship the GPL-covered Samba software, which lets Unix or Linux systems share files on Windows networks, as part of its UnixWare and OpenServer products.
SCO spokesman Blake Stowell said SCO doesn't offer indemnification, or legal protection, for use of Samba. As a hypothetical example, if Microsoft were to decide Samba violated its file system intellectual property and start suing companies that use the software, SCO would stop including Samba but wouldn't offer customers using the software legal protection, Stowell said.
"I'd be confident if we had any reservations that misappropriated code had gone into Samba, we ourselves would stop shipping it, and we would recommend to our users they stop using it," Stowell said. But of assuming responsibility for a Samba lawsuit, he said, "I don't think we could."
So, just to be clear: SCO's demanding that IBM and other vendors indemnify their customers against any problems arising from open source customers, but SCO won't do that for their customers. SCO would, if Stowell is believed, not only leave their customers open to lawsuits, but also stop providing updates. OK. Good to know.
And of course, if the GPL is invalid, SCO's presumably violating copyrights by distributing Samba without a licence...
SCO's Web site states unambiguously that it's not possible to offer indemnification on GPL software: "Some customers have asked their Linux distributors to indemnify them against intellectual property infringement claims in Linux. The Linux distributors are unable to do so because of the terms and conditions in the General Public License," a page describing SCO's Unix license said.
SCO has been suggesting that IBM should indemnify its Linux customers. "If IBM is so confident that Linux is free and clear, why don't they indemnify their users against any lawsuit SCO could bring against them?" Stowell said.
Not possible, eh? Unable? Not confident? HP just did it.
posted Fri 7 Nov 2003 in /issues/sco-vs-linux | link
Quilt
Andrew Morton kicked off a little version control tool which is now called Quilt. It doesn't seem to have many web resources at the moment, so I have mirrored the README here. It is in Debian.
Quilt (an assembly of patches, right?) wins in simplicity. Essentially it helps you organize the open-source process of generating patches against somebody else's tree, which you will later presumably mail to them or something similar. Quilt helps you manage the common case of needing to say apply several patches on top of a Linus tree, and then write your own work on top of that.
Leaving the means of archiving, distributing, and reviewing patches out of the scope of the tool is pretty smart.
I *think* Arch is doing something like this on the inside, but it's too hard to understand in the time available.
I don't know this is the direct cause, but it does seem like Larry McVoy has succeeded in taunting free software developers into writing something better than CVS, if not yet clearly better than BitKeeper. There has been a real flowering of interesting new version control systems. Not just different implementations, but genuinely new ways of thinking about the problem, or even of defining what the problem ought to be.
One thing Quilt suggests is that it really may be appropriate to use different tools at different times. The kind of operations you want for sending a single small patch to somebody else's package are quite different to when you're maintaining your own team's tree over many years. It may well be it's better to use different simple tools for each case rather than designing one complex one to do everything.
Reading about some of akpm's other work inspires a mix of nostalgia and awe.
posted Fri 7 Nov 2003 in /software/vc/quilt | link
Fort Collins
I'm in Fort Collins, Colorado at an HP internal get-together. Contrary to my expectations, the Denver-Loveland-Ft Collins strip of Colardo is completely flat and treeless. There are some very pretty and impressive mountains just nextdoor though. I went for a walk in Rocky Mountains National Park and could feel the thin air when walking up hills. It's been snowing just enough to entertain. Temperatures in the 30s Fahrenheit don't feel uncomfortable compared to Canberra.
Pictures to follow.
posted Fri 7 Nov 2003 in /travel | link
A Personal Record
Reading Joseph Conrad's A Personal Record. Just great.
posted Fri 7 Nov 2003 in /books | link
SHFS
SHFS: a Linux remote filesystem implemented by running shell commands over SSH, kind of like emacs Tramp mode.
posted Fri 7 Nov 2003 in /software/linux-kernel | link
N410c
/evo-N410c-acpi-static--2003-11-06.diff: kludgy patch to make ACPI work on the HP Compaq Evo N410c laptop, and a kernel configuration. A far better description of how to make this work is here.
I haven't tested it very much yet but this does at least allow X to run and prevents the machine going into thermal shutdown.
posted Fri 7 Nov 2003 in /computers/hardware/N410c | link
Archives 2008: Apr Feb 2007: Jul May Feb Jan 2006: Dec Nov Oct Sep Aug Jul Jun Jan 2005: Sep Aug Jul Jun May Apr Mar Feb Jan 2004: Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan 2003: Dec Nov Oct Sep Aug Jul Jun May
Copyright (C) 1999-2007 Martin Pool.
