Arch reference card
tla reference card in various formats, plus code to produce it.
posted Mon 31 May 2004 in /software/vc/arch | link
Stupid amateurs
177 O + Sun 30 May 04 6.1K Unleashes S. Morphem Stupid Amateur Girl online now
posted Sun 30 May 2004 in /issues/spam/wierdness | link
Bad analogies
esr says:
I haven't seen a book quite so egregiously shoddy and dishonest since Michael Bellesisles's Arming America.
I agree with him that it is an atrocious book.
I think this comparison is very unfortunate, though predictable. Whatever you may think of Arming America — I have not read it — gun control is an issue on which reasonable people can and do differ. Perhaps there is no single answer: I can admit concealled carry suits the cultural and historical situation of Texas, but still prefer gun control in Canberra. It is one of those soft issues which are as much about values as evidence. It is certainly an issue that shows no signs of being settled soon.
Brown's output is garbage of a different order. It is simply, provably, unambiguously false. No informed person, whatever their politics, gives it any credibility. An intelligent reader with no knowledge of Linux whatsoever could see he makes wild assertions without proof. (Disagree? Mail me.)
Perhaps Arming America is provably wrong as well. I don't know or care. I do think it's a bad idea to even compare the provenance of Linux, which is an open-and-shut case, to contentious issues such as gun control. A more apt comparison is Brown insisting that powered flight is impossible.
posted Sat 29 May 2004 in /issues/adti | link
The Vesta Configuration Management System
Vesta is a CM system released under the LGPL by Compaq (now HP). It is designed to allow tight control of very large projects. The Vesta project describes itself:
For substantial software systems (say, 500k source lines or larger), effective software configuration management (SCM) is a serious problem. It is not generally known how to make a configuration management system that is both easy to use and general enough to handle multi-million line software projects. As a result, the world is full of easy-to-use, small-scale configuration management tools and large-scale, hard-to-use ones.
Vesta is a portable SCM system targeted at supporting development of software systems of almost any size, from fairly small (under 10,000 source lines) to very large (10,000,000 source lines).
Vesta is a mature system. It is the result of over 10 years of research and development at the Compaq/Digital Systems Research Center, and it was in production use by Compaq's Alpha microprocessor group for over two and a half years. The Alpha group had over 150 active developers at two sites thousands of miles apart, on the east and west coasts of the United States. The group used Vesta to manage builds with as much as 130 MB of source data, each producing 1.5 GB of derived data. The builds done at the eastern site in an average day produced about 10-15 GB of derived data, all managed by Vesta. Although Vesta was designed with software development in mind, the Alpha group demonstrated the system's flexibility by using it for hardware development, checking their hardware description language files into Vesta's source code control facility and building simulators and other derived objects with Vesta's builder. The members of the former Alpha group, now a part of Intel, are continuing to use Vesta today in a new microprocessor project.
I have not tried it myself, but I recently had an interesting conversation with Ian Wienand. His group decided to try it out, and I asked him how it turned out:
So how has it worked out? Did you enjoy it? What has annoyed you in Vesta?
Well, the situation was a hard deadline (OSDI), a pre-existing source tree and a whole bunch of people looking to get rid of CVS.
It never worked out for us, even with two in house champions. Classic software engineering 101 stuff; combine a new and complicated product (that may or may not be better) with people who are pretty set in their ways with their old product that basically works anyway...
If we didn't have the hard deadline, maybe people would have given it more of a chance, but after about a week of trying to use it a meeting was held and it was like "who is spending more time struggling with Vesta than actually hacking?"; hands were raised and we were back to the status quo. For such a huge paradigm shift in the view of source control, we never even gave it a chance.
But everyone always has deadlines; that's life. You have to really *want* to use Vesta to use Vesta; it's not easy to setup, run or maintain. Maybe those things could be worked out with more developers and documentation but it's a long path.
I think Vesta can be described as a little too academic; a great idea that if everyone should be using, but for some reason reality is different (c.f. Tanenbaum's recent microkernel jabs).
Vesta integrates both CM and a build system. Changing your build system and VC system at the same time is likely to be more than twice as hard as changing just one.
Yes, totally. It's monolithic in that sense; everything is bundled together.
We have been having quite good success at HP with SCons and Subversion. SCons in combination with ccache gives you very fast and reliable rebuilds, so there is no excuse for broken commits. Subversion makes sure the commits are atomic.
I think migrating to an architecture like that has a much higher probability of working because it describes the time tested UNIX philosophy of plugging together small components that do one thing well.
I wish we had of of tried something that -- thinking out loud
- ccache/distcc drops around gcc, no changing how you work
- We have ended up with our own internal build environment (in Python too) which people had to learn their way around anyway. Equivalent of changing to Scons (but Scons has the advantage of a community).
- CVS -> Subversion/Arch change is manageable.
The beauty is there if any one piece fails to catch on, you've still got the advantages of the other bits. With Vesta, it's take it or leave it.
I have a feeling Vesta started out as a strong in-house thing at Digital, where people probably got really used to it and set things up so it would work well. If you weren't born into it it might be hard to adapt...
I think also maybe it works well for, say, closed source operating system group where you 100% need the same tools chains because your code is so dependent, everyone who wants the code is a developer, you have very well defined targets and your ultimate goal is to ship a binary.
I think your traditional open source model fails on every single one of those points :)
posted Fri 28 May 2004 in /software/vc/vesta | link
Raymond and Matzen on Samizdat: "unpublishably bad"
Eric Raymond gives his opinion of Samizdat. Summary: it “is unpublishably bad. Fix it or bury it.”
Jem Matzen has more comments:
The only shocking aspect of Ken Brown's book is that it contains not one shred or iota of evidence to back any of his implications. While he doesn't directly accuse, he also doesn't present any good reasons to believe that we should listen to him. The bibliography, for instance, has 81 items of reference, less than five of which are traditionally recognized reference sources. The greater part of Brown's sources are personal Web pages of people who are not considered experts in the field of Unix, Linux, GNU, or other related subjects, home pages of people who are considered experts but were speaking generally about the subject of the history of Unix, and quotes taken grossly out of context from interviews that Brown did not conduct or take part in.
You don't have to be an author or professional writer to know that when presenting an argument professionally, the strength of your sources is the strength of your position. With no reliable sources, a position paper, thesis, or essay carries no more weight than the Anonymous Coward comments on weblogs and message forums -- in other words, it's bunk. For entertainment purposes only. Read at your own risk. Worse than bunk, it's FUD because it pushes an agenda without presenting any proof. [...]
It is the worst journalism, the worst research, the worst case of abuse of the literary and technical world that I have ever had the profound displeasure of reading.[...]
I could find no evidence of SCO funding. For all the times I've tried in the past, they never return calls or emails and I doubt very much that they'd tell me anyway. My feeling is that SCO doesn't have the money to play these kinds of silly games with; history dictates that Darl McBride and his cohorts are perfectly willing to generate their own untruths for the press and would probably view the Alexis de Tocqueville Institution as unnecessary.
posted Fri 28 May 2004 in /issues/adti | link
Salus: "Be ashamed, Mr. Brown!"
Peter H Salus is a prominent historian of Unix, and wrote one of the definitive works, A Quarter Century of UNIX. (I read it a while ago and can recommend it.)
What does he think of the "book" by Ken Brown from AdTI in which it is "revealed" that Linus Torvalds did not "invent" Linux? Salus says on UnixReview.com:
Alexis de Tocqueville observed that it is easier for the world to accept a simple lie than a complex truth.
So there's a painful irony when we're forced to recognize the validity of de Tocqueville's remark in a May press release from the head of the Alexis de Tocqueville Institution, Ken Brown. [...]
As the rest of the world minus Brown has always known: science always builds on previous work, and computer langauges and operating systems especially so. CTSS led to Multics led to Unix led to Minix led to Linux. All without any copyright infringement; just learning lessons and incremental improvement.
posted Thu 27 May 2004 in /issues/adti | link
Canberra, Autumn
posted Thu 27 May 2004 in /photo/nature/au/act | link
Blogs holding updated documents
Nick Moffit made the wise observation that "blog" is usually just a word for a website that is regularly updated, rather than left to stagnate with an animated GIF of a council worker bludging.
I'd also add that the other feature is that they allow some kind of chronological view of the information, so that people who revisit the site can see what's new since last time. This is a substantial improvement on those corny little new.gif icons we used to use: things are not "new" or "not-new", they occur over a stream of time. What is new to one person might not be new to another. You can approximate "new to you" using cookies, but I just don't think they work as well.
And then you can extend this by exporting it as an RSS or similar feed, so that people can discover what if anything is new without needing to actually visit the site.
So why would any site not want to have all these features? Almost all sites, even if they're not explicitly news or journal sites, are going to be updated over time. People will revisit them and want to know what's new, or they might like to have an RSS overview of changes. Every site ought to have some attributes of what we currently think of as a blog.
At the same time current blogs that I am aware of are a bit unsatisfying in several ways. They seem too tightly coupled to chronology. There's also a limited representation of updates — if I have second thoughts on a topic I want to be able to update it, have people be able to see that it's updated, but not make it pop to the top in the way it does in Blosxom.
Wikis do a little better here: you can view recent changes or history of a page, but you can also navigate without worrying about chronology.
Maybe some other software handles these better than Blosxom does?
posted Thu 27 May 2004 in /meta | link
Dissecting Ken Brown's "Samizdat"
Ken Brown and Justin Orndorff from the "Alexis de Tocqueville Institution" have written a paper entitled Samizdat. (I have more on the origin and meaning of the name here.) In it, they state that Linus did not write Linux, and they suggest that he must have cheated by copying from Minix or Unix. They make various other allegations against open source developers that are similar to those seen in recent Microsoft and SCO press releases.
AdTI and Microsoft have confirmed that Microsoft provides funding to AdTI.
My opinion is:
- The paper is poorly written, full of contradictions and gramatical errors. If their essay were a program, it would not even compile, let alone work.
- Nearly every page makes an unsubstantiated assertion. Brown seems to feel that just inserting "it is clear that", "ironically", "clearly", or "it is widely known" is an adequate substitute for cited evidence. Ironically, it clearly is not.
- Brown clearly does not understand the terms he uses, such as "copyright", "public domain" or "open source". He does not seem to understand that copyright protects representations, not ideas. In several places he seems to think that open source is in the public domain.
- Quotes such as "sometimes theft is necessary" as are attributed to the open source community without any evidence they were ever uttered by anyone.
- Experts are asked misleading or hypothetical questions to elicit quotes that are used out of context. I think AdTI is not honest enough to ask straight questions because the answers would not suit them.
- Brown says he can't believe that Linus wrote Linux, because... well, he just can't believe it. Nothing more. He does not cite even a single line of Linux source that was copied from any other system, despite that all the data needed to check this is available to him. If he found even one line, his paper might be credible. But he does not.
- When sources are cited, Brown grossly misinterprets the data: diagrams that do not show code descent are interpreted as showing code descent.
- If Microsoft paid AdTI to write this, they didn't get much for their money. (They should have paid me instead. Open source is not perfect, and you can criticize it without needing to make your evidence up, if you are prepared to do a little work.)
- AdTI would like universities to release their work under something like the MIT licence, rather than the GPL or proprietary licences. At least this is not obviously silly, though as usual they just state it without making a meaningful case.
- Perhaps worst of all, the authors did not even speak to Linus before publishing these fabulous allegations against him.
I'm working with a friend on a review copy of Brown's Samizdat paper, obtained directly from AdTI. This is a work in progress; if you have comments please mail me.
I see Brown forgot to put a copyright statement on his paper, which is slightly amusing for someone getting so hot under the collar about copyright without actually understanding it. Nevertheless, I suppose it has an implicit copyright by the Alexis de Tocqueville Institution.
A few paragraphs are reproduced here under fair use rights for the purpose of criticism. The copy I got is labelled "final", so I think it's fair to assume this is what they will go to press with: typos, gaping errors and all. According to AdTI, this is part of a “soon-to-be-published book on operating systems and open source.” I hope that AdTI will feel the rest of the book can stand up to rigorous scrutiny and review. On the basis of Samizdat, I really cannot suggest that anyone spend any money to acquire a copy.
There is a fine discussion by Roaring Penguin of AdTI's previous paper. There too, AdTI seems to ask a lot of rhetorical questions without any evidence or logical argument.
On to the paper:
p6:
"Samizdat: The Source of Open Source Code", discusses the controversial production factory of "free" computer source code. While the literal meaning of Samizdat refers to a period of freedom-fighting publishers in early Russia, the term has been borrowed by programmers that engage in the practice of surreptitiously circulating and/or using software source code that belongs to other individuals or companies. Whether it is reverse engineering, employee theft, or Rembrandt-like copying, plagiarism in software programming has become the proud flag of many in the `open source movement'.
I know a pretty good selection of prominent open source programmers, and I don't know of anyone who approves of plagiarism, let alone bears it as a “proud flag”. On the other hand, I find piracy commonly accepted by users of proprietary operating systems, and I would estimate the majority of Windows machines hold unlicenced software. (How many of you are still "evaluating" WinZip?)
Brown does not quote even a single person stating that position, let alone evidence that it is representative of the movement as a whole. He is entitled to make the claim if he can substantiate it, but this is mere assertion.
p9:
Software is a business. But ironically, free software is a business too. The free software model provides users with accompanying source code for modification or development of the original software. The business logic for providing free source code is to enable clients to modify/customize aspects of the accompanying software.
I count 11 instances of "irony", or roughly one every 6 pages. Most are misused, and Brown really means to say "unexpectedly", "improbably", etc. In this case, he seems to be trying to suggest that it is inconsistent to sell free software. Of course there is an explanation for the apparent contradiction ("free as in speech, not beer") but Brown doesn't share it with his audience. One suspects that Brown wants to give the impression that free software companies are hypocritical, but he can't actually say it.
The only ironic thing here is that Brown doesn't seem to know much about the business of free software either. Enabling clients to modify customize the software is one advantage, but not the most important for most users. Other important ones are that customers are not locked to a particular vendor, often can obtain free software at lower cost than proprietary software, can fix their own bugs, have the source as the ultimate reference documentation, will never be orphaned, can learn from examining the source, can check there are no security backdoors, will never be forced to a new licencing plan, and so on.
Linux and many other products are referred to as open source. But in fact they would more properly be referred to as hybrid source, products that attempt to offer the benefit of true open source, but operate in a commercial world like traditional proprietary products. For example Apache is a true open source product. In contrast, the Red Hat Linux operating system is a hybrid product. It is very important to differentiate between the two.
In fact Linux is not just "referred to as open source", it is open source. The term has a formal definition, which Brown has apparently not even read. If the formal definition is not enough, consider that Linux is overwhelmingly the most commonly cited example of open source.
Nevertheless, throughout the paper Brown continues to insist that "Linux is not true open source". It's not only disingenuous, but also makes the remainder of the paper harder to read, since we must always mentally search-and-replace to get the true meaning.
Brown is also deeply confused about the market: Apache operates in a commercial world; is sold, serviced and supported; and competes with proprietary offerings. Just like Linux. As is BSD, for that matter.
True open source is software and source code that can be used for any reason, for any use. If you get it with a license, it only requires attribution, or a copyright notice. You can modify it in any way and sell it as your own, without any additional requirements.
The second type, hybrid source, gets the lion's share of attention. It is software that is also no cost or free, but any modification to it becomes the `equal property' of the original author and any user that is interested in it.
Brown seems to be trying to distinguish BSD/MIT-style licences which allow for code to be "taken proprietary", from reciprocal licences such as the GPL which preserve openness in later distribution. His characterization of both licences is inaccurate in the details.
It's fine that Brown, as a Microsoft apologist, would rather my free software could be reused in closed software without me being paid. I can understand him wanting that. But I don't have to allow it. Brown calling my friends a pack of thieves is hardly persuasive.
pg10:
Although introduced at a much later date, ironically, hybrid source has become the largest pool of free open source software.
Again with the irony, Mr Brown?
I can't even see a way in which the emergence of GPL'd software would be unexpected, let alone ironic. Probably part of it is just chance, but I think it's reasonable to believe that the GPL helps prevents fragmentation and therefore gives a better chance of long-term success than do BSD licences.
The empirical success of open source is integral to the promotion of all science and technology.
How kind of him to say so!
However, it is unquestionable that the hybrid source code model is having a deleterious effect on both true open source, research and development, and the commercial intellectual property economy.
Unquestionable, eh? It is a shame that he doesn't provide some kind of substantiation for the benefit of those so cheeky as to question his statement.
pg11:
Linux and other hybrid source code products 3, commonly referred to as `open source' software, have steadily migrated into the IT departments of both private and public institutions.
Yes, thankyou. Roughly $8,000,000,000 last year, by some estimates.
As usage of the non-proprietary model of selling software and software services grows, like any other new technology, it is important to continually analyze its accompanying opportunities and consequences to best implement and shape relevant public policy.
What a remarkable assertion from a self-described liberal/libertarian thinktank: any new technology needs a public policy, and government intervention. I suppose the idea that free people might decide to release and procure software under the terms they think best is a bit too subversive for Brown. Why, next they will be deciding for themselves what books to read!
Software is source code - and the topic of the `source' of the code is as big as the billion dollar industry itself.
I think there's a lot of truth in the "software is source code", and I hear "if it isn't source, it isn't software" is a rule of thumb in NASA procurement. That seems to imply that Microsoft don't release any software, aside from a little build tool. But they do make good ergonomic keyboards.
An issue that flies beneath the radar is the question: where does the successful Linux product come from?
Brown seems to be setting up for the idea that Linux source code was really stolen from SCO or someone else. Good luck: IBM, SCO and the court system have already spent quite a lot of time and effort establishing that isn't the case. But I'd like to see him try.
The origin of true open source code doesn't really matter, because a) it does not have many significant legal consequences of misuse b) it has almost no use restriction. It is definitely free--commercial products such as Linux are entirely different.
By "true open source", he means BSD-licenced source. I suppose Brown didn't do his homework enough to realize there was a big court case a few years ago about BSD, despite the licence.
We know where traditional commercial proprietary source code comes from. We also know who its original owners are.
Where? Again, mere unfounded assertion contrary to the evidence. Proprietary software customers have no idea where the source came from: what country it was written in; who wrote it; who it was licenced from; what trade secrets or patents it may embody; what security backdoors it may contain.
However, we don't really know what the origin of the bulk of hybrid source code is. We don't know much about this pool of software, other than what we are told. The assumption is, there is no cause to ask---For example, we know that Linux is a free public domain product, given to us by its inventor Linus Torvalds. But not many people ask where did it come from?
This is really getting silly. Before he said the GPL is almost a proprietary licence. Now he says Linux is in the public domain. Which is it? It can't be both.
The difference between "public domain" and "free software" is one of the most basic points in understanding software licencing. Clearly Brown does not.
Is it a dumb question to ask, "what is the origin, the `source' of this pool of source code?"
No, it's not a dumb question. Ask nicely, as Boston Consulting Group did a couple of years ago, and you'll get a detailed and quite fascinating answer. But assume your conclusions, and you make yourself a laughingstock.
Some critics are even unlucky enough to receive widespread excoriation in public forums.
Well, I'm sure such a sterling paper would never deserve that.
Ch2:
Here follows a mediocre summary of the history of Unix. It's more or less correct, though you could get a more accurate and interesting description from Raymond or Salus. Even better, take Nick Moffitt to the Tied House, and hear all about it over beer.
David Bloch an attorney with McDermott, Will & Emery discussing the question hypothetically comments, [27 David Bloch interview with AdTI, April 9, 2004. Bloch was NOT asked about the Lions incident specifically, only legal questions about scenario.]
AdTI has a consistent pattern of asking people for comments on hypothetical scenarios and applying those comments out of context to Linux. It allows him to give the impression that Bloch, or Tanenbaum, or Richie is saying "Linux is X", when they said no such thing.
pg24:
"Sometimes a little theft is necessary".
"There is theft everywhere and the open source community should not be singled out."
"The samizdat exchange was outright theft but it was necessary."
Quotes supposed to be from open source programmers, but not attributed. Did they just make them up?
Perhaps we should attribute thoughts to "Factions within the AdTI" on whether wife-beating is "sometimes OK", "happens all the time", or "is absolutely necessary"?
A lot of context about the Lions book seems to be missing.
pg27:
Brown looks at the unix history diagram by Eric Levenez. Despite a categorical statement by Levenez that the diagram is not a representation of copyrights or patents, Brown proceeds to assume that almost all Unix-like systems “originate from licensed Unix code, a Unix licensee, or a previous Unix licensee”.
Brown could have read in any number of books that Unix has been independently rewritten several times. Clearly he has not done his research or is wilfully ignoring the facts.
pg37:
Follows what is called "an argument from personal incredulity." Brown says, in nearly so many words: "I can't believe anyone can just sit down and write an operating system kernel. So it must not have happened." The same argument works equally well against heavier-than-air flight.
Writing 7000 lines of rough first-cut code for Linux 0.01 in a few months is entirely plausible. Brown doesn't seem to have consulted any programmers before deciding it's impossible.
Brown seems to have the idea that all operating systems are terribly large things. If it was expensive to write Windows 2000, it must have been equally expensive to write Linux 0.01! Therefore, Linus could not possibly have done it himself.
Linus could never have written the whole kernel that we have today by himself. What he could do in those first few months was to get enough of it going to act as a seed crystal for all the other people who wanted a high-quality free unix. Perhaps the first version wasn't very good, but it was a start. Plenty has been written about how good it is for open projects to release early and often, and to do other things to encourage contributions. Linux just did those things well — perhaps better than most people had before.
Contrary to Brown: it is possible to eat an elephant, if you do it in small pieces, and have a lot of hungry friends.
The whole question of whether Linux contained Unix source code is easy to answer: check whether there is any code in common. Brown should be replaced by a small Perl script. SCO tried and failed, but AdTI is welcome to try.
The source for ancient Unix, Minix and Linux is available, so it's easy to check. One might hope that if Brown could produce any evidence he would do so and not rant on about incredulity and irony.
pg43:
A tedious recitation of how slow it is to develop large monolithic software systems such as "Windows NT 5.0", now known as Windows 2000. There's no explanation of how this is meant to be at all relevant to Linux 0.01, which was about 7000 lines. Aside from an enormous difference in scale, W2k was slowed down by maintaining backward binary compatibility, compatibility with a massive range of hardware, testing to a level appropriate for a mature rather than first-cut product, and the overheads of communication and organization between hundreds of developers. Linux 0.01 had none of these costs, and so was developed proportionally faster.
Brown cites sources such as The Mythical Man Month, but if he had truly read and understood them, he would have seen why it is entirely possible for one programmer to write 7,000 lines in six months. Linus was working under perfect conditions: a good programmer, a green-fields project, flexible requirements, a low quality bar, and no management overhead.
pg49:
It is also important to note that if motivated parties with the power of subpoenas, witnesses, interviews, and evidence delved deeper into the development of Unix to Minix to Linux (UML) there are a number of reasons why there could potentially be problems.
Brown seems to have been asleep for the last couple of years, and to have not noticed that motivated parties (Microsoft and SCO) have in fact been trying to find copyright problems in Linux, with negligible success.
But to this day, Linux, a product known virtually around the world, still does not properly credit Minix for its source code, its derivative use or its influence. Arguably, this has cost Prentice Hall considerable book sales from the years 1987 to present. In addition, it also obviously cost Prentice Hall sales between 1987 and 2000. One reason is due to the loss of customers that would have bought the Prentice Hall publication for the Minix code.
I'm not sure what degree of credit is necessary for a program you used 12 years ago that made you feel like writing one of your own. I would guess that it is honest to mention it in books or interviews about the history, which is what Linus does.
As Tanenbaum points out, the purpose of the Minix books and software is to be a teaching tool. The purpose of Linux is to be a practical operating system, and it is perhaps now getting too large to be comfortably used for teaching. They are complementary, not substitutable.
pg49:
Instead of buying the Tanenbaum book for the Minix code, they could get a free copy of Linux.
Hypothetically, at $100 per book, at a loss of just 500 book sales a year, to date, Prentice Hall and Tanenbaum have lost almost $1,000,000 in revenues. This is of course only represents compensatory damages, not punitive. Arguably, Prentice Hall has lost out on tens of millions of dollars.
Let's just check this: $100 × 500 × 12 years is $600,000. That's a big rounding error! Should we really trust people who can't do grade-school arithmetic to give advice on public policy or corporate strategy?
The whole argument rests on the assumption that Linux incorporates Minix code, for which Brown presents not a shred of primary evidence and which has been resoundingly rebutted by the author of Minix.
I note that Prentice Hall currently publishes a number of books about Linux, including Understanding the Linux Virtual Memory Manager, Linux Programming by Example, Samba 3 by Example. Since their offerings cover a broader range than just the design of microkernel operating systems I would venture that Prentice Hall probably sell more Linux books than Brown's estimate of 500 lost Minix sales.
"Arguably" means "it may be argued that". But Brown doesn't actually argue it. Why didn't Brown ask Tanenbaum or Prentice Hall if they felt Linux infringed their copyrights, or if they bad about the idea of lost sales? Apparently either he didn't ask, or he didn't like the answer.
pg59:
Another interesting perspective on the credits files is the limited credit to members from developing countries in the Tuomi chart. 85 This can be explained away by simply suggesting that non-English speaking countries would have been slow to show interest in Linux development. However, by 2000, although it is widely known that China and India are heavy Linux developers, they both receive an insignificant amount of credit in the Linux credits files 86. In fact, India, an English-speaking country, is non-existent, while countries such as Mexico, Brazil, and Argentina are recorded with minimal presence. Amusingly, while the Tuomi chart studies Linux credits from 1991 to 2000 from over thirty countries, according to Tuomi's study of the credits files, Finland per million inhabitants, remains the number one source of original Linux code for the ten year project.
It is almost certain that Tuomi, a scientist and rigorous researcher, does not introduce this data to argue that there may be country bias in the credits files. However, Tuomi's point "History has a very selective memory..." could be relevant in this instance as well. After all, the Matthew Effect historically has been very effective in purging the origin of invention from developing countries for many years. We don't have any evidence that it occurred with Linux. However, it is conspicuous that an open source model, touted to uplift developing countries, does not seem to have contributions from the very countries Linux advocates are a rguing they are interested in promoting.
Tuomi, a "rigorous researcher", doesn't see any bias here. But the less rigorous Mr Brown does, or at least tries to imply it.
Mind you, he doesn't actually name even one Indian developer who has been omitted, or say more than "it is widely known that India and China are heavy contributors." I think I'd question that point: I've personally only seen a lot of people from India coming into open source over the last two or three years, which is after this survey concluded. Even then, many of the contributions seem to be in other places than the kernel. (I use the circumlocution because of course there have been ethnic Indian and Chinese people living in other countries and working on free software for a rather longer time.)
Credits in the Linux kernel are maintained by the developers themselves, using patches like this. If anyone feels they have not been credited sufficiently or accurately, they can easily correct the record just by sending a patch or a request for their name to be added. This policy is stated in the file in question, linux-2.6/MAINTAINERS:
PLEASE try to include any credit lines you want added with the patch. It avoids people being missed off by mistake and makes it easier to know who wants adding and who doesn't.
I am not aware of any complaints from developers that Linus is ignoring those requests. On the contrary, there is plenty of evidence on the kernel list archives that the credits file is updated when people request it. For example, here is an excerpt of the changes in linux 2.6.5, showing updates to author information:
@@ -1875,6 +1864,13 @@ S: D53424 Remagen S: Germany +N: Colin Leroy +E: colin@colino.net +W: http://www.geekounet.org/ +D: PowerMac adt7467 fan driver +S: Toulouse +S: France + N: Achim Leubner E: achim_leubner@adaptec.com D: GDT Disk Array Controller/Storage RAID controller driver
It seems to me the onus is on Brown to prove that there is a conspiracy to not give credit to Indian developers. He does not provide any evidence.
pg62:
Brown says that "reverse engineering" (of what?) is a possible explanation for how Linux developed so quickly.
Anyone who has ever seen how slow and painful reverse engineering can be is welcome to laugh at this point. Given the choice between reverse engineering a program and writing one from scratch, I'd always go from scratch.
Brown also proposes that the reason why we can't find Minix code in Linux is that it was obfuscated. Never mind that the designs are completely different. Brown doesn't actually point to any code in Linux which shows signs of obfuscation.
pg72:
Handwave at outsourcing. Of course.
pg74:
Paradoxically, every dollar of advertising and promotion corporations such as IBM and Oracle contribute to increasing customer interest in ree Platforms' respectively will cost these companies lucrative accounts.
IBM is fortunate to have Microsoft/AdTI be so concerned for IBM's welfare. No doubt IBM and Oracle would have been better off to forgo their billions of dollars in Linux-related revenue.
pg75:
To defend themselves, all open source organizations are slowly becoming more bureaucratic and more closed--more like proprietary software companies.
I'm sure Microsoft/AdTI wishes that were the case, but I don't see it happening. On the contrary, open source processes seem to be getting more efficient all the time: GNOME and Fedora for example now ship on a regular schedule, whereas Microsoft's Longhorn has slipped another couple of years.
If Brown has evidence of open source projects becoming bureaucratic I would be interested to read it, but it seems he does not.
pg77:
A section titled "Achieving Balance". Given that the entire essay to date has been Brown's fervid imaginings without a shred of evidence, it is hard to see why any adjustment is required.
Corporate interests cannot fund truly free software because their interests are tied to the promotion of their business.
Is he saying that corporations are not able to fund free software, or that they must not be allowed? If the first, who is Brown to tell people how to run their business? If the second, on what grounds does he propose to outlaw cooperation between free parties? I see no answer here.
It is in the best interest for the federal government to take the lead on funding a bigger open source project at universities. The commercial open source model is 1) depreciating the value of U.S. proprietary software 2) depreciating the value of U.S. investment in the IT industry 3) diminishing the returns of the IT industry which is in turn send U.S. jobs overseas to make up for losses. 4) funding the devolution of the U.S. intellectual property rights economy.
On the other hand, free software is making some American businesses more efficient, and offering better products to some American consumers. Perhaps we could try to estimate the costs and benefits, or perhaps we could just say that the free market will work it out. But Brown just rather tediously makes the assertion and moves on.
(At this point in the essay Brown seems to switch from "ironic" to "inane" as a favourite word to wave around when he can't think of anything better to say.)
1) The government should support R&D at universities with open source projects that produce research that all parties can use. This includes developers and commercial interests. However, taxpayer dollars cannot support open source projects that are tied to commercial open source models that compete with the private sector.
2) Universities and colleges that receive government grants should not be able use taxpayer dollars to generate source code that is restrictive. Both individuals and business should be able subsequently to develop free software and protect it as its own intellectual property
Here, at last, two consecutives paragraphs(!) that are concrete and not contradictory. He seems to argue that all government projects should produce code that is MIT-licenced. I can see some sense in that, though there are some difficulties.
If consistently applied, it would also prevent universities from working on proprietary code, as they might currently do in joint ventures. I'm not sure if Brown sees that as a positive good or if he just didn't think of it. It might be a reasonable tradeoff. One might equally argue that all work should be GPL'd, so as to guarantee ongoing public access. Or one might argue that, as at present, it should be decided by the university case-by-case.
If AdTI wants to persuade universities to release their work under the MIT licence, they need to make a stronger argument than they have to date.
To be clear, the hybrid open source model encourages conspicuous development and proprietary software models.
I'm not sure exactly what that is meant to mean, but I like the sound of it. "Conspicuous contribution", rather than "conspicuous consumption." Nice.
Finally, U.S. corporations, especially in today's economy, would only benefit by more research and development assistance.
Oh, so AdTI is not so liberal after all. Here we get close to the truth: a plea from Microsoft to the government to make the nasty free-speech subversives go away, and to give Microsoft more public money. How sad. Adapt or die.
pg87:
Here is included a list of many papers by Andrew Tanenbaum. An impressive record, but most of them have little to do with Linux. Who knows what it's doing here? If it's an attempt to rub some of Professor Tanenbaum's credibility of onto AdTI, it seems they failed miserably.
In conclusion, I can do no better than to repeat David Skoll's summary of the previous AdTI fluff:
The entire AdTI study is a commercial funded by Microsoft, whose sole aim is to counter the growing adoption of GPL'd software. The report contains nothing constructive or useful. It is a sham.
posted Wed 26 May 2004 in /issues/adti | link
Yesterday, my program worked. Today, it does not. Why?
Andreas Zeller has an interesting paper. Abstract:
Imagine some program and a number of changes. If none of these changes is applied (`yesterday''), the program works. If all changes are applied (`today''), the program does not work. Which change is responsible for the failure? We present an efficient algorithm that determines the minimal set of failure-inducing changes. Our delta debugging prototype tracked down a single failure-inducing change from 178,000 changed GDB lines within a few hours.
posted Tue 25 May 2004 in /software/debugging | link
The Tyrrany of Taste
Indulging in tyranny of taste: Bay Area is fertile soil for cultivating snobs, according to Vicki Haddock:
We like to tell ourselves that we live in the most egalitarian country on Earth, and that the Bay Area is the epicenter of unpretentiousness. It is rather like our halo. No caste system here, barely a blip on any snob-o- meter. Mutual respect, our mantra. Diversity, our virtue. Tolerance, our sacrament.
Mais, au contraire! We actually may live in one of the snob centers of the universe.
posted Mon 24 May 2004 in /random | link
Java: the next COBOL
I had lunch with Mike and Christian of Make Technologies here in Vancouver, and in my new capacity at Sun got my ear bent about the Java value proposition. Their key point was: probably more half of the data being crunched out in the business world is being crunched by COBOL programs on mainframes. When these systems really finally can't be lived with any longer, the CIOs who have to replace them notice that they're decades old. They're smart guys who try to learn from what they observe, and they deduce that the next big piece of infrastructure is apt to be with them for a long time. "So," they wonder, "this JES stuff (or .NET, or whatever) they're trying to sell me, will it still be a viable platform in 25 years?" Put that way, it sounds to me like a damn good question. I think the Java answer is about as good as anyone's at the moment, but I suspect it's something that none of us on either the vendor or customer side have been putting enough thought into.
That might just be a decent explanation of both the good and the bad points of Java, and a guide to when to use it. Not so exciting, not so agile, but a safe choice, if you're really planning to keep the code for 25 years.
It's a bit like describing people as "born old".
posted Mon 24 May 2004 in /software/languages/java | link
(fwd) Corporate layoff advice
From the ineffable Don Marti:
Don't fire the one guy who has root on the web server. [mirror]
posted Mon 24 May 2004 in /issues/sco-vs-linux | link
Kim Weatherall
Kim Weatherall has a blog about Australian IP law, the USFTA and related topics.
posted Mon 24 May 2004 in /issues/copyright | link
Referers: An easy question
Someone from telkom.net.id asked Google about L.O.T.R. Arwen Sex. That's easy. She was female.
posted Mon 24 May 2004 in /meta/referer | link
New slogan for AdTI
Someone on Groklaw calls the Alexis de Tocqueville Insitution the think tank that didn't. I love it.
posted Sat 22 May 2004 in /issues/adti | link
AdTI's "Samizdat"
Ken Brown's “book” Samizdat says Linux and Open Source is like samizdat, the self-publishing books of Russia. From a foreword by Cynthia Martin (Associate Professor of Russian, Maryland), who surely bears no blame for the travesty of the book itself:
To understand the appropriateness of the word samizdat in the title of this paper, a brief discussion about the word's meaning and its significance in Soviet history is in order.
Russian culture has always recognized the power of the word, spoken and especially written. In contrast to a democratic tradition predicated upon the notion that protecting free speech is necessary to foster the open exchange of ideas, a monolithic world-view, be it tsarist, monarchy, or Communist totalitarianism, cannot tolerate the potential for alternative positions or systems of government gaining broad support. The written word, as the bearer of such alternative ideas, is viewed as quite powerful, and hence, it is not surprising that official control over all forms of publication has been exercised throughout Russian history, especially during the Soviet period.
State-sponsored censorship developed during the pre-1917 tsarist period, and subsequently found its full elaboration in the Soviet Union. Samizdat was a response to the attempt by the Russian government to control access to all publications and publication outlets. Samizdat referred to the practice of "self-publishing" by dissident thinkers in a variety of areas, including political thinkers, academics and scholars, scientists, and literary and artistic figures in the Soviet Union. [...]
The punishment for producing samizdat or even possessing such self-published literature could be harsh, resulting in prison sentences or worse. To prevent unauthorized publishing, state control of the printing apparatus was so meticulous, that over long holiday weekends, for example, publishing offices containing typewriters and other forms of copying technologies were literally locked and their doors were sealed. The particular keystrokes of all typewriters were registered with the authorities so that illegally typed works might be traced to those responsible.
One of the most famous cases of a dissident writer whose works, political and literary, were published via samizdat is the case of Alexander Solzhenitsyn. His personal fate is evidence of how much Soviet Russia feared the bearer of alternative ideas, and how total the attempt was to control the dissemination of texts that offered alternative views. Solzhenitsyn came to be seen as more of a threat inside Russia, where he could still spread his anti-Soviet views, than outside, and therefore he was stripped of his Soviet citizenship and expelled from Russia in February 1974.
What a noble enterprise! The goals of samizdat publishers are those which I think many open source contributors would admire: speaking truth, sharing information and ideas, and doing it using their own tools even when it inconveniences or annoys the regime.
Brown slanders the open source and Unix community in the body of the book. But at least in the choice of his title, he is far too kind. The comparison to samizdat publishers is inspiring and flattering, but eventually an exageration: few of us run the risk of the gulag to publish our code, and though the cause of free software is worthy it is not so grand as the liberation of a country from totalitarianism. I do my bit, but I am no Solzhenitsyn.
The rest of the book is truly awful. Were it a university paper, every page would have red ink... at least until halfway through, when I think any marker would give up and just write FAIL.
If there is one worthwhile thing that AdTI ever said, it is this: even our enemies see Linux as being like dissidents under communism.
posted Sat 22 May 2004 in /issues/adti | link
Register on AdTI
John Lettice at the Reg has a good report on AdTI, and some links.
"Now he's making a big joke, saying it was Santa and the Tooth Fairy," said Brown, "but I want all of your readers to ask themselves, in the history of computing, has anyone else ever written an operating system who never was a licensee, didn't have operating system experience, and didn't have the source code? How did he develop so much code in just six months? Everyone else has taken years to develop operating systems.... Linus perpetuated the lie [that he is the inventor of the Linux kernel], and I have a problem with this smarmy attitude."
You could point out that what Linus developed in the first six months was actually.... but really, why spoil it?
It would appear therefore that Brown went into the interview with Tanenbaum with the view that Torvalds must have stolen Linux and couldn't possibly have written it, that Tanenbaum went to great pains to disabuse him of this entirely unfounded notion, and that Brown emerged sufficiently unsullied by knowledge to just carry on and write his "path-breaking study".
posted Fri 21 May 2004 in /issues/adti | link
Justin Orndorff and AdTI
In the context of the recent AdTI shambles, it is interesting to note recent posts by someone writing as "Justin Orndorff" <raison__d_etre@hotmail.com>, who says he works for the Alex de Tocqueville Institution:
Greetings,
I'm currently doing research into corporate contributions towards open source projects, such as Linux. One of the recent Credits Files lists Mr. Anton Blanchard as a contributor. Is Mr. Blanchard still an employee with the company?
Also, does the company have any policies regarding open source contributions by employees? If so, are there any differences between on and off the clock contributions?
Minix list, 28 April
I'm a noob. I was wondering if anyone could point me in the direction towards downloadable earlier version of Minix. Anyone?
(Why would a "noob" want to download anything but the most recent/usable version?)
linux-kernel mailing list, 12 April
Hello, my name is Justin and I'm doing some research into the specifics of Linux. Out of curiosity, who owns the Linux kernel?
Is it owned by Linus Torvalds, its contributors or the public?
Hello board,
I'm currently conducting some research into the history and background of operating systems. Any comments on the following questions would be welcome.
1. Describe the components of an operating system, besides the central component, the kernel.
2. What do programmers usually develop first, the compiler or the kernel?
3. Does this sequence impact the OS at all?
4. What's more complicated, the kernel or the compiler?
5. Why does operating system development take as long as it does? What are the three key things in all operating system development that take the longest to perfect?
6. Do you need operating systems familiarity to write a kernel? Yes / no? Elaborate please.
7. In your opinion, why aren't there more operating systems on the market?
It does sound a bit like he wants someone else to do his undergraduate homework, but I think that is not quite the case.
Same questions on "WebMasterWorld" and OSDev.org. (Would you think webmasterworld would be a good forum to try to find the history of Linux?)
Further information on Usenet.
Anyone considering talking to Justin or AdTI might want to read about Prof Tanenbaum's experience first.
posted Fri 21 May 2004 in /issues/adti | link
Tanenbaum on AdTI
Andrew Tanenbaum responds to silly accusations by Ken Brown, President of the Alexis de Tocqueville Institution, that Linus didn't write Linux:
Brown flew over to Amsterdam to interview me on 23 March 2004. Apparently I was the only reason for his coming to Europe. The interview got off to a shaky start, roughly paraphrased as follows:
AST: "What's the Alexis de Tocqueville Institution?"
KB: We do public policy work
AST: A think tank, like the Rand Corporation?
KB: Sort of
AST: What does it do?
KB: Issue reports and books
AST: Who funds it?
KB: We have multiple funding sources
AST: Is SCO one of them? Is this about the SCO lawsuit?
KB: We have multiple funding sources
AST: Is Microsoft one of them?
KB: We have multiple funding sourcesHe was extremely evasive about why he was there and who was funding him. He just kept saying he was just writing a book about the history of UNIX. I asked him what he thought of Peter Salus' book, A Quarter Century of UNIX. He'd never heard of it! I mean, if you are writing a book on the history of UNIX and flying 3000 miles to interview some guy about the subject, wouldn't it make sense to at least go to amazon.com and type "history unix" in the search box, in which case Salus' book is the first hit? For $28 (and free shipping if you play your cards right) you could learn an awful lot about the material and not get any jet lag. As I soon learned, Brown is not the sharpest knife in the drawer, but I was already suspicious. As a long-time author, I know it makes sense to at least be aware of what the competition is. He didn't bother.
Strongly recommended if you want to get an idea of the calibre of the “researchers” at AdTI.
(link from LWN)
posted Fri 21 May 2004 in /issues/adti | link
USFTA Senate Committee submission on copyright
Rusty testified to the Senate Select Committee on the US-Aus Free Trade Agreemenet Monday, and I think made a very good presentation of the case. I was happy to be able to help a little in preparing for the presentation.
There are two major problems. Briefly put: the treaty requires draconian laws against devices that can be used to circumvent copyright control measures, and secondly it requires Australia to accept software and business-method patents. There is an additional meta-problem that treaties are harder to change than laws, so if we discover that the provisions are too harsh it is hard to correct the mistake. Roger Clarke has an excellent more detailed description of the problems.
Our slides are now available, which may make more sense read in conjunction with the draft transcript of the session. When the draft is confirmed, it may be linked from the committee web page.
Some personal observations on the comittee:
In transcripts of previous hearings some of the witnesses were subject to very robust questioning, focussing more on the person of the witnesses than on the substance of their submissions. We went in fearing questions along the lines of "are you a communist?", but did not encounter any.
I think the diagrams helped in explaining the fairly complex questions about how technical protection measures infringe on the rights of third parties. However, I don't think we presented them very effectively. There is no, as far as I know, a way to project a slideshow, and it might not be appropriate. We had A3 printouts of the diagrams, but since the committeee room was a bit large it may have been hard for some senators to read them. We probably should have also made A4 sets for easier reference.
posted Thu 20 May 2004 in /issues/copyright | link
CVS/Subversion vulnerability
As if on cue, a vulnerability in CVS and Subversion was announced, which allows remote compromise of machines that publish read-only repositories.
I don't mention it to be mean to the Subversion or CVS developers. Security problems happen in complex code.
I do think it's an argument against version control systems that need custom protocols and special servers to publish code. Most of the distributed systems can do without this, and can make read-only repositories available through a static HTTP server. Arch and Darcs can do this. Monotone and Codeville seem to require their own servers.
I think this reduces security exposure because most projects are likely to have at least a static web site already. Adding read-only files containing the repository doesn't increase the surface of code in the web server that can be reached by an attacker.
There has been discussing of adding an archd daemon that can speak a more efficient protocol than HTTP. Having the option is fine, but I think it would be unfortunate it it were no longer possible to publish repositories through a web server.
To some extent you can simulate this under CVS or Subversion by copying code to a separate machine which provides a read-only public repository, but it's still a little more risky.
More on this on slashdot, and David Wheeler has an essay on SCM security.
[typos fixed 2004-05-27]
posted Thu 20 May 2004 in /software/vc | link
machine-room archaology
Last week I installed a new rack in our little machine room, to hold some Itanium development machines. Part of that was trying to hook up a "PDU" (power distribution unit), which is basically a fancy/expensive power board. This one needs an AS-3112 15A to IEC C20 cable. Those are hard to find.
Under the raised floor, while trying to find a power point, I found thick ethernet cable, with Digital repeaters(?) still attached.
posted Wed 19 May 2004 in /random | link
Rusty before the senate on the USFTA
Rusty is appearing on Monday afternoon before the Senate committee enquiring into the US Trade Agreement. He will testify on the damaging consequences of the DMCA-style provisions for free software, fair use, etc. If you're in Canberra please come to the public hearing at 5pm on Monday and show support.
posted Sun 16 May 2004 in /issues/copyright | link
hacker, interrupted
I'm enjoying a little Sunday-morning hack on librsync while S sleeps. Our cable provider seems to have messed up a firmware upgrade, so I'm disconnected from the internet. How pleasant to have distributed vc in darcs and arch so that I can still get things done.
I'm writing this in a mirror of my blosxom installation which I'll later sync up using Unison, which is also a pretty amazing tool.
posted Sun 16 May 2004 in /software/vc | link
Please just jail spammers
aj continues our discussion about email postage as a solution to spam/viruses, charging that “from someone whose Orkut profile lists him as a ‘libertarian’, [it] seems odd” to want criminal remedies for spam.
(I am not accurately "libertarian". I'd rather be small-l-liberal but translated into American "libertarian" was the closest match.)
I think it's entirely consistent for moderate libertarians to want the government to enforce laws against fraud, trespass, theft, etc.
People can get so wrapped up in the technical fight against spam, or so used to thinking of it as just a nuisance that they forget almost every message represents evidence of a felony.
Spam is fraud and theft of service on an industrial scale, activities which are already illegal. No new spam laws are required. At the very most, perhaps the illegality needs to be made more clear in the law, but I don't think that is needed.
A majority of spam messages are sent through consumer machines that have been compromised through Windows worms or similar means. Unauthorized access to a computer system is illegal in Australia and punishable by up to ten years in prison. I would like to see that law enforced.
(I'd like to see criminal negligence charges against people who knowingly allow their systems to be used in commission of fraud.)
Spam usually involves unauthorized access to a computer system on the sending end. It also involves unauthorized access on the receiving end: if unsolicited advertisements are specifically disallowed by the terms of service of a computer system, then posting them is also unauthorized insertion of data, and possible a breach of the law.
A large volume of spam advertises products that are likely to be either illegal or fraudulent. In a random sample: bogus pharmaceuticals, child pornography, bogus home loans, unlicensed software, fraudulent invoices, 419 scams... Even if spam was sent legally, the majority of people sending it are involved in some other criminal enterprise.
There are reports that great volumes of spam are sent by organized crime gangs also involved in credit card fraud, illegal pornography, drug trafficking, theft, and so on. This single point I can't verify for myself, but it does seem like it ought to motivate police to investigate more energetically.
You have to go pretty far out on the scale of anarcho-libertarianism to say that there should not be laws against theft and fraud, or that those laws should not be enforced by publicly-funded police. Interesting late night thought-experiment though that may be, it is practically irrelevant.
If I only received spam that did not fake its sender, was not sent through compromised machines, was not advertising criminal or fraudulent products, did not contravene terms of use and did not breach any other laws then I would see far less need for government involvement. Oh happy thought!
I don't see the point in introducing a new email postage system. Existing laws are being flouted on an industrial scale by hundreds of perpetrators. Any new system will be abused too. If you cannot have at least a shade of a threat of sanctions against fraud and theft, it is hard for a free market to work.
I can imagine there are practical problems in enforcing these laws globally. But let us, at least, punish every spammer and con-artist in Australia and the USA. If that works, but we are still being attacked by criminals in China or Nigeria or Russia then let us handle that through some international process.
Stealing a car for use in a robbery is, and should be illegal. Stealing control of a computer for use in fraud and theft is illegal, but seems to be rarely prosecuted. Send a few spammers to prison for five years, a punishment they richly deserve, and the spam problem might start to go away. Failing that I would like to see more civil suits.
posted Fri 14 May 2004 in /issues/spam | link
XML is like
From slashdot via rusty
XML is like violence: If it doesn't solve your problem, you aren't using enough of it.
posted Fri 14 May 2004 in /software/xml | link
referers
Google: windows software for unixy access is denied
You'd better believe it.
posted Thu 13 May 2004 in /meta/referer | link
(fwd) librsync and the new rsync
On 12 May 2004, Farkas Levente wrote:
hi, is there any progress in librsync, rzync or superlifter or these projects are dead?
They are cooking slowly, so that they will be delicious and tender when they're done.
Are you interested in helping?
posted Thu 13 May 2004 in /projects/librsync | link
Do you work for a psychopath?
The Economist covers [sub reqd] The Corporation, "an award-winning documentary film coming soon to a cinema near you:
To the anti-globalisers, the corporation is a devilish instrument of environmental destruction, class oppression and imperial conquest. But is it also pathologically insane?[...]
The answer, elicited over two-and-a-half hours of interviews with left-wing intellectuals, right-wing captains of industry, economists, psychologists and philosophers, is that the corporation is a psychopath. Like all psychopaths, the firm is singularly self-interested: its purpose is to create wealth for its shareholders. And, like all psychopaths, the firm is irresponsible, because it puts others at risk to satisfy its profit-maximising goal, harming employees and customers, and damaging the environment. The corporation manipulates everything. It is grandiose, always insisting that it is the best, or number one. It has no empathy, refuses to accept responsibility for its actions and feels no remorse. It relates to others only superficially, via make-believe versions of itself manufactured by public-relations consultants and marketing men. In short, if the metaphor of the firm as person is a valid one, then the corporation is clinically insane.
posted Thu 13 May 2004 in /issues/business | link
Shingleback skinks
posted Wed 12 May 2004 in /photo/nature | link
groggy on C++, etc
Groggy has some thoughts on C vs C++ and on revisiting old code.
(Isn't it strange that job ads ask for experience in "C/C++"? It's a bit like asking for candidates who speak French/Italian.)
posted Wed 12 May 2004 in /software/languages/cplusplus | link
SCO infringes copyright; settles case
Controversial UNIX vendor The SCO Group apparently has paid to settle a copyright infringement complaint from San Francisco publisher No Starch Press.
"We have no issues with SCO at this time", said No Starch founder Bill Pollock in a telephone interview Tuesday. However, Pollock said last fall that he would insist on a payment from SCO in order to resolve the copyright dispute.
Some time before mid-2003, SCO copied entire chapters of a No Starch book, The Book of Webmin by Joe Cooper, into SCO's on-line documentation. The infringement was described last summer as "an open-and-shut case" by a person familiar with the facts. The Book of Webmin, originally copyrighted in 2000, is available on the Web, but it is not licensed for redistribution.
posted Wed 12 May 2004 in /issues/sco-vs-linux | link
On comparisons of version control systems
The last couple of years there has been a sudden blossoming of new open source version control systems: Arch, Bitkeeper (originally kind of open source, now not), Codeville, Darcs, (can't think of one with E :-), Monotone, Quilt, Subversion, and more.
I like thinking and writing about version control because it has an interesting mix of technical problems and human/social problems, and because it's still very much an open question. For something like a file server you can cap the basic performance in terms like "should be able to saturate GbE on such a machine and should correctly implement these RFCs." But for more human tool like a version control system the basic measurement is how much it improves the productivity of developers, which is pretty much unbounded.
Indeed the surge of open source software in the last 10 years enabled by many people having cheap modems and PCs and knowing Unix is in a sense a productivity booster far beyond the wildest claims of Rational salespeople. Imagine: "Just publish all your source, and people will semi-magically fix your bugs while you sleep!"
Anyhow: vc is interesting to talk about. It can also be useful to talk about, because people have a lot of choices and it's hard to see which one is best, but CVS chafes and people might like to move. So I think a lot of writers have a desire to help out in making that choice.
One way people do this is by building feature comparison tables, such as this one at Berlios. You will find something similar in the documentation or web sites for many of the systems.
When you have basically similar products like telephones or printers or cars comparison tables can be useful: as a potential user, you work out what feature set you want and trade it off against price and whatever else.
But I think for version control at the moment these tables are so unhelpful as to be almost misleading. The differences are not just about which features happen to have been implemented yet; they're more fundamental differences in conception of how version control ought to be done.
It's not like comparing different cars in terms of power or economy or price. It's more like comparing cars versus motorbikes versus bicycles versus wheelbarrows. You *can* have a line saying that you can carry two people on a motorbike and one in a wheelbarrow, or that a motorbike is faster over short distances than a car, but I don't think it helps you work out which one is right for you.
The analogy, lame though it is, suggests to me that it is also interesting to talk about how they coexist to solve different problems well: you can put a wheelbarrow in a ute, or a bicycle on a car... The Quilt user manual suggests that you might like to check your Quilt patches into CVS. Luke says that Arch seems to require more discipline and optionally bondage than Darcs does, but some people (and projects) like that.
So I think I would like to write a little more about what the different systems feel like, and what problems they seem like they might seem well. Stay tuned .
posted Tue 11 May 2004 in /software/vc | link
spam statistics; spam as steganographic cover
In a 24-hour period on samba.org, we received about 12751 messages, of which about 10950 were blocked by the system as either spam or viruses. So roughly 86% of incoming messages are trash. A bit more than half of them were blocked by blacklists such as Spamhaus and a third were rejected for containing malware signatures such as PE headers. Many of the remaining ones are bounced because they're going to invalid addresses, presumably coming either from dictionary attacks or spammers who collected random strings containing @. SpamAssassin deals with the remaining 528.
(SpamAssassin could probably pick out many more, but it's relatively expensive so we only run it on things that are not obviously bad.)
In fact, the fraction of spam is probably a bit higher because the system-wide filters are pretty conservative, and I am not counting messages filtered out by individual users. I think we're certainly over 90% spam/malware; possibly over 95%. It's a bit like John Birminham's description of a sewer of pure shit coming straight into our living room.
On the other hand, it rather reminds me of Rivest's great Chaffing and Winnowing: Confidentiality without Encryption paper, and of the idea of steganography in general. Hiding messages is technically easy; the hard part is finding cover traffic. (In the standard example, the FBI wonders why Alice and Bob are posting each other so many pictures of puppies.) Spam is the perfect background noise to send invisible steganographic messages, as long as you can agree on a method for your eligible receiver to pick out the good bits.
Rivest writes
We could thus have the following intriguing scenario: Alice is communicating with Bob using a standard packet-based communication scheme. Each packet is authenticated with a MAC created using a secret authentication key known only to Alice and Bob. (In practice, they might use a different key for packets in each direction, although this is not necessary if the packet contents identify sender and receiver.) Furthermore, each packet happens to contain only a single `message bit.'' (Alice wrote their software, and it contained a bug that caused this unusual behavior.)
So far, Alice and Bob are not encrypting anything, and are using standard messaging techniques that would not be considered as encryption and that would not be export-controlled. Alice and Bob have no intention of achieving confidentiality of their messages from an eavesdropper.
Now, Alice's packets to Bob may be routed from her computer through the computer of her Internet service provider, run by Charles, on another floor of her building, before being sent on to more major trunks of the Internet and then on to Bob.
Charles' computer, for whatever reason, then adds `chaff'' packets to the packet sequence from Alice to Bob. All of sudden, Charles' activities provide a very high degree of confidentiality for the communications between Alice and Bob! Alice's and Bob's software have not been modified in the least to achive this confidentiality! Charles does not know the secret authentication key used between Alice and Bob! Alice and Bob did not even want or care to have confidential communications! Charles is not using encryption and does not know any encryption key! Amazing!
In this case, Charles is COL CHARLES MOGUBE of the LIBERIAN ARMY.
posted Thu 6 May 2004 in /issues/spam | link
Some thoughts on arch security
GNU Arch has some pretty powerful and novel security properties for a version control system.
I have been helping maintain cvs.samba.org for a few years, so perhaps I have a pretty good idea, at least from the perspective of people doing open source or distributed development.
The word "security" means different things to different people. Some organizations, for example, would like to make sure that unauthorized persons don't see the source code, or even that developers who are allowed to see one part can't see another part. Others might want to make sure that any changes which are committed pass all the appropriate reviews and quality checks. I think Arch could probably do a pretty good job in helping with that, but they're not really the facet of security that I want to write about tonight.
What I am concerned with is that in recent years there have been quite a few criminal intrusions into development systems. Somebody tried to get a change into the Linux kernel source through the bitkeeper-cvs gateway. Somebody had a trojan installed on the machine of a senior developer at Valve software. Someone else got part of the Windows source code through compromising a developer's machine. Even if the source is not confidential the risk of unauthorized changes can be enormously disruptive.
CVS and Subversion are both commonly operated by free software projects in this mode: developers have SSH access to the server, and everyone else has anonymous read access.
It was originally planned that Subversion would run as an Apache 2 module using SSL and Apache authentication, so that there would be no need for developers to have local accounts. For various reasons this turned out to be pretty unpopular, and my impression is that almost all free software projects are using svn+ssh.
CVS, Subversion and co require a special server process both for committers and anonymous users. Arch does not: archives can be published just as read-only directories on a web or ftp server. This is a good thing: one less program to worry about, one less listening port. You can use whatever web server you think is least likely to be compromised.
Using SSH as a transport is one of my favourite Unix design patterns, and it is certainly much better than each SCM system inventing its own authentication protocol. But it does have several problems. Firstly, you need to be able to create Unix accounts for contributors. This has been a method of entry for attackers on open source projects before. Administrators can try to limit which commands can run, but there is a risk that contributors might break out of such a jail. On an older project, many dormant contributors may still have shell accounts, and these remain a possibility of intrusion.
Arch doesn't require that any two developers have access to a single system. This isn't just a theoretical possibility; it is the standard way of using it.
One good consequence is that there doesn't need to be any assessment of whether a contributor is "good enough" or "trusted enough" to have commit access. For CVS this is a big deal: someone who has commit access could destroy the whole repository or rewrite history, but people without commit access can't really work well. With Arch, there is no such decision point: anyone can work comfortably without needing to be specially trusted, and every change can be considered on its merits.
Most version control systems, including Arch, present the user model that once revisions are committed, they cannot be changed. The archive is conceptually read-only. However, as far as I know, only Arch makes the revisions physically read-only: each one is a directory containing a few files, such as this one. There is little chance of a later update changing or corrupting any previous work. (I have seen svn need to have its database rebuild from time to time, but arch never has.) This seems to have several really good properties against either accidental or intentional damage. On a machine that is shared by several developers, one might have a cron script that chowned and chmodded committed revisions as extra protection. Tools like tripwire would immediately pick up any new additions or changes (although changeset signing would probably trap that already.)
Arch stores checksums for each commit, which should trap accidental or hardware damage. These can then be gpg-signed, which should give pretty good assurance that, at least, the changeset came from a developer's machine.
If the worst happens and a machine is compromised, Arch's distributed design makes it likely that the affected or destroyed archive will be widely mirrored, so the changes can be detected and an older version can be restored.
[draft, more to come]
posted Wed 5 May 2004 in /software/vc/arch | link
From: Jesus
Subject: help
To: postmaster.samba.org
Hi:
My name is Jesus and I want to subscribe to the list samba....
posted Wed 5 May 2004 in /random | link
How Arch Works
Photo from jdub.
Tom Lord wrote a good whitepaper on How Arch Works. I think this answers a lot of questions about why it is the way it is that might be troubling people who have just read the tutorial.
Doing powerful distributed version control with *no* server-side computation is just brilliant, with great results for scalability, security, reliability and simplicity. One idea that good makes a good year.
I should really write something about arch security vs CVS.
posted Tue 4 May 2004 in /software/vc/arch | link
2004 resolutions
Someone asked me the other night what I would like to do in free software in the next couple of years. My rough plan for 2004/2005 is:
- Finish distcc. It's really almost complete already, but there are a few more things to do to make it scale well.
- Clean up, refactor and document librsync.
- Using librsync, write superlifter/rsync3 as a toolkit of flexible composable parts.
I think that probably takes a year or two. It's good to have some kind of plan and focus, even for things you are doing mostly for personal interest.
posted Tue 4 May 2004 in /projects | link
distcc for Java
Anthony Green is trying to hack (as with an axe) distcc into something that can do distributed compiles for Java. Good luck!
Tom Tromey has some comments in response.
I think the compilation process for Java and C are so different that they may not fit into the same program, but perhaps it can work.
I had wondered in the past whether I should have made the distcc server and protocol completely independent of the work of compilation, so that they just distribute arbitrary work. The client needs some special intelligence about how to interpret gcc command lines and run the preprocessor. But I think this can be done purely on the client, in a gcc-specific skin.
There really are a remarkable number of bright people at Red Hat these days.
posted Tue 4 May 2004 in /projects/distcc | link
Why Free Software?
In the bigger picture, I don't care why IBM wants Java to be free software. Their motivations are their own. Mine are the same ones that have been my passion for nearly 14 years now: the ability to fix bugs that affect me, to add features I think are needed, to work on projects in cooperation with like-minded comrades, to avoid being locked in to any one company's view of the world. Personally I find APIs wonderful, but even good documentation is no substitute for an occasional (contamination-free) look underneath the hood.
My feelings are similar.
I feel a bit uncomfortable when I see news stories hyping Linux and free software as being more secure, cheaper, etc etc. It often is good in these ways but I don't think it is globally optimal. In any particular situation some closed software might well be a better solution for an immediate practical problem.
The trade press and the industry as a whole has this stupid habit of hyping things up to a ridiculous degree, then realizing how stupid they've been and swinging back the other way. So in a way I hate to see gushing articles about Linux because I think they will just make the recoil worse.
Anyhow: personally, I think it would be really nice to have a free system, with all the good aspects that Tom mentions. I think it is worth spending a little time to get it. It's not so important how many people want to use it, except if that helps it develop faster.
posted Mon 3 May 2004 in /software/freedom | link
LDB, the ldap-like database
tridge has a cool plan for ldb. It's a simple database like gdb or sleepycat db, but it stores formatted data similar to LDAP.
posted Sat 1 May 2004 in /software | link
(fwd) kernel BUG at page_alloc.c:98 -- compiling with distcc
I am so good tweaking kernel bugs. I think this is the third or fourth one distcc has trapped.
posted Sat 1 May 2004 in /projects/distcc | link
The c10k document
Every Linux network programmer ought to read Dan Kegel's great page the C10k problem:
It's time for web servers to handle ten thousand clients simultaneously, don't you think? After all, the web is a big place now.
And computers are big, too. You can buy a 1000MHz machine with 2 gigabytes of RAM and an 1000Mbit/sec Ethernet card for $1200 or so. Let's see - at 20000 clients, that's 50KHz, 100Kbytes, and 50Kbits/sec per client. It shouldn't take any more horsepower than that to take four kilobytes from the disk and send them to the network once a second for each of twenty thousand clients. (That works out to $0.08 per client, by the way. Those $100/client licensing fees some operating systems charge are starting to look a little heavy!) So hardware is no longer the bottleneck. [....]
posted Sat 1 May 2004 in /software/linux | link
Archives 2008: Apr Feb 2007: Jul May Feb Jan 2006: Dec Nov Oct Sep Aug Jul Jun Jan 2005: Sep Aug Jul Jun May Apr Mar Feb Jan 2004: Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan 2003: Dec Nov Oct Sep Aug Jul Jun May
Copyright (C) 1999-2007 Martin Pool.

