Martin Pool's blog

Dissecting Ken Brown's "Samizdat"

Ken Brown and Justin Orndorff from the "Alexis de Tocqueville Institution" have written a paper entitled Samizdat. (I have more on the origin and meaning of the name here.) In it, they state that Linus did not write Linux, and they suggest that he must have cheated by copying from Minix or Unix. They make various other allegations against open source developers that are similar to those seen in recent Microsoft and SCO press releases.

AdTI and Microsoft have confirmed that Microsoft provides funding to AdTI.

My opinion is:

I'm working with a friend on a review copy of Brown's Samizdat paper, obtained directly from AdTI. This is a work in progress; if you have comments please mail me.

I see Brown forgot to put a copyright statement on his paper, which is slightly amusing for someone getting so hot under the collar about copyright without actually understanding it. Nevertheless, I suppose it has an implicit copyright by the Alexis de Tocqueville Institution.

A few paragraphs are reproduced here under fair use rights for the purpose of criticism. The copy I got is labelled "final", so I think it's fair to assume this is what they will go to press with: typos, gaping errors and all. According to AdTI, this is part of a “soon-to-be-published book on operating systems and open source.” I hope that AdTI will feel the rest of the book can stand up to rigorous scrutiny and review. On the basis of Samizdat, I really cannot suggest that anyone spend any money to acquire a copy.

There is a fine discussion by Roaring Penguin of AdTI's previous paper. There too, AdTI seems to ask a lot of rhetorical questions without any evidence or logical argument.

On to the paper:

p6:

"Samizdat: The Source of Open Source Code", discusses the controversial production factory of "free" computer source code. While the literal meaning of Samizdat refers to a period of freedom-fighting publishers in early Russia, the term has been borrowed by programmers that engage in the practice of surreptitiously circulating and/or using software source code that belongs to other individuals or companies. Whether it is reverse engineering, employee theft, or Rembrandt-like copying, plagiarism in software programming has become the proud flag of many in the `open source movement'.

I know a pretty good selection of prominent open source programmers, and I don't know of anyone who approves of plagiarism, let alone bears it as a “proud flag”. On the other hand, I find piracy commonly accepted by users of proprietary operating systems, and I would estimate the majority of Windows machines hold unlicenced software. (How many of you are still "evaluating" WinZip?)

Brown does not quote even a single person stating that position, let alone evidence that it is representative of the movement as a whole. He is entitled to make the claim if he can substantiate it, but this is mere assertion.

p9:

Software is a business. But ironically, free software is a business too. The free software model provides users with accompanying source code for modification or development of the original software. The business logic for providing free source code is to enable clients to modify/customize aspects of the accompanying software.

I count 11 instances of "irony", or roughly one every 6 pages. Most are misused, and Brown really means to say "unexpectedly", "improbably", etc. In this case, he seems to be trying to suggest that it is inconsistent to sell free software. Of course there is an explanation for the apparent contradiction ("free as in speech, not beer") but Brown doesn't share it with his audience. One suspects that Brown wants to give the impression that free software companies are hypocritical, but he can't actually say it.

The only ironic thing here is that Brown doesn't seem to know much about the business of free software either. Enabling clients to modify customize the software is one advantage, but not the most important for most users. Other important ones are that customers are not locked to a particular vendor, often can obtain free software at lower cost than proprietary software, can fix their own bugs, have the source as the ultimate reference documentation, will never be orphaned, can learn from examining the source, can check there are no security backdoors, will never be forced to a new licencing plan, and so on.

Linux and many other products are referred to as open source. But in fact they would more properly be referred to as hybrid source, products that attempt to offer the benefit of true open source, but operate in a commercial world like traditional proprietary products. For example Apache is a true open source product. In contrast, the Red Hat Linux operating system is a hybrid product. It is very important to differentiate between the two.

In fact Linux is not just "referred to as open source", it is open source. The term has a formal definition, which Brown has apparently not even read. If the formal definition is not enough, consider that Linux is overwhelmingly the most commonly cited example of open source.

Nevertheless, throughout the paper Brown continues to insist that "Linux is not true open source". It's not only disingenuous, but also makes the remainder of the paper harder to read, since we must always mentally search-and-replace to get the true meaning.

Brown is also deeply confused about the market: Apache operates in a commercial world; is sold, serviced and supported; and competes with proprietary offerings. Just like Linux. As is BSD, for that matter.

True open source is software and source code that can be used for any reason, for any use. If you get it with a license, it only requires attribution, or a copyright notice. You can modify it in any way and sell it as your own, without any additional requirements.

The second type, hybrid source, gets the lion's share of attention. It is software that is also no cost or free, but any modification to it becomes the `equal property' of the original author and any user that is interested in it.

Brown seems to be trying to distinguish BSD/MIT-style licences which allow for code to be "taken proprietary", from reciprocal licences such as the GPL which preserve openness in later distribution. His characterization of both licences is inaccurate in the details.

It's fine that Brown, as a Microsoft apologist, would rather my free software could be reused in closed software without me being paid. I can understand him wanting that. But I don't have to allow it. Brown calling my friends a pack of thieves is hardly persuasive.

pg10:

Although introduced at a much later date, ironically, hybrid source has become the largest pool of free open source software.

Again with the irony, Mr Brown?

I can't even see a way in which the emergence of GPL'd software would be unexpected, let alone ironic. Probably part of it is just chance, but I think it's reasonable to believe that the GPL helps prevents fragmentation and therefore gives a better chance of long-term success than do BSD licences.

The empirical success of open source is integral to the promotion of all science and technology.

How kind of him to say so!

However, it is unquestionable that the hybrid source code model is having a deleterious effect on both true open source, research and development, and the commercial intellectual property economy.

Unquestionable, eh? It is a shame that he doesn't provide some kind of substantiation for the benefit of those so cheeky as to question his statement.

pg11:

Linux and other hybrid source code products 3, commonly referred to as `open source' software, have steadily migrated into the IT departments of both private and public institutions.

Yes, thankyou. Roughly $8,000,000,000 last year, by some estimates.

As usage of the non-proprietary model of selling software and software services grows, like any other new technology, it is important to continually analyze its accompanying opportunities and consequences to best implement and shape relevant public policy.

What a remarkable assertion from a self-described liberal/libertarian thinktank: any new technology needs a public policy, and government intervention. I suppose the idea that free people might decide to release and procure software under the terms they think best is a bit too subversive for Brown. Why, next they will be deciding for themselves what books to read!

Software is source code - and the topic of the `source' of the code is as big as the billion dollar industry itself.

I think there's a lot of truth in the "software is source code", and I hear "if it isn't source, it isn't software" is a rule of thumb in NASA procurement. That seems to imply that Microsoft don't release any software, aside from a little build tool. But they do make good ergonomic keyboards.

An issue that flies beneath the radar is the question: where does the successful Linux product come from?

Brown seems to be setting up for the idea that Linux source code was really stolen from SCO or someone else. Good luck: IBM, SCO and the court system have already spent quite a lot of time and effort establishing that isn't the case. But I'd like to see him try.

The origin of true open source code doesn't really matter, because a) it does not have many significant legal consequences of misuse b) it has almost no use restriction. It is definitely free--commercial products such as Linux are entirely different.

By "true open source", he means BSD-licenced source. I suppose Brown didn't do his homework enough to realize there was a big court case a few years ago about BSD, despite the licence.

We know where traditional commercial proprietary source code comes from. We also know who its original owners are.

Where? Again, mere unfounded assertion contrary to the evidence. Proprietary software customers have no idea where the source came from: what country it was written in; who wrote it; who it was licenced from; what trade secrets or patents it may embody; what security backdoors it may contain.

However, we don't really know what the origin of the bulk of hybrid source code is. We don't know much about this pool of software, other than what we are told. The assumption is, there is no cause to ask---For example, we know that Linux is a free public domain product, given to us by its inventor Linus Torvalds. But not many people ask where did it come from?

This is really getting silly. Before he said the GPL is almost a proprietary licence. Now he says Linux is in the public domain. Which is it? It can't be both.

The difference between "public domain" and "free software" is one of the most basic points in understanding software licencing. Clearly Brown does not.

Is it a dumb question to ask, "what is the origin, the `source' of this pool of source code?"

No, it's not a dumb question. Ask nicely, as Boston Consulting Group did a couple of years ago, and you'll get a detailed and quite fascinating answer. But assume your conclusions, and you make yourself a laughingstock.

Some critics are even unlucky enough to receive widespread excoriation in public forums.

Well, I'm sure such a sterling paper would never deserve that.

Ch2:

Here follows a mediocre summary of the history of Unix. It's more or less correct, though you could get a more accurate and interesting description from Raymond or Salus. Even better, take Nick Moffitt to the Tied House, and hear all about it over beer.

David Bloch an attorney with McDermott, Will & Emery discussing the question hypothetically comments, [27 David Bloch interview with AdTI, April 9, 2004. Bloch was NOT asked about the Lions incident specifically, only legal questions about scenario.]

AdTI has a consistent pattern of asking people for comments on hypothetical scenarios and applying those comments out of context to Linux. It allows him to give the impression that Bloch, or Tanenbaum, or Richie is saying "Linux is X", when they said no such thing.

pg24:

"Sometimes a little theft is necessary".

"There is theft everywhere and the open source community should not be singled out."

"The samizdat exchange was outright theft but it was necessary."

Quotes supposed to be from open source programmers, but not attributed. Did they just make them up?

Perhaps we should attribute thoughts to "Factions within the AdTI" on whether wife-beating is "sometimes OK", "happens all the time", or "is absolutely necessary"?

A lot of context about the Lions book seems to be missing.

pg27:

Brown looks at the unix history diagram by Eric Levenez. Despite a categorical statement by Levenez that the diagram is not a representation of copyrights or patents, Brown proceeds to assume that almost all Unix-like systems “originate from licensed Unix code, a Unix licensee, or a previous Unix licensee”.

Brown could have read in any number of books that Unix has been independently rewritten several times. Clearly he has not done his research or is wilfully ignoring the facts.

pg37:

Follows what is called "an argument from personal incredulity." Brown says, in nearly so many words: "I can't believe anyone can just sit down and write an operating system kernel. So it must not have happened." The same argument works equally well against heavier-than-air flight.

Writing 7000 lines of rough first-cut code for Linux 0.01 in a few months is entirely plausible. Brown doesn't seem to have consulted any programmers before deciding it's impossible.

Brown seems to have the idea that all operating systems are terribly large things. If it was expensive to write Windows 2000, it must have been equally expensive to write Linux 0.01! Therefore, Linus could not possibly have done it himself.

Linus could never have written the whole kernel that we have today by himself. What he could do in those first few months was to get enough of it going to act as a seed crystal for all the other people who wanted a high-quality free unix. Perhaps the first version wasn't very good, but it was a start. Plenty has been written about how good it is for open projects to release early and often, and to do other things to encourage contributions. Linux just did those things well — perhaps better than most people had before.

Contrary to Brown: it is possible to eat an elephant, if you do it in small pieces, and have a lot of hungry friends.

The whole question of whether Linux contained Unix source code is easy to answer: check whether there is any code in common. Brown should be replaced by a small Perl script. SCO tried and failed, but AdTI is welcome to try.

The source for ancient Unix, Minix and Linux is available, so it's easy to check. One might hope that if Brown could produce any evidence he would do so and not rant on about incredulity and irony.

pg43:

A tedious recitation of how slow it is to develop large monolithic software systems such as "Windows NT 5.0", now known as Windows 2000. There's no explanation of how this is meant to be at all relevant to Linux 0.01, which was about 7000 lines. Aside from an enormous difference in scale, W2k was slowed down by maintaining backward binary compatibility, compatibility with a massive range of hardware, testing to a level appropriate for a mature rather than first-cut product, and the overheads of communication and organization between hundreds of developers. Linux 0.01 had none of these costs, and so was developed proportionally faster.

Brown cites sources such as The Mythical Man Month, but if he had truly read and understood them, he would have seen why it is entirely possible for one programmer to write 7,000 lines in six months. Linus was working under perfect conditions: a good programmer, a green-fields project, flexible requirements, a low quality bar, and no management overhead.

pg49:

It is also important to note that if motivated parties with the power of subpoenas, witnesses, interviews, and evidence delved deeper into the development of Unix to Minix to Linux (UML) there are a number of reasons why there could potentially be problems.

Brown seems to have been asleep for the last couple of years, and to have not noticed that motivated parties (Microsoft and SCO) have in fact been trying to find copyright problems in Linux, with negligible success.

But to this day, Linux, a product known virtually around the world, still does not properly credit Minix for its source code, its derivative use or its influence. Arguably, this has cost Prentice Hall considerable book sales from the years 1987 to present. In addition, it also obviously cost Prentice Hall sales between 1987 and 2000. One reason is due to the loss of customers that would have bought the Prentice Hall publication for the Minix code.

I'm not sure what degree of credit is necessary for a program you used 12 years ago that made you feel like writing one of your own. I would guess that it is honest to mention it in books or interviews about the history, which is what Linus does.

As Tanenbaum points out, the purpose of the Minix books and software is to be a teaching tool. The purpose of Linux is to be a practical operating system, and it is perhaps now getting too large to be comfortably used for teaching. They are complementary, not substitutable.

pg49:

Instead of buying the Tanenbaum book for the Minix code, they could get a free copy of Linux.

Hypothetically, at $100 per book, at a loss of just 500 book sales a year, to date, Prentice Hall and Tanenbaum have lost almost $1,000,000 in revenues. This is of course only represents compensatory damages, not punitive. Arguably, Prentice Hall has lost out on tens of millions of dollars.

Let's just check this: $100 × 500 × 12 years is $600,000. That's a big rounding error! Should we really trust people who can't do grade-school arithmetic to give advice on public policy or corporate strategy?

The whole argument rests on the assumption that Linux incorporates Minix code, for which Brown presents not a shred of primary evidence and which has been resoundingly rebutted by the author of Minix.

I note that Prentice Hall currently publishes a number of books about Linux, including Understanding the Linux Virtual Memory Manager, Linux Programming by Example, Samba 3 by Example. Since their offerings cover a broader range than just the design of microkernel operating systems I would venture that Prentice Hall probably sell more Linux books than Brown's estimate of 500 lost Minix sales.

"Arguably" means "it may be argued that". But Brown doesn't actually argue it. Why didn't Brown ask Tanenbaum or Prentice Hall if they felt Linux infringed their copyrights, or if they bad about the idea of lost sales? Apparently either he didn't ask, or he didn't like the answer.

pg59:

Another interesting perspective on the credits files is the limited credit to members from developing countries in the Tuomi chart. 85 This can be explained away by simply suggesting that non-English speaking countries would have been slow to show interest in Linux development. However, by 2000, although it is widely known that China and India are heavy Linux developers, they both receive an insignificant amount of credit in the Linux credits files 86. In fact, India, an English-speaking country, is non-existent, while countries such as Mexico, Brazil, and Argentina are recorded with minimal presence. Amusingly, while the Tuomi chart studies Linux credits from 1991 to 2000 from over thirty countries, according to Tuomi's study of the credits files, Finland per million inhabitants, remains the number one source of original Linux code for the ten year project.

It is almost certain that Tuomi, a scientist and rigorous researcher, does not introduce this data to argue that there may be country bias in the credits files. However, Tuomi's point "History has a very selective memory..." could be relevant in this instance as well. After all, the Matthew Effect historically has been very effective in purging the origin of invention from developing countries for many years. We don't have any evidence that it occurred with Linux. However, it is conspicuous that an open source model, touted to uplift developing countries, does not seem to have contributions from the very countries Linux advocates are a rguing they are interested in promoting.

Tuomi, a "rigorous researcher", doesn't see any bias here. But the less rigorous Mr Brown does, or at least tries to imply it.

Mind you, he doesn't actually name even one Indian developer who has been omitted, or say more than "it is widely known that India and China are heavy contributors." I think I'd question that point: I've personally only seen a lot of people from India coming into open source over the last two or three years, which is after this survey concluded. Even then, many of the contributions seem to be in other places than the kernel. (I use the circumlocution because of course there have been ethnic Indian and Chinese people living in other countries and working on free software for a rather longer time.)

Credits in the Linux kernel are maintained by the developers themselves, using patches like this. If anyone feels they have not been credited sufficiently or accurately, they can easily correct the record just by sending a patch or a request for their name to be added. This policy is stated in the file in question, linux-2.6/MAINTAINERS:

PLEASE try to include any credit lines you want added with the patch. It avoids people being missed off by mistake and makes it easier to know who wants adding and who doesn't.

I am not aware of any complaints from developers that Linus is ignoring those requests. On the contrary, there is plenty of evidence on the kernel list archives that the credits file is updated when people request it. For example, here is an excerpt of the changes in linux 2.6.5, showing updates to author information:

@@ -1875,6 +1864,13 @@
 S: D53424 Remagen
 S: Germany
 
+N: Colin Leroy
+E: colin@colino.net
+W: http://www.geekounet.org/
+D: PowerMac adt7467 fan driver
+S: Toulouse
+S: France
+
 N: Achim Leubner
 E: achim_leubner@adaptec.com
 D: GDT Disk Array Controller/Storage RAID controller driver

It seems to me the onus is on Brown to prove that there is a conspiracy to not give credit to Indian developers. He does not provide any evidence.

pg62:

Brown says that "reverse engineering" (of what?) is a possible explanation for how Linux developed so quickly.

Anyone who has ever seen how slow and painful reverse engineering can be is welcome to laugh at this point. Given the choice between reverse engineering a program and writing one from scratch, I'd always go from scratch.

Brown also proposes that the reason why we can't find Minix code in Linux is that it was obfuscated. Never mind that the designs are completely different. Brown doesn't actually point to any code in Linux which shows signs of obfuscation.

pg72:

Handwave at outsourcing. Of course.

pg74:

Paradoxically, every dollar of advertising and promotion corporations such as IBM and Oracle contribute to increasing customer interest in ree Platforms' respectively will cost these companies lucrative accounts.

IBM is fortunate to have Microsoft/AdTI be so concerned for IBM's welfare. No doubt IBM and Oracle would have been better off to forgo their billions of dollars in Linux-related revenue.

pg75:

To defend themselves, all open source organizations are slowly becoming more bureaucratic and more closed--more like proprietary software companies.

I'm sure Microsoft/AdTI wishes that were the case, but I don't see it happening. On the contrary, open source processes seem to be getting more efficient all the time: GNOME and Fedora for example now ship on a regular schedule, whereas Microsoft's Longhorn has slipped another couple of years.

If Brown has evidence of open source projects becoming bureaucratic I would be interested to read it, but it seems he does not.

pg77:

A section titled "Achieving Balance". Given that the entire essay to date has been Brown's fervid imaginings without a shred of evidence, it is hard to see why any adjustment is required.

Corporate interests cannot fund truly free software because their interests are tied to the promotion of their business.

Is he saying that corporations are not able to fund free software, or that they must not be allowed? If the first, who is Brown to tell people how to run their business? If the second, on what grounds does he propose to outlaw cooperation between free parties? I see no answer here.

It is in the best interest for the federal government to take the lead on funding a bigger open source project at universities. The commercial open source model is 1) depreciating the value of U.S. proprietary software 2) depreciating the value of U.S. investment in the IT industry 3) diminishing the returns of the IT industry which is in turn send U.S. jobs overseas to make up for losses. 4) funding the devolution of the U.S. intellectual property rights economy.

On the other hand, free software is making some American businesses more efficient, and offering better products to some American consumers. Perhaps we could try to estimate the costs and benefits, or perhaps we could just say that the free market will work it out. But Brown just rather tediously makes the assertion and moves on.

(At this point in the essay Brown seems to switch from "ironic" to "inane" as a favourite word to wave around when he can't think of anything better to say.)

1) The government should support R&D at universities with open source projects that produce research that all parties can use. This includes developers and commercial interests. However, taxpayer dollars cannot support open source projects that are tied to commercial open source models that compete with the private sector.

2) Universities and colleges that receive government grants should not be able use taxpayer dollars to generate source code that is restrictive. Both individuals and business should be able subsequently to develop free software and protect it as its own intellectual property

Here, at last, two consecutives paragraphs(!) that are concrete and not contradictory. He seems to argue that all government projects should produce code that is MIT-licenced. I can see some sense in that, though there are some difficulties.

If consistently applied, it would also prevent universities from working on proprietary code, as they might currently do in joint ventures. I'm not sure if Brown sees that as a positive good or if he just didn't think of it. It might be a reasonable tradeoff. One might equally argue that all work should be GPL'd, so as to guarantee ongoing public access. Or one might argue that, as at present, it should be decided by the university case-by-case.

If AdTI wants to persuade universities to release their work under the MIT licence, they need to make a stronger argument than they have to date.

To be clear, the hybrid open source model encourages conspicuous development and proprietary software models.

I'm not sure exactly what that is meant to mean, but I like the sound of it. "Conspicuous contribution", rather than "conspicuous consumption." Nice.

Finally, U.S. corporations, especially in today's economy, would only benefit by more research and development assistance.

Oh, so AdTI is not so liberal after all. Here we get close to the truth: a plea from Microsoft to the government to make the nasty free-speech subversives go away, and to give Microsoft more public money. How sad. Adapt or die.

pg87:

Here is included a list of many papers by Andrew Tanenbaum. An impressive record, but most of them have little to do with Linux. Who knows what it's doing here? If it's an attempt to rub some of Professor Tanenbaum's credibility of onto AdTI, it seems they failed miserably.

In conclusion, I can do no better than to repeat David Skoll's summary of the previous AdTI fluff:

The entire AdTI study is a commercial funded by Microsoft, whose sole aim is to counter the growing adoption of GPL'd software. The report contains nothing constructive or useful. It is a sham.

Archives 2008: Apr Feb 2007: Jul May Feb Jan 2006: Dec Nov Oct Sep Aug Jul Jun Jan 2005: Sep Aug Jul Jun May Apr Mar Feb Jan 2004: Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan 2003: Dec Nov Oct Sep Aug Jul Jun May