Martin Pool's blog

More on changelogs

Colin replied to my entry on ChangeLogs.

I have to disagree with Martin Pool's reasoning on ChangeLogs, although not the conclusion. I think ChangeLogs were created mainly as a workaround for the lack of atomic commits, and the file orientation of RCS and CVS. Since my logical change could span several files, the ChangeLog (as used in CVS) serves to mark what files were involved in the change, as well as duplicate the log entry for the commit.

However, tree-oriented version control systems with atomic commits (e.g. GNU Arch) don't have this problem. There is no direct notion of individual file history - every change affects the entire tree, atomically. Which makes sense - a change, in general, is associated with the tree, and not an individual file.

Yes, a lack of good tree-wide revision control is probably why ChangeLogs were originally invented. Indeed I wouldn't be surprised if some people started using ChangeLogs before they had any version control at all, let alone tree-wide changesets. (Intercontinental CVS is pretty tedious even today, I don't know if many people would have been doing it ten years ago.) It was for just this reason that I was previously pretty skeptical about the idea of using them: all the information is in CVS or Subversion, so why bother? But what Ben eventually convinced me, and what I was trying to communicate, is that there is still some value in keeping them, even if you have a good vc system.

The ChangeLog standard is good in itself. The GNU Standards give fairly well considered requirements for what ought to be in log entries: for example, they say you shouldn't abbreviate function names, so that people can easily grep for them. There's consistent formatting, which is understood by various programs. And when somebody sends you a patch, if they send you ChangeLog entries for it, then you have a detailed description in their own words of what they changed. This may not be an argument for literally having a file called ./ChangeLog, but I think it's good to think about writing commit messages in this kind of style, however you record them. Or at least read the GNU manual entries on this, and take from them whatever seems valuable.

It supports promiscuous forking. It's a good thing if people can fork a project without needing permission from the author, go and do their own thing, and then perhaps come back and merge later. CVS does this poorly; BitKeeper and Arch do it very well, and the informal mailing-patches-around system does it moderately well. Keeping ChangeLogs helps with that: when you later get a patch from somebody else, you have at least some record of what changes they made and (equally important) which of your changes they've already taken. So I'd say for this reason to at least consider using ChangeLogs if you're using a vc system that's not natively distributed.

Even if arch supports promiscuous forking easily, it's still a barrier to entry. Personally I found it too underdocumented last time I tried. If I wanted to contribute to a project that used it, I'd just send a patch. So for users like me, having a ChangeLog in the tarball is pretty useful. If you always ship the ChangeLog when you ship source, then there is always a chance for sane forking and merging.

I note that even though the Subversion developers obviously have whole-tree commits, they keep a ChangeLog as well (with the same text.)

Version control is not forever. Many successful programs outlive their version control systems. There are a whole range of possible reasons:

And note that in many of these cases, by the time you need the ChangeLog it's too late to get it. It's no good to say "oh, you can generate it when you want it." By the time you want it, the CVS server may be long gone.

A ChangeLog is not a real substitute for a complete source history, but you can count on it always being there. You can't always get what you want, but sometimes you can get what you need. It's something that may pay off in the long term, rather than the short term. But then most documentation is like that.

Also like other documentation: some rough notes, that are kept around and are up to date are more useful in the long term than grandiose autogenerated phonebooks.

Given a ChangeLog, you can see what was changed, and perhaps why. Like a good comment, you get a higher-level overview of the textual changes from one version to another. You can see who might know about particular areas. By diffing two ChangeLogs, you can see what the difference is between their trees at a higher level than the raw diff. This is highly useful because of the tendency of diffs to become noisy when for example a global rename occurs, or a movement between files.

It's cheap. With a little tool or editor macro keeping the ChangeLog up to date is easy. So given that it's very useful, and it's cheap, why not do it?

It may seem a little ugly to record text twice but disk is so cheap these days...

Everyone can use them. I suppose it's possible that in the future Arch or Bitkeeper or Buttkeeper really will become the universal version control system, perfect and unchanging like the face of god. Everyone who gets the source will get the whole history, and everybody else will be able to merge with them. You'll never lose history, and nobody will ever want to change vc systems. But until that day arrives, ChangeLogs are a reasonable common denominator that allows everyone to get at least a little bit of history about their tree.

I don't know about Arch, but the dependencies of Subversion make it nontrivial to compile on a new Linux architecture, let alone an alien OS. There are still people around who want to port things to OpenVMS or Netware, and in the future perhaps they'll be porting to EROS or TUNES. If you want truly promiscuous forking, it's good to support these people. If they can at least participate by making and reading ChangeLog entries then we don't have to feel so guilty about using a vc too that they can't run.

The good thing about a text file is that everyone can read it, regardless of what tools they use. Not everyone can swallow the BitKeeper licence, and possibly not everyone wants to use Arch. But everyone can read and write a ChangeLog.

It's handy. I was pretty skeptical about this when Ben told me, but it's true: it is very nice to just quickly C-s through the file, and to get grep hits when you search for a function name. Avoiding the few-second delay to run svn log or whatever is handy. And of course it's still there when you're disconnected.

I think ChangeLogs should be a complement to NEWS files, rather than a replacement for them. (GNU standards say so too.) The NEWS file is directed at users and contains things they need to know about in their language; the ChangeLog is directed at programmers. I think having both, and keeping them separate, is pretty valuable: looking through /usr/share/doc/ it is a bit annoying to see many programmers which have only one and not both. If you're going to have both, then using the canonical names and contents helps make it clear.

I started wanting this after being handed three somewhat different versions of a tree that I was meant to maintain. It wasn't easy to reach the original developers, and it wasn't clear of them which one was meant to be "most current" and whether the others were forks or earlier snapshots. A ChangeLog wouldn't have fixed everything, but at least it might have given some indication of the relationships.

I wish those developers had used ChangeLogs out of consideration for the people who came after them. So the golden rule suggests that I should try them myself, and it's surprisingly good.

Archives 2008: Apr Feb 2007: Jul May Feb Jan 2006: Dec Nov Oct Sep Aug Jul Jun Jan 2005: Sep Aug Jul Jun May Apr Mar Feb Jan 2004: Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan 2003: Dec Nov Oct Sep Aug Jul Jun May