Mailing ChangeLog diffs
Luke Gorrie suggests
One ChangeLog trick I like is to send a daily diff of the ChangeLog to the development mailing list. This saves the bother of writing "I just added.." mails. Like a digest version of a cvs-commit mailing list, though nicer if the ChangeLog is hand-written.
posted Fri 19 Dec 2003 in /software/vc/changelogs | link
Should you keep ChangeLogs in CVS?
Assuming you were persuaded by my previous argument that ChangeLogs are a good things, we can consider whether it is good to keep ChangeLogs in CVS as a specific file. The alternatives here are to either put it in CVS and commit with every change, or to autogenerate the ChangeLog when you make a release tarball.
In favour of generating from CVS
If you want to autogenerate it, there are some tools such as cvs2cl which will produce something moderately close to GNU format, including joining up multiple files which have the same changes. This avoids needing to separately maintain and commit the changes.
It avoids a slight amount of work in getting the text into both the ChangeLog and the commit message. I don't think this is really compelling though: if you set up your environment properly then it should be nearly automatic, and secondly I would hope that you're spending most of your time thinking or writing code rather than writing commit messages.
It avoids the possibility of forgetting to commit to the ChangeLog.
In favour of explicitly keeping it in CVS
Writing the ChangeLog by hand makes it more likely that people will stick carefully to the GNU standard for formatting it.
If there are mistakes or omissions in the ChangeLog, you can go back and correct them later. It's not easy (for good reasons) to amend commit messages in most version control systems.
One reason not to autogenerate the ChangeLog is that in the gnu way of doing things, they operate at slightly different levels. For example, you might sometimes make a small typo in your commit and immediately go back and fix it. That needs a separate CVS commit of course, and some explanation of what you've done, but it's unlikely anyone will care in a month's time, and so it doesn't need to go into the ChangeLog. There is a hierarchy of detail, where the NEWS file describes only user-visible changes, the ChangeLog describes significant internal changes to the program text, and CVS describes every textual change. (I seem to recall that cvs2cl has some option to ignore particular changes, but I can't find it at the moment.)
If you lose your repository, then you'll at least have the ChangeLog in any checkouts or snapshots you may have around. (As it happens, a problem with LVM destroyed some of my Subversion repositories yesterday...) Hopefully you have backups, but they may not have captured the most recent changes. The ChangeLog gives you some description of what was done.
emacs has good facilities for forming ChangeLog entries, including inserting the function name and so on. (It might be nice if there were an option to write the same into a temporary file to be used as a commit message.)
ChangeLogs provide a good way to keep a record of uncommitted changes. Of course it's good to go back and review your diff before you commit, but if you're making an extended change it can help to keep notes while you're working. The ChangeLog file is probably as good as any.
If you are working offline with a non-distributed vc system like CVS or Subversion then ChangeLogs let you retain a history of what you changed and record a more detailed history when you commit.
Keeping the ChangeLog up to date with every CVS update means its always available. You don't need to remember to update it before disconnecting.
Some projects use small libraries that are maintained separately, but also cloned into the project's tree. For example, distcc has a copy of the popt tree so that people without that library can easily build it. These subtrees have their own history, and they need to merge from upstream from time to time. A good way to support this is to keep a ChangeLog in that directory representing upstream changes. (This isn't really incompatible with using cvs2cl for the project itself as you could just exclude that directory from autogeneration.)
Comments?
posted Thu 18 Dec 2003 in /software/vc/changelogs | link
More on changelogs
Colin replied to my entry on ChangeLogs.
I have to disagree with Martin Pool's reasoning on ChangeLogs, although not the conclusion. I think ChangeLogs were created mainly as a workaround for the lack of atomic commits, and the file orientation of RCS and CVS. Since my logical change could span several files, the ChangeLog (as used in CVS) serves to mark what files were involved in the change, as well as duplicate the log entry for the commit.
However, tree-oriented version control systems with atomic commits (e.g. GNU Arch) don't have this problem. There is no direct notion of individual file history - every change affects the entire tree, atomically. Which makes sense - a change, in general, is associated with the tree, and not an individual file.
Yes, a lack of good tree-wide revision control is probably why ChangeLogs were originally invented. Indeed I wouldn't be surprised if some people started using ChangeLogs before they had any version control at all, let alone tree-wide changesets. (Intercontinental CVS is pretty tedious even today, I don't know if many people would have been doing it ten years ago.) It was for just this reason that I was previously pretty skeptical about the idea of using them: all the information is in CVS or Subversion, so why bother? But what Ben eventually convinced me, and what I was trying to communicate, is that there is still some value in keeping them, even if you have a good vc system.
The ChangeLog standard is good in itself. The GNU Standards give fairly well considered requirements for what ought to be in log entries: for example, they say you shouldn't abbreviate function names, so that people can easily grep for them. There's consistent formatting, which is understood by various programs. And when somebody sends you a patch, if they send you ChangeLog entries for it, then you have a detailed description in their own words of what they changed. This may not be an argument for literally having a file called ./ChangeLog, but I think it's good to think about writing commit messages in this kind of style, however you record them. Or at least read the GNU manual entries on this, and take from them whatever seems valuable.
It supports promiscuous forking. It's a good thing if people can fork a project without needing permission from the author, go and do their own thing, and then perhaps come back and merge later. CVS does this poorly; BitKeeper and Arch do it very well, and the informal mailing-patches-around system does it moderately well. Keeping ChangeLogs helps with that: when you later get a patch from somebody else, you have at least some record of what changes they made and (equally important) which of your changes they've already taken. So I'd say for this reason to at least consider using ChangeLogs if you're using a vc system that's not natively distributed.
Even if arch supports promiscuous forking easily, it's still a barrier to entry. Personally I found it too underdocumented last time I tried. If I wanted to contribute to a project that used it, I'd just send a patch. So for users like me, having a ChangeLog in the tarball is pretty useful. If you always ship the ChangeLog when you ship source, then there is always a chance for sane forking and merging.
I note that even though the Subversion developers obviously have whole-tree commits, they keep a ChangeLog as well (with the same text.)
Version control is not forever. Many successful programs outlive their version control systems. There are a whole range of possible reasons:
- The program's abandoned by the original maintainer. Some time later, somebody else picks it up and takes it over. They have a tarball, but they don't have the original maintainer's repository.
- The maintainer just loses the history. Perhaps their machine crashes, and they have backups of tarballs but not CVS. Perhaps they want to carry it around with them from one job or project to another.
- The code gets imported as a library or utility into a project that's stored in a different VC system: perhaps CVS, or Perforce, or ClearCase. There might be some history-conversion tools, but they're rarely perfect, and even more rarely work well in both directions.
- The code's passed between companies or groups. They hand over a "finished" tarball, but it still needs to be maintained and history can illuminate the source.
- A better vc system comes out, and you want to switch to it.
- Or conversely, a worse vc system comes out, and you want to switch to it. :-) Perhaps you were using BitKeeper, but the licence changes and you can't swallow it anymore.
And note that in many of these cases, by the time you need the ChangeLog it's too late to get it. It's no good to say "oh, you can generate it when you want it." By the time you want it, the CVS server may be long gone.
A ChangeLog is not a real substitute for a complete source history, but you can count on it always being there. You can't always get what you want, but sometimes you can get what you need. It's something that may pay off in the long term, rather than the short term. But then most documentation is like that.
Also like other documentation: some rough notes, that are kept around and are up to date are more useful in the long term than grandiose autogenerated phonebooks.
Given a ChangeLog, you can see what was changed, and perhaps why. Like a good comment, you get a higher-level overview of the textual changes from one version to another. You can see who might know about particular areas. By diffing two ChangeLogs, you can see what the difference is between their trees at a higher level than the raw diff. This is highly useful because of the tendency of diffs to become noisy when for example a global rename occurs, or a movement between files.
It's cheap. With a little tool or editor macro keeping the ChangeLog up to date is easy. So given that it's very useful, and it's cheap, why not do it?
It may seem a little ugly to record text twice but disk is so cheap these days...
Everyone can use them. I suppose it's possible that in the future Arch or Bitkeeper or Buttkeeper really will become the universal version control system, perfect and unchanging like the face of god. Everyone who gets the source will get the whole history, and everybody else will be able to merge with them. You'll never lose history, and nobody will ever want to change vc systems. But until that day arrives, ChangeLogs are a reasonable common denominator that allows everyone to get at least a little bit of history about their tree.
I don't know about Arch, but the dependencies of Subversion make it nontrivial to compile on a new Linux architecture, let alone an alien OS. There are still people around who want to port things to OpenVMS or Netware, and in the future perhaps they'll be porting to EROS or TUNES. If you want truly promiscuous forking, it's good to support these people. If they can at least participate by making and reading ChangeLog entries then we don't have to feel so guilty about using a vc too that they can't run.
The good thing about a text file is that everyone can read it, regardless of what tools they use. Not everyone can swallow the BitKeeper licence, and possibly not everyone wants to use Arch. But everyone can read and write a ChangeLog.
It's handy. I was pretty skeptical about this when Ben told me, but it's true: it is very nice to just quickly C-s through the file, and to get grep hits when you search for a function name. Avoiding the few-second delay to run svn log or whatever is handy. And of course it's still there when you're disconnected.
I think ChangeLogs should be a complement to NEWS files, rather than a replacement for them. (GNU standards say so too.) The NEWS file is directed at users and contains things they need to know about in their language; the ChangeLog is directed at programmers. I think having both, and keeping them separate, is pretty valuable: looking through /usr/share/doc/ it is a bit annoying to see many programmers which have only one and not both. If you're going to have both, then using the canonical names and contents helps make it clear.
I started wanting this after being handed three somewhat different versions of a tree that I was meant to maintain. It wasn't easy to reach the original developers, and it wasn't clear of them which one was meant to be "most current" and whether the others were forks or earlier snapshots. A ChangeLog wouldn't have fixed everything, but at least it might have given some indication of the relationships.
I wish those developers had used ChangeLogs out of consideration for the people who came after them. So the golden rule suggests that I should try them myself, and it's surprisingly good.
posted Mon 1 Dec 2003 in /software/vc/changelogs | link
Why Changelogs?
In many GNU packages like gcc or emacs you'll see a ChangeLog file containing a description of all of the changes to the package. I'd always thought of them as a vestige of a previous era before CVS, but bje recently made a pretty good argument for continuing to use them.
Here's a sample from emacs's ChangeLog, for any readers who aren't familiar with the form:
2003-09-23 Dave Love <fx@gnu.org>* configure.in: Check members of struct ifreq.
2003-09-14 Kim F. Storm <storm@cua.dk>
* configure.in: Add checks for sys/ioctl.h and net/if.h.
2003-09-12 Luc Teirlinck <teirllm@mail.auburn.edu>
* Makefile.in (install-arch-indep, uninstall): Add SES manual.
2003-08-18 Lute Kamstra <Lute.Kamstra@cwi.nl>
* configure.in: Revert the change of 2003-07-29 as GTK+ 2.2 is not required anymore.
2003-08-07 Andrew Choi <akochoi@shaw.ca>
* configure.in [powerpc-apple-darwin*]: Use the -no-cpp-precomp option instead of -traditional-cpp for CPP.
There's a text format for listing changes and then also a description in the GNU coding standards of what information ought to be in the entries. This list of what was changed, when, and by whom is pretty similar to what you might see in a CVS history.
So if you're storing your project in CVS or some other revision control system, then keeping a ChangeLog as well is redundant. Indeed, if you keep the ChangeLog in CVS then every comment is literally being stored twice...
The main reason for using a ChangeLog is that it travels with the source it describes. Years in the future if somebody obtains the source they can still see its history, even if they no longer have access to the version control system. It's fairly common for projects to move between different vc systems over their lifetime. It's not unheard of for a vc system to be lost entirely, leaving only a set of tarballs as a record.
If some other party wants to fork the project, or develop it offline for a while before merging back, then the ChangeLog gives some hope that their history will be recorded.
Another advantage is that it's easy to scan through the log for mentions of a particular function. Easier than with any vc system I know of, at least if the log is well written.
I've recently got hold of a few source tarballs written by parties unknown, and existing in different versions with no clear indication of when things were changed or why. If they'd come with ChangeLogs, things might be a bit easier. Of course the kind of person who omits even a README might not write a ChangeLog, but if somebody else had started one perhaps they might have continued it.
If we all used a version control system like arch or bitkeeper that carried history with the source then perhaps this might be less necessary. But even then there's no guarantee that every person who gets the source in the future will want to keep using that system...
If things are being kept in both CVS and a ChangeLog, it ought to be easy to use a little script or macro to keep them in sync.
There are scripts such as cvs2cl that produce a ChangeLog for CVS sources.
GNU emacs also has commands to integrate version control and ChangeLogs.
Some projects require that ChangeLog entries be submitted with patches. This means an explanation of the change in the originator's own words always gets into the project history. If the ChangeLog standards are enforced then the entry will have a level of detail and preciseness that might not be present in an informal description of the change.
As with any history logs, there is a small challenge in writing descriptions that will be comprehensible and useful to people reading them months or years hence.
I'm going to try this for a while...
posted Mon 17 Nov 2003 in /software/vc/changelogs | link
Archives 2008: Apr Feb 2007: Jul May Feb Jan 2006: Dec Nov Oct Sep Aug Jul Jun Jan 2005: Sep Aug Jul Jun May Apr Mar Feb Jan 2004: Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan 2003: Dec Nov Oct Sep Aug Jul Jun May
Copyright (C) 1999-2007 Martin Pool.