[darcs-devel] darcs and git and the linux kernel

David Roundy droundy at darcs.net
Sun Apr 10 06:28:32 PDT 2005


On Sun, Apr 10, 2005 at 12:46:15AM +0100, Ian Lynagh wrote (offlist):
> By the way, I'm curious - have you been talking to Linus/other kernel
> developers about them using darcs, or are you just reacting to them
> saying they are looking around?

Just reacting to them.  But I've been following the threads on the linux
kernel mailing list, so I'm pretty much aware of the discussion.  There
currently isn't an SCM which qualifies for what they want.  Linus says
monotone is the front-runner, but darcs has also been mentioned, and
usually immediately dismissed because of performance issues.

I had an interesting idea this morning.  Linus has been working on a
stopgap measure called "git".  (see http://lwn.net/Articles/131313/ for the
first announcement--it's been pretty active since then) It's basically an
SCM without the CM... just a database that holds a sequence of versions and
associated changelog comments, but can't do anything interesting
(e.g. merging).  It's based on hashes for filenames with data stored in
files in a directory, and is extremely optimized--the whole point is to
speed up the common operations that Linus cares about.

What I had started by thinking was that we could (in theory) use git as a
cache to store older versions.  Then I got to thinking a bit more, and
wondered if we could (optionally) use git to manage the pristine tree.  It
wouldn't gain us anything in record time--if anything it'd cost us
something--but then for commands like

darcs diff --from-patch foo --to-patch bar file.c

we could just check out the relevant copies of file.c from those two
versions, and this would be instantaneous.  Obviously, this would involve
an extension of the pristine cache interface.  With care, that extended
interface could perhaps also be used beneficially with other pristine cache
options.

Also, git stores the modification time and creation time in nanoseconds,
the inode number, so that one can tell with great accuracy if a file has
*not* been modified.  This is better than the second-granularity
modification time check we currently use, which would be nice.  It's not
portable, but that's life.

Git is written (as one might guess) in C, and I imagine it wouldn't be too
hard to make a little library out of it.  And since it looks like the
kernel will be stored in git until a decent SCM matures to the point where
it can take over, having git code in darcs might make conversion from git
to darcs easier.

On the down side, git stores more information than the pristine cache
currently does, so repositories would expand in size.  Also, git is under
rapid development.  In both cases, since darcs has good mechanisms for
switching pristine cache format, I don't see there being a real problem.

Anyhow, this was my idea.  If someone might be interested in working on
this (git integration for the pristine cache), I suggest either that you
ask on the linux-kernel list for thoughts on whether this would be a good
idea, or mention it to me, and I could ask.  I'm hesitant to ask there
myself first, since I don't expect to have the time to do this.

Juliusz, any chance you'd be interested? Since you wrote the pristine-cache
abstraction, and you're also a C coder with a unixy background, you'd
definitely be well qualified.  The question is one of time and interest...
-- 
David Roundy
http://www.darcs.net




More information about the darcs-devel mailing list