[darcs-users] Re: Limits of Darcs (the whole Linux kernel?)

Catalin Marinas catalin.marinas at arm.com
Mon Nov 8 10:52:21 UTC 2004


David Roundy <droundy at abridgegame.org> writes:
> We often need to commute patches, which means we to know what files they
> modify, and which lines of those files, and what other
> (non-file-modification) changes thy might make.  If we need to commute with
> a replace patch, we also need to know the contents of all the lines
> modified by that file.

The biggest part of a patch is the text it inserts into a file. It is
my impression that this text is only needed when applying the patch
onto a tree.

Let's only consider the patches which modify the file contents (the
biggest). They can be split in 2 parts - the hunks information (the
line numbers a hunk adds or removes into a file) and the data (lines
added by hunks), which can be a separate file or concatenated at the
end of the same file. Each hunk which adds lines to a file also needs
to hold the information about the position into the data file. This
position information is not modified by any commutation
operation. This means that there is no need to read the data file
since it will _not_ be changed by the commutation. Even when merging
patches, the data file can simply be copied into the new repository
and only modify it if conflicts occur (that's the only case where the
data file needs to be modified).

> In short, in order to do commutation, we *may* need the entire contents of
> a patch.  We may be able to get by with a subset of that information, but
> it's not easy to figure out which subset is needed.

I could cope with the initial import but later merging a patch took
much longer (about 3 hours, compared with 70min for an import). The
way I tried it was having a main Linux repository, linux-2.6
(containing the 2.6.8 version), and branching from it (darcs get) to
linux-2.6-mine. Work goes into the -mine tree and later on I apply the
2.6.9 patch (around 18MB) onto the linux-2.6. It finishes the record
and want to pull this patch into the linux-2.6-mine tree (conflicts
appearing). This is where it stayed for hours.

I didn't understand why darcs needs to parse the 2.6.8 patch
containing the initial import (not sure what it did but this was my
impression after using ~600MB of memory) since it is present in both
the linux-2.6 and linux-2.6-mine trees and no commuting with it has to
be performed. Darcs should probably only check the dependency with
patches which are not present into the new repository (and maybe only
go up to the common ancestor of the 2 repositories). This would
improve its memory usage even if the whole patches are loaded into
memory.

Catalin





More information about the darcs-users mailing list