[darcs-users] CVS-style development with darcs

Juliusz Chroboczek jch at pps.jussieu.fr
Thu Jul 1 19:02:55 UTC 2004


After much struggling, I've finally managed to work out a layout that
fits my working habits with darcs.  Here's a summary of my findings.

To my great delight, it is possible to have a CVS-style work
environment with darcs if you know what you're doing.  I'm finding
darcs much more pleasant to use than either cvs or arch; however,
there are a few flaws that make me slightly uneasy.  Please scroll to
the end to see what the problems are.

There are just two things you need to understand in order to convert
from CVS to darcs[1]:

1. a CVS repository corresponds to a bunch of darcs repositories;
2. a CVS working directory corresponds to a single darcs repository.

While these things seem obvious with hindsight, it took me a while to
work them out.

1. a CVS repository is a bunch of repos

A CVS repository contains a bunch of modules, each of which contains a
bunch of branches, each of which contains a bunch of versions.

For every branch, you'll need a darcs repository.  Fortunately,
because darcs repositories live in the filesystem, you've got great
freedom organising them.  You could use a flat layout:

  /var/repos/
    hello-world/
    hello-world-stable/
    polipo/
    polipo-stable/

A ``branches within modules'' layout,

  /var/repos
    hello-world/
      head/
      stable/
    polipo/
      head/
      stable/

or even, as darcs handles repos within repos just fine, something like

/var/repos/
  hello-world/
    ./
    stable/
  polipo/
    ./
    stable/

As right now I'm only using darcs for four projects, only one of which
has branched yet, I'm using the flat layout, but might switch to a
nested layout when I find it helpful.


2. A working directory is a darcs repo.

Just like one never touches the files under /var/cvs except when
something's gone wrong, I never cd to /var/repos; when I want to work
on a project, I |darcs pull| into my home directory, and work from
there.  When I want to commit, I make sure all my changes have been
recorded, test that everything compiles, and then I |darcs push|.  If
something goes wrong before I push, I can always unrecord.

With this setup, I can also easily consult earlier versions: going
back in time is |darcs unpull|, going forward is |darcs pull|.
However, this is not as convenient as |cvs update -r|; more on this
below.


3. Darcs is good for you

Darcs gives you a lot of things that CVS doesn't.

The nicest thing is distributed work.  Whenever I take my laptop away
from the 'net, I make sure that it has an up-to-date clone[2] of
whichever repos I wish to work on.  I can then record as much as I
wish; when I'm back online, I |darcs send| rather than pushing: I'd
rather deal with the merging when I next read my mail.

Every working directory is a repository.  I often find myself
recording, trying something out, reverting, unrecording.  Or
recording, doing more work, unrecording, recording.

I'll barely mention the simplicity of branching and merging, and
especially the fact that you can merge incrementally without fear of
future conflicts.


4. Darcs doesn't like large trees

CVS loves large trees.  You can checkout just part of a large tree,
and a CVS working directory doesn't cost much.  I've spent years
working on a 500MB tree using a machine with just 32MB of memory and
4GB of disk space (the repository was remote, obviously).

There's no way I could do that with darcs; the tree would need to be
split into multiple repos.  I guess that's not a problem, just
something you need to be aware of when setting up your repos, although
I wouldn't want to be the one moving the XFree86 tree into darcs.


5. Darcs branches are not cheap

With CVS, both branches and working dirs are cheap.  Not so with
darcs: every branch and working dir costs at least two full trees.
There are three solutions that I'm considering

 - making darcs get and pull hard-link _darcs/current whenever
   possible;
 - using an (optional) different format for _darcs/current; I'm
   thinking of only storing hashes of files rather than the full
   files, which will be slow but good enough for the ``master'' repos
   under /var/repos.
 - allowing empty working dirs (a per-repo preference), which is fine
   for /var/repos.

Any good ideas on that subject are welcome.


6. Darcs is unsafe

The reason why people (me included) love CVS is that CVS is safe.
There is no way you can ever lose data that has been committed without
doing manual surgery on /var/cvs/.  This gives you a peace of mind
that is difficult to understand if you haven't been exposed to CVS.

Darcs, on the other hand, doesn't enforce any invariants except from
honouring patch dependencies.  Patches flow randomly between repos,
and it takes a lot of discipline to ensure they flow the way you want
them to.

The new option --no-set-default has made this much better as it allows
you to set up a default dataflow that never changes.  However, I find
myself longing for ways to enforce dataflow by forbidding certain
flows.  I certainly never want a patch to flow from head to stable
without requiring a --force flag, and I would also like to make sure
that I cannot push to stable from a clone of head.

I think the right design is not to forbid individual flows, but to
have _darcs/level, an integer, and forbidding patches from flowing
from a lower-level to a higher-level repo (think water); then I could
setup stable to have level -1, and all'd be fine.  Again, any ideas
are welcome.


An apology of switch

As I've mentioned before, darcs makes me uneasy because many commands
are potentially unsafe.  Hence, I'm looking for ways to extend it with
operations that have guaranteed safety properties.

In order to go back in time with darcs, you need to manually unpull a
bunch of patches.  Not only is that tedious, but it is a dangerous
operation: unless you are very careful, you run the risk of unpulling
a patch that you haven't pulled yet.

The solution is |switch|.  It roughly corresponds to |cvs update -r|
and |svn switch|.  In darcs, it would look like

  darcs switch [--match match] [--context context] [ [repo1] repo2]

which unpulls all the patches present in repo1 but not repo2, then
pulls all the patches present in repo2 but not repo1.  Both match and
context apply to repo2.

                                        Juliusz

[1] perhaps this should be made into a Wiki page?

[2] a clone is what |darcs get| produces.




More information about the darcs-users mailing list