[darcs-users] DRAFT Proposal: Navigating the space of versions using Tree hashes

Jason Dagit dagit at codersbase.com
Sat Sep 4 07:47:17 UTC 2010

I would like to throw out an idea for a new darcs feature.  I would
like is to start a discussion about this proposed feature.  Perhaps if
this idea is accepted we can find volunteers to do the work :)

Petr, correct me if I say something incorrect hashed-storage.  There
is still I a lot I don't know about it (and why I keep banging on the
more documentation drum).

When the discussion, if any, converges I'll create a wishlist ticket
with the details.

= Overview and Motivation =

Something to keep in mind as you read this, is that the value of
context dumps in darcs is that they provide us a way to ensure two
repositories are at the same pristine state without needing to create
a tag(s) before hand.

Hashed-storage provides us with a way to label pristine states.

For example, in the Agda darcs repository, if I type:
darcs show index

The first line reads:
71b36cfa618165da717aede1c2f08b0f6d02544f2cb3f169df8e03df6e22abd3 ./

Meaning, the root of this repository is at a state named
That might be the state of my working tree though, not 100% confident
either way.  Petr could you please comment?

I could then communicate this label to other developers as a fast
check that we have the same set of changes in our repositories [1].
This can be done more easily than comparing contexts.  I would further
like to be able to tell other developers, "I'm at state X, and I see
the following [...]" and have those developers be able to type
something like [2]:
darcs move-to-state X

And their darcs would do one of two things [3]:
  1) Ideally recognize how to go from the currently active set and
order of patches to some other ordering and set of patches that
results in pristine state X.
  2) Tell me that X is unknown in this branch and that I likely need
patches that this branch has never seen before.

The problem comes when the names of our pristine state varies.  Then
suddenly we have to do detective work to figure out why they are
different and how to make them match.  The key value add of this
proposed feature is this ability to "jump to" a named state quickly
and hassle free.

By the way, Git supports this sort of "named state" navigation of the
version space and I think they've demonstrated that it has real value
for users.

= Rough Sketch of Implementation =

I was thinking that as darcs visits[4] pristine states it could
maintain/build a map from Hash -> Inventory, call it the "named state
map".  Perhaps when branches communicate they are able to union their
named state maps.  I expect these maps to be relatively small, but
that might require clever packing/compressing as inventories grow in
size and the number of visited states explodes combinatorially.

I expect this next bit to be controversial, but keep reading for an
alternative.  The idea is that you can have patches (and inventories)
in your repository that are not the currently active ones.  I think
that's my preferred way to use this feature, but I would like it to be
non-destructive.  Visit a particular state, do something, then decide
you want to be back where you were so you go back to the previous
state/inventory.  This would need new machinery/commands and raise
questions about handling unrecorded changes.

I think a less controversial (and more consistent with the current UI)
way, would be to work analogous to 'darcs get foo --context=baz bar',
in that you always have to create a new branch in a new directory when
you use the "jump to state" feature.  This has the attractive benefit
of keeping with the existing branching model that darcs uses.

A possible pitfall for users is if darcs doesn't recognize a state in
its named state map.  Care is needed to ensure that error messages
give users the info they need to make the feature work and that it be
easy to update your repo to include the named states in other

I think this is a place where we can reasonably avoid invoking patch
theory to construct the repo state.  It's sort of like having implicit
tags and doing a 'darcs get -t implicit_tag_X'.  We already know what
the requested state looks like, let's simply 'make it so'.

= Request for Feedback =
  * Is this silly?
  * Is it already possible?
  * What would you change?
  * What are must-haves and can't-stands for you if we had this feature?

= Positive Impact on other features =
  * Annotate can give out tree hashes (if that is useful)
  * People can compare pristine state by hash and actually synchronize
via hashes
  * Buildbot could report state instead of conext

= Negative Impact on other features =

  * Will the size of named state maps grow too quickly?

If we go with being able to "jump to state" without creating a new branch:
  * Requires users to know which tree hash they are at relative to the
'tip' of their branch
  * Requires special handling of inventories and unrecorded changes


[1] Zooko has requested this feature on numerous occasions, and I
think we all deserve it.
[2] "move-to-state" is a terrible command name, please advise.
[3] This *needs* to fail fast or finish quick.  Definitely not
searching all possible subsets and orderings of the repository.
[4] I don't know how to efficiently define "visiting" a state.
Roughly, my idea of "visited state" means any pristine state that has
existed in this branch or its parent at the time of 'get'.  Ideas?

More information about the darcs-users mailing list