[darcs-devel] patch domains idea

Thu Jan 27 05:39:03 PST 2005

I have an idea, which originated from an off-list discussion with Shae,
which I wanted to put forward.  It's pretty much theoretical at this stage,
and I don't have implementation time for it any time soon.  Shae expressed
interest in perhaps working on this.

The idea is called "patch domains".  A patch domain is a description of
"what a patch acts on".  Like the concept of "patches" and "trees" in the
existing patch theory, a "domain" is an abstract concept, but with
reasonably strict laws governing the behavior of any implementation.

A domain is simply a coarse-grained description of what is modified by a
patch or set of patches.  There are few operations one can make on domains
with domains, and a few operations between domains and patches.  I'll write
types in haskell syntax

1. nullDomain :: Domain

This is the domain of an empty patch.

2. domainOf :: Patch -> Domain

Gives the domain of a given patch.

3. extendDomain :: Patch -> Domain -> Domain

This extends a Domain by "applying" a patch, so

domainOf (ComP (p:ps)) = dof (reverse ps) (domainOf p)
  where dof [] d = d
        dof (x:xs) d = dof xs (x `extendDomain` d)

4. intersects :: Domain -> Domain -> Bool

indicates whether two domains intersect.  It is always true that

if (domainOf p1 `intersects` domainOf p2)
  then commute (p1, p2) == Just (p2, p1)
  else True

5. commuteDomain :: (Patch, Domain) -> Maybe Domain

such that

case commute (p2, p1) of
Just (p1', p2') -> commuteDomain (p2, domainOf p1) == domainOf p1'

One could work out a few more laws (involving, for example, inverses) for
domains based on the laws for patches, but I think these five properties
may be sufficient.  I guess one would probably also want an invertDomain,
although perhaps it should be defined to be identity.

In practice, I figure that domains will consist of a list of filepaths
affected (or rather a FiniteMap or Data.Map of FileNames).  Given a domain
algebra (I may be using the wrong term here, I'm thinking something like
the Lie algebra), I think one could work out a set of efficient commutation
algorithms which would actually be independent of the actual choice of
domains.

Obviously there is the trivial "domain space" in which all domains
intersect, which obeys all these laws.  For debugging purposes, an
implemntation with "pluggable domains" might be a good idea.  Then one
could experiment with different domain realizations with the same set of
unit tests.  e.g. perhaps for seriously large trees one might want to
truncate the filepaths after two or three nested directories.

Anyhow, I thought I'd get these ideas out in there for anyone who's
interested.  If you haven't yet caught on (I'm not sure I was quite clear),
the idea here is to deal with the issue, for example, that the commutation
of two patch sequences of N patches each is currently an O(N^2) process.
It's not too slow currently, but with care (and patch domains) it could be
made an O(NlogN) process in many common cases (it'll always be O(N^2) in
the worst-case scenario).  And if we store and read the patch domains
(which isn't at all a bad idea--it could massively speed up annotate, for
example), perhaps patch commutation could be O(1) in the best cases and
O(logN) in other common cases.
-- 
David Roundy
http://www.darcs.net