[darcs-users] darcs conflicts/dependencies -- is patch theory the place to start?

AntC anthony_clayden at clear.net.nz
Tue Sep 18 03:47:57 UTC 2012


Kevin Quick <quick <at> sparq.org> writes:

> On Sun, 16 Sep 2012 01:37:15 -0700, AntC <anthony_clayden <at> 
clear.net.nz>  
> wrote:
> 
> > ... test for observational equivalence ...
> 
> I believe the effort to determine equivalence becomes exponentially hard.
> ...

Thanks Kevin, I hadn't considered the exponential search side of it, but that 
strikes me as a solid pragmatic reason for preferring something more like the 
L/S/L model. (I certainly don't want to have to implement the process you 
describe of scanning back through the branch to see where a file came from, 
and guess whether two files are 'really' the same, especially where there's a 
bunch of intervening operations, including file add/removes.)

If the VCS gives each file an 'internal' id that is:
- unique across all repos; and
- persistent wherever the file goes, or however its location/name changes.

Then we don't have to go searching to establish pre-conditions on operations 
that touch the file -- we just ask if that file id exists in the target repo.

To spell out the consequences:
- if two repos have same dir/named file,
  those are not the same file
  unless one repo pulled the addfile from t'other
  (or both pulled from some common repo)
- if the programmer changes the dir/name of a file
  they must use darcs-like explicit move-file
  (or perhaps git-like detection, with a prompt for 'is that what you mean?')
  Otherwise the VCS will take it as remove and add of unrelated files.

This suggests we need an implementation where the VCS maintains an 'internal' 
map of file id <-> directory/name. At each record point, validate the map to 
detect added/moved/removed files and directories.

I'm suggesting the file id be the ppid of when it got added, to help with the 
book-keeping. (I'm assuming this can also tell us in which repo the file 
started life.)

It's detecting at record points that the L/S/L paper doesn't tackle. At file 
system level, detecting and re-mapping file id's is relatively (!) 
straightforward. This discussion is all by way of warming-up for dealing with 
hunk changes, where we need to implement some sort of line-id, and detect line 
movements.

Stephen's suggested this line-level tracking might get into 
recursive/combinatorial/exponential trouble in following a line's identity. So 
avoiding that is what I'm scratching my head about.

> 
> > Possibly we could expose the non-equivalence to the programmer even  
> > before
> > pulling the hunk change, by the VCS linking B's file G to F to branch A,  
> > but not linking C's file G.
> 
> Explicit dependencies like that (which are normally impossible or at best  
> over-restrictive in darcs because it forces explicit repo relationships) ...

Could you explain a bit more what you mean by 'explicit repo relationships', 
and what's bad about them?

Pulling a patch from one repo to another sets up a relationship anyway (so I 
understand?). One purpose of that is to detect duplicates. (That is, dependen-
upon patches already pulled to the target -- from Owen's description.) I don't 
think anything about ppid-as-file-id gets in the way of repos working 
standalone; that is until you want to pull/merge patches -- at which point 
surely a repo relationship and restrictions is exactly what you want(?)

 
AntC




More information about the darcs-users mailing list