[darcs-users] Replacing the n^2 algorithm.

Michael Conrad conradme at email.uc.edu
Tue Jan 11 02:00:50 UTC 2005


Since this conflict merging problem seems to be the largest complaint
of new darcs users, it seems like we ought to talk about how to fix it
more often.  So, I'm trying to get the discussion started again.

I generated a test repo to see a N^2 conflict firsthand, and see if I
could see any way of resolving it that didn't involve the current merger
system.  I've shared my test repos (*0.1) but its easy enough to
generate an example like them.

Lets say that I create a new project:

(*1)

Then, a friend pulls from me so he can help with the development.
Then, someone posts a nifty source library on slashdot.
He sees it, and adds it to the project.

(*2)

I also see it, and add it to the project.  Additionally, I add a comment
 to foo.c

(*3)

Now, to keep things clear in my test repos, I will assume that a third
 person wants to help with the development.  He pulls from my friend,
 getting (*1) and (*2), and then pulls from me, and then gets the
 dreaded N^2 conflict (*4), which actually runs vary quickly in this
 simple example.  Also, notice the file name: its the same as the one
 in my original repo, although the contents are different.
 (I didn't think that was allowed?)

Anyway, to my point:
The problem is that files with duplicate names (and in this case,
contents) have been added to the repo.  The repos are in conflict.

Currently, darcs takes this odd approach of making automatic mergers
between conflicting directories, and putting them in the patch file
that caused them.  For files, it does the cvs-style merges.

Considering that darcs has no knowledge of whether these files are
actually meant to be the same, and considering that it doesn't really
know what to do about it, and considering that (in this situation, at
least) the files are left in an unrecorded state such that they must
be re-recorded, I think we can agree that this behavior isn't really
useful.
(I'm also thinking that things can't commute accross this mess
anymore, at least in this case making the whole merger thing kind of
pointless, but I'm not sure how to check.)

So, to offer something constructive to get discussion started, here's
my take on the situation:
    1.  There is no automatic way to resolve this such that it will
        always do what the user wants.
    2.  The files are either meant to be the same, or meant to be
        separate.
    3.  If the files are meant to be the same, and are the same, then
        simply recording this fact should be enough to let darcs know
        (in the future) that it is safe to commute accross the merger.
    4.  If the files are not meant to be the same, the user should be
        forced to rename one of them so that they don't conflict.
    5.  If the files are meant to be the same and are not compatible,
        then darcs should force the user to make them compatible, or
        unpull the patch.  Darcs should not allow any other operation
        until the conflict is resolved. (*0.2) Once the user has made
        the necessary changes, darcs can take these changes and write a
        patch containing them and declare the files to be merged.
    6.  The choices of the user ought to be written into the patch,
        rather than the resulting data.  To save space mostly, but
        also because I can't see how the choice could be a N^2 routine,
        and that would solve the problem.

I was going to elaborate quite a bit on point 5, but I got tangled up
in the logic of it all.  Basically I'd like darcs to give me the
"chance" to make my version compatible with the incoming patch,
(accomplished by returning me to the command line, and putting both
versions of the file/directory into the repo with mangled names, so
that I can merge them by hand and rename them to the official version)

Or let me declare "forget that one" to just use one version or the other.
(this could be accomplished by inserting "shim" patches on each side of
the offending patch which nullify the particular offending operation of
that patch)

Anyway, does this all make sense?  Am I forgetting something major?
Any major hurdles in this approach?
(And what about that deal below where the same patch name has different
contents?)

-Mike

0.1) The test repos (rather appropriately named, hehe):
http://conserv.silverdirk.com/testrepo/foo
http://conserv.silverdirk.com/testrepo/bar
http://conserv.silverdirk.com/testrepo/foobar  (bar, with foo merged in)

0.2) Don't let the user do other actions til resolved:
	If the user makes other changes, and records or pushes or pulls,
darcs might lose the ability to commute the shims to the sides of the
conflicted patch.  It really seems like the conflict should be resolved
immediately, and other changes shouldn't be recorded with it, maybe
even to the point of having darcs write a patch with an automatic title
that specifically contains only the shim patch data.

1) The first patch, which is sort of pointless.
20050110233413-46018-694c9da51a21fc52c4d07767ae03e5d37440e428.gz
  [Started Foo's Project
  silverdirk at conserv.silverdirk.com**20050110233413] {
  addfile ./Makefile
  hunk ./Makefile 1
  +foo: foo.c
  +       gcc -o foo foo.c
  addfile ./foo.c
  }

2) The second person's patch, adding the common code.
20050110234119-46018-99c1cb99abec58848be17699e097a540b6553b44.gz
  [Added a nifty utility library
  silverdirk at conserv.silverdirk.com**20050110234119] {
  adddir ./nifty_util
  addfile ./nifty_util/util.c
  hunk ./nifty_util/util.c 1
  +blah
  +blah
  +int foobar() {
  +       return -1;
  +}
  }

3) My patch, including common code.
20050110234359-46018-e3cd854f04d12d3a1e55dcdc5726db6af7949a40.gz
  [Added a comment to foo.c, and added a nifty utility library.
  silverdirk at conserv.silverdirk.com**20050110234359] {
  adddir ./nifty_util
  addfile ./nifty_util/util.c
  hunk ./foo.c 1
  +# Added a comment
  hunk ./nifty_util/util.c 1
  +blah
  +blah
  +int foobar() {
  +       return -1;
  +}
  }

4) The N^2 Conflict from pulling from repo #2, then from repo #1
And would I be corrent in saying that the merger patch is N^2 in size
 as well as time-to-generate?

20050110234359-46018-e3cd854f04d12d3a1e55dcdc5726db6af7949a40.gz
  [Added a comment to foo.c, and added a nifty utility library.
  silverdirk at conserv.silverdirk.com**20050110234359] {
  merger 0.0 (
  hunk ./nifty_util/util.c 1
  +blah
  +blah
  +int foobar() {
  +       return -1;
  +}
  merger 0.0 (
  addfile ./nifty_util/util.c
  merger 0.0 (
  adddir ./nifty_util
  adddir ./nifty_util
  )
  )
  )
  merger 0.0 (
  merger 0.0 (
  merger 0.0 (
  addfile ./nifty_util/util.c
  merger 0.0 (
  adddir ./nifty_util
  adddir ./nifty_util
  )
  )
  hunk ./nifty_util/util.c 1
  +blah
  +blah
  +int foobar() {
  +       return -1;
  +}
  )
  merger 0.0 (
  merger 0.0 (
  merger 0.0 (
  adddir ./nifty_util
  adddir ./nifty_util
  )
  addfile ./nifty_util/util.c
  )
  merger 0.0 (
  merger 0.0 (
  adddir ./nifty_util
  adddir ./nifty_util
  )
  addfile ./nifty_util/util.c
  )
  )
  merger 0.0 (
  merger 0.0 (
  addfile ./nifty_util/util.c
  merger 0.0 (
  adddir ./nifty_util
  adddir ./nifty_util
  )
  )
  hunk ./nifty_util/util.c 1
  +blah
  +blah
  +int foobar() {
  +       return -1;
  +}
  )
  )
  )
  }






More information about the darcs-users mailing list