[darcs-users] sharing files

David Roundy droundy at abridgegame.org
Fri Jul 25 11:06:12 UTC 2003


On Thu, Jul 24, 2003 at 08:06:37PM -0700, John Meacham wrote:
> I think we are talking about different things.  file renames and
> additions are nothing more than edits in the _darcs/map text file. since
> each repositories _darcs/map file is different (has a different guid)
> changes to one repositories version will not be propegated to
> another. any patches which reference the remote _darcs/map will be
> handled like any other and the irrelevant parts (those refering to the
> guid you don't care about, the other map file) will be ignored.

I see.  I didn't think of that.

> > It seems to me that with these caveats you haven't gained a whole lot.
> > You could get as much by modifying darcs to ignore any FilePatches to
> > files that don't exist (rather than choking as it does
> > now--intentionally, since such a situation would indicate a bug).
> > You'd also have to modify the commute routine to accept a directory of
> > which files exist so that it would know that FilePatches to files that
> > don't exist always commute.
> 
> it can handle several situation that the current scheme cannot.  imagine
> a project forks, both forks add a 'foo.c' file, suddenly, they cannot
> pull each other patches which involve foo.c since they would conflict.

Yes, if they simultaneously add a foo.c file, that would be annoying.  But
if one adds the foo.c first, and then the other pulls that foo.c, removes
it and adds their own, there would be no problem.  And this could be done
afterwards when someone realizes that they have a conflict (in a temporary
repo, for example).  Or if they want both, they could rename one rather
than deleting it.

> things get more complicated as projects branch, you would be unable to
> pull patches back and forth which should not conflict but do.  the
> solutions are not easy since they involve knowing before hand that other
> people branching your project are not creating files of the same name so
> you can 'break up' patches into seperate ones which modify the contested
> files independently, you then must remember to not pull patches which
> reference them.

No, the only time to "break up" patches is if you want to do funky shared
files between totally different projects.  For simple branching, all you
need to do is use darcs normally, deleting files you don't want, or
renaming them to avoid conflicts.

> with the system i was talking about, this would be a non-issue. every
> 'darcs add' creates a new unique file, independent of the name given.
> two developers doing 'darcs add foo.c' creates two distinct files with
> distinct histories.

In darcs, every darcs add also creates a unique file.  True, two
simultaneous darcs adds creates a conflict, but once you realize that you
can either rename or delete one of them, and everything will go along
merrily.

> > But on top of this, you'd start running into all sorts of other
> > problems.  What happens when you add a file to a repository and you've
> > already pulled patches that would have modified those files? Do those
> > old patches get reinterpereted? One of the basic ideas of darcs is that
> > a patch always means the same thing (regardless, for example, of the
> > order in which patches are pulled and merged).
> 
> darcs add would ALWAYS creates a new unique file . There is no way to
> darcs add something which will conflict with something in another
> repository since they would have different guids. the only way to get
> the 'same' file from another repository is to explicitly 'pull' it in
> which grabs all apropriate patches which must exist in the other
> repository for the file being pulled to exist there in the first place. 

My question was what happens when you decide (after the repository is
created) that you want to include a file that is already in a different
repo (which shouldn't create a new unique ID).  If you've already pulled
patches from that repo (to get another file), then those patches would need
to be reinterpereted now that you have the new file.

> arch's file ids are used different than in PRCS AFAICT. the important
> thing is not the file-ids, but the ability to store meta-info (the entiry
> directory tree layout) in it's own editable text file and abstract the
> concept of a files contents away from how it is used in anyones
> particular repository. tags and preferences also fall easily into this
> scheme.

I guess there are two issues here.  Certainly Tom is very proud about
treating all of his repository data as simply files stored in the
repository, which I consider one of the biggest design flaws in arch, since
it hides the real difference between metadata and data.  Data is what your
user wants to store and retrieve.  Metadata is what you (the SCM) need in
order to do so.  

> > In practice I think that forcing people to make patches to a shared
> > file not include changes to other (unshared) files is a good thing.  A
> > patch should (and this isn't revision control, but best use of revision
> > control) include only one logical change.  If the file is shared then
> > any changes to it *can't* depend on changes to an unshared file,
> > athough it may require changes to unshared files if, for example, you
> > change an API.  But if you change the API provided by a shared file,
> > all the repositories using that file but one will have to have two
> > patches anyways, one to change the API and one to support the change to
> > the API.  So there's no compelling reason to put the two into one patch
> > (although if it weren't shared I'd definitely want to do so).
> 
> but this requires knowledge a priori that a file will be shared. as well
> as communication between distributed branches.

Well, I do happen to thing that communication between developers working
with the same code tends to be a plus, rather than a minus.  Yes, this does
mean (for example) that if you want to use FastPackedString.lhs (from
darcs) you'd have to let me know so that I could separate its patches.  But
if it is going to be used as a separate library, it would be a good idea
for me to know that so I won't keep breaking it and changing its behavior,
or at least when I do I would put notes to that effect in the patch
comments.

And no, it doesn't require a priori knowledge that a file will be shared,
if you don't mind breaking the file's history.  Normally I wouldn't advocate
breaking a file's history, but in this case it seems reasonable.

> the safe way to do things is to create one patch per file which leads to
> patchspace crowding and a lot of effort on the developers part to
> remember which patchs they should not apply.

Darcs is designed to serve whole trees.  That is its purpose.  Sharing
single files is the exception rather than the rule, and should be done
deliberately.  Very few files in any given project are even appropriate to
be added to another project.

> > So to summarize what I think is the most important reason, I guess the
> > only advantage I see in your proposal would be for shared files (or
> > nested repos, which I haven't addressed), and the advantage here would
> > be simply to allow you to make a single patch modifying both shared and
> > unshared files.  Since I think this would be bad practice, it doesn't
> > stand up as an argument in favor of such a change.
> 
> I think we were talking about slightly different things, arch does not
> use file-id's to their real potential (among other things). part of the
> reasons i looked for something better, darcs :)

It sounds like what you are interested in is something that has a model
more similar to subversion (and in a sense, CVS).  In svn the repository
holds a sort of collection of files, each of which is version controlled,
some subset of which you may be interested in.  The repository holds many
different projects (potentially) and files can be included in any number of
projects and moved back and forth.  This relies on there being a single
monolithic repository.  And I don't mean to suggest you are interested in a
monolithic repository, I mean that you are thinking the way svn folks think
(which isn't bad), of there being a bunch of version controlled files, only
without the centralized repository.  (I should also warn that I have never
used svn or really researched it closely, so my impression could be wrong.)

Darcs does not take the view that there are just a bunch of files out
there.  The focus is not on the files, but on the changes.  You can pick
and choose changes to pull into your repository, but you can't pick and
choose files (except that you could always delete the files after pulling
them in...).  A change is intended to be a single logical unit.  It is poor
usage to include orthogonal changes in a single patch, and what you seem to
be advocating is what I would call poor usage of darcs.

Just so you know, it also is totally contrary to the design and
implementation of darcs.  So while I'm happy to try to convince you that
darcs' design is better, I don't want to give you the impression that if
you try hard enough you can convince me to implement your suggestions in
darcs.  Even if I were convinced that your ideas would have been a better
way to do things, I still wouldn't implement them in darcs.  It would be
easier to just create a new SCM tool.
-- 
David Roundy
http://www.abridgegame.org




More information about the darcs-users mailing list