[darcs-devel] Darcs and git: plan of action

Tue Apr 19 15:40:42 PDT 2005

(Sorry for the delayed reply -- I'm living on tape delay for a bit.)

On Mon, 2005-04-18 at 22:05 -0400, Kevin Smith wrote:
> >>>>The other is "replace very instace of identifier `foo` with identifier`bar`".
> >>>
> >>>That could be derived, however, by a particularly smart parser [1].
> >>
> >>No, it can't. Seriously. A darcs replace patch is encoded as rules, not
> >>effects, and it is impossible to derive the rules just by looking at the
> >>results. Not difficult. Impossible.
> >  
> > If I do a token replace in an editor (say one of those fancy new-fangled
> > refactoring thangs, or good ol' vi), a token-level comparator can
> > discover what I did. That link I sent is an example of one such beast.
> 
> The big feature of a darcs replace patch is that it works forward and
> backward in time.

That's *not* a feature of the token replace patch, however. That's a
feature of the darcs commutation machinery, correct? (With the obvious
caveat that darcs can only *do* the commutation if it has correctly
nuanced darcs-style token replace patches, rather than mere ASCII
textual diffs.)

> Let me try to come up with an example that can help
> explain it. Hopefully I'll get it right. Let's start with a file like
> this that exists in a project for which both you and I have darcs repos:
> 
> cat
> dog
> fish
> 
> Now, you change it to:
> 
> cat dog
> dog
> fish
> 
> while I simultaneously do a replace of "dog" with "plant", resulting in:
> 
> cat
> plant
> fish
> 
> We merge. The final result in both of our trees is:
> 
> cat plant
> plant
> fish

Okay, that all makes sense.

> Notice that just by looking at my diffs, you can't tell that I used a
> replace operation.

Here's where we disagree. If you checkpoint your tree before the
replace, and immediately after, the only differences in the
source-controlled files would be due to the replace. And since the
language of the file is known (and thereby the tokenization -- it *is*
well-defined), then a tokenizer that compares the before and after trees
(for just the files that changed, obviously), can discover what you did,
and promote the mere ASCII diff into a token-replace diff. (The same
sort of idea could be done for reindention, I'd hope.)

> I didn't just replace the instances of "dog" that
> were in my file at that moment. I conceptually replaced all instances,
> including ones that aren't there yet.

Well yes, that's exactly what we want. And the key point of all of this
is that there's no magic here. The darcs machinery does all the
commutations such that the patches can wiggle together without
conflicts. To do it's job, of course, it needs nuanced patches, rather
than the quite literal ones generated by diff.

We agree on everything except that it's provable that one can discover a
replace operation, given a before and after tree.

> Now, I should mention here that I personally dislike the replace
> operation, and I think it is more dangerous than helpful. However, other
> darcs users are quite happy with it, and it certainly is a creative and
> powerful feature.

It's creative alright, though I had the same misgivings. In my common
code workflow, I almost never have global tokens -- all my replaces
would be per function, so I never saw an opportunity to use it when I
was screwing around with darcs.

> Other creative patch types have also been dreamed of. For example, a
> powerful language-specific refactoring operation has been discussed as a
> far-future possibility. That would be safe, and cool.

<subliminal> indention patch type, indention patch type... </subliminal>

> > > Automated refactoring tools, for example, perform the
> > > rename+modify as an atomic operation.
> > [...]
> Although there are no such nifty refactoring tools available today, they
> will exist at some point.

Yeah, I spent some time drooling over the refactoring editors before
slapping myself and deciding I'd wait for others to live on that
bleeding edge for a while. I've had to clean up too much code from other
people.

> Even without tools, many shops have policies against checking in code
> that won't compile. If you rename a java class, you must simultaneously
> perform the rename and modify the class name inside. If you commit
> between those steps, it's broken.

I'm trying hard to find a nice way to say that's silly. I'm failing. My
suggestion in that case would be that the local coder commit many
patches to a local repository, one of which is the rename. Then upon
completion of the refactoring, the set of patches is committed to the
group repository. Tags before and after preserve the repository's
precondition that it always compiles.

> [I do realize that the kernel doesn't have java code, by the way.]

Don't worry, I didn't think that you did :-).

Ray