[darcs-users] Fwd: darcs patch: test that tentative leftover is cleared (and 2 more)

Kamil Dworakowski kamil at dworakowski.name
Sat Sep 12 22:37:22 UTC 2009

This was intented to go to the list, but went to Jason only due to me
clicking on reply instead of reply to all (old habits never die).

---------- Forwarded message ----------
From: Kamil Dworakowski <kamil at dworakowski.name>
Date: Sun, Sep 6, 2009 at 1:03 PM
Subject: Re: [darcs-users] darcs patch: test that tentative leftover
is cleared (and 2 more)
To: dagit at codersbase.com

On Sun, Sep 6, 2009 at 5:05 AM, Jason Dagit<dagit at codersbase.com> wrote:
> Kamil,
> You've really tackled an unruly bit of code.  It's forced me to read over
> this part and really try to understand it.  I hope my understanding is
> correct and hopefully you're also learning a lot from this too.  This
> version is better than the last, but I suspect I've found some problems with
> it (see below).


Thanks for the time taken on this review. Below I write how I
understand the transactions in darcs, which I hope will help to
resolve some of the doubts you have about the patch.

Tentative state corresponds to the real state of the repository. The
representation is the same except for the tentative_pristine, which is
stored as a sequence of patches, a diff from the real pristine.

Tentative state is cleared, or more precisely copied from the real
repo state, before a transaction is executed. This is done by
withRepoLock, which is used in amend-record command for example.
withRepoLock calls revertRepositoryChanges, which in turn calls

Operations in a transaction (one such operation is tentatvelyRemovePatches)
should operate on the tentative state only.

The test command is executed on a tentative version of the repository.
If the test fails then darcs simply exits with a failure code (no
cleanup to tentative happens at this point).

A transaction is committed by finalize_tentative_changes called by
finalizeRepositoryChanges, which you can see used in the
amend-record command's source code.

The failure described in the bug report was caused by one of the
functions called by tentativeRemovePatches, namely
DarcsRepo.remove_from_tentative_inventory, changing the real state of
the repository. As it was written it was preserving consistency but
breaking atomicity of any transaction it was used in. I have
rewritten the function to make changes only to the tentative state and
thus preserve both properties.

>> hunk ./src/Darcs/Repository/DarcsRepo.lhs 202
>> remove_from_tentative_inventory :: RepoPatch p => Bool -> [DarcsFlag] ->
>> FL (Named p) C(x y) -> IO ()
>> remove_from_tentative_inventory update_pristine opts to_remove =
>> -    do finalize_tentative_changes
>> -       Sealed allpatches <- read_repo opts "."
>> +    do Sealed allpatches <- read_tentative_repo opts "."
>>        skipped :< unmodified <- return $ commute_to_end (unsafeCoerceP
>> to_remove) allpatches
>>        sequence_ $ mapFL (write_patch opts) skipped
>> hunk ./src/Darcs/Repository/DarcsRepo.lhs 205
>> -       write_inventory "." $ deep_optimize_patchset
>> -                           $ mapRL_RL n2pia (reverseFL skipped) :<:
>> unmodified
>> -       remove_from_checkpoint_inventory to_remove
>> -       when update_pristine $
>> -            do pris <- identifyPristine
>> -               repairable $ applyPristine pris
>> -                              $ progressFL "Applying inverse to pristine"
>> $ invert to_remove
>> -       revert_tentative_changes
>> +       write_tentative_inventory "." $ deep_optimize_patchset
>> +                                     $ mapRL_RL n2pia (reverseFL skipped)
>> :<: unmodified
>> +       when update_pristine $ add_to_tentative_pristine
>> +                            $ progressFL "Applying inverse to pristine" $
>> invert to_remove
>> +       remove_from_tentative_checkpoint_inventory to_remove
> Some of these lines have been reordered and some now work with tentative
> variants of the inventory.  The stuff that has been reordered is applying to
> pristine and removing patches from the checkpoint.  I suspect it's better to
> update all inventories and then change pristine to match.  Imagine if your
> computer rebooted between those two operations.  Which order would leave
> your repository in a repairable state?  I'm not certain, but my intuition
> says that you can repair your pristine more easily than your inventory
> (using darcs repair).  Also, what happens if we fail during
> add_to_tentative_pristine?  In that case we won't make it to
> remove_from_tentative_checkpoint_inventory, but we will have already updated
> the tentative_inventory.  I think this means that the tentative_inventory
> will be out of sync with the checkpoint's tentative_inventory.  Therefore, I
> think the last thing this function should do is add_to_tentative_pristine.

The consistency of the tentative state after a failure does not
matter. The tentative state is cleared before every transaction, see
my description above.

> Looking at add_to_tentative_pristine, I see that it updates a file
> (tentative_pristine) instead of the actual pristine.  I guess that makes it
> safer than what I was talking about above.  It could be safer yet by writing
> to a new file and then move that to be tentative_pristine.

That would break consistency. remove_from_tentative_inventory should
not assume that it is the only or first operation in a transaction
(which actually holds, but nevertheless). There might have been an
arbitrary number of operations on the tentative state before it, each
of which could have modified the tentative_pristine. In other words,
we should update the tentative_pristine, not create it based on the
real pristine, which would drop all the changes previous operations in
the transaction made.

> If I recall
> correctly that's how you can do atomic file changes on unix.  Not sure how
> well that works on windows.  We already do this in
> finalize_tentative_changes for inventories.

I am not sure if we need the updating of the tentative_pristine file
to be atomic. We want to have atomicity, and other ACID properties, on
the level of performing an amend-record for example. Atomicity of
updating the tentative_pristing file does not matter for the ACID
properties of the of darcs transactions.

> Furthermore, I just realized with these changes we may actually call
> write_patch on patches that have yet to be recorded.  How could this happen?
>  Suppose there are patches in tentative_inventory that are not in to_remove.
>  The first thing we do is to read everything in the repo +
> tentative_inventory.  This would give us a sequence containing all the
> tentative patches (including the patches in to_remove) and the rest of the
> patches in the repo.  Then we commute to_remove to the end so they can be
> safely discarded.  But, some of the tentative patches which are not in the
> inventory and not to be removed could potentially also be in the sequence
> named 'skipped'.  Then we write out that whole sequence, potentially
> including patches which should be only be in a tentative_inventory but now
> they would be stored with the other patches yet do not appear in the
> inventory.  This essentially creates orphaned patches unless we just happen
> to record them soon.

Yes. Doesn't darcs optimize remove orphaned patches? I just assumed it
is the way with darcs because obliterate does not make any effort to
remove orphaned patches either. But now I can't easily find anything
in the code that does it. This might be a separate issue.

> I wouldn't have guessed this to be so tricky.

Indeed. The bug looked easy to begin with, but the code changes required
are quite extensive.


More information about the darcs-users mailing list