[darcs-users] [patch304] Reduce DarcsFlag usage in Darcs.Patch and Darcs.Repository

Petr Rockai me at mornfall.net
Fri Jul 16 20:56:47 UTC 2010


Eric Kow <kowey at darcs.net> writes:

> It's also a sign of why we should be a bit careful when making these
> kinds of changes.  I imagine that just passing in (diffingOpts opts)
> would have had the same effect.
It would, but as you point out, would be less transparent.

> Other slightly interesting cases.  The revert and unrevert take
> --ignore-times (for directory diffing), but not --look-for-adds

> (fst (diffingOpts opts), ScanKnown)?
> (second (const ScanKnown) (diffingOpts opts))?

The former looks OK, I guess. I am not opposed. Separate cleanup patch
welcome.

>> +data UseIndex = UseIndex | IgnoreIndex
>> +data ScanKnown = ScanKnown | ScanAll
>> +diffingOpts :: [DarcsFlag] -> (UseIndex, ScanKnown)
>> +diffingOpts opts = (if willIgnoreTimes opts then IgnoreIndex else UseIndex,
>> +                    if LookForAdds `elem` opts then ScanAll else ScanKnown)
>
> As you might imagine, I don't have any strong feelings on the matter,
> but I thought it would be useful for me to make a note of the choices
> you made here
>
> * Would naming the ADTs after one of the mutually exclusive choices
>   would lead to confused Darcs hackers?

I don't know. It's not easy to name those things. The best rationale I
can give is that I failed to come up with anything better. Using
synonyms to avoid a type/ctor name coincidence seems like a bad idea to
me (more things to remember). The result would be something like

data WantIndex = UseIndex | IgnoreIndex
data WhatToScan = ScanKnown | ScanAll

... probably not any better, maybe worse.

> * Also either --ignoretimes or UseIndex may be misleading names.
>   Which is it?  What is the effect of UseIndex when the repository
>   does not have an index (say it's old-fashioned)?  Then again, I
>   suppose the idea is that for hashed repositories, all repositories
>   have an index; they just don't know it yet.

Even old-fashioned repos have an index. It's just not very useful. The
implication is as if --ignore-times / IgnoreIndex was always in
effect. This can be remedied, but I don't have time. Volunteers are
however welcome to do so. (The solution is to keep a second index for
the pristine.)

> * I notice that you chose an core-facing implementation-oriented 
>   name (talking about indexes) and not a user-oriented name
>   (ignores the times on files)

Yes, since this is eventually going to be part of the libdarcs API.

>> hunk ./src/Darcs/Repository.hs 392
>>  addToPending :: RepoPatch p => Repository p C(r u t) -> FL Prim C(u y) -> IO ()
>>  addToPending (Repo _ opts _ _) _ | NoUpdateWorking `elem` opts = return ()
>>  addToPending repo@(Repo _ opts _ _) p =
>> -    do pend <- unrecordedChanges opts repo []
>> +    do pend <- unrecordedChanges (UseIndex, ScanKnown) repo []
>
> Again, here's a potential win from not relying on opts (I'm not sure
> where the opts that are baked into a repo come from.  Are they limited
> to some repository-specific preferences like the pristine type?)

I think it's the opts you pass to withRepository -- which is usually the
same thing that you get everywhere else. I want to remove them too,
though... They totally don't belong there.

> I'm guessing this is used for things like darcs pull and apply where
> patch effects have to be merged into the working directory

Not only working: pristine as well, and also it does the actual us /
them merging. This is the general high-level interface to merge. Used by
apply and pull as you say.

>> hunk ./src/Darcs/SelectChanges.hs 861
>> -          unrec <- fmap n2pia . (anonymous . fromPrims) =<< unrecordedChanges [] repository []
>> +          unrec <- fmap n2pia . (anonymous . fromPrims)
>> +                     =<< unrecordedChanges (UseIndex, ScanKnown) repository []
>
> Replacing the one sort of default [] with another.
>

> Makes me wonder if we need to store the pair (UseIndex, ScanKnown) in
> some sort of common variables.

Probably not worth it. Just makes the code less transparent in this
particular case, IMO.

> You might be able to make the case that --nolinks should simply be
> made to work with hashed repositories.  My natural inclination is,
> in case of doubt, to prune and simplify but I think it would only be
> fair if Kevin had a chance to weigh in first.

We would need a volunteer for that, though. Personally, I think it'd be
better served by a posthook.

>> -      copyFileOrUrl [NoLinks] (repodir </> prefsRelPath)
>> -         prefsRelPath Uncachable `catchall` return ()
>> +      (fetchFilePS (repodir </> prefsRelPath) Uncachable >>= B.writeFile prefsRelPath)
>> +       `catchall` return ()
>
> Wouldn't some sort of cloneFileOrUrl be generally useful instead?

Probably no, since this is the only place I found it used. Although
maybe there are such places elsewhere, implemented differently.

> Interestingly, in the 2007 threads I remarked that Kevin's work introduced a
> lot of splatter (functions that had to be updated to thread opts through).
> And in this patch, we only remove a very small amount of the splatter.  Why?
> It seems that we still need the threading of [DarcsFlag] so that we can keep
> track of the new --remote-darcs option that we later introduced.  Hmm!

Partially because I didn't remove everything I could, just what I
spotted.

> Remove [DarcsFlag] parameters from apply.
> -----------------------------------------
> The --set-scripts-executable flag is used as a sort of workaround for
> the fact that Darcs does not do any sort of permissions or metadata
> tracking.
>
> Commands which apply patches (get, pull, apply) etc could be instructed
> to look for files that start with a shebang and set those to executable.
> It's not a very exact way of doing things but gets the job done for some
> folks (limitations: no obliterate/rollback support, and perhaps no
> support for eg. adding a shebang line to a file that wasn't a script
> before)
>
> We have two ways of implementing this. For get/put, we had this
> setScriptsExecutable function which just traverses the working directory
> and hunts for scripts.  For apply/pull, we had an apply function which
> inspects hunk patches affecting the first line of a file.

> Petr's patch switch everything to the first approach.  From the user
> perspective, one small change is that now applying any file which
> affects a shebang file will cause that file to be set executable
> (whereas previously it was just patches that added a shebang first
> line).  I think this is the sort of change that nobody is actually going
> to notice and is more consistent anyway.
This is not true -- the behaviour did not change. There's
setScriptsExecutable like before, and makeScriptsExecutable which takes
a patch (sequence): the latter is used in place of the hack previously
residing in apply -- it extracts the paths that were touched by a given
patch and only looks at those files to check whether they need to be
made executables.

> From the code perspective, now when you apply patches to the working
> directory, you have to make a conscious effort to call
> setScriptsExecutable.  I'm guessing this is acceptable because the
> number of places where we actually apply patches to the working
> directory is small?
Well, it's certainly more acceptable than making apply, which is the
core of the core, aware of this convenience hack... Of course, a later
refactor can produce a nicer applyToWorking that would take this into
account. Maybe even more consolidated, like applyToRepo. Nevertheless,
whatever it is, it should live in the Commands layer. Marking scripts as
executable is certainly not a fundamental darcs concept...

> Clarification requested
> ~~~~~~~~~~~~~~~~~~~~~~~
> 1. I think you may be missing a makeScriptsExecutable in Darcs.Commands.Convert
Not needed, since
      when (SetScriptsExecutable `elem` opts) $ setScriptsExecutable
is called later in the sequence, which makes any previous
makeScriptsExecutable moot.

> 2. Does copyPackedRepository in Darcs.Repository need this too?
Probably no, since none of the other copy*Repository do this. This is
however another topic for a far-fetched refactor: the repository copying
code is unsightly and sprawled.

> 3. repoRepo stuff?
>
>> -readRepo :: RepoPatch p => [DarcsFlag] -> String -> IO (SealedPatchSet p C(Origin))
>> -readRepo _ d = do
>> +readRepo :: RepoPatch p => String -> IO (SealedPatchSet p C(Origin))
>> +readRepo d = do
>
>> -readTentativeRepo :: RepoPatch p => [DarcsFlag] -> String -> IO (SealedPatchSet p C(Origin))
>> -readTentativeRepo _ d = do
>> +readTentativeRepo :: RepoPatch p => String -> IO (SealedPatchSet p C(Origin))
>> +readTentativeRepo d = do
>
> While this and its consequences look good, I wonder: is this related to the
> apply stuff in any way?  Perhaps this change should be made in a separate
> patch?

Dunno. Probably not very related. It was enough work already to sort
this out into this many patches though, so it's no wonder some things
ended up a bit mingled.

>> -trackBisect :: (Invert p, ShowPatch p, Apply p) => [DarcsFlag] -> IO ExitCode -> RL p C(x y) -> IO ()
>> +trackBisect :: (Patchy p) => [DarcsFlag] -> IO ExitCode -> RL p C(x y) -> IO ()
>
>> -patchTreeFromRL :: (Invert p, ShowPatch p, Apply p) => RL p C(x y) -> PatchTree p C(x y)
>> +patchTreeFromRL :: (Patchy p) => RL p C(x y) -> PatchTree p C(x y)
>
>> -patchTree2RL :: (Invert p) => PatchTree p C(x y) -> RL p C(x y)
>> +patchTree2RL :: (Patchy p) => PatchTree p C(x y) -> RL p C(x y)
>
>> -trackNextBisect :: (Invert p, ShowPatch p, Apply p) => [DarcsFlag] -> BisectState -> IO ExitCode -> BisectDir -> PatchTree p C(x y) -> IO ()
>> +trackNextBisect :: (Patchy p) => [DarcsFlag] -> BisectState -> IO ExitCode -> BisectDir -> PatchTree p C(x y) -> IO ()
>
> Do these changes really belong in the patch?  Also do we really want to
> effectively increase the number of constraints on the patch types?
> Since this high-level code, I guess it doesn't really matter and simpler
> is better..
Not irrelevant, see below.

> 5. question about trackdown --bisect
>> -jumpHalfOnRight :: (Invert p, ShowPatch p, Apply p) => [DarcsFlag] -> PatchTree p C(x y) -> IO ()
>> -jumpHalfOnRight opts l = unapplyRL opts (patchTree2RL l)
>> +jumpHalfOnRight :: (Patchy p) => [DarcsFlag] -> PatchTree p C(x y) -> IO ()
>> +jumpHalfOnRight opts l = unapplyRL ps -- >> makeScriptsExecutable opts ps
>> +  where ps = patchTree2RL l
>
>> hunk ./src/Darcs/Commands/TrackDown.lhs 173
>> -jumpHalfOnLeft :: (Invert p, ShowPatch p, Apply p) => [DarcsFlag] -> PatchTree p C(x y) -> IO ()
>> -jumpHalfOnLeft  opts r = applyRL opts (patchTree2RL r)
>> +jumpHalfOnLeft :: (Patchy p) => [DarcsFlag] -> PatchTree p C(x y) -> IO ()
>> +jumpHalfOnLeft opts r = applyRL p -- >> makeScriptsExecutable opts p
>> +  where p = patchTree2RL r
>
> Why aren't we calling makeScriptsExecutable here? Petr definitely noticed that
> we ought to do it and we did seem to be threading the opts through in the past.
Right, we ought and I wanted to, but GHC gave me super-mysterious error
about constraints (even though I believe that with the about
non-irrelevant change to Patchy, the contexts all match up, GHC
disagrees and wants Conflict added to some contexts... I have no idea
why. I'll look into it in a bit...)

[snip]

> Back to work
> ~~~~~~~~~~~~
>> hunk ./src/Darcs/Commands/Apply.lhs 200
>>      withSignalsBlocked $ do finalizeRepositoryChanges repository
>>                              applyToWorking repository opts pw `catch` \(e :: SomeException) ->
>>                                  fail ("Error applying patch to working dir:\n" ++ show e)
>> +                            makeScriptsExecutable opts pw
>
>> hunk ./src/Darcs/Commands/Get.lhs 298
>>             do tentativelyRemovePatches repository opts us'
>>                tentativelyAddToPending repository opts $ invert $ effect us'
>>                finalizeRepositoryChanges repository
>> -              apply opts (invert $ effect ps) `catch` \e ->
>> +              apply (invert $ effect ps) `catch` \e ->
> ...
>> +              makeScriptsExecutable opts (invert $ effect ps)
>
> This must the case where we're getting --to-match and unapplying patches.
Indeed.

>> -  Sealed local_patches <- DR.readRepo opts "." :: IO (SealedPatchSet Patch C(Origin))
>> +  Sealed local_patches <- DR.readRepo "." :: IO (SealedPatchSet Patch C(Origin))
>
>> hunk ./src/Darcs/Commands/Get.lhs 348
>> -                  apply opts p_ch `catch`
>> +                  apply p_ch `catch`
>>                        \e -> fail ("Bad checkpoint!!!\n" ++ prettyError e)
>> hunk ./src/Darcs/Commands/Get.lhs 350
>> -                  applyPatches opts (reverseRL needed_patches)
>> -          else applyPatches opts $ reverseRL $ newset2RL local_patches
>> +                  applyPatches (reverseRL needed_patches)
>> +                  when (SetScriptsExecutable `elem` opts) setScriptsExecutable
>> +          else applyPatches $ reverseRL $ newset2RL local_patches
>
> This block of code pertains to applying patches after fetching the
> partial repository checkpoint.
Aye. It's all a total mess. :(

> Replace some [DarcsFlag] uses with newly introduced RemoteDarcs.
> ----------------------------------------------------------------
>
> Request for future work?
> ~~~~~~~~~~~~~~~~~~~~~~~~
>> hunk ./src/Ssh.hs 4
>> -copySSH :: String -> String -> FilePath -> IO ()
>> -copySSH rdarcs uRaw f = withSSHConnection rdarcs uRaw (\c -> grabSSH uRaw c >>= B.writeFile f) $
>> +copySSH :: RemoteDarcs -> String -> FilePath -> IO ()
>> +copySSH remote uRaw f | rdarcs <- remoteDarcs remote =
>
>> -copySSHs :: String -> String -> [String] -> FilePath -> IO ()
>> -copySSHs rdarcs u ns d =
>> +copySSHs :: RemoteDarcs -> String -> [String] -> FilePath -> IO ()
>> +copySSHs remote u ns d | rdarcs <- remoteDarcs remote =
>
> Nice, but why use this pattern guard syntax when we could just as easily have
> written where rdarcs = remoteDarcs remote?
Hm, I think it seemed as the least intrusive way at that moment. Looking
at the patch's timestamp (2:30 am) makes me think that I wasn't quite as
alert as I maybe ought to. So yes, where would work for sure.

> Also, how about changing applyViaSsh to take RemoteDarcs?
It needs to pass opts to the remote darcs (after filtering), so it'd be
in addition to opts and not instead.

> The basic change
> ~~~~~~~~~~~~~~~~
>> hunk ./src/Darcs/Flags.hs 61
>> -               | RemoteDarcs String
>> +               | RemoteDarcsOpt String
>
> Rename the surface level flag so that we can re-use the RemoteDarcs name
> for the deeper stuff.
>
> Note that it'd be good to keep an eye out for consistency and friendliness in
> naming the ADTs and the flags.  This particular case seems friendly enough.
>
>> +remoteDarcs :: [DarcsFlag] -> RemoteDarcs
>> +remoteDarcs f | (x:_) <- [ c | RemoteDarcsOpt c <- f ] = RemoteDarcs x
>> +              | otherwise = DefaultRemoteDarcs
>

> (For review, I pasted in your later version from 'Fix "head: empty list" bug in
>  Darcs.Flags.RemoteDarcs.')
Aye. It's separate because of intervening depending patches (bummer).

> These looked tricky until I realised it was a refactor
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +copyAndReadFile :: (FilePath -> IO a) -> String -> Cachable -> IO a
>> +copyAndReadFile readfn fou _ | isFile fou = readfn fou
>> +copyAndReadFile readfn fou cache = withTemp $ \t -> do copyFileOrUrl DefaultRemoteDarcs fou t cache
>> +                                                       readfn t
>>
>> +fetchFilePS = copyAndReadFile B.readFile
>> +fetchFileLazyPS = copyAndReadFile BL.readFile
>> +gzFetchFilePS = copyAndReadFile gzReadFilePS
>
> Nice little refactor of fetchFilePS, fetchFileLazyPS, and gzFetchFilePS
> Higher order functions are your friends.
>
> This made it a bit easier to take the RemoteDarcs stuff into account
> and perhaps extend it in the future for issue1736
>
> May be worth adding a comment to copyAndReadFile just marking it as
> a helper function
>
>> +-- | @fetchFile fileOrUrl cache@ returns the content of its argument (either a
>> +-- file or an URL). If it has to download an url, then it will use a cache as
>> +-- required by its second argument.
>> +--
>> +-- We always use default remote darcs, since it is not fatal if the remote
>> +-- darcs does not exist or is too old -- anything that supports transfer-mode
>> +-- should do, and if not, we will fall back to SFTP or SCP.
>
> It's not fatal, but slightly unfortunate.
> See http://bugs.darcs.net/issue1736
It's been like this forever. It may be unfortunate, but at least for now
I'd avoid adding new stuff to the already messy base.

>> hunk ./src/Darcs/External.hs 143
>> --- | @fetchFileLazyPS fileOrUrl cache@ lazily reads the content of
>> --- its argument (either a file or an URL). Warning: this function may
>> --- constitute a fd leak; make sure to force consumption of file contents
>> --- to avoid that.
>> +-- | @fetchFileLazyPS fileOrUrl cache@ lazily reads the content of its argument
>> +-- (either a file or an URL). Warning: this function may constitute a fd leak;
>> +-- make sure to force consumption of file contents to avoid that. See
>> +-- "fetchFilePS" for details.
>
> Minor grumble about minimal patches: the whitespace changes were not really
> necessary were they?
Sorry about that. I am too lazy to fill manually (so using M-q in emacs
which fills blocks)... It'd be easier if all comment blocks would fill
at the same column...

> The extra comment is added to hopefully point users at the remote darcs stuff
> in the comment Petr added about always using default remote darcs
That too.



More information about the darcs-users mailing list