[darcs-users] [patch16] Bump the hashed-storage dependency to >=... (and 1 more)

Wed Nov 4 10:43:01 UTC 2009

Reinier Lamers <tux_rocker at reinier.de> writes:

>>unrecordedChanges :: (RepoPatch p) => [DarcsFlag] -> Repository p C(r u t)
>>                  -> [SubPath] -> IO (FL Prim C(r y))
>>unrecordedChanges opts repo paths = do
>>  (all_current, _) <- readPending repo
>>  Sealed pending <- pendingChanges repo paths
>>
>>  relevant <- restrictSubpaths repo paths
>>  nonboring <- restrictBoring
>>
>>  let current = relevant all_current
>>  working <- case (LookForAdds `elem` opts, IgnoreTimes `elem` opts) of
>>               (False, False) -> do
>>                 I.updateIndex =<< (relevant <$> readIndex repo)
>>               (False, True) -> do
>>                  guide <- expand current
>>                  all_working <- readPlainTree "."
>>                  return $ relevant $ (restrict guide) all_working
>>-               -- TODO (True, False) could use a more efficient implementation... 
>>-               (True, _) -> do
>>+               (True, False) -> do
>>+                 plain <- relevant <$> nonboring <$> readPlainTree "."
>>+                 index <- I.updateIndex =<< (relevant <$> readIndex repo)
>>+                 return $ plain `overlay` index
>>+               (True, True) -> do
>>                  all_working <- readPlainTree "."
>>                  return $ relevant $ nonboring all_working
>
> I see why this is correct, but not why it is faster. Is it that readIndex uses
> data from the index when the timestamp on the working directory file is no
> newer than the timestamp of that file in the index? But how would that help?
> Is reading a file from the index faster than reading it from disk?
The trick is that with the index, we don't read the files at all. The index has
a cached hash of the working files, which -- if timestamps match -- is used to
check against pristine hash. If they are equal, neither the working nor the
pristine files are opened at all.

>> +-- | Lay one tree over another. The resulting Tree will look like the base (1st
>> +-- parameter) Tree, although any items also present in the overlay Tree will be
>> +-- taken from the overlay. It is not allowed to overlay a different kind of an
>> +-- object, nor it is allowed for the overlay to add new objects to base.  This
>> +-- means that the overlay Tree should be a subset of the base Tree (although
>> +-- any extraneous items will be ignored by the implementation).
>> +overlay :: (Functor m, Monad m) => Tree m -> Tree m -> Tree m
>> +overlay base over = Tree { items = M.fromList immediate
>> +                         , listImmediate = immediate
>> +                         , treeHash = NoHash }
>> +    where immediate = [ (n, get n) | (n, _) <- listImmediate base ]
>> +          get n = case (M.lookup n $ items base, M.lookup n $ items over) of
>> +                    (Just (File _), Just f@(File _)) -> f
>> +                    (Just (SubTree b), Just (SubTree o)) -> SubTree $ overlay b o
>> +                    (Just (Stub b _), Just (SubTree o)) -> Stub (flip overlay o `fmap` b) NoHash
>> +                    (Just (SubTree b), Just (Stub o _)) -> Stub (overlay b `fmap` o) NoHash
>> +                    (Just (Stub b _), Just (Stub o _)) -> Stub (do o' <- o
>> +                                                                   b' <- b
>> +                                                                   return $ overlay b' o') NoHash
>> +                    (Just x, _) -> x
>> +                    (_, _) -> error $ "Unexpected case in overlay at get " ++ show n ++ "."
>
> I suppose you know that you'll never get a Nothing from the lookup because you
> get n via listImmediate, which apparently always returns valid indexes.
Yes, the error there is pure paranoia.

Yours,
   Petr.