[darcs-devel] FastPackedString issues

Donald Bruce Stewart dons at cse.unsw.edu.au
Sat Aug 27 00:10:50 PDT 2005


Further on the FastPackedString code, I've reimplemented pscmp as
comparePS, to use memcmp. It is much faster (96%) on large (ie.. 100M
strings), and uses around 1/5 the space (though space before wasn't
bad). Performance of == is basically unchanged.

I thought this worth submitting back to darcs, although I'm not sure how
important a fast compare is to darcs.

-- Don
-------------- next part --------------

New patches:

[New implementation of comparePS, based on memcmp. 1/5 space usage, 96% faster
dons at cse.unsw.edu.au**20050827070030] {
hunk ./FastPackedString.hs 223
-   (==) = psniceq
+   (==) = eqPS
hunk ./FastPackedString.hs 228
-{-# INLINE psniceq #-}
-psniceq :: PackedString -> PackedString -> Bool
-psniceq a b | nullPS a && nullPS b = True
-psniceq (PS x1 s1 l1) (PS x2 s2 l2) =
-    ((l1 == l2) &&) $ unsafePerformIO $ withForeignPtr x1 $ \p1->
-    withForeignPtr x2 $ \p2 ->
-        if p1 == p2 && s1 == s2
-        then return True
-        else liftM (==0) $ c_memcmp (p1 `plusPtr` s1) (p2 `plusPtr` s2) l1
+{-# INLINE eqPS #-}
+eqPS :: PackedString -> PackedString -> Bool
+eqPS a b = (comparePS a b) == EQ
hunk ./FastPackedString.hs 233
-    compare = pscmp
+    compare = comparePS
hunk ./FastPackedString.hs 235
-pscmp :: PackedString -> PackedString -> Ordering
-pscmp (PS x1 s1 l1) (PS x2 s2 l2) =
-    unsafePerformIO $ withForeignPtr x1 $ \p1->
-        withForeignPtr x2 $ \p2 ->
-    let doc :: Ptr Word8 -> Ptr Word8 -> IO Ordering
-        st1 = p1 `plusPtr` s1 `plusPtr` l1
-        st2 = p2 `plusPtr` s2 `plusPtr` l2
-        doc w1 w2 =
-            if w1 == st1 && w2 == st2 then return EQ
-            else if w1 == st1 then return LT
-                 else if w2 == st2 then return GT
-                      else do h1 <- peek w1
-                              h2 <- peek w2
-                              if h1 < h2
-                                 then return LT
-                                 else if h1 == h2
-                                      then doc (w1 `plusPtr` 1) (w2 `plusPtr` 1)
-                                      else return GT
-        in
-        doc (p1 `plusPtr` s1) (p2 `plusPtr` s2)
+-- | 'comparePS' provides an 'Ordering' for 'PackedStrings' supporting slices.
+comparePS :: PackedString -> PackedString -> Ordering
+comparePS (PS _ _ 0) (PS _ _ 0) = EQ    -- short cut for empty strings
+comparePS (PS x1 s1 l1) (PS x2 s2 l2) = unsafePerformIO $ 
+    withForeignPtr x1 $ \p1 -> 
+        withForeignPtr x2 $ \p2 -> do 
+            i <- c_memcmp (p1 `plusPtr` s1) (p2 `plusPtr` s2) (min l1 l2)
+            return $ case i `compare` 0 of
+                EQ  -> l1 `compare` l2
+                x   -> x
}

Context:

[remove hideous malloc hack.
David Roundy <droundy at darcs.net>**20050818161411] 
[change my AUTHORS email to droundy at darcs.net.
David Roundy <droundy at abridgegame.org>**20050808124703] 
[fix mkstemp implementation for win32
Peter Strand <peter at zarquon.se>**20050810211303] 
[Implement parts of System.Posix.(IO|Files) for win32
peter at zarquon.se**20050809200433] 
[implement RawMode with library functions instead of ffi
peter at zarquon.se**20050809200148] 
[call hsc2hs without output filename argument
peter at zarquon.se**20050808220444] 
[Rename compat.c to c_compat.c to avoid object filename conflict with Compat.hs
peter at zarquon.se**20050731114011] 
[Move atomic_create/sloppy_atomic_create to Compat
Ian Lynagh <igloo at earth.li>**20050730141703] 
[Split the raw mode stuff out into its own .hsc file. Windows needs some TLC
Ian Lynagh <igloo at earth.li>**20050730134030] 
[Move maybe_relink out of compat.c
Ian Lynagh <igloo at earth.li>**20050730131205] 
[Remove is_symlink
Ian Lynagh <igloo at earth.li>**20050730122255] 
[Move mkstemp to Compat.hs
Ian Lynagh <igloo at earth.li>**20050730020918] 
[Remove unused function
Ian Lynagh <igloo at earth.li>**20050730010118] 
[Start Compat.hs, and move stdout_is_a_pipe from compat.c
Ian Lynagh <igloo at earth.li>**20050730004829] 
[fix for bug Ian found in apply.
David Roundy <droundy at abridgegame.org>**20050811162558
 This is the bug manifested in the cabal repository.
] 
[fix compilation errors with ghc-6.2.2 on win32
Peter Strand <peter at zarquon.se>**20050809192759] 
[Retain both Git's author and committer.
Juliusz Chroboczek <jch at pps.jussieu.fr>**20050810000820] 
[Move slurping into syncPristine.
Juliusz Chroboczek <jch at pps.jussieu.fr>**20050809232101
 Avoids creating a useless pristine tree when there is none.  Thanks to
 Ian for pointing this out.
] 
[Split --relink into --relink and --relink-pristine.
Juliusz Chroboczek <jch at pps.jussieu.fr>**20050809230951
 Relinking the pristine tree breaks handling of timestamps, which causes
 Darcs to compare file contents.  It should not be used unless you know
 what you are doing.
] 
[make repair work on partial repositories.
David Roundy <droundy at abridgegame.org>**20050805113001] 
[Cleanup --verbose handling in repair command
Matt Lavin <matt.lavin at gmail.com>**20050805020630] 
[clean up Printer.wrap_text.
David Roundy <droundy at abridgegame.org>**20050808114844] 
[add several changelog entries.
David Roundy <droundy at abridgegame.org>**20050808114800] 
[improve EOD message a tad.
David Roundy <droundy at abridgegame.org>**20050807112644
 This change also introduces a "wrapped_text" function in Printer, so we
 won't have to worry so often about manually wrapping lines.
] 
[changed ***DARCS*** to ***END OF DESCRIPTION***
Jason Dagit <dagit at codersbase.com>**20050729032543] 
[remove unused opts argument from apply_patches and apply_patches_with_feedback
Matt Lavin <matt.lavin at gmail.com>**20050807031038] 
[Use apply_patch_with_feedback from check and repair commands
Matt Lavin <matt.lavin at gmail.com>**20050805020830] 
[add code to read patch bundles with added CRs.
David Roundy <droundy at abridgegame.org>**20050806222631
 I think this'll address bug #291.
] 
[accept command-line flags in any order.
David Roundy <droundy at abridgegame.org>**20050806211828
 In particular, we no longer require that --flags precede filename and
 repository arguments.
] 
[show patch numbers instead of dots on get
Matt Lavin <matt.lavin at gmail.com>**20050804013649] 
[add obliterate command as alias for unpull.
David Roundy <droundy at abridgegame.org>**20050804104929] 
[Do not ask confirmation for revert -a
jani at iv.ro**20050627124011
 Giving -a as a parameter means the user expects all changes to be reverted.
 Just like for unrevert and record go ahead with it do not ask for confirmation.
] 
[clarify help text for 'd' in SelectPatches.
David Roundy <droundy at abridgegame.org>**20050806231117] 
[Add --with-static-libs configure flag for linking static versions of libraries.
v.haisman at sh.cvut.cz**20050729015539] 
[add changelog entry for bug #477.
David Roundy <droundy at abridgegame.org>**20050806212148] 
[changelog entry for bug #189.
David Roundy <droundy at abridgegame.org>**20050731132624] 
[add description of how to add changelog entries to ChangeLog.README.
David Roundy <droundy at abridgegame.org>**20050806225901] 
[Explain the missing ChangeLog
Mark Stosberg <mark at summersault.com>**20050526135421
 
 It should be easy for casual users and contributors to view and update the
 ChangeLog.
 
 Providing a README file in the place where people are most likely to look
 provides a very useful clue.
 
 However, it's still not clear to me exactly how the system works, so I have
 left a stub to complete that documentation.
 
     Mark
 
] 
[fix obsolete error explanation in get_extra bug.
David Roundy <droundy at abridgegame.org>**20050804130610] 
[simplify fix for bug 463; reuse /// from FilePathUtils
Matt Lavin <matt.lavin at gmail.com>**20050804021130] 
[Make curl exit with error on failed downloads
peter at zarquon.se**20050802192833] 
[Bump up AC_PREREQ version to 2.59.
v.haisman at sh.cvut.cz**20050801173925] 
[fix for bug 463 (with new test)
Matt Lavin <matt.lavin at gmail.com>**20050802002116] 
[bump version number, since I just made a release.
David Roundy <droundy at abridgegame.org>**20050731190756] 
[Use simpler curl_version() function to get version string.
Kannan Goundan <kannan at cakoose.com>**20050322221027] 
[fix documentation on --reorder-patches.
David Roundy <droundy at abridgegame.org>**20050731185406] 
[add changelog entry for bug #224.
David Roundy <droundy at abridgegame.org>**20050731133942] 
[fix bug when editing long comment leaves empty file.
David Roundy <droundy at abridgegame.org>**20050731133612] 
[TAG 1.0.4pre2
David Roundy <droundy at abridgegame.org>**20050731121029] 
Patch bundle hash:
87fff466f8c99db3b2dab79edadc50648dab7624


More information about the darcs-devel mailing list