[darcs-devel] FastPackedString issues

Donald Bruce Stewart dons at cse.unsw.edu.au
Sun Sep 4 22:54:08 PDT 2005


Hey,

Here are some items from my port of the FastPackedString library, after
doing some more profiling against Data.PackedString. I'm posting only
the functions used in darcs (not all of which are /often/ used in darcs,
however).  These changes may be interesting nonetheless, particularly
the improved pack and unpack, and the *packAddress constructors (feel
free to ignore these if they seem too scary -- though an O(1)
PackedString constructor is pretty nice, imo ;).

Let me know if any of these are worth integrating into darcs, and I'll
repost as individual darcs patches. 

Note that the darcs testsuite seems to run ok with these patches, and
they mostly have QuickCheck and HUnit properties in the FPS library
itself.  There are several other improved functions in FPS, so if you
start using other functions from FastPackedString, you may want to check
to see if FPS has an improved implementation.

Cheers,
   Don

------------------------------------------------------------------------

As well as the attached functions, here's an unboxed reverse, but this adds
some C code. I'm not sure if this is politically correct:

  -- | /O(n)/ 'reverse' @xs@ efficiently returns the elements of @xs@ in reverse order.
  reverse :: FastString -> FastString
  reverse (PS x s l) = createPS l $ \p -> withForeignPtr x $ \f ->
          c_reverse p (f `plusPtr` s) l -- 99% less space, very much faster

  /* copy a string in reverse */
  void reverse(unsigned char *dest, unsigned char *from, int len)
  {
      unsigned char *p, *q;
      p = from + len - 1;
      q = dest;

      while (p >= from)
          *q++ = *p--;
  }    
-------------- next part --------------

New patches:

[Improved versions of some common FastPackedString functions
Don Stewart <dons at cse.unsw.edu.au>**20050905054237] {
hunk ./FastPackedString.hs 1
-{-# OPTIONS -fffi -cpp #-}
+{-# OPTIONS -fffi -cpp -fglasgow-exts #-}
hunk ./FastPackedString.hs 37
+#if defined(__GLASGOW_HASKELL__)
+        packAddress,          -- :: Addr# -> FastString
+        unsafePackAddress,    -- :: Int -> Addr# -> FastString
+#endif
hunk ./FastPackedString.hs 118
-import Foreign.Storable ( peekElemOff, peek, poke )
+import Foreign.Storable ( peekElemOff, peekByteOff, peek, poke )
hunk ./FastPackedString.hs 139
-                           newForeignPtr )
+                           newForeignPtr, newForeignPtr_ )
hunk ./FastPackedString.hs 143
+
+import GHC.Ptr              (Ptr(..))
+import GHC.Prim
+import GHC.ST               (ST(..))
+import GHC.Base             (Char(..))
+import Control.Monad.ST     (stToIO)
hunk ./FastPackedString.hs 272
--- | Convert a 'String' into a 'PackedString'
+-- | /O(n)/ Convert a 'String' into a 'FastString'
hunk ./FastPackedString.hs 274
-packString str = createPS (length str) $ \p -> pokeArray p $ map c2w str
+# if !defined(__GLASGOW_HASKELL__)
+packString str = createPS (length str) $ \p -> go p str
+  where
+      go _ []     = return ()
+      go p (x:xs) = poke p (c2w x) >> go (p `plusPtr` 1) xs -- less space than pokeElemOff
+# else /* hack away */
+packString str = createPS (length str) $ \(Ptr p) -> stToIO (go p 0# str)
+  where
+      go _ _ []        = return ()
+      go p i (C# c:cs) = writeByte p i c >> go p (i +# 1#) cs
+
+      writeByte p i c = ST $ \s# ->
+          case writeCharOffAddr# p i c s# of s2# -> (# s2#, () #)
+#endif
+
hunk ./FastPackedString.hs 321
+#if defined(__GLASGOW_HASKELL__)
+-- | /O(n)/ Pack a null-terminated sequence of bytes, pointed to by and
+-- Addr\# (an arbitrary machine address assumed to point outside the
+-- garbage-collected heap) into a PackedString. A useful way to
+-- create an Addr\# is with an unboxed string literal, which is compiled
+-- to a @char []@. Establishing the length of the string requires a call
+-- to /strlen(3)/. Use 'unsafePackAddress' if you know the length of the
+-- string statically. Requires '-fglasgow-exts' for the 'Addr\#' type.
+--
+-- An example:
+--
+-- > literalFS = packAddress "literal"#
+--
+packAddress :: Addr# -> PackedString
+packAddress addr# = unsafePerformIO $ do
+    p <- newForeignPtr_ cstr
+    i <- c_strlen cstr
+    return $ PS p 0 (fromIntegral i)
+    where
+      cstr = Ptr addr# 
+{-# INLINE packAddress #-}
+
+-- | /O(1)/ Pack a null-terminated sequence of bytes into a
+-- 'PackedString', given a raw 'Addr\#' to the string, and the
+-- length of the string. Make sure the length is correct, otherwise use
+-- the safer 'packAddress' (where the length will be calculated once at
+-- runtime).
+unsafePackAddress :: Int -> Addr# -> PackedString
+unsafePackAddress len addr# = unsafePerformIO $ do
+    p <- newForeignPtr_ cstr
+    return $ PS p 0 len
+    where
+      cstr = Ptr addr# 
+#endif
+
hunk ./FastPackedString.hs 359
--- | Convert a 'PackedString' into a 'String'
+-- | /O(n)/ Convert a 'PackedString' into a 'String'
hunk ./FastPackedString.hs 361
-unpackPS (PS ps s l)
- = map w2c $ unsafePerformIO
-           $ withForeignPtr ps $ \p -> peekArray l (p `plusPtr` s)
+unpackPS (PS _  _ 0) = []
+unpackPS (PS ps s l) = unsafePerformIO $ withForeignPtr ps $ \p ->
+      go (p `plusPtr` s) (l - 1) []
+  where
+      go p 0 acc = liftM w2c (peekByteOff p 0) >>= \e -> return (e : acc)
+      go p n acc = liftM w2c (peekByteOff p n) >>= \e -> go p (n-1) (e : acc)
+
hunk ./FastPackedString.hs 435
-  | otherwise  = w2c $ unsafePerformIO $ withForeignPtr x $ \p -> peekElemOff p s
+  | otherwise  = w2c $ unsafePerformIO $ withForeignPtr x $ \p -> peekByteOff p s
}

Context:

[-add test script for --set-scripts-executable
Mark Stosberg <mark at summersault.com>**20050901015046
 
 It's currently failing because darcs is currently broken in this regard. I commented
 out a "TODO" test in case you want to make to this a TODO test until
 someone gets to it.
] 
[clean up docs on flags directly to darcs (not to a darcs command).
David Roundy <droundy at darcs.net>**20050903124050] 
[bump version to 1.0.4rc1.
David Roundy <droundy at darcs.net>**20050903114002] 
[update the web page to direct new users first to the precompiled binaries rather than first to the source
zooko at zooko.com**20050902162737] 
[add test script that displays --no-pristine test-related bug.
David Roundy <droundy at darcs.net>**20050903132906] 
[fix bug triggered by --no-pristine-tree and running test.
David Roundy <droundy at darcs.net>**20050903132055
 The trouble was that the NoPristine version of createPristineDirectoryTree
 would fail if the directory already exists, which isn't the intended
 behavior.  I also took this opportunity to remove the "stubbornly" function
 and replace some stubborn directory creation with
 createDirectoryIfMissing.
] 
[don't create test directory if we don't want to actually run test.
David Roundy <droundy at darcs.net>**20050903130722] 
[Change an rm_rf to a cleanup in tests/disable.pl
Ian Lynagh <igloo at earth.li>**20050902024711] 
[TAG 1.0.4pre4
David Roundy <droundy at darcs.net>**20050901110418] 
[add changelog entry for makefile fix.
David Roundy <droundy at darcs.net>**20050901110353] 
[bump version to 1.0.4pre4.
David Roundy <droundy at darcs.net>**20050901110210] 
[fix DESTDIR syntax errors in makefile
Andres Loeh <loeh at iai.uni-bonn.de>**20050831192410] 
[fix "No root path(s) specified at ..." testsuite problem.
David Roundy <droundy at darcs.net>**20050830121603] 
[add test that triggers "too many open files" bug.
David Roundy <droundy at darcs.net>**20050827192215
 We just need to pull over 1024 patches at once to trigger this bug on my
 linux system.
] 
[TAG 1.0.4pre3
David Roundy <droundy at darcs.net>**20050831115448] 
[add two changelog entries.
David Roundy <droundy at darcs.net>**20050831113335] 
[only create directories on install if they don't exist (bug #494)
David Roundy <droundy at darcs.net>**20050831113142] 
[fix bug in whatsnew -l -l (rt#501).
David Roundy <droundy at darcs.net>**20050831110552] 
[fix typo in docs.
David Roundy <droundy at darcs.net>**20050831002520] 
[fix --posthook code to pass tests.
David Roundy <droundy at darcs.net>**20050830132225] 
[add test for --disable.
David Roundy <droundy at darcs.net>**20050830132122] 
[add changelog entry for --posthook.
David Roundy <droundy at darcs.net>**20050830132110] 
[add skeleton posthook test.
David Roundy <droundy at darcs.net>**20050827123744] 
[posthook documentation
Jason Dagit <dagit at codersbase.com>**20050825045706] 
[changed from --posthook-command to posthook
Jason Dagit <dagit at codersbase.com>**20050825043414] 
[now the posthook options appear for each command
Jason Dagit <dagit at codersbase.com>**20050825043305] 
[posthook for apply
Jason Dagit <dagit at codersbase.com>**20050803070343
 With this patch it is now possible to specify a command to run after every
 successful apply.
] 
[added run_posthook for actually running posthooks
Jason Dagit <dagit at codersbase.com>**20050803070156
 This adds the function run_posthook which should be used to run posthooks.
 The code was added to Test.lhs, but there may be a better place for this code.
] 
[added posthook command line switches
Jason Dagit <dagit at codersbase.com>**20050803065956
 Added generic posthook command line switches.  This patch does not add any
 posthooks to any command.
] 
[Rewrite gcau, add explanatory comment from David and some TODO notes
Ian Lynagh <igloo at earth.li>**20050830020943] 
[update building darcs section of manual.
David Roundy <droundy at darcs.net>**20050829120152] 
[add bench directory with a single script in it.
David Roundy <droundy at darcs.net>**20050828114118
 See bench/README for discussion of the idea behind this.
] 
[New implementation of comparePS, based on memcmp. 1/5 space usage, 96% faster
dons at cse.unsw.edu.au**20050827070030] 
[Use substrPS-less versions of initPS and tailPS
dons at cse.unsw.edu.au**20050827023214] 
[remove hideous malloc hack.
David Roundy <droundy at darcs.net>**20050818161411] 
[change my AUTHORS email to droundy at darcs.net.
David Roundy <droundy at abridgegame.org>**20050808124703] 
[fix mkstemp implementation for win32
Peter Strand <peter at zarquon.se>**20050810211303] 
[Implement parts of System.Posix.(IO|Files) for win32
peter at zarquon.se**20050809200433] 
[implement RawMode with library functions instead of ffi
peter at zarquon.se**20050809200148] 
[call hsc2hs without output filename argument
peter at zarquon.se**20050808220444] 
[Rename compat.c to c_compat.c to avoid object filename conflict with Compat.hs
peter at zarquon.se**20050731114011] 
[Move atomic_create/sloppy_atomic_create to Compat
Ian Lynagh <igloo at earth.li>**20050730141703] 
[Split the raw mode stuff out into its own .hsc file. Windows needs some TLC
Ian Lynagh <igloo at earth.li>**20050730134030] 
[Move maybe_relink out of compat.c
Ian Lynagh <igloo at earth.li>**20050730131205] 
[Remove is_symlink
Ian Lynagh <igloo at earth.li>**20050730122255] 
[Move mkstemp to Compat.hs
Ian Lynagh <igloo at earth.li>**20050730020918] 
[Remove unused function
Ian Lynagh <igloo at earth.li>**20050730010118] 
[Start Compat.hs, and move stdout_is_a_pipe from compat.c
Ian Lynagh <igloo at earth.li>**20050730004829] 
[fix for bug Ian found in apply.
David Roundy <droundy at abridgegame.org>**20050811162558
 This is the bug manifested in the cabal repository.
] 
[fix compilation errors with ghc-6.2.2 on win32
Peter Strand <peter at zarquon.se>**20050809192759] 
[Retain both Git's author and committer.
Juliusz Chroboczek <jch at pps.jussieu.fr>**20050810000820] 
[Move slurping into syncPristine.
Juliusz Chroboczek <jch at pps.jussieu.fr>**20050809232101
 Avoids creating a useless pristine tree when there is none.  Thanks to
 Ian for pointing this out.
] 
[Split --relink into --relink and --relink-pristine.
Juliusz Chroboczek <jch at pps.jussieu.fr>**20050809230951
 Relinking the pristine tree breaks handling of timestamps, which causes
 Darcs to compare file contents.  It should not be used unless you know
 what you are doing.
] 
[make repair work on partial repositories.
David Roundy <droundy at abridgegame.org>**20050805113001] 
[Cleanup --verbose handling in repair command
Matt Lavin <matt.lavin at gmail.com>**20050805020630] 
[clean up Printer.wrap_text.
David Roundy <droundy at abridgegame.org>**20050808114844] 
[add several changelog entries.
David Roundy <droundy at abridgegame.org>**20050808114800] 
[improve EOD message a tad.
David Roundy <droundy at abridgegame.org>**20050807112644
 This change also introduces a "wrapped_text" function in Printer, so we
 won't have to worry so often about manually wrapping lines.
] 
[changed ***DARCS*** to ***END OF DESCRIPTION***
Jason Dagit <dagit at codersbase.com>**20050729032543] 
[remove unused opts argument from apply_patches and apply_patches_with_feedback
Matt Lavin <matt.lavin at gmail.com>**20050807031038] 
[Use apply_patch_with_feedback from check and repair commands
Matt Lavin <matt.lavin at gmail.com>**20050805020830] 
[add code to read patch bundles with added CRs.
David Roundy <droundy at abridgegame.org>**20050806222631
 I think this'll address bug #291.
] 
[accept command-line flags in any order.
David Roundy <droundy at abridgegame.org>**20050806211828
 In particular, we no longer require that --flags precede filename and
 repository arguments.
] 
[show patch numbers instead of dots on get
Matt Lavin <matt.lavin at gmail.com>**20050804013649] 
[add obliterate command as alias for unpull.
David Roundy <droundy at abridgegame.org>**20050804104929] 
[Do not ask confirmation for revert -a
jani at iv.ro**20050627124011
 Giving -a as a parameter means the user expects all changes to be reverted.
 Just like for unrevert and record go ahead with it do not ask for confirmation.
] 
[clarify help text for 'd' in SelectPatches.
David Roundy <droundy at abridgegame.org>**20050806231117] 
[Add --with-static-libs configure flag for linking static versions of libraries.
v.haisman at sh.cvut.cz**20050729015539] 
[add changelog entry for bug #477.
David Roundy <droundy at abridgegame.org>**20050806212148] 
[changelog entry for bug #189.
David Roundy <droundy at abridgegame.org>**20050731132624] 
[add description of how to add changelog entries to ChangeLog.README.
David Roundy <droundy at abridgegame.org>**20050806225901] 
[Explain the missing ChangeLog
Mark Stosberg <mark at summersault.com>**20050526135421
 
 It should be easy for casual users and contributors to view and update the
 ChangeLog.
 
 Providing a README file in the place where people are most likely to look
 provides a very useful clue.
 
 However, it's still not clear to me exactly how the system works, so I have
 left a stub to complete that documentation.
 
     Mark
 
] 
[fix obsolete error explanation in get_extra bug.
David Roundy <droundy at abridgegame.org>**20050804130610] 
[simplify fix for bug 463; reuse /// from FilePathUtils
Matt Lavin <matt.lavin at gmail.com>**20050804021130] 
[Make curl exit with error on failed downloads
peter at zarquon.se**20050802192833] 
[Bump up AC_PREREQ version to 2.59.
v.haisman at sh.cvut.cz**20050801173925] 
[fix for bug 463 (with new test)
Matt Lavin <matt.lavin at gmail.com>**20050802002116] 
[bump version number, since I just made a release.
David Roundy <droundy at abridgegame.org>**20050731190756] 
[Use simpler curl_version() function to get version string.
Kannan Goundan <kannan at cakoose.com>**20050322221027] 
[fix documentation on --reorder-patches.
David Roundy <droundy at abridgegame.org>**20050731185406] 
[add changelog entry for bug #224.
David Roundy <droundy at abridgegame.org>**20050731133942] 
[fix bug when editing long comment leaves empty file.
David Roundy <droundy at abridgegame.org>**20050731133612] 
[TAG 1.0.4pre2
David Roundy <droundy at abridgegame.org>**20050731121029] 
Patch bundle hash:
8b068ef17449443601e0210aa4a029d2d0dbcf70


More information about the darcs-devel mailing list