[darcs-devel] [patch1083] name change Lcs.hs -> Diff/Myers.hs (and 2 more)

Tue Jul 30 10:17:40 UTC 2013

I looked at the benchmark results and managed to optimize the
implementation a lot using hashes and assigning an Int to each
different line for the comparison. Please test the new implementation
from http://hub.darcs.net/jlneder/darcs-patience-benchmarks.
This is my criterion report:
http://hub.darcs.net/jlneder/darcs-patience-benchmarks/raw/CriterionBenchmarks.html.
I saw a 7 times slowdown my machine in the test case you mention with
the old implementation and i managed to make it a 2x slowdown but i
would like to see how it runs in your machine given it runs the old
implementation a lot faster.
There is a little more room for optimization using arrays instead of
list like the myers implementation do, but it is almost a total
rewrite. I think a patience diff implementation using arrays would be
equal, if not faster than myers implementation because the algorithm
is less complex than the myers implementation with Darcs tweaks.
I test the hash from Data.Hashable and i found it's better than the
hashPS implementation of Darcs(performance wise). I don't know if it's
worth to add a dependency to make this change but is a small
dependency anyway.

2013/7/26 Guillaume Hoffmann <bugs at darcs.net>:
>
> Guillaume Hoffmann <guillaumh at gmail.com> added the comment:
>
> José, you wrote about the performance of Patience vs Myers diff in your
> blog post here:
> http://blog.jlneder.com.ar/2013/07/integrating-patience-diff-and-some.html
>
> You conclude that their performance are close one to another.
>
> However, on some tests I ran, it is possible to see a quite big
> difference in a realistic case: a relatively big file (6000 lines) with
> a small hunk deletion and another small hunk insertion: we get a 4 times
> slowdown (from ~1 to ~4 miliseconds).
>
> I generated random files using this script:
>     http://hub.darcs.net/gh/bench/browse/genDiff.sh
>
> And the criterion report I got was:
>     http://hub.darcs.net/gh/bench/raw/crit.html
>
> Do you get similar results on your machine with this benchmark?
>
> I think the performance is still acceptable but would it be possible to
> enhance it? Or is it by design a more complex algorithm?
>
> __________________________________
> Darcs bug tracker <bugs at darcs.net>
> <http://bugs.darcs.net/patch1083>
> __________________________________