[darcs-devel] [patch2166] cleaning up Arbitrary instances

Ganesh Sittampalam bugs at darcs.net
Tue May 4 21:30:18 UTC 2021


Ganesh Sittampalam <ganesh at earth.li> added the comment:

On 04/05/2021 21:37, Ben Franksen wrote:

> I don't quite understand the need for these wrappers. Could you explain?
> Perhaps using the simplest one, the Pair type, as example?

One of the problems with the existing code is that there are a lot of
instances with quite specific types. For example there's this comment in
Darcs.Test.Patch.Arbitrary.RepoPatchV1:

> Until this is cleaned up, we take some care that the Arbitrary instances
> do not overlap and are only used for tests from the respective
> modules.

which refers to various very specialised instances like:

> instance Arbitrary (Sealed2 (Prim :> Prim)) where

whereas in D.T.P.A.PrimV1 we have:

> instance Arbitrary (Sealed2 (Prim1 :> Prim1)) where
>   arbitrary = mapSeal2 wsPatch <$> arbitraryPair

which gets used for a completely different set of tests.

It would be much better if these instances were all built in a
compositional way, with each type constructor like Sealed2, :>, etc,
just having a single instance that builds on the instances for its
parameters. That way it's easy to look at a type and work backwards to
see how its instance will behave, and also to add new instances for new
things without fear of overlap.

Essentially I'm trying to untangle the mess bit by bit. V1Gen is maybe
the simplest example of this as it's very local to RepoPatchV1.
Introducing that means that instead of carefully crafting instances like
the above, we can make the specific tests in D.T.P.RepoPatchV1 that want
those instances always work via the V1Gen wrapper. I found the specific
tests by changing the instances to use V1Gen and then working through
the compiler errors: but now that it's done, we can be confident of
adding instances that don't mention V1Gen without affecting those tests.

Similarly the introduction of classes like ArbitraryS2 means that
instead of having instances like Arbitrary (Sealed2 Prim), Arbitrary
(Sealed2 (Prim :> Prim)) etc, we can have a single instance ArbitraryS2
p => Arbitrary (Sealed2 p). With just that change, we still have
ArbitraryS2 Prim, ArbitraryS2 (Prim :> Prim) but then further
refactorings along similar lines can make those better too.

Instances for things like (Prim :> Prim) are problematic because they
often rely on having the same type on both sides of the :> to do
something special, but overlap with the more generic (p :> q).
Introducing the 'Pair' type means that the two can be clearly
disambiguated at the point the test is run.

>   * generalise the NamedPrim instances for ArbitraryWS[Pair]
> 
> Does this one (which gets rid of a lot of code duplication, so much
> appreciated) actually depend on the previous refactors? If yes, why is
> it possible now but wasn't earlier?

There are a few different pieces of duplication getting removed from
D.T.P.A.NamedPrimV1 etc, not all of them in this specific patch.
Probably the trickiest bit is this pattern:

[NamedPrimV1]
> instance Arbitrary (Sealed2 (Prim :> Prim)) where
>   arbitrary = mapSeal2 wsPatch <$> arbitraryPair

and again in NamedPrimFileUUID.

Then the underlying types have these instances:

[PrimV1]
> instance Arbitrary (Sealed2 (Prim1 :> Prim1)) where
>   arbitrary = mapSeal2 wsPatch <$> arbitraryPair

> instance Arbitrary (Sealed2 (Prim2 :> Prim2)) where
>   arbitrary = mapSeal2 wsPatch <$> arbitraryPair

and again in PrimFileUUID.

Each of these refers to a copy and pasted 'arbitraryPair', which in turn
referred to different 'aPrimPair's. The 'aPrimPair' for NamedPrimV1 and
NamedFileUUID delegate to the ones for PrimV1 and PrimFileUUID
respectively. By moving to a compositional instance structure, the code
for all the NamedPrims is now:

[NamedPrim]
> instance ArbitraryState prim => ArbitraryState (NamedPrim prim) where
>   [..]
>   arbitraryStatePair = ... use arbitraryStatePair from prim ...

and the instance works for NamedPrimFileUUID and NamedPrimV1.

Now that I know it's possible, I probably could have achieved just this
specific goal by a different and shorter route.

> I am trying to understand the big picture...

I guess there isn't a clear big picture that describes my end goals.
What happened was that I started exploring cleanups by making small and
(hopefully) obviously correct refactorings that tried to reduce the mess
of instances. In that process, getting rid of boilerplate just came out
naturally.

I think that even apart from the boilerplate removal, the simpler
instances are a net win because they make the code more modular and make
further cleanups easier. For example another outcome of my patches is
that I've cut down 4 different Arbitrary instances for each of Prim1 and
Prim2 to a single instance of ArbitraryWS along with a new member in
ArbitraryState.

__________________________________
Darcs bug tracker <bugs at darcs.net>
<http://bugs.darcs.net/patch2166>
__________________________________


More information about the darcs-devel mailing list