[darcs-devel] confused questions about versioning structured data

Matthias Fischmann mf at zerobuzz.net
Sun Sep 1 12:28:17 UTC 2013


Hi Ganesh,

Thanks for the feedback - yes, this is very helpful!

I am still incubating more ideas.  Should anything ever foster from it
I'll post an update here (or probably on darcs-users?).

cheers,
m.


On Fri, Aug 23, 2013 at 07:38:15AM +0100, Ganesh Sittampalam wrote:
> Date: Fri, 23 Aug 2013 07:38:15 +0100
> From: Ganesh Sittampalam <ganesh at earth.li>
> To: Matthias Fischmann <mf at zerobuzz.net>
> CC: darcs-devel at darcs.net
> Subject: Re: [darcs-devel] confused questions about versioning structured
>  data
>
> Hi Matthias,
>
> Some general thoughts which might provoke further discussion:
>
> - Using a proper VCS seems like the right thing to do particuarly if you
> want a history that's branchable and mergeable.
>
> - Darcs lib is still somewhat immature, but it should give you a
> programmatic interface to creating and commuting/merging patches if
> that's what you want: http://darcs.net/Library
>
> - Tagging every patch is the way I would personally recommend for
> identifying specific versions. I don't know of any specific performance
> problems that would create.
>
> - Pretty printing to get canonical JSON string would work, but one thing
> you would need to think about is whether diffs of those strings would
> merge nicely with each other.
>
> - Darcs won't help you much with your external rules about changes
> causing other things to change, except that you might encode them with
> explicit Darcs patch dependencies.
>
> - You could create your own patch types - it's "just" a matter of
> implementing various type classes. That would be more up-front work to
> encode your system but might get you something with nicer commute and
> merge behaviour.
>
> Hope this is of some help!
>
> Cheers,
>
> Ganesh
>
> On 19/08/2013 16:52, Matthias Fischmann wrote:
> >
> > Dear darcs community,
> >
> > I had a discussion in a software project that needs versioning on
> > structured data (think json objects).  So far, my favorite solution is
> > hacking together a restful repository server using darcs and snap, but
> > we are still struggling with understanding what we actually need.
> > This is a summary of the status of the discussion and some open
> > questions, in the hope that at least some of it makes sense.  Thanks
> > for listening!  :-)
> >
> > The data we are dealing with is content objects of a web application
> > written in Python.  There are documents, paragraphs, comments, users,
> > groups, authorizations, votes, collections of votes and a lot more.
> >
> > Objects evolve along patch trees.  Also, they are associated with each
> > other in a graph with different edge types (a document consists of
> > several paragraphs; a vote is made by a user on a document).  The type
> > of an edge decides what the source will do if the target grows a new
> > version (if a user votes on a document and the document is updated,
> > the vote relates to the old version; if a paragraph changes in a
> > document; the document grows a new version in parallel with the
> > paragraph).
> >
> > The application will visualize patches, version trees, and arbitrary
> > versions of objects.  (We have considered allowing for patches of
> > patches, but have abandoned the idea as unnecessarily complex.)
> >
> > Now, how would you do this using darcs as a library?
> >
> > What would I have to do to process patches on sets of json objects
> > rather than file trees?
> >
> > What are the drawbacks of using a pretty-printer to get a canonical
> > string representation of json objects, write them all to files, and
> > use darcs to version-control the files?  (It seems like a weird idea
> > to me, but I can only think of performance reasons not to do it.)
> >
> > I have rules like this: "If attribute X in object A changes, then the
> > object referenced in attribute Y of A must get a new version in which
> > it refers to the new version of A."  (This implies that the contents
> > of json objects is aware of darcs patches.)  Is darcs offering any
> > tools to implement rules like this, or do I have to do this on foot,
> > before I present new versions or patches to darcs lib?
> >
> > To extract arbitrary versions (instead of patches), I would need to
> > either retrieve all patches by timestamp filter, or tag every patch.
> > Is either of the two a good idea?  Did I miss a better one?
> >
> > If every patch comes with a tag that makes the associated state
> > pullable, would that be a performance issue, or would it actually
> > mitigate potential performance issues on large patch sets?  My
> > uninformed guess is that the situations where complexity bites you
> > with darcs involve large, unordered sub-sets of patches.
> >
> > In general, do you have any opinion (emotional or rational) whether
> > darcslib+snap is right for my use case?  The alternative would be to
> > use an object database with linear version history (most likely ZODB)
> > and implement additional version control features on top of that.
> >
> > Looking forward ot your feedback,
> > cheers,
> > matthias
> > _______________________________________________
> > darcs-devel mailing list
> > darcs-devel at darcs.net
> > http://lists.osuosl.org/mailman/listinfo/darcs-devel
> >


More information about the darcs-devel mailing list