[darcs-devel] confused questions about versioning structured data
Ganesh Sittampalam
ganesh at earth.li
Fri Aug 23 06:38:15 UTC 2013
Hi Matthias,
Some general thoughts which might provoke further discussion:
- Using a proper VCS seems like the right thing to do particuarly if you
want a history that's branchable and mergeable.
- Darcs lib is still somewhat immature, but it should give you a
programmatic interface to creating and commuting/merging patches if
that's what you want: http://darcs.net/Library
- Tagging every patch is the way I would personally recommend for
identifying specific versions. I don't know of any specific performance
problems that would create.
- Pretty printing to get canonical JSON string would work, but one thing
you would need to think about is whether diffs of those strings would
merge nicely with each other.
- Darcs won't help you much with your external rules about changes
causing other things to change, except that you might encode them with
explicit Darcs patch dependencies.
- You could create your own patch types - it's "just" a matter of
implementing various type classes. That would be more up-front work to
encode your system but might get you something with nicer commute and
merge behaviour.
Hope this is of some help!
Cheers,
Ganesh
On 19/08/2013 16:52, Matthias Fischmann wrote:
>
> Dear darcs community,
>
> I had a discussion in a software project that needs versioning on
> structured data (think json objects). So far, my favorite solution is
> hacking together a restful repository server using darcs and snap, but
> we are still struggling with understanding what we actually need.
> This is a summary of the status of the discussion and some open
> questions, in the hope that at least some of it makes sense. Thanks
> for listening! :-)
>
> The data we are dealing with is content objects of a web application
> written in Python. There are documents, paragraphs, comments, users,
> groups, authorizations, votes, collections of votes and a lot more.
>
> Objects evolve along patch trees. Also, they are associated with each
> other in a graph with different edge types (a document consists of
> several paragraphs; a vote is made by a user on a document). The type
> of an edge decides what the source will do if the target grows a new
> version (if a user votes on a document and the document is updated,
> the vote relates to the old version; if a paragraph changes in a
> document; the document grows a new version in parallel with the
> paragraph).
>
> The application will visualize patches, version trees, and arbitrary
> versions of objects. (We have considered allowing for patches of
> patches, but have abandoned the idea as unnecessarily complex.)
>
> Now, how would you do this using darcs as a library?
>
> What would I have to do to process patches on sets of json objects
> rather than file trees?
>
> What are the drawbacks of using a pretty-printer to get a canonical
> string representation of json objects, write them all to files, and
> use darcs to version-control the files? (It seems like a weird idea
> to me, but I can only think of performance reasons not to do it.)
>
> I have rules like this: "If attribute X in object A changes, then the
> object referenced in attribute Y of A must get a new version in which
> it refers to the new version of A." (This implies that the contents
> of json objects is aware of darcs patches.) Is darcs offering any
> tools to implement rules like this, or do I have to do this on foot,
> before I present new versions or patches to darcs lib?
>
> To extract arbitrary versions (instead of patches), I would need to
> either retrieve all patches by timestamp filter, or tag every patch.
> Is either of the two a good idea? Did I miss a better one?
>
> If every patch comes with a tag that makes the associated state
> pullable, would that be a performance issue, or would it actually
> mitigate potential performance issues on large patch sets? My
> uninformed guess is that the situations where complexity bites you
> with darcs involve large, unordered sub-sets of patches.
>
> In general, do you have any opinion (emotional or rational) whether
> darcslib+snap is right for my use case? The alternative would be to
> use an object database with linear version history (most likely ZODB)
> and implement additional version control features on top of that.
>
> Looking forward ot your feedback,
> cheers,
> matthias
> _______________________________________________
> darcs-devel mailing list
> darcs-devel at darcs.net
> http://lists.osuosl.org/mailman/listinfo/darcs-devel
>
More information about the darcs-devel
mailing list