[darcs-users] Re: Whitespace in filenames

Zack Brown zbrown at tumblerings.org
Sun Aug 3 21:18:52 UTC 2003


On Sun, Aug 03, 2003 at 03:10:43PM -0400, David Roundy wrote:
> On Sun, Aug 03, 2003 at 11:40:08AM -0700, Zack Brown wrote:
> > On Sat, Aug 02, 2003 at 06:12:59AM -0400, David Roundy wrote:
> > > Well, there's a problem in that I don't know that it's possible or
> > > practical to check against the conventions of the filesystem you're pulling
> > > to.  Among other things, the repository might span more than one
> > > filesystem, each of which has different filename restrictions.  And on the
> > > principle of letting users do whatever they want, that should be ok.  So I
> > > don't see myself trying to figure out what the filename restrictions of a
> > > given filesystem are to check on pull.  Patches to do this would be
> > > welcome, but it doesn't interest me, and I don't see how it can guarantee
> > > that the patch will apply properly (that is, that it will write to the
> > > desired file).
> > 
> > How about this: try to create the file, then check to see if it was
> > properly created. If not, you know there's an incompatibility, and you
> > can punt to the user. That way you don't need to know exactly which
> > filesystem you're dealing with, but you still catch all violations.
> 
> The problem is that you can only do that by actually modifying the
> directory, which is well after I'd like to do my checking.  I like being
> able to do my check phase before touching the repo, so if there is a
> problem there is no chance of the repo being corrupted.  If you create
> files in the repo while still checking, then if darcs crashes (the power
> dies, or whatever) before you get a chance to delete the files, you've got
> a corrupt repo.  It would be nice to keep the window of potential
> corruption as small as possible.  Also, you'd have to create all the
> potential test files before deleting any of them, since two files in the
> same patch may conflict with one another.

I can see why you'd want to do the checking before actually modifying the
filesystem, and I agree about the corruption dangers. But just to follow
the train of thought a little farther:

Would you be able to solve the above corruption by keeping a metafile
containing lists of files that still need to be deleted? Then, assuming
the check phase occurs after writing, darcs would be able to recognize
and recover from any system crash.

The potential justification for this is that it's an easy way to deal with
many different filesystems. The alternative is to have darcs understand
the limitations of all filesystems. That seems like a big can o' worms,
so if there's a way to just identify when a violation has occurred, while
protecting the repo from corruption, it might be a good thing, at least until
some insane hacker dude decides to add darcs support for tons of filesystems.

> > OK, let's say that files that existed once-upon-a-time in the repository
> > are called old files. Do old files still have an impact on the actual
> > files in a current repository? i.e. will an old file cause problems
> > because of files that actually exist in a current repository, or because
> > of problems that would only exist if someone tried to recreate the
> > earlier version of the repository that contained the old file?
> 
> It would mean that darcs get couldn't retrieve that repository on any
> platform that has the problem with an old filename.  This could be worked
> around by writing a version of get that doesn't require that the repository
> be consistent.

But aside from that workaround, *why* would the problem occur? I'm confused. If
the old file only existed as part of the history of the project's development,
but not as any file actually on disk, would it still be impossible to retrieve
the repository on that platform? And if so, why?

Be well,
Zack

> 
> One way to do this would be to implement a sort of checkpointing-like
> scheme in which we would store (optionally and perhaps only occasionally)
> "snapshots" of the repository at tags.  This would allow doing a darcs get
> of without downloading the entire repository history, which would be nice,
> since I don't care for the idea that get is an O(n) process where n is the
> age of the repository.  Actually, I think that it is likely that eventually
> I'll get around to implementing this idea, which could be used to
> effectively throw away old repository history in a nice controlled manner.
> 
> I had originally been thinking that I'd want to implement this by storing a
> snapshot tarball, but after my recent testing with large repos I think I
> can store the snapshot as one big patch, which is much nicer, since it
> means that it can store any data that darcs is made to support (and no
> more--which is as it should be).  I've been looking into using zlib to
> compress patches, which would make the storing and transferring of large
> snapshot patches somewhat less painful...
> 
> But for the moment, the painful situation is that a repository with an
> invalid filename anywhere in its history cannot effectively be used.  You
> could hack around this, for example by copying the repository manually,
> but then you wouldn't be able to use check to see if you had done so
> correctly (since you'd still have a corrupt repo).
> -- 
> David Roundy
> http://www.abridgegame.org
> 
> _______________________________________________
> darcs-users mailing list
> darcs-users at abridgegame.org
> http://www.abridgegame.org/mailman/listinfo/darcs-users

-- 
Zack Brown




More information about the darcs-users mailing list