[darcs-users] Bitkeeper and Eclipse questions
Sean Perry
shaleh at speakeasy.net
Sun Mar 20 17:59:04 UTC 2005
Juliusz Chroboczek wrote:
>>bk wins for having nice merge tools, revision history tools and
>>trigger / script support. In general bk has nice tools (gui or not)
>>for getting at a lot of information.
>
>
> Sean,
>
> Could you please expand on that (either here or on the Wiki)? As you
> are surely aware, the Darcs developers cannot legally play with the
> free version of BK, and hence we might not know what's missing.
>
Even if you can not use bk, you can read their docs. (-:
Let me give an idea of what my work scenario was like. Then we can
discuss what this can mean for darcs.
Background. The repos have revision history from now to circa 2001 in
bk. The section I dealt with most was a C++ based control daemon
weighing in around 20 - 50 kloc. 20+ programmers had come and gone, most
had pushed changes to this repo (let's call it MAIN). A *LOT* of churn
had occured. People had added entire source directories from other
projects to MAIN and then found out it was a bad idea or it was no
longer used. For instance, someone added a GPL library so it had to be
removed. A clone of MAIN was several gigs in size. Of course, it also
contained the entire 7.2 Postgres source tree as well, shrug.
We had a scripted system which:
* prevents pushes to the tree unless your user name is in a permissions
file. This is used when the product hits alpha to control who changes
the tree.
* if you push to version 1.0 of the tree, your change would propagate to
1.1, 1.5, 2.0, etc. as defined by a propagation file. If the merge
breaks, an email is sent out and the person who pushed the code along
with the tree maintainer are in charge of triage.
* automatically send out change notification emails
* automatically schedule a fresh checkout for the build system and
notifed the build system that a new checkout was in the queue. 100%
automated queue, build and test system. Worked well. Unfortunately there
were no unit tests, this was all black box testing.
* read changelog messages and update bugzilla if there was a Fixes: #XXX
entry.
* read changelog and would not allow a commit without a "ReviewedBy" field.
I feel like I am missing something, but this should give you the idea.
Scripting is fundamental to automation. Without the automation the
product would not have made it anywhere near as far as it has.
Notifications via email, irc bots, IM, etc. are also key. We had
developers in different states here in the US as well as in India. So it
was not as easy as walking over to someone's cube to find out information.
Because of the large amount of history, deleted files, etc. clones are
slow and large. This is where a distributed system shows it weakness. bk
until recently did not have a good way to cut off this detritus. Even
with the BitMover's help in cleaning up the tree, handling the merge
from versionA -> versionB -> versionC is going to take work so it keeps
getting pushed further away. Disk may be cheap, but not when you have
20+ people who average 40+GB home directories. The home directories were
managed on a SAN but adding new space is not easy or cheap when you
reach the terabyte level. "Cheap" is also relative. When a company
refuses to spend money in a particular quarter on infrastructure nothing
is cheap enough.
Revision history. Ok, bk uses ugly, 1995 style tk apps. However, they
work and work well. bk revtool launches a GUI browser which shows you
the entire repo's history as multiple timelines, so branches, multiple
users, etc. are all represented. You can ask for the difference between
node N and node M. You can ask for revision X.Y.Z. When looking at an
individual file, you get something analogous to what Mr. Schwern is
asking for on the annotate thread. Really handy during a bug hunt.
Merging. bk did a really good job of dealing with conflicts. When an
issue arose, bk mergetool was a real help. For each file that a conflict
occured in you received a choice:
keep the local version
take the new version
run diff
exit to shell
perform a 2 way merge using the GUI difftool
perform a 3 way merge using the GUI difftool
(one or two others I am forgetting)
The gui tools let you cherry pick diffs from both files as well as hand
edit the resulting file. Their UI was obviously made by a coder and took
quite a bit of learning. But once you became use to them they were very
useful.
Now there is once place where bk did a horrible job at merges. If I
added a function to the bottom of a file and someone else added a
function to the bottom of the same file bk would realize that the
changes could stack:
A
B
or even
B
A
so it caused a merge conflict which 99.9% of the time involved you
saying "take my change, then take their change, commit".
All is not roses with bk. But it is the only system I have seen keep up
with the pain we put it through. subversion looks like it is getting
close but the binary db aspect is worrying. If things go really, really
wrong bk uses SCCS and you can find a way to fix it. Last I heard the
svn people were developing a simple, file based system.
The GUIs are ugly and look worse on Windows. *BUT* they do run on
Windows, Unix with the X Window System, and Mac OS X.
Workflow. You have to: bk edit, <make changes>, bk checkin, bk commit,
bk push. The bk edit part makes applying patches a bit of a pain.
However, bk unedit returns the file to the state it was before you ran
bk edit so it made quick testing of ideas easy and safe. I suppose you
could do the same thing in darcs with undo / unrecord, etc. checkin
takes a changelog message. commit wraps up individual changes into a
changeset and adds a changelog message to the changeset. While handy,
you have to always remember that bk wants changesets, not patches and
not file revisions.
bk also makes it hard to generate a patch once you have checked in a
revision (analogue of darcs' record). Until the 3.0 series generating a
patch after commit was REALLY a daunting task because you had to tell bk
which file revision to start from. Finding the right revision number is
often not a trivial task.
Summary:
bk is robust and dependable. It handles really, really big repos. The
GUIs work and really make certain jobs easy. Scripting and automation
are vital in groups bigger than a few people especially with distributed
programmers.
darcs *MUST* have a way to clean out the revision tree. There should be
a way to get rid of 4 year old garbage. I am not saying darcs can not do
it today, maybe it can. If it can, this ability should be very clearly
spelled out in the docs.
bk's GUIs are possible because some of the functionality lives in
libraries that the CLI and GUIs share. Simply wrapping shell programs in
GUIs is a recipe for pain, ask the arch people. I believe this means
that either we move some of the darcs core to a C library or all of the
GUI work will need to occur in Haskell perhaps using wxHaskell.
I hope this helps. If I have not sufficiently explained something,
please let me know.
More information about the darcs-users
mailing list