[mnet-devel] Subversion vs DARCS (was: Moving sf.net CVS to cryptomonkey.net Subversion)

Zooko zooko at zooko.com
Sun Jul 6 16:00:46 BST 2003


[This message has four sections and a conclusion.]
[Myers wrote the lines prepended with "> ".  I wrote the lines prepended 
with "> > ".]
[Cc:'ing the darcs mailing list.]


Section 1: sysadmin evaluation of Subversion

> As a sysadmin I'm *really* happy with subversion.

Good to know!  I think Subversion is less stable than CVS, and that could give 
us problems, but it might be worth the risk of using Subversion instead of CVS 
in order to have convenience and security for the server admin.


Section 2: DARCS transport, frequency of merge conflicts

> DARCS looks intresting.  I can't tell from the docs
> 
> http://abridgegame.org/darcs/manual/node3.html#SECTION00330000000000000000
> 
> if you can commit changes via HTTP, or only via email.  Only via email is
> bad if you outgoing mailserver takes forever to deliver messages.  You could
> be waiting for your email to go thru while some conflicting patch gets
> commited first.

I don't think this would be a problem.  Real merge conflicts are rare, and 
would be even rarer using DARCS.  For example, DARCS has a "token replacement" 
kind of patch in addition to the normal "hunk was changed" kind of patch, so 
if I rename some variables at the same time as you are changing the way those 
variables are used, it won't conflict.

And when there *is* a merge conflict, it isn't a big disaster if the patch 
that got committed first is the one that reaches the repository second.  
Because we're going to merge the two patches so that they are both accepted, 
and the order in which they arrive makes no difference to whether they will 
merge correctly or how they will be merged.  (But it might make a difference 
to *who* merges them.  :-))

But if it *were* a problem, I think it wouldn't be too hard to switch the 
transport from e-mail to scp or something.


Section 3: merging branches

> > Subversion is the only "CVS successor" listed in [1] that *doesn't* solve the 
> > biggest problem with CVS: that when you merge branch A onto HEAD, then make 
> > more changes on branch A, and then merge it onto HEAD again, it tries to merge 
> > all the parts of branch A that it *already* merged, causing all sorts of 
> > spurious changes and unnecessary merge conflicts.
> 
> Why wouldn't you merge the branch, and then start another? branch-2.0 or
> something?  Would that take care of the problem?

Hm.  That's a good point.  It would be a bit tedious, and people who wanted to 
find the current version of the branch would have to look at all the tags and 
find the one named "ent-X" with the highest value for X.  If two people were 
working on a branch, they would both have to switch to the new branch-X 
simultaneously?  Still, it might work for our needs.

However, it would *not* work if we weren't willing to merge both branches onto 
each other every time we merged!

For example, suppose that you were working on branch_twisted while I was 
working on branch_ent, neither of which was the HEAD branch.

Now suppose that I want to get the current Twisted integration merged onto 
branch_ent, but you do *not* want the current ent hackery merged onto 
branch_twisted.  

With CVS or Subversion, I think the best I can do in this case is manually 
make a tag and write it down on a piece of paper taped to the wall whenever 
I merge branch_twisted onto branch_ent.  Then the next time I want to merge 
the *new* parts of branch_twisted onto branch_ent, I make a new tag, tell the 
revision control system to merge branch_twisted [from old_tag to new_tag], and 
write the new tag on the piece of paper on the wall.

Bram Cohen and I did exactly that (using CVS) at Signet Assurance Corporation 
in 1999.  It was a major hassle and we made mistakes which caused spurious 
merge conflicts that screwed up our code.  (Around that time we went out for 
coffee and had discussions about newfangled revision control systems, and some 
years later Bram extended and reified those ideas into Codeville.)

Bottom line: we *could* do it with CVS or Subversion, using either the "paper 
taped to wall" technique or the "new branch every time you merge" technique. 
(The latter technique available only if you are willing to merge both branches 
onto each other every time you merge.)  But hopefully one of the other systems 
would make it easier and safer.

All of the systems other than CVS and Subversion offer "merge-history 
tracking", which directly addresses this problem by having the tool implement 
the "paper taped to the wall" technique for you, behind the scenes.

But DARCS goes another step further than the others...


Section 4: why DARCS is speeecial

The magical features of DARCS are enabled by the fact that it doesn't consider 
a file to be a sequence of bytes, but instead to be a collection of all of the 
patches that have ever been applied to that file.

Section 4.a: conflict management at the level of patches instead of branches

Therefore, if there is a merge conflict, DARCS doesn't say "this branch 
changed line 200 in a different way than that branch changed line 200".  It 
says "this patch from this branch conflicts with that patch from that branch", 
even if the patches were committed many days ago onto their respective branches.

Basically, DARCS shows you the conflict at the level of patches (which 
specific patch from branch A conflicted with which specific patch from 
branch B), where Subversion shows you the conflict at the level of branches 
(treating all of the patches that have gone into branch A as a single huge 
patch, it conflicts with the single huge patch made up of all of the patches 
that have gone into branch B).

Section 4.b: cherry-picking of patches

Furthermore, DARCS allows you to "cherry-pick" patches.  Suppose you are 
reading the mnet-cvs list, and you see a bugfix patch go by that you think 
should be applied to your branch.  You try to apply it.  With either 
CVS, Subversion or DARCS, if the patch applies cleanly then you are done.  
(Although as I mentioned DARCS has a more nuanced notion of diffs which 
allows more patches to apply cleanly than does Subversion.)

With either CVS, Subversion or DARCS, if the patch *doesn't* apply cleanly, 
then you get an error message.  With CVS and Subversion, the error message is 
based on the text.  It is the familiar "merge conflict" error message that 
we've all seen with CVS -- the one that shows the two different changes 
separated by "<<<<<<<" and ">>>>>>>".

With DARCS, the error message is based on the patch history.  It says "this 
bugfix patch that you pulled can't be applied to your branch, because it 
depends on the XYZ refactoring patch, which was applied to the other branch 
but not to your branch".

(See this e-mail [2] for a concrete example.)

Now with DARCS you have the choice of:

(a) forgetting about it.  Perhaps learning that it depends on the XYZ 
    refactoring patch shows you that you don't even *have* that bug on your 
    branch after all.

(b) manually merging the bugfix onto your branch.  This is easier with DARCS 
    since you have the "logical" information about which refactoring patch 
    caused the conflict, rather than just the "textual" information about 
    which lines have changed.

(c) pulling the refactoring patch in addition to the bugfix patch.



Conclusion:

Bottom line: DARCS promises some delicious features that no other Free 
Software revision control system has even conceived of, as far as I am 
aware [3].  However, I haven't tried it yet, and there could well be hassles 
of administration, UI, or stability that would render it unusable.  I want to 
find out.

Our current needs for sophisticated branching and merging are high.  We want 
to implement disruptive new changes without disrupting the HEAD branch, and we 
want to be sure that those changes can be merged onto the HEAD branch once 
they are ready.  HiveCache wants to maintain its own ultra-top-secret patches, 
and they want to cherry-pick bugfixes, or perhaps track HEAD, or perhaps track 
a stable branch instead of HEAD.  Nobody involved has the manpower or the 
communication bandwidth to deal with Hassle, such as the "paper taped to the 
wall" approach.

The Mnet project has already suffered considerably when these needs weren't 
met.  A better tool isn't a silver bullet of course, but if the tool makes 
these tasks just a little bit easier, it could have a very large beneficial 
effect on our ability to cooperate and move the project forward.


Regards,

Zooko

http://zooko.com/
         ^-- under re-construction: some new stuff, some broken links

> > [1] http://zooko.com/revision_control_quick_ref.html
[2] http://www.abridgegame.org/pipermail/darcs-users/2003/000146.html
[3] Perhaps ClearCase and such commercial systems that sell for $10,000 per 
    seat license have also conceived of this idea.  I wouldn't know.


-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01
_______________________________________________
mnet-devel mailing list
mnet-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mnet-devel




More information about the Mnet-devel mailing list