From: Chris Shoemaker <c.shoemaker@cox.net>
To: Keith Packard <keithp@keithp.com>
Cc: Linus Torvalds <torvalds@osdl.org>,
David Mansfield <centos@dm.cobite.com>,
David Mansfield <cvsps@dm.cobite.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: Fix branch ancestry calculation
Date: Fri, 24 Mar 2006 20:45:32 -0500 [thread overview]
Message-ID: <20060325014532.GB32522@pe.Belkin> (raw)
In-Reply-To: <1143218338.6850.68.camel@neko.keithp.com>
On Fri, Mar 24, 2006 at 08:38:58AM -0800, Keith Packard wrote:
> On Fri, 2006-03-24 at 07:46 -0800, Linus Torvalds wrote:
> >
> > On Fri, 24 Mar 2006, David Mansfield wrote:
> > >
> > > Anyway, I'd like to nail down some of the other nagging ancestry/branch point
> > > problems if possible.
> >
> > What I considered doing was to just ignore the branch ancestry that cvsps
> > gives us, and instead use whatever branch that is closest (ie generates
> > the minimal diff). That's really wrong too (the data just _has_ to be in
> > CVS somehow), but I just don't know how CVS handles branches, and it's how
> > we'd have to do merges if we were to ever support them (since afaik, the
> > merge-back information simply doesn't exists in CVS).
>
> cvsps is more of a problem than cvs itself. Per-file branch information
> is readily available in the ,v files; each version has a list of
> branches from that version, and there are even tags marking the names of
> them. One issue that I've discovered is when files have differing branch
> structure in the same repository. That happens when a branch is created
> while files are checked out on different branches. I'm not quite sure
> what to do in this case; I've been trying several approaches and none
> seem optimal. One remaining plan is to just attach such branches by
> date, but that assumes that the first commit along a branch occurs
> shortly after the branch is created (which isn't required).
>
> Of course, this branch information is only created when a change is made
> to the file along said branch, so most of the repository will lack
> precise branch information for each branch. When you create a child
> branch, the files with no commits in the parent branch will never get
> branch information, so the child branch will be numbered as if it were a
> branch off of the grandparent. Globally, it is possible to reconstruct
> the entire branch structure.
If that last sentence was a typo then you already know this, but
otherwise you may be disappointed to learn that it's not _always_
possible to discern the correct ancestry tree.
The simplest counter-example is two branches where each adds one file
and no files in common are modified. If A and B both branched off of
HEAD and each adds one file, then they should each only have one file.
But if B branched from A which branched from HEAD, then B should also
have the file that was added to A. (*) However, the information to
distinguish these two cases isn't recorded in CVS.
I seem to have described this example more fully in the notes I took
while writing the patch to cvsps that does the global inferrence
you're describing. You _usually_ can make a very good guess, and the
more files that are modified, the better you can do.
BTW, those notes are still available here:
http://www.codesifter.com/cvsps-notes.txt
If you end up comparing the ancestry tree discovered by your tool and
the tree output by a patched cvsps, I would be very interested in the
results.
-chris
(*) You can distinguish between A->B->head and B->A->head simply by
date.
next prev parent reply other threads:[~2006-03-25 1:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-23 1:29 Fix branch ancestry calculation Linus Torvalds
2006-03-23 1:50 ` [RFC] Make dot-counting ignore ".1" at the end Linus Torvalds
2006-03-23 6:26 ` Keith Packard
2006-03-23 6:34 ` Linus Torvalds
2006-03-23 7:17 ` Keith Packard
2006-03-24 14:40 ` David Mansfield
2006-03-24 14:45 ` Fix branch ancestry calculation David Mansfield
2006-03-24 15:46 ` Linus Torvalds
2006-03-24 16:38 ` Keith Packard
2006-03-25 1:45 ` Chris Shoemaker [this message]
2006-03-25 7:54 ` Keith Packard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060325014532.GB32522@pe.Belkin \
--to=c.shoemaker@cox.net \
--cc=centos@dm.cobite.com \
--cc=cvsps@dm.cobite.com \
--cc=git@vger.kernel.org \
--cc=keithp@keithp.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).