git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marco Costalba <mcostalba@yahoo.it>
To: Junio C Hamano <junkio@cox.net>, Daniel Barkalow <barkalow@iabervon.org>
Cc: Linus Torvalds <torvalds@osdl.org>, git@vger.kernel.org
Subject: Re: Last mile for 1.0 again
Date: Tue, 5 Jul 2005 06:34:48 -0700 (PDT)	[thread overview]
Message-ID: <20050705133448.21778.qmail@web26309.mail.ukl.yahoo.com> (raw)
In-Reply-To: <7v3bqthiso.fsf@assigned-by-dhcp.cox.net>



--- Junio C Hamano <junkio@cox.net> wrote:

> >>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> DB> [perl script]
> 
> >> How does this work, and what do we do about merges?
> 

Checking diffs of all the parents can be computational expensive. 

I'am developing a different alghoritm for qgit, where I know,
before to run annotate all the revs that changed a given file.

The revs are saved in the shaHist list according to git-rev-list --merge-order.

The algoritm operates form the oldest to the newst calculating annotation for each 
following rev.

This alghoritm takes advantage of the fact that mostly revs belong to 2 categories:

1) revs of the same development-sequence (e.g. connected without merges)

2) empty merges, i.e. merges where the file we are interested in
 is listed in only one merging branch

If a rev in shaHist does not belong to above categories a reach analisys must be performed
in order to find the direct ancestor among all the previous revs in the list.


So starting from shaHist a ReachList rl is populated by getReachability(): 


void Annotate::getReachability(ReachList& rl, const QString& file, const QStringList& shaHist) {

	rl.clear();
	QStringList::const_iterator it = shaHist.begin();
	Rev* r = revLookup(*it);
	rl.append(ReachInfo(*it, r->id, INITIAL));
	for (it++; it != shaHist.end(); it++) {
		
		Rev* r = revLookup(*it);

		// initial revision case
		if (r->parents.count() == 0) {
			rl.append(ReachInfo(*it, r->id, INITIAL));
			continue;
		}
		// merge case
		if (r->parents.count() > 1) { 

			// check empty merges diffing against previous in list
			QString prevSha = rl.last().sha;
			QString d = getFileDiff(*it, prevSha, file);

			if (d.isEmpty()) { 
				rl.append(ReachInfo(*it, r->id, EMPTY_MERGE));
				rl.last().roots.append(prevSha);
			} else {
				// real merge: we need reach test.
				QStringList roots;
				roots.clear();
				for (uint i = 0; i < r->parents.count(); i++) {

					// here tree walking is performed
					QString root = getRoot(r->parents[i], rl);				
					if (!root.isEmpty())
						roots.append(root);
				}
				rl.append(ReachInfo(*it, r->id, MERGE));
				rl.last().roots = roots;
				}
			}
			continue;
		}
		// normal case: a revision with a single parent
		if (r->id == rl.last().id) { // same development sequence

			QString prevSha = rl.last().sha;
			rl.append(ReachInfo(*it, r->id, SAME_BR)); // same branch
			rl.last().roots.append(prevSha);

		} else { // different development sequence

			QString root = getRoot(*it, rl); // <-- here tree walking is performed
			if (!root.isEmpty()) {
				rl.append(ReachInfo(*it, r->id, DIFF_BR));
				rl.last().roots.append(root);
			} else
				qDebug("ASSERT getReachability: rev %s not reachable",
					(*it).latin1());
		}
	}
}


Then ReachList rl is then used to compute correct annotations:


void Annotate::run(const QString& file, const QStringList& shaHist, AnnotateHistory& ah) {

	QString d, diffTarget;
	QStringList t;
	int pos;
	ReachList rl;
	ah.clear();

	getReachability(rl, file, shaHist);  // <-- here ReachList rl is calculated

	for (uint i = 0; i < rl.count(); i++) {

		switch(rl[i].type) { // <-- here ReachList rl is used to find annotations

		case SAME_BR:
		case DIFF_BR:
			diffTarget = rl[i].roots.first();
			d = getFileDiff(rl[i].sha, diffTarget, file);
			pos = shaHist.findIndex(diffTarget);
			ah.append(processDiff(d, ah[pos], getAuthor(rl[i].sha, shaHist)));
			break;
		case INITIAL:
			pos = shaHist.findIndex(rl[i].sha);
			ah.append(getFirstAnnotation(file, shaHist, pos));
			break;
		case EMPTY_MERGE:
			diffTarget = rl[i].roots.first();
			pos = shaHist.findIndex(diffTarget);
			t = ah[pos]; // copy annotation from previous
			ah.append(t);
			break;
		case MERGE:
			// append annotation calculated on first root
			diffTarget = rl[i].roots.first();
			d = getFileDiff(rl[i].sha, diffTarget, file);
			pos = shaHist.findIndex(diffTarget);
			ah.append(processDiff(d, ah[pos], "Merge"));

			// update annotation with all the others roots
			for (uint j = 1; j < rl[i].roots.count(); j++) {

				diffTarget = rl[i].roots[j];
				d = getFileDiff(rl[i].sha, diffTarget, file);
				pos = shaHist.findIndex(diffTarget);

				// create an annotation calculated between root[i] and
				// the merge.
				QStringList scndAnn = processDiff(d, ah[pos],
						 getAuthor(diffTarget, shaHist));

				// finally we unify the two annotations, so we catch
				// also the case of a 3-way merge
				unify(ah.last(), scndAnn);
			}
			break;
		}
	}
}


Advantage of this algorithm is that reach analisys, i.e. tree walking, is
done, statistically, only for a small subset of revs.

Another side benefit is that correct annotation is calculated for each rev and
not only for newest one.

Of course there are issues:

1) A ordered list of revs (file history) must be known in advance. This is not a problem
   for qgit because file history is calculated while loading revs.

2) Does not takes in account file renames.


Hope this notes can help in ongoing discussion

Marco


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

  reply	other threads:[~2005-07-05 13:35 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-25  4:20 kernel.org and GIT tree rebuilding David S. Miller
2005-06-25  4:40 ` Jeff Garzik
2005-06-25  5:23   ` Linus Torvalds
2005-06-25  5:48     ` Jeff Garzik
2005-06-25  6:16       ` Linus Torvalds
2005-06-26 16:41         ` Linus Torvalds
2005-06-26 18:39           ` Junio C Hamano
2005-06-26 19:19             ` Linus Torvalds
2005-06-26 19:45               ` Junio C Hamano
     [not found]                 ` <7v1x6om6o5.fsf@assigned-by-dhcp.cox.net>
     [not found]                   ` <Pine.LNX.4.58.0506271227160.19755@ppc970.osdl.org>
     [not found]                     ` <7v64vzyqyw.fsf_-_@assigned-by-dhcp.cox.net>
2005-06-28  6:56                       ` [PATCH] Obtain sha1_file_info() for deltified pack entry properly Junio C Hamano
2005-06-28  6:58                         ` Junio C Hamano
2005-06-28  6:58                         ` [PATCH 2/3] git-cat-file: use sha1_object_info() on '-t' Junio C Hamano
2005-06-28  6:59                         ` [PATCH 3/3] git-cat-file: '-s' to find out object size Junio C Hamano
2005-06-26 20:52           ` kernel.org and GIT tree rebuilding Chris Mason
2005-06-26 21:03             ` Chris Mason
2005-06-26 21:40             ` Linus Torvalds
2005-06-26 22:34               ` Linus Torvalds
2005-06-28 18:06           ` Nicolas Pitre
2005-06-28 19:28             ` Linus Torvalds
2005-06-28 21:08               ` Nicolas Pitre
2005-06-28 21:27                 ` Linus Torvalds
2005-06-28 21:55                   ` [PATCH] Bugfix: initialize pack_base to NULL Junio C Hamano
2005-06-29  3:55                   ` kernel.org and GIT tree rebuilding Nicolas Pitre
2005-06-29  5:16                     ` Nicolas Pitre
2005-06-29  5:43                       ` Linus Torvalds
2005-06-29  5:54                         ` Linus Torvalds
2005-06-29  7:16                           ` Last mile for 1.0 again Junio C Hamano
2005-06-29  9:51                             ` [PATCH] Add git-verify-pack command Junio C Hamano
2005-06-29 16:15                               ` Linus Torvalds
2005-07-04 21:40                             ` Last mile for 1.0 again Daniel Barkalow
2005-07-04 21:45                               ` Junio C Hamano
2005-07-04 21:59                               ` Linus Torvalds
2005-07-04 22:41                                 ` Daniel Barkalow
2005-07-04 23:06                                   ` Junio C Hamano
2005-07-05  1:54                                     ` Daniel Barkalow
2005-07-05  6:24                                       ` Junio C Hamano
2005-07-05 13:34                                         ` Marco Costalba [this message]
2005-06-25  5:04 ` kernel.org and GIT tree rebuilding Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050705133448.21778.qmail@web26309.mail.ukl.yahoo.com \
    --to=mcostalba@yahoo.it \
    --cc=barkalow@iabervon.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).