Git development
 help / color / mirror / Atom feed
* [PATCH] don't load objects needlessly when repacking
From: Nicolas Pitre @ 2006-07-01  2:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

If no delta is attempted on some objects then it is useless to load them 
in memory, neither create any delta index for them.  The best thing to 
do is therefore to load and index them only when really needed.

Signed-off-by: Nicolas Pitre <nico@cam.org>

---

With this patch, a git-repack -a on the Linux kernel repo takes 19 
seconds instead of 25 seconds on my machine.  At this point the cost of 
creating a pack is largely dominated by git-rev-list alone while the 
actual pack creation is basically free.

diff --git a/pack-objects.c b/pack-objects.c
index 47da33b..b486ea5 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -970,11 +970,12 @@ struct unpacked {
  * one.
  */
 static int try_delta(struct unpacked *trg, struct unpacked *src,
-		     struct delta_index *src_index, unsigned max_depth)
+		     unsigned max_depth)
 {
 	struct object_entry *trg_entry = trg->entry;
 	struct object_entry *src_entry = src->entry;
-	unsigned long size, src_size, delta_size, sizediff, max_size;
+	unsigned long trg_size, src_size, delta_size, sizediff, max_size, sz;
+	char type[10];
 	void *delta_buf;
 
 	/* Don't bother doing diffs between different types */
@@ -1009,19 +1010,38 @@ static int try_delta(struct unpacked *tr
 		return 0;
 
 	/* Now some size filtering heuristics. */
-	size = trg_entry->size;
-	max_size = size/2 - 20;
+	trg_size = trg_entry->size;
+	max_size = trg_size/2 - 20;
 	max_size = max_size * (max_depth - src_entry->depth) / max_depth;
 	if (max_size == 0)
 		return 0;
 	if (trg_entry->delta && trg_entry->delta_size <= max_size)
 		max_size = trg_entry->delta_size-1;
 	src_size = src_entry->size;
-	sizediff = src_size < size ? size - src_size : 0;
+	sizediff = src_size < trg_size ? trg_size - src_size : 0;
 	if (sizediff >= max_size)
 		return 0;
 
-	delta_buf = create_delta(src_index, trg->data, size, &delta_size, max_size);
+	/* Load data if not already done */
+	if (!trg->data) {
+		trg->data = read_sha1_file(trg_entry->sha1, type, &sz);
+		if (sz != trg_size)
+			die("object %s inconsistent object length (%lu vs %lu)",
+			    sha1_to_hex(trg_entry->sha1), sz, trg_size);
+	}
+	if (!src->data) {
+		src->data = read_sha1_file(src_entry->sha1, type, &sz);
+		if (sz != src_size)
+			die("object %s inconsistent object length (%lu vs %lu)",
+			    sha1_to_hex(src_entry->sha1), sz, src_size);
+	}
+	if (!src->index) {
+		src->index = create_delta_index(src->data, src_size);
+		if (!src->index)
+			die("out of memory");
+	}
+
+	delta_buf = create_delta(src->index, trg->data, trg_size, &delta_size, max_size);
 	if (!delta_buf)
 		return 0;
 
@@ -1054,8 +1074,6 @@ static void find_deltas(struct object_en
 	while (--i >= 0) {
 		struct object_entry *entry = list[i];
 		struct unpacked *n = array + idx;
-		unsigned long size;
-		char type[10];
 		int j;
 
 		if (!entry->preferred_base)
@@ -1082,11 +1100,8 @@ static void find_deltas(struct object_en
 		free_delta_index(n->index);
 		n->index = NULL;
 		free(n->data);
+		n->data = NULL;
 		n->entry = entry;
-		n->data = read_sha1_file(entry->sha1, type, &size);
-		if (size != entry->size)
-			die("object %s inconsistent object length (%lu vs %lu)",
-			    sha1_to_hex(entry->sha1), size, entry->size);
 
 		j = window;
 		while (--j > 0) {
@@ -1097,7 +1112,7 @@ static void find_deltas(struct object_en
 			m = array + other_idx;
 			if (!m->entry)
 				break;
-			if (try_delta(n, m, m->index, depth) < 0)
+			if (try_delta(n, m, depth) < 0)
 				break;
 		}
 		/* if we made n a delta, and if n is already at max
@@ -1107,10 +1122,6 @@ static void find_deltas(struct object_en
 		if (entry->delta && depth <= entry->depth)
 			continue;
 
-		n->index = create_delta_index(n->data, size);
-		if (!n->index)
-			die("out of memory");
-
 		idx++;
 		if (idx >= window)
 			idx = 0;

^ permalink raw reply related

* Re: [PATCH] Enable tree (directory) history display
From: Junio C Hamano @ 2006-07-01  3:01 UTC (permalink / raw)
  To: ltuikov; +Cc: git, Linus Torvalds
In-Reply-To: <20060701024309.63001.qmail@web31805.mail.mud.yahoo.com>

Luben Tuikov <ltuikov@yahoo.com> writes:

> I took a comparative look with and without "--full-history",
> and FWIW, enabling full history just clobbers the output with a lot
> of unnecessary information.  I.e. it shows merges which do not have
> direct consequence or change to the files in the path spec specified
> after the "--".
>...
> FWIW, I think that the original intention (had there been a choice)
> would've been to show only most relevant history, i.e. changes
> directly related to paths and files after the "--".

Sounds sane.  Maybe we should clearly explain this behaviour
change in the commit log and claim it as an improvement.  We
might even want to do --no-merges if we go that route.

^ permalink raw reply

* Re: A note on merging conflicts..
From: Junio C Hamano @ 2006-07-01  3:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0606301927260.12404@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> Now, the downside is that the above is both a pain to type, and we don't 
> actually even save the MERGE_BASE as a head, so you actually have to 
> compute it yourself. It's easy enough to do:
>
> 	git-merge-base HEAD MERGE_HEAD > .git/MERGE_BASE
>
> will do it, but the fact is, we should make this even easier.

Heh, that's why I kept saying I want somebody to teach rev-list
a new notation, A...B, to mean $(merge-base A B)..B ;-).

> In fact, after writing the above a few times, I really think there's a 
> case for making a helper function that does exactly the above for us. 
> Including all the "conflicting-filename" thing. It would be nice if
>
> 	git log -p --merge [[--] filenames...]
>
> would basically expand to
>
> 	git log -p HEAD MERGE_HEAD
> 		^$(git-merge-base HEAD MERGE_HEAD)
> 		-- $(git-ls-files -u [filenames...])
>
> so that I wouldn't have to type that by hand ever again, and doing a
>
> 	git log -p --merge drivers/
>
> would automatically give me exactly that for all the unmerged files in 
> drivers/.

> Anybody want to try to make me happy, and learn some git internals at the 
> same time?

I fall in the former category but as the current maintainer I
feel I should leave chance to do the latter to others first.  I
wouldn't call it "trivial" but it is not that hard -- I think I
can write it in my head (as Linus can).

^ permalink raw reply

* Re: [PATCH] Enable tree (directory) history display
From: Luben Tuikov @ 2006-07-01  3:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds
In-Reply-To: <7v3bdmf2p6.fsf@assigned-by-dhcp.cox.net>

--- Junio C Hamano <junkio@cox.net> wrote:
> Luben Tuikov <ltuikov@yahoo.com> writes:
> 
> > I took a comparative look with and without "--full-history",
> > and FWIW, enabling full history just clobbers the output with a lot
> > of unnecessary information.  I.e. it shows merges which do not have
> > direct consequence or change to the files in the path spec specified
> > after the "--".
> >...
> > FWIW, I think that the original intention (had there been a choice)
> > would've been to show only most relevant history, i.e. changes
> > directly related to paths and files after the "--".
> 
> Sounds sane.  Maybe we should clearly explain this behaviour
> change in the commit log and claim it as an improvement.  We
> might even want to do --no-merges if we go that route.

I agree.  Indeed, this would definitely be an improvement.

   Luben

^ permalink raw reply

* Re: [PATCH] Enable tree (directory) history display
From: Linus Torvalds @ 2006-07-01  3:21 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: Junio C Hamano, git
In-Reply-To: <20060701024309.63001.qmail@web31805.mail.mud.yahoo.com>



On Fri, 30 Jun 2006, Luben Tuikov wrote:
> 
> I took a comparative look with and without "--full-history",
> and FWIW, enabling full history just clobbers the output with a lot
> of unnecessary information.  I.e. it shows merges which do not have
> direct consequence or change to the files in the path spec specified
> after the "--".

Oh, you're right - "--full-history" right now is literally geared solely 
towards the graphical kind, which needs all the merges (regardless of 
whether they change a file or not) to make sense of the history.

> I.e. no new, relevant information is being shown when "--full-history"
> is enabled.  In fact the default git-rev-list case, simplify_history=1,
> still shows a merge here and there which doesn't have any direct
> changes into what is being sought, but the difference is
> about 48% less clobber.

Well, with history simplification, we still show merges that are required 
to make the history _complete_, ie say that you had

	  a
	  |
	  b
	 / \
	c   d
	|   |

and neither "a" nor "b" actually changed the file, but both "c" and "d" 
did: in this case we have to leave "b" around just because otherwise there 
would be no way to show the _relationship_, even if "b" itself doesn't 
actually change the tree in any way what-so-ever.

> Can you consider the default case to be simplify_history=1,
> which is currently the default behaviour of git-rev-list.

Actually, for your case, you don't want _any_ merges, unless those merges 
literally changed the tree from all of the parents.

I think it would make sense to make that further simplification if the 
"--parents" flag wasn't present. 

Hmm. Maybe something like this..

BTW! Junio, I think this patch actually fixes a real bug.

Without this patch, the "--parents --full-history" combination (which 
you'd get if you do something like

	gitk --full-history Makefile

or similar) will actually _drop_ merges where all children are identical. 
That's wrong in the --full-history case, because it measn that the graph 
ends up missing lots of entries.

In the process, this also should make

	git-rev-list --full-history Makefile

give just the _true_ list of all commits that changed Makefile (and 
properly ignore merges that were identical in one parent), because now 
we're not asking for "--parent", so we don't need the unnecessary merge 
commits to keep the history together.

Luben, does this fix the problem for you?

		Linus

----
diff --git a/revision.c b/revision.c
index b963f2a..1cf6276 100644
--- a/revision.c
+++ b/revision.c
@@ -280,7 +280,7 @@ int rev_same_tree_as_empty(struct rev_in
 static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 {
 	struct commit_list **pp, *parent;
-	int tree_changed = 0;
+	int tree_changed = 0, tree_same = 0;
 
 	if (!commit->tree)
 		return;
@@ -298,6 +298,7 @@ static void try_to_simplify_commit(struc
 		parse_commit(p);
 		switch (rev_compare_tree(revs, p->tree, commit->tree)) {
 		case REV_TREE_SAME:
+			tree_same = 1;
 			if (!revs->simplify_history || (p->object.flags & UNINTERESTING)) {
 				/* Even if a merge with an uninteresting
 				 * side branch brought the entire change
@@ -334,7 +335,7 @@ static void try_to_simplify_commit(struc
 		}
 		die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1));
 	}
-	if (tree_changed)
+	if (tree_changed && !tree_same)
 		commit->object.flags |= TREECHANGE;
 }
 
@@ -896,6 +897,8 @@ static int rewrite_one(struct rev_info *
 		struct commit *p = *pp;
 		if (!revs->limited)
 			add_parents_to_list(revs, p, &revs->commits);
+		if (p->parents && p->parents->next)
+			return 0;
 		if (p->object.flags & (TREECHANGE | UNINTERESTING))
 			return 0;
 		if (!p->parents)
@@ -988,8 +991,15 @@ struct commit *get_revision(struct rev_i
 		    commit->parents && commit->parents->next)
 			continue;
 		if (revs->prune_fn && revs->dense) {
-			if (!(commit->object.flags & TREECHANGE))
-				continue;
+			/* Commit without changes? */
+			if (!(commit->object.flags & TREECHANGE)) {
+				/* drop merges unless we want parenthood */
+				if (!revs->parents)
+					continue;
+				/* non-merge - always ignore it */
+				if (commit->parents && !commit->parents->next)
+					continue;
+			}
 			if (revs->parents)
 				rewrite_parents(revs, commit);
 		}

^ permalink raw reply related

* Re: [PATCH] Enable tree (directory) history display
From: Linus Torvalds @ 2006-07-01  3:45 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0606301954140.12404@g5.osdl.org>



On Fri, 30 Jun 2006, Linus Torvalds wrote:
> 
> BTW! Junio, I think this patch actually fixes a real bug.
> 
> Without this patch, the "--parents --full-history" combination (which 
> you'd get if you do something like
> 
> 	gitk --full-history Makefile
> 
> or similar) will actually _drop_ merges where all children are identical. 
> That's wrong in the --full-history case, because it measn that the graph 
> ends up missing lots of entries.

Here's some real numbers.

Before this patch

	git-rev-list --full-history HEAD -- Makefile | wc -l
	git-rev-list --parents --full-history HEAD -- Makefile | wc -l

both returned 971 commits on my current kernel tree, while

	git-rev-list --parents HEAD -- Makefile | wc -l

returned 145 commits, and

	git-rev-list --parents --no-merges HEAD -- Makefile | wc -l

returns 136.

That count of 145 is the number of commits that actually _change_ Makefile 
some way - and some of them really are merges, because they have a content 
merge, and the merge result is thus different from any of the children. So 
that's a real number. So is 136, in some sense - it just says that we 
don't care about commits, even if those commits _do_ end up changing the 
file.

But the important part to realize is that the "971" number is always 
wrong. It's never a really valid number. It contains a lot of extra 
merges, but it does _not_ contain enough of them to connect all the dots, 
and it's thus never correct. Either you should drop merges that don't 
change things (in which case you cannot have full connectivity, and 
"--parents" doesn't make sense), or you should keep them all (or at least 
enough to get full connectivity).

Now, AFTER this patch

	git-rev-list --full-history HEAD -- Makefile | wc -l

returns 145 commits (the same 145 commits that really change Makefile, but 
we've now properly decided that because we don't have "--parents" and 
don't need to keep connectivity we drop _all_ of the merges that have a 
child that is identical), while

	git-rev-list --parents --full-history HEAD -- Makefile | wc -l

returns 2323 commits, and now really has _all_ the merges (because it 
needs to include every single merge in the tree - otherwise the 
connectivity doesn't make sense).

Now, that 2323 is a bit unnecessary - we end up having merges to merges 
that don't actually have any changes at all in between, and it might be 
nice to simplify the merge history to create a minimal tree that still has 
all potential changes in it, but that's a much harder problem.

Anyway, the patch definitely makes a difference, and I think it's correct. 
The effects might be a bit easier to visualize on a smaller tree than the 
kernel ;)

We could still potentially improve on the "--parents --full-history" case, 
but --full-history currently means always walking all possible chains, and 
that will be shown in the output (ie we will have all possible paths in 
the result if "--parents" is used, even if those paths end up being 
totally uninteresting)

  "Hey - you asked for full history, you got it. Don't blame me if you got 
   a lot of totally uninteresting crud"

				Linus

^ permalink raw reply

* Re: A note on merging conflicts..
From: Linus Torvalds @ 2006-07-01  3:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vy7vedntn.fsf@assigned-by-dhcp.cox.net>



On Fri, 30 Jun 2006, Junio C Hamano wrote:
> 
> Heh, that's why I kept saying I want somebody to teach rev-list
> a new notation, A...B, to mean $(merge-base A B)..B ;-).

I actually don't think that expression makes any sense.

	$(merge-base A B)..B

as an expression only makes sense if there is a single point of forking, 
and no contact apart from that. In that case, what you suggest makes 
sense, because doing

	git diff A...B

is exactly what you want. 

HOWEVER. If there has been any other merges in between (but they aren't 
merge-bases because either branch _also_ did other things), your A...B
expression is meaningless, I think. To do a diff in that case, you really 
need to do my "merge+diff" thing, and no amount of "A...B" expressions on 
a commit relationship level can be meaningful.

Now, the expression

	A...B == B...A == A B --not $(git-merge-base --all A B)

is meaningful (and the one I want for merges), but it's largely useless 
for anything else. It just means "the set of all commits that aren't 
trivially in both" (it's not strictly a valid set operation, but it
approaches being an "xor" instead of a union or an intersection or a 
difference).

But the above isn't useful for "git diff" and friends any more, it's 
mainly just for merging.

			Linus

^ permalink raw reply

* Re: A note on merging conflicts..
From: Linus Torvalds @ 2006-07-01  3:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0606302046230.12404@g5.osdl.org>



On Fri, 30 Jun 2006, Linus Torvalds wrote:
>
> (it's not strictly a valid set operation, but it approaches being an 
> "xor" instead of a union or an intersection or a difference).

Oh, I guess it _is_ perfectly valid. It's called a "symmetric difference" 
in set theory.

So from a set standpoint:

	Git op:			Set theory:

	git-rev-list a..b	// difference: B - A
	git-rev-list b..a	// difference: A - B

	git-rev-list a b	// union of A B (order doesn't matter)

	git-rev-list a...b	// symmetric difference A B (order doesn't matter)

	git-rev-list $(git-merge-base --all a b)
				// intersection of A and B

I think.

		Linus

^ permalink raw reply

* [PATCH] git-svn: allow a local target directory to be specified for init
From: Eric Wong @ 2006-07-01  4:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Luca Barbato, Eric Wong

Original patch by Luca Barbato, cleaned up and made to work for
the current version of git-svn by me (Eric Wong).

Luca Barbato <lu_zero@gentoo.org> wrote:
> Since I'm lazy I just hacked a bit git-svn in order to create a target
> dir and init it if is passed as second parameter.
>
> git-svn init url://to/the/repo local-repo
>
> will create the local-repo dir if doesn't exist yet and populate it as
> expected.
>
> Maybe someone else could find it useful

Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
 contrib/git-svn/git-svn.perl |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/contrib/git-svn/git-svn.perl b/contrib/git-svn/git-svn.perl
index b3d3f47..1e19aa1 100755
--- a/contrib/git-svn/git-svn.perl
+++ b/contrib/git-svn/git-svn.perl
@@ -264,9 +264,19 @@ when you have upgraded your tools and ha
 }
 
 sub init {
-	$SVN_URL = shift or die "SVN repository location required " .
+	my $url = shift or die "SVN repository location required " .
 				"as a command-line argument\n";
-	$SVN_URL =~ s!/+$!!; # strip trailing slash
+	$url =~ s!/+$!!; # strip trailing slash
+
+	if (my $repo_path = shift) {
+		unless (-d $repo_path) {
+			mkpath([$repo_path]);
+		}
+		$GIT_DIR = $ENV{GIT_DIR} = $repo_path . "/.git";
+		init_vars();
+	}
+
+	$SVN_URL = $url;
 	unless (-d $GIT_DIR) {
 		my @init_db = ('git-init-db');
 		push @init_db, "--template=$_template" if defined $_template;
-- 
1.4.1.rc1.g7fe3

^ permalink raw reply related

* Re: Problem with GITK
From: Paul Mackerras @ 2006-07-01 10:49 UTC (permalink / raw)
  To: Paolo Ciarrocchi; +Cc: Git Mailing List
In-Reply-To: <4d8e3fd30606281340n147821e2hcbd4ccaf9551173f@mail.gmail.com>

Paolo Ciarrocchi writes:

> there is a strange orange line in the following screenshot from gitk:
> http://paolo.ciarrocchi.googlepages.com/Screenshotgit.png

That's an X server bug, and just today I found it.  The bug
description and proposed patch are at:

https://bugs.freedesktop.org/show_bug.cgi?id=7381

Paul.

^ permalink raw reply

* Re: [RFC] gitweb wishlist and TODO list
From: Paul Mackerras @ 2006-07-01 10:35 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Jakub Narebski, git
In-Reply-To: <46a038f90606201417k71c4c43ak59204774bcfe8246@mail.gmail.com>

Martin Langhoff writes:

> > If I remember correctly, it was done in the background, and it was done
> > at least partially _in_ gitk (Tcl/Tk).
> 
> I suspect it is doing a whole lot of git-merge-base invocations, which
> are rather costly. I don't know of any cheaper way to ask that
> question.

There's no git-merge-base involved.  Gitk does a

git-rev-list --all --topo-order --parents

and reads the output of that, and then traverses the entire graph
forwards and backwards (in Tcl).  (This is after gitk has read the
output of git ls-remote $GIT_DIR, so it knows which commits have
tags.)

Paul.

^ permalink raw reply

* git-send-email: mail send at 1st december 2005
From: Bertrand Jacquin @ 2006-07-01 13:41 UTC (permalink / raw)
  To: git

Hi,

Since git 1.4.0 all mail send with git-send-email have a date of
01/12/2005. Is it a known problem ?

Thanks
-- 
# Beber : beber@gna.org
# IM : beber@jabber.fr
# http://guybrush.ath.cx, irc://irc.freenode.net/#{e.fr,gentoofr}

^ permalink raw reply

* Re: [PATCH] autoconf: Use autoconf to write installation directories to config.mak
From: Jakub Narebski @ 2006-07-01 13:58 UTC (permalink / raw)
  To: git
In-Reply-To: <44A51693.5020501@op5.se>

Andreas Ericsson wrote:

> This is bad, since it forces users to do one thing first and then do 
> what they're used to. Better to have the script add
> 
> -include config.mak.autogen
> 
> LAST in config.mak, unless it's already in the file and generate 
> config.mak.autogen with configure.
> 
> Since Make does things bottoms-up (much like swedish students and 
> midsummer celebrators), the previous hand-edited defaults in config.mak 
> will beat the ones in config.mak.autogen (a good thing).

That's not true, unless we use '?=' assignment in Makefile. And we _want_ to
override defaults provided in main git Makefile, so in config.mak.autoconf,
and probably in user's own config.mak we use '=' overriding assignment. So
it would be better to just add "-include config.mak.autogen" to makefile
before "-include config.mak", as in patch below.

diff --git a/Makefile b/Makefile
index ccd7c62..a37d400 100644
--- a/Makefile
+++ b/Makefile
@@ -333,6 +333,7 @@ ifneq (,$(findstring arm,$(uname_M)))
        ARM_SHA1 = YesPlease
 endif
 
+-include config.mak.autogen
 -include config.mak
 
 ifdef WITH_OWN_SUBPROCESS_PY


-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply related

* Re: [PATCH 1/3] Add read_cache_from() and discard_cache()
From: Johannes Schindelin @ 2006-07-01 15:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Alex Riesen
In-Reply-To: <7v3bdmk2zj.fsf@assigned-by-dhcp.cox.net>

Hi,

On Fri, 30 Jun 2006, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > +int discard_cache()
> > +{
> > +	int ret;
> > +	
> > +	if (cache_mmap == NULL)
> > +		return 0;
> > +	ret = munmap(cache_mmap, cache_mmap_size);
> > +	cache_mmap = NULL;
> > +	cache_mmap_size = 0;
> > +	active_nr = active_cache_changed = 0;
> > +	/* no need to throw away allocated active_cache */
> > +	return ret;
> > +}
> > +
> 
> I haven't been following the details of the patches in this
> thread while they are being cooked actively, but two things to
> look out for are:
> 
>  - I am guessing you run discard_cache() because you want to
>    read in a new cache (or start from a clean slate).  I am not
>    sure what you are doing with the old cache tree data
>    structure.  If you are starting from a clean slate
>    (i.e. there is no read_cache_from() after you call
>    discard_cache), you would probably need to discard the old
>    cache tree; otherwise your next write-tree may produce an
>    incorrect index file.  If you keep the old one and later
>    swap it in, the problem might be even more severe.

True, I missed that one. But it is just a call to 
cache_tree_free(active_cache_tree); in discard_cache(), right?

>  - index_timestamp is left as the old value in this patch when
>    you switch cache using read_cache_from() directly.  I have a
>    suspicion you may be bitten by "Racy Git" problem, especially
>    because the operations are supposed to happen quickly thanks
>    to the effort of you two ;-) increasing the risks that the
>    file timestamp of the working tree file and the cached entry
>    match.

Yes. Again, just one line to discard_cache(), right?

	index_file_timestamp = 0;

If there is more to it, please don't let me die dumb.

Ciao,
Dscho

^ permalink raw reply

* Re: A note on merging conflicts..
From: Rene Scharfe @ 2006-07-01 15:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0606302046230.12404@g5.osdl.org>

On Fri, Jun 30, 2006 at 08:54:33PM -0700, Linus Torvalds wrote:
> Now, the expression
> 
> 	A...B == B...A == A B --not $(git-merge-base --all A B)
> 
> is meaningful (and the one I want for merges), but it's largely useless 
> for anything else. It just means "the set of all commits that aren't 
> trivially in both" (it's not strictly a valid set operation, but it
> approaches being an "xor" instead of a union or an intersection or a 
> difference).

You mean something like the following patch on top of the 'next' branch?
It also documents the --not switch because I needed it for the example.

TODO: There are still a few undocumented options left and setup_revisions()
is fat and ugly.  Any volunteers?  I'd clean it up if I only could
write comprehensible documentation and wasn't that lazy..

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>

diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index ad6d14c..ffbf0c9 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -15,6 +15,7 @@ SYNOPSIS
 	     [ \--sparse ]
 	     [ \--no-merges ]
 	     [ \--remove-empty ]
+	     [ \--not ]
 	     [ \--all ]
 	     [ \--topo-order ]
 	     [ \--parents ]
@@ -37,6 +38,14 @@ not in 'baz'".
 A special notation <commit1>..<commit2> can be used as a
 short-hand for {caret}<commit1> <commit2>.
 
+Another special notation is <commit1>...<commit2> which is useful for
+merges.  The resulting set of commits is the symmetric difference
+between the two operands.  The following two commands are equivalent:
+
+------------
+$ git-rev-list A B --not $(git-merge-base --all A B)
+$ git-rev-list A...B
+------------
 
 OPTIONS
 -------
@@ -93,6 +102,11 @@ OPTIONS
 --remove-empty::
 	Stop when a given path disappears from the tree.
 
+--not::
+	Reverses the meaning of the '{caret}' prefix (or lack
+	thereof) for all following revision specifiers, up to
+	the next `--not`.
+
 --all::
 	Pretend as if all the refs in `$GIT_DIR/refs/` are
 	listed on the command line as <commit>.
diff --git a/revision.c b/revision.c
index ae4ca82..d4224a1 100644
--- a/revision.c
+++ b/revision.c
@@ -766,6 +766,47 @@ int setup_revisions(int argc, const char
 			left++;
 			continue;
 		}
+		dotdot = strstr(arg, "...");
+		if (dotdot) {
+			unsigned char other_sha1[20];
+			const char *one = arg;
+			const char *two = dotdot + 3;
+			*dotdot = '\0';
+			if (dotdot == arg)
+				one = "HEAD";
+			if (!*two)
+				two = "HEAD";
+			if (!get_sha1(one, sha1) &&
+			    !get_sha1(two, other_sha1)) {
+				struct commit *a, *b;
+				struct commit_list *exclude;
+
+				a = lookup_commit_reference(sha1);
+				b = lookup_commit_reference(other_sha1);
+				if (!a || !b)
+					die("Invalid symmetric difference expression %s...%s", arg, two);
+
+				if (!seen_dashdash) {
+					*dotdot = '.';
+					verify_non_filename(revs->prefix, arg);
+
+				}
+				exclude = get_merge_bases(a, b);
+				while (exclude) {
+					struct object *object =
+						&exclude->item->object;
+					object->flags |= flags ^ UNINTERESTING;
+					add_pending_object(revs, object, sha1_to_hex(object->sha1));
+					exclude = exclude->next;
+				}
+				a->object.flags |= flags;
+				add_pending_object(revs, &a->object, one);
+				b->object.flags |= flags;
+				add_pending_object(revs, &b->object, two);
+				continue;
+			}
+			*dotdot = '.';
+		}
 		dotdot = strstr(arg, "..");
 		if (dotdot) {
 			unsigned char from_sha1[20];

^ permalink raw reply related

* Re: A note on merging conflicts..
From: Johannes Schindelin @ 2006-07-01 15:23 UTC (permalink / raw)
  To: Rene Scharfe; +Cc: Linus Torvalds, Junio C Hamano, git
In-Reply-To: <20060701150926.GA25800@lsrfire.ath.cx>

Hi,

On Sat, 1 Jul 2006, Rene Scharfe wrote:

> +				exclude = get_merge_bases(a, b);

Aaah! Junio, Linus, I see the light now.

Ciao,
Dscho

^ permalink raw reply

* Re: A note on merging conflicts..
From: Linus Torvalds @ 2006-07-01 16:25 UTC (permalink / raw)
  To: Rene Scharfe; +Cc: Junio C Hamano, git
In-Reply-To: <20060701150926.GA25800@lsrfire.ath.cx>



On Sat, 1 Jul 2006, Rene Scharfe wrote:
> 
> You mean something like the following patch on top of the 'next' branch?
> It also documents the --not switch because I needed it for the example.

Yes. 

However, I think that 90% of the code for the ".." and "..." case are the 
same, as is largely the finding of it.

So why not just do this all inside the already existing

	dotdot = strstr(arg, "..");
	if (dotdot) {
		unsigned char other_sha1[20];
		const char *one = arg;
		const char *two = arg + 2;
		int symmetric = *two == '.';

		*dotdot = '\0';
		two += symmetric;

		if (one == arg)
			one = "HEAD";
		if (!*two)
			two = "HEAD";
		...

because the only difference is really at the very end.

Did you test that it looks correct too?

		Linus

^ permalink raw reply

* Re: [RFC/PATCH 14] autoconf: Added --with/--without for openssl, curl, expat to ./configure
From: Jakub Narebski @ 2006-07-01 17:55 UTC (permalink / raw)
  To: Pavel Roskin, git
In-Reply-To: <20060630233004.7xckw444g4g0gcs8@webmail.spamcop.net>

Pavel Roskin wrote:
> Hi Jakub,
> 
> you lost cc: but please feel free to return to the list.

Sending reply to GMane newsgroup tied to mailing list, and via email 
to author, but _not_ via mail to mailing list would do that... 
especially if the author I'm replying to doesn't use newsgroup 
interface.
 
> Quoting Jakub Narebski <jnareb@gmail.com>:

>> I suspect that AS_HELP_WITH does some strange quoting, or stripping. Both
>> [=PATH] and [[=PATH]] produces =PATH in ./configure --help output.
>> When using @<:@=PATH@:>@ I get [=PATH], but the description of option begins
>> line below.
> 
> If you are not following quoting rules, every macro can do strange things!

Well, [=PATH] or [[=PATH]] doesn't work even if GIT_APPEND_LINE is without
double quotes. Besides, that doesn't matter because this is inside of 
AS_HELP_STRING macro. No combination of quoting (I think I tried all)
works... I guess I would just not use AS_HELP_STRING, and format help 
message "by hand".

>> Any ideas for name of MY_APPEND_LINE(LINE)/GIT_APPEND_LINE(LINE) macro,
>> which as a result adds line to output (e.g. LINE = "NO_CURL=YesPlease")?
> 
> GIT_LIB_CURL
> 
> Generally, please try to avoid negations.  They are confising to the end users. 
> Lack of curl may be an anomaly to git developers, but it is not an anomaly for a
> user who has never heard of curl.  If you can, try to use positive logic, like
> CURL=1, and translate it to negative logic only as a temporary solution and far
> away from the user's eyes.

I'm following example set by main Makefile. That, and I tried to make configure.ac
as simple as possible...


By the way, if you know autoconf well, perhaps you could tell me how to write
tests for the following programs: ar, tar, rpmbuild, how to write test for
Python version (or rather for WITH_OWN_SUBPROCESS_PY) and other test autoconf.ac
lacks now (NEEDS_SSL_WITH_CRYPTO, NEEDS_LIBICONV, NEEDS_SOCKET, NO_MMAP,
NO_IPV6, NO_ICONV, NO_ACCURATE_DIFF unless that was removed or changed name).

-- 
Jakub Narebski
ShadeHawk on #git

^ permalink raw reply

* Re: A note on merging conflicts..
From: J. Bruce Fields @ 2006-07-01 18:01 UTC (permalink / raw)
  To: Rene Scharfe; +Cc: Linus Torvalds, Junio C Hamano, git
In-Reply-To: <20060701150926.GA25800@lsrfire.ath.cx>

On Sat, Jul 01, 2006 at 05:09:26PM +0200, Rene Scharfe wrote:
> +Another special notation is <commit1>...<commit2> which is useful for
> +merges.  The resulting set of commits is the symmetric difference
> +between the two operands.  The following two commands are equivalent:

What's the logic behind naming the operator "..."?

Seems like asking for trouble to have two visually similar operators (".." and
"...") with different meanings, and "..." seems like kind of an arbitrary
choice anyway.

A symmetric difference is basically equivalent to an xor--would a carat ("^")
work?  Or could we just stick a word there instead of using some tricky
notation?

--b.

^ permalink raw reply

* Re: A note on merging conflicts..
From: Rene Scharfe @ 2006-07-01 18:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0607010921450.12404@g5.osdl.org>

On Sat, Jul 01, 2006 at 09:25:43AM -0700, Linus Torvalds wrote:
> 
> 
> On Sat, 1 Jul 2006, Rene Scharfe wrote:
> > 
> > You mean something like the following patch on top of the 'next' branch?
> > It also documents the --not switch because I needed it for the example.
> 
> Yes. 
> 
> However, I think that 90% of the code for the ".." and "..." case are the 
> same, as is largely the finding of it.
> 
> So why not just do this all inside the already existing
> 
> 	dotdot = strstr(arg, "..");
> 	if (dotdot) {
> 		unsigned char other_sha1[20];
> 		const char *one = arg;
> 		const char *two = arg + 2;
> 		int symmetric = *two == '.';
> 
> 		*dotdot = '\0';
> 		two += symmetric;
> 
> 		if (one == arg)
> 			one = "HEAD";
> 		if (!*two)
> 			two = "HEAD";
> 		...
> 
> because the only difference is really at the very end.

Hrm, I'm not sure this is really cleaner.  The two operators consist
of all dots only coincidentally, this is not functionally inherent.
So I think it's better to keep them apart.

Let's see..  [Time passes.  A patch materializes at six o'clock.]

With a little helper factored out this doesn't look as bad as I
imagined.  Maybe we can take it.  What do you think?

> Did you test that it looks correct too?

Sort of; I checked that the two forms (with ... and $(git-merge-base))
gave the same results for 7b8cf0cf and 51d1e83f, that's all.  For a
proper test script we'd need to create a repo for which git-merge-base
can report multiple results.  I wasn't able to come up with the needed
commands without thinking and gave up for now.  Am working on it..

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>

diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index ad6d14c..6c370e1 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -15,6 +15,7 @@ SYNOPSIS
 	     [ \--sparse ]
 	     [ \--no-merges ]
 	     [ \--remove-empty ]
+	     [ \--not ]
 	     [ \--all ]
 	     [ \--topo-order ]
 	     [ \--parents ]
@@ -37,6 +38,14 @@ not in 'baz'".
 A special notation <commit1>..<commit2> can be used as a
 short-hand for {caret}<commit1> <commit2>.
 
+Another special notation is <commit1>...<commit2> which is useful for
+merges.  The resulting set of commits is the symmetric difference
+between the two operands.  The following two commands are equivalent:
+
+------------
+$ git-rev-list A B --not $(git-merge-base --all A B)
+$ git-rev-list A...B
+------------
 
 OPTIONS
 -------
@@ -93,6 +102,11 @@ OPTIONS
 --remove-empty::
 	Stop when a given path disappears from the tree.
 
+--not::
+	Reverses the meaning of the '{caret}' prefix (or lack
+	thereof) for all following revision specifiers, up to
+	the next `--not`.
+
 --all::
 	Pretend as if all the refs in `$GIT_DIR/refs/` are
 	listed on the command line as <commit>.
diff --git a/revision.c b/revision.c
index ae4ca82..bcedf66 100644
--- a/revision.c
+++ b/revision.c
@@ -536,6 +536,18 @@ void init_revisions(struct rev_info *rev
 	diff_setup(&revs->diffopt);
 }
 
+static void add_pending_commit_list(struct rev_info *revs,
+                                    struct commit_list *commit_list,
+                                    unsigned int flags)
+{
+	while (commit_list) {
+		struct object *object = &commit_list->item->object;
+		object->flags |= flags;
+		add_pending_object(revs, object, sha1_to_hex(object->sha1));
+		commit_list = commit_list->next;
+	}
+}
+
 /*
  * Parse revision information, filling in the "rev_info" structure,
  * and removing the used arguments from the argument list.
@@ -771,27 +783,45 @@ int setup_revisions(int argc, const char
 			unsigned char from_sha1[20];
 			const char *next = dotdot + 2;
 			const char *this = arg;
+			int symmetric = *next == '.';
+			unsigned int flags_exclude = flags ^ UNINTERESTING;
+
 			*dotdot = 0;
+			next += symmetric;
+
 			if (!*next)
 				next = "HEAD";
 			if (dotdot == arg)
 				this = "HEAD";
 			if (!get_sha1(this, from_sha1) &&
 			    !get_sha1(next, sha1)) {
-				struct object *exclude;
-				struct object *include;
-
-				exclude = get_reference(revs, this, from_sha1, flags ^ UNINTERESTING);
-				include = get_reference(revs, next, sha1, flags);
-				if (!exclude || !include)
-					die("Invalid revision range %s..%s", arg, next);
+				struct commit *a, *b;
+				struct commit_list *exclude;
+
+				a = lookup_commit_reference(from_sha1);
+				b = lookup_commit_reference(sha1);
+				if (!a || !b) {
+					die(symmetric ?
+					    "Invalid symmetric difference expression %s...%s" :
+					    "Invalid revision range %s..%s",
+					    arg, next);
+				}
 
 				if (!seen_dashdash) {
 					*dotdot = '.';
 					verify_non_filename(revs->prefix, arg);
 				}
-				add_pending_object(revs, exclude, this);
-				add_pending_object(revs, include, next);
+
+				if (symmetric) {
+					exclude = get_merge_bases(a, b);
+					add_pending_commit_list(revs, exclude,
+					                        flags_exclude);
+					a->object.flags |= flags;
+				} else
+					a->object.flags |= flags_exclude;
+				b->object.flags |= flags;
+				add_pending_object(revs, &a->object, this);
+				add_pending_object(revs, &b->object, next);
 				continue;
 			}
 			*dotdot = '.';

^ permalink raw reply related

* Re: A note on merging conflicts..
From: Linus Torvalds @ 2006-07-01 18:20 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Rene Scharfe, Junio C Hamano, git
In-Reply-To: <20060701180125.GA27550@fieldses.org>



On Sat, 1 Jul 2006, J. Bruce Fields wrote:
> 
> What's the logic behind naming the operator "..."?

Well, if ".." is set difference, why not "..." for symmetric set 
difference.

The operations really _are_ related.

Also, the parse syntax and logic really is the same, even if Rene's patch 
didn't take advantage of that.

That said, it does have a real downside, and that's simply that it can 
take a long time to compute.

Somebody should also verify that there are no interesting interaction with 
the fact that we end up traversing the commit lists twice (no object flag 
interactions etc) with the new "get_merge_bases()"

		Linus

^ permalink raw reply

* Re: A note on merging conflicts..
From: Jakub Narebski @ 2006-07-01 18:22 UTC (permalink / raw)
  To: git
In-Reply-To: <20060701180125.GA27550@fieldses.org>

J. Bruce Fields wrote:

> On Sat, Jul 01, 2006 at 05:09:26PM +0200, Rene Scharfe wrote:
>> +Another special notation is <commit1>...<commit2> which is useful for
>> +merges.  The resulting set of commits is the symmetric difference
>> +between the two operands.  The following two commands are equivalent:
> 
> What's the logic behind naming the operator "..."?
> 
> Seems like asking for trouble to have two visually similar operators (".." and
> "...") with different meanings, and "..." seems like kind of an arbitrary
> choice anyway.

Because A...B is extension of A..B for merges.

> A symmetric difference is basically equivalent to an xor--would a carat ("^")
> work?  Or could we just stick a word there instead of using some tricky
> notation?

Caret is used twice, with different meaning. As prefix operator "^" means 
"exclude lineage of commit" (while commit without "^" in front means:
"include lineage of commit and commit itself"). BTW. why we don't use '!'
for that?

As postfix operator "^" means "dereference", i.e. parent in the case 
of commit; allows choosing a parent (commit^n) and listing all parents 
(commit^@). Using it as binary infix operator that would be I think 
too much. 

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* Re: A note on merging conflicts..
From: Junio C Hamano @ 2006-07-01 18:37 UTC (permalink / raw)
  To: Rene Scharfe; +Cc: git, Linus Torvalds
In-Reply-To: <20060701150926.GA25800@lsrfire.ath.cx>

I suspect this has the same problem I pointed out to Kristian's 
attempt to make git-branch a built-in.

    Subject: Re: [PATCH] Implement git-branch and git-merge-base as built-ins.
    Date: Thu, 08 Jun 2006 11:53:48 -0700
    Message-ID: <7vverbsclf.fsf@assigned-by-dhcp.cox.net>

Namely, merge-base code is not set up to be called more than
once without cleaning things up.

^ permalink raw reply

* Re: [PATCH 1/3] Add read_cache_from() and discard_cache()
From: Junio C Hamano @ 2006-07-01 18:51 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0607011657460.29667@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> True, I missed that one. But it is just a call to 
> cache_tree_free(active_cache_tree); in discard_cache(), right?

On the codepath to write out the new index file, calling
cache_free_tree(&active_cache_tree) before write_cache() is all
that should be needed.  When "active_cache_tree == NULL",
write_cache() would write out an index file without the cached
tree information.

Currently not many things take advantage of cached tree
information to optimize its operation.  But I'd like to change
that.  For example, tree merges by read-tree should be able to
take advantage of the fact that a cached tree read from the
index and three trees being read all match for a subdirectory
and do the merge of the directory without descending into it.

>>  - index_timestamp is left as the old value in this patch when
>>    you switch cache using read_cache_from() directly.  I have a
>>    suspicion you may be bitten by "Racy Git" problem, especially
>>    because the operations are supposed to happen quickly thanks
>>    to the effort of you two ;-) increasing the risks that the
>>    file timestamp of the working tree file and the cached entry
>>    match.
>
> Yes. Again, just one line to discard_cache(), right?
>
> 	index_file_timestamp = 0;

This one I am not sure.  Read the comment in ce_match_stat() and
see the problematic sequence of events that this variable tries
to help resolve applies to your use.

^ permalink raw reply

* Re: A note on merging conflicts..
From: Linus Torvalds @ 2006-07-01 18:52 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e86ega$gnc$1@sea.gmane.org>



On Sat, 1 Jul 2006, Jakub Narebski wrote:
> 
> Caret is used twice, with different meaning. As prefix operator "^" means 
> "exclude lineage of commit" (while commit without "^" in front means:
> "include lineage of commit and commit itself"). BTW. why we don't use '!'
> for that?

Using '!' is really nasty with most shells. Avoid, avoid, avoid.

It would be more sensible to use ~ (mathematical negation), but that also 
has magic meaning for shell at the beginning of a word..

			Linus

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox