[RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
@ 2008-07-27  3:33 Christian Couder
  2008-07-27  4:47 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Christian Couder @ 2008-07-27  3:33 UTC (permalink / raw)
  To: git, Junio C Hamano, Johannes Schindelin; +Cc: Miklos Vajna

Before this patch "git merge-base" accepted only 2 arguments, so
only merge bases between 2 references could be computed.

The purpose of this patch is to make "git merge-base" accept more
than 2 arguments, so that the merge bases between the first given
reference and all the other references can be computed.

This is easily implemented because the "get_merge_bases_many"
function in "commit.c" already implements the needed computation.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/git-merge-base.txt |   27 +++++++++++++++-------
 builtin-merge-base.c             |   45 ++++++++++++++++++++++++++-----------
 commit.h                         |    2 +
 3 files changed, 51 insertions(+), 23 deletions(-)

	Hi,

	No tests yet, but the following changes since the first version:

	- added documentation
	- don't use static variables anymore

	Thanks for your comments,
	Christian.

diff --git a/Documentation/git-merge-base.txt b/Documentation/git-merge-base.txt
index 1a7ecbf..463c230 100644
--- a/Documentation/git-merge-base.txt
+++ b/Documentation/git-merge-base.txt
@@ -8,26 +8,35 @@ git-merge-base - Find as good common ancestors as possible for a merge
 
 SYNOPSIS
 --------
-'git merge-base' [--all] <commit> <commit>
+'git merge-base' [--all] <commit> <commit>...
 
 DESCRIPTION
 -----------
 
-'git-merge-base' finds as good a common ancestor as possible between
-the two commits. That is, given two commits A and B, `git merge-base A
-B` will output a commit which is reachable from both A and B through
-the parent relationship.
+'git-merge-base' finds as good common ancestors as possible between
+the first commit and the other commits. The default behavior is to
+output only one as good as possible common ancestor, called a merge
+base.
+
+For example, given two commits A and B, `git merge-base A B` will
+output a commit which is reachable from both A and B through the
+parent relationship.
+
+Given three commits A, B and C, `git merge-base A B C` will output a
+commit which is reachable through the parent relationship from both A
+and B, or from both A and C.
 
 Given a selection of equally good common ancestors it should not be
 relied on to decide in any particular way.
 
-The 'git-merge-base' algorithm is still in flux - use the source...
-
 OPTIONS
 -------
 --all::
-	Output all common ancestors for the two commits instead of
-	just one.
+	Output all merge bases for the commits instead of just one. No
+	merge bases in the output should be an ancestor of another
+	merge base in the output. Each common ancestor of the first
+	commit and any of the other commits passed as arguments,
+	should be an ancestor of one of the merge bases in the output.
 
 Author
 ------
diff --git a/builtin-merge-base.c b/builtin-merge-base.c
index 1cb2925..f2c9756 100644
--- a/builtin-merge-base.c
+++ b/builtin-merge-base.c
@@ -2,9 +2,11 @@
 #include "cache.h"
 #include "commit.h"
 
-static int show_merge_base(struct commit *rev1, struct commit *rev2, int show_all)
+static int show_merge_base(struct commit *rev1, int prev2_nr,
+			   struct commit **prev2, int show_all)
 {
-	struct commit_list *result = get_merge_bases(rev1, rev2, 0);
+	struct commit_list *result = get_merge_bases_many(rev1, prev2_nr,
+							  prev2, 0);
 
 	if (!result)
 		return 1;
@@ -20,12 +22,20 @@ static int show_merge_base(struct commit *rev1, struct commit *rev2, int show_al
 }
 
 static const char merge_base_usage[] =
-"git merge-base [--all] <commit-id> <commit-id>";
+"git merge-base [--all] <commit-id> <commit-id>...";
+
+static struct commit *get_commit_reference(const char *arg)
+{
+	unsigned char revkey[20];
+	if (get_sha1(arg, revkey))
+		die("Not a valid object name %s", arg);
+	return lookup_commit_reference(revkey);
+}
 
 int cmd_merge_base(int argc, const char **argv, const char *prefix)
 {
-	struct commit *rev1, *rev2;
-	unsigned char rev1key[20], rev2key[20];
+	struct commit *rev1, **prev2 = NULL;
+	size_t prev2_nr = 0, prev2_alloc = 0;
 	int show_all = 0;
 
 	git_config(git_default_config, NULL);
@@ -38,15 +48,22 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix)
 			usage(merge_base_usage);
 		argc--; argv++;
 	}
-	if (argc != 3)
+	if (argc < 3)
 		usage(merge_base_usage);
-	if (get_sha1(argv[1], rev1key))
-		die("Not a valid object name %s", argv[1]);
-	if (get_sha1(argv[2], rev2key))
-		die("Not a valid object name %s", argv[2]);
-	rev1 = lookup_commit_reference(rev1key);
-	rev2 = lookup_commit_reference(rev2key);
-	if (!rev1 || !rev2)
+
+	rev1 = get_commit_reference(argv[1]);
+	if (!rev1)
 		return 1;
-	return show_merge_base(rev1, rev2, show_all);
+	argc--; argv++;
+
+	do {
+		struct commit *rev2 = get_commit_reference(argv[1]);
+		if (!rev2)
+			return 1;
+		ALLOC_GROW(prev2, prev2_nr + 1, prev2_alloc);
+		prev2[prev2_nr++] = rev2;
+		argc--; argv++;
+	} while (argc > 1);
+
+	return show_merge_base(rev1, prev2_nr, prev2, show_all);
 }
diff --git a/commit.h b/commit.h
index 77de962..4829a5c 100644
--- a/commit.h
+++ b/commit.h
@@ -121,6 +121,8 @@ int read_graft_file(const char *graft_file);
 struct commit_graft *lookup_commit_graft(const unsigned char *sha1);
 
 extern struct commit_list *get_merge_bases(struct commit *rev1, struct commit *rev2, int cleanup);
+extern struct commit_list *get_merge_bases_many(struct commit *one,
+		int n, struct commit **twos, int cleanup);
 extern struct commit_list *get_octopus_merge_bases(struct commit_list *in);
 
 extern int register_shallow(const unsigned char *sha1);
-- 
1.6.0.rc0.43.g00eb.dirty

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
  2008-07-27  3:33 [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments Christian Couder
@ 2008-07-27  4:47 ` Junio C Hamano
  2008-07-28  6:10   ` Christian Couder
  2008-07-27 14:38 ` Johannes Schindelin
  2008-07-27 15:02 ` Jakub Narebski
  2 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2008-07-27  4:47 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Johannes Schindelin, Miklos Vajna

Christian Couder <chriscool@tuxfamily.org> writes:

>  OPTIONS
>  -------
>  --all::
> -	Output all common ancestors for the two commits instead of
> -	just one.
> +	Output all merge bases for the commits instead of just one. No
> +	merge bases in the output should be an ancestor of another
> +	merge base in the output. Each common ancestor of the first
> +	commit and any of the other commits passed as arguments,
> +	should be an ancestor of one of the merge bases in the output.

The point of merge_bases_many() is that it allows you to compute a merge
base between a commit and another commit that does not yet exist which is
a merge across all others.

               o---o---o---o---C
              /                 :
             /   o---o---o---B..(M)
            /   /                 :
	---o---*---o---o---o---A..(X)

Suppose you have commits A, B and C, and you would want to come up with an
Octopus merge X across these three commits.  Because our low-level merge
machinery works always on two trees with one common ancestor tree, we
would first create a tree that is a merge between B and C (which is marked
as (M) in the picture), and then merge the tree of (M) and A using common
ancestor between (M) and A.

If we did not have merge_bases_many(), we would actually create (M) as a
real commit, and compute merge base between A and (M), which is marked as
"*" in the picture.  The use of merge_bases_many() allows us to run the
same merge base computation without actually creating commit (M).  Instead
of computing merge-base between A and (M), you can ask for the merge base
between A ("the first commit") and B, C ("the other commits") to obtain
the same answer "*".

Base between A and B is that "*", and you are correct to say that it is an
ancestor of the "*" that is output from the command; base between A and C
is the parent of "*", and again you are correct to say it is ancestor of
the "*" that is output from the command.

But if we output any other commit between "*" and A from the command, it
still satisifies your condition.  "The merge base between A and each of B,
C,... should be an ancestor of what is output".  In order to satisify your
condition, in the extreme case, we could even output A.  Both the merge
base between A and B, and the merge base between A and C, would be an
ancestor of A.

So your description may not be incorrect, but I think it completely misses
the point of what is being computed.

>  Author
>  ------
> diff --git a/builtin-merge-base.c b/builtin-merge-base.c
> index 1cb2925..f2c9756 100644
> --- a/builtin-merge-base.c
> +++ b/builtin-merge-base.c
> @@ -2,9 +2,11 @@
>  #include "cache.h"
>  #include "commit.h"
>  
> -static int show_merge_base(struct commit *rev1, struct commit *rev2, int show_all)
> +static int show_merge_base(struct commit *rev1, int prev2_nr,
> +			   struct commit **prev2, int show_all)
>  {
> -	struct commit_list *result = get_merge_bases(rev1, rev2, 0);
> +	struct commit_list *result = get_merge_bases_many(rev1, prev2_nr,
> +							  prev2, 0);

This is just style, but if you must break lines somewhere, I'd prefer to
have prev2_nr and prev2 on the same line, like this:

	struct commit_list *result = get_merge_bases_many(rev1,
							  prev2_nr, prev2, 0);

because they logically belong to each other.  Further, I think this
84-column single-line statement is perfectly fine as well in this case:

	struct commit_list *result = get_merge_bases_many(rev1, prev2_nr, prev2, 0);

I would probably do this myself in this case, though:

	struct commit_list *result;

	result  = get_merge_bases_many(rev1, prev2_nr, prev2, 0);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
  2008-07-27  3:33 [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments Christian Couder
  2008-07-27  4:47 ` Junio C Hamano
@ 2008-07-27 14:38 ` Johannes Schindelin
  2008-07-27 15:28   ` Miklos Vajna
                     ` (2 more replies)
  2008-07-27 15:02 ` Jakub Narebski
  2 siblings, 3 replies; 11+ messages in thread
From: Johannes Schindelin @ 2008-07-27 14:38 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Junio C Hamano, Miklos Vajna

Hi,

On Sun, 27 Jul 2008, Christian Couder wrote:

> diff --git a/builtin-merge-base.c b/builtin-merge-base.c
> index 1cb2925..f2c9756 100644
> --- a/builtin-merge-base.c
> +++ b/builtin-merge-base.c
> @@ -38,15 +48,22 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix)
>  			usage(merge_base_usage);
>  		argc--; argv++;
>  	}
> -	if (argc != 3)
> +	if (argc < 3)
>  		usage(merge_base_usage);
> -	if (get_sha1(argv[1], rev1key))
> -		die("Not a valid object name %s", argv[1]);
> -	if (get_sha1(argv[2], rev2key))
> -		die("Not a valid object name %s", argv[2]);
> -	rev1 = lookup_commit_reference(rev1key);
> -	rev2 = lookup_commit_reference(rev2key);
> -	if (!rev1 || !rev2)
> +
> +	rev1 = get_commit_reference(argv[1]);
> +	if (!rev1)
>  		return 1;

Why do you special case rev1?  Is it so special?  Just handle it together 
with all of the other arguments!

IOW have one commit array, and do not call it "prev".

> -	return show_merge_base(rev1, rev2, show_all);
> +	argc--; argv++;
> +
> +	do {
> +		struct commit *rev2 = get_commit_reference(argv[1]);
> +		if (!rev2)
> +			return 1;
> +		ALLOC_GROW(prev2, prev2_nr + 1, prev2_alloc);
> +		prev2[prev2_nr++] = rev2;
> +		argc--; argv++;
> +	} while (argc > 1);

Now, this is ugly.  You know beforehand the _exact_ number of arguments, 
and yet you dynamically grow the array?  Also, why do you use a do { } 
while(), when a for () would be much, much clearer?

BTW I seem to recall that get_merge_bases_many() was _not_ the same as 
get_merge_octopus().  Could you please remind me what _many() does?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
  2008-07-27  3:33 [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments Christian Couder
  2008-07-27  4:47 ` Junio C Hamano
  2008-07-27 14:38 ` Johannes Schindelin
@ 2008-07-27 15:02 ` Jakub Narebski
  2 siblings, 0 replies; 11+ messages in thread
From: Jakub Narebski @ 2008-07-27 15:02 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Junio C Hamano, Johannes Schindelin, Miklos Vajna

Christian Couder <chriscool@tuxfamily.org> writes:

> Before this patch "git merge-base" accepted only 2 arguments, so
> only merge bases between 2 references could be computed.

> +'git-merge-base' finds as good common ancestors as possible between
> +the first commit and the other commits. The default behavior is to
> +output only one as good as possible common ancestor, called a merge
> +base.
> +
> +For example, given two commits A and B, `git merge-base A B` will
> +output a commit which is reachable from both A and B through the
> +parent relationship.
> +
> +Given three commits A, B and C, `git merge-base A B C` will output a
> +commit which is reachable through the parent relationship from both A
> +and B, or from both A and C.

I don't understand this complication.  Isn't merge base (merge bases)
for commits A, B and C simply least common ancestor (or ancestors)
of commits A, B and C (with commits being included as their own
ancestors)?

What are the results of "git merge-base" and "git merge-base --all"
in the following situations?

For two commits:

      .---.---*---.---.---A
               \
                \-.---B


      .---m-----b---.---.---A
           \     \ /
            \     X
             \   / \
              \-a---.---.---B


For three commits:

      .---.---1---2---.---.---A
               \   \
                \   \-.---B
                 \
                  \---.---.---C


                    /-.---.---A
                   /
      .---.---1---2---.---.---B
               \
                \-.---.---.---C 
-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
  2008-07-27 14:38 ` Johannes Schindelin
@ 2008-07-27 15:28   ` Miklos Vajna
  2008-07-27 18:11   ` Junio C Hamano
  2008-07-28  5:49   ` Christian Couder
  2 siblings, 0 replies; 11+ messages in thread
From: Miklos Vajna @ 2008-07-27 15:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Christian Couder, git, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 676 bytes --]

On Sun, Jul 27, 2008 at 04:38:05PM +0200, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> BTW I seem to recall that get_merge_bases_many() was _not_ the same as 
> get_merge_octopus().  Could you please remind me what _many() does?

get_octopus_merge_bases() takes a commit_list, and tries to figure out
the common bases of those commits.

get_merge_bases_many(), which is used by reduce_heads() takes a commit
and a commit_list, and counts the bases of the heads against the commit
specified as the first parameter.

I think get_merge_bases_many() could be even static, it is not really
interesting for anything else than reduce_heads() at the moment.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
  2008-07-27 14:38 ` Johannes Schindelin
  2008-07-27 15:28   ` Miklos Vajna
@ 2008-07-27 18:11   ` Junio C Hamano
  2008-07-27 20:44     ` Johannes Schindelin
  2008-07-27 20:47     ` Re* " Junio C Hamano
  2008-07-28  5:49   ` Christian Couder
  2 siblings, 2 replies; 11+ messages in thread
From: Junio C Hamano @ 2008-07-27 18:11 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Christian Couder, git, Miklos Vajna

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> BTW I seem to recall that get_merge_bases_many() was _not_ the same as 
> get_merge_octopus().  Could you please remind me what _many() does?

I explained what merge-bases-many gives in a separate message last night
with pictures.

get_merge_octopus() is a more or less useless function.  It is there only
because the protocol between "merge" and strategies requires that the
former have to pass _some_ bases to the latter.  In fact, the octopus
strategy implementation completely ignores the heads given by "merge";
a single set of merge base given from outside is not even useful when you
build octopus by repeatedly running pairwise three-way merges.

With Christian's git-merge-base enhancement, the big comment at the end of
git-merge-octopus's main loop can go with a much improved "next" merge
base computation.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
  2008-07-27 18:11   ` Junio C Hamano
@ 2008-07-27 20:44     ` Johannes Schindelin
  2008-07-27 20:47     ` Re* " Junio C Hamano
  1 sibling, 0 replies; 11+ messages in thread
From: Johannes Schindelin @ 2008-07-27 20:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Christian Couder, git, Miklos Vajna

Hi,

On Sun, 27 Jul 2008, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > BTW I seem to recall that get_merge_bases_many() was _not_ the same as 
> > get_merge_octopus().  Could you please remind me what _many() does?
> 
> I explained what merge-bases-many gives in a separate message last night 
> with pictures.

I missed that, alright.

> get_merge_octopus() is a more or less useless function.  It is there 
> only because the protocol between "merge" and strategies requires that 
> the former have to pass _some_ bases to the latter.

Does it?  I thought that e.g. merge-recursive accepts an empty set of 
merge bases?  AFAIR that was the reason why gitk could be merged so well.

> In fact, the octopus strategy implementation completely ignores the 
> heads given by "merge"; a single set of merge base given from outside is 
> not even useful when you build octopus by repeatedly running pairwise 
> three-way merges.
> 
> With Christian's git-merge-base enhancement, the big comment at the end 
> of git-merge-octopus's main loop can go with a much improved "next" 
> merge base computation.

Okay.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re* [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
  2008-07-27 18:11   ` Junio C Hamano
  2008-07-27 20:44     ` Johannes Schindelin
@ 2008-07-27 20:47     ` Junio C Hamano
  1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2008-07-27 20:47 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Christian Couder, git, Miklos Vajna

Junio C Hamano <gitster@pobox.com> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> BTW I seem to recall that get_merge_bases_many() was _not_ the same as 
>> get_merge_octopus().  Could you please remind me what _many() does?
>
> I explained what merge-bases-many gives in a separate message last night
> with pictures.
>
> get_merge_octopus() is a more or less useless function.  It is there only
> because the protocol between "merge" and strategies requires that the
> former have to pass _some_ bases to the latter.  In fact, the octopus
> strategy implementation completely ignores the heads given by "merge";
> a single set of merge base given from outside is not even useful when you
> build octopus by repeatedly running pairwise three-way merges.

This part may benefit from a bit of clarification, so I'll borrow the
picture from my other message.

               o---o---o---o---C
              /
             /   o---o---o---B
            /   /
        ---1---2---o---o---o---A

Suppose you have this topology, and you are trying to make an octopus
across A, B and C (you are at C and merging A and B into your branch).
The protoccol between "git merge" and merge strategies is for the former
to pass common ancestor(s), '--' and then commits being merged.

I think "merge", both in scripted version and Miklos's C version, gives
"1" because that is the common across all the heads.  This choice of merge
base in itself is correct (that is why I said that git_merge_octopus() is
"more or less useless", not "totally pointless").  If an implementation of
strategy that is capable of producing an octopus wanted to merge arbitrary
number of trees using a single merge base in one-go, that is the merge
base it would want to use.

HOWEVER.

The particular implementation of git-merge-octopus does not produce the
final merge in one-go.  It iteratively produces pairwise merges.  So the
first step might be to come up with a merge between B and C:

               o---o---o---o---C
              /                 :
             /   o---o---o---B..(M)
            /   /
        ---1---2---o---o---o---A

and for that, "1" is used as the merge base, not because it is the base
across A, B and C but because it is the base between B and C.  For this
merge, A does not matter.

I drew M in parentheses and lines between B and C to it in dotted line
because we actually do _not_ create a real commit --- the only thing we
need is a tree object, in order to proceed to the next step.

Then the final merge result is obtained by merging tree of (M) and A using
their common ancestor.  For that, we _could_ still use "1" as the merge
base.

But if you imagine a case where you started from A and M, you would _not_
pick "1" as the merge base; you would rather use "2" which is a better
base for this merge.

That is why git-merge-octopus ignores the merge base given by "merge" but
computes its own.

        Side note.  If your first step merge is between A and B, then you
        would also use "2" as the merge base.

                   o---o---o---o---C
                  /
                 /   o---o---o---B..(M)
                /   /               : 
            ---1---2---o---o---o---A

The comment at the end of git-merge-octopus talks about "merge reference
commit", that we used to update it to common found in this round, and that
that updating was pointless.  After the first round of merge to produce
the tree for M (but without actually creating the commit object M itself),
in order to figure out the merge base used to merge that with A in the
second round, we used to use A and "1" (which was merge base between B and
C).  That was pointless --- "merge-base A 1" is guaranteed to give a base
that is no better than either "merge-base A B" or "merge-base A C".  So
the current code keeps using the original head (iow, MRC=Ci, because n
this case we are starting from C and merging B and then A into it).

This trickerly was necessary only because we avoided creating the extra
merge commit object M.

	Side note.  An alternative implementation could have been to
	actually record it as a real merge commit M, and then let the
	two-commit merge-base compute the base between A and M when
	merging A to the result of the previous round, but we avoided
	creating M, at the expense of potentially using suboptimal base in
	the later rounds.

But we do not have to be that pessimistic.  We can instead accumulate the
commits we have merged so far in MRC, and have merge_bases_many() compute
the merge base for the new head being merged and the heads we have
processed so far, which can give a better base than what we currently do.

That is what Christian's patch enables us to do.

In other words, we can do something like this.

 git-merge-octopus.sh |   11 ++---------
 1 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
index 645e114..1dadbb4 100755
--- a/git-merge-octopus.sh
+++ b/git-merge-octopus.sh
@@ -61,7 +61,7 @@ do
 		exit 2
 	esac

-	common=$(git merge-base --all $MRC $SHA1) ||
+	common=$(git merge-base --all $SHA1 $MRC) ||
 		die "Unable to find common commit with $SHA1"

 	case "$LF$common$LF" in
@@ -100,14 +100,7 @@ do
 		next=$(git write-tree 2>/dev/null)
 	fi

-	# We have merged the other branch successfully.  Ideally
-	# we could implement OR'ed heads in merge-base, and keep
-	# a list of commits we have merged so far in MRC to feed
-	# them to merge-base, but we approximate it by keep using
-	# the current MRC.  We used to update it to $common, which
-	# was incorrectly doing AND'ed merge-base here, which was
-	# unneeded.
-
+	MRC="$MRC $SHA1"
 	MRT=$next
 done

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
  2008-07-27 14:38 ` Johannes Schindelin
  2008-07-27 15:28   ` Miklos Vajna
  2008-07-27 18:11   ` Junio C Hamano
@ 2008-07-28  5:49   ` Christian Couder
  2 siblings, 0 replies; 11+ messages in thread
From: Christian Couder @ 2008-07-28  5:49 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Miklos Vajna

Le dimanche 27 juillet 2008, Johannes Schindelin a écrit :
> Hi,
>
> On Sun, 27 Jul 2008, Christian Couder wrote:
> > diff --git a/builtin-merge-base.c b/builtin-merge-base.c
> > index 1cb2925..f2c9756 100644
> > --- a/builtin-merge-base.c
> > +++ b/builtin-merge-base.c
> > @@ -38,15 +48,22 @@ int cmd_merge_base(int argc, const char **argv,
> > const char *prefix) usage(merge_base_usage);
> >  		argc--; argv++;
> >  	}
> > -	if (argc != 3)
> > +	if (argc < 3)
> >  		usage(merge_base_usage);
> > -	if (get_sha1(argv[1], rev1key))
> > -		die("Not a valid object name %s", argv[1]);
> > -	if (get_sha1(argv[2], rev2key))
> > -		die("Not a valid object name %s", argv[2]);
> > -	rev1 = lookup_commit_reference(rev1key);
> > -	rev2 = lookup_commit_reference(rev2key);
> > -	if (!rev1 || !rev2)
> > +
> > +	rev1 = get_commit_reference(argv[1]);
> > +	if (!rev1)
> >  		return 1;
>
> Why do you special case rev1?  Is it so special?  Just handle it together
> with all of the other arguments!
>
> IOW have one commit array, and do not call it "prev".

Ok, I have done that in v3.

> > -	return show_merge_base(rev1, rev2, show_all);
> > +	argc--; argv++;
> > +
> > +	do {
> > +		struct commit *rev2 = get_commit_reference(argv[1]);
> > +		if (!rev2)
> > +			return 1;
> > +		ALLOC_GROW(prev2, prev2_nr + 1, prev2_alloc);
> > +		prev2[prev2_nr++] = rev2;
> > +		argc--; argv++;
> > +	} while (argc > 1);
>
> Now, this is ugly.  You know beforehand the _exact_ number of arguments,
> and yet you dynamically grow the array?

You are right. I guess I dealt too much with complex parsing of arguments 
where you can have options everywhere, and perhaps I also felt somewhat 
guilty for not having used ALLOC_GROW before, or something.

> Also, why do you use a do { } 
> while(), when a for () would be much, much clearer?

Well, we already know that there are at least 2 commits to parse, so it 
feels a little bit wastefull to check first again. Also the code to parse 
the [--all|-a] option before use a while loop with "argc--; argv++;" at the 
end, so I think it is more coherent like that.

Thanks,
Christian.

> BTW I seem to recall that get_merge_bases_many() was _not_ the same as
> get_merge_octopus().  Could you please remind me what _many() does?
>
> Ciao,
> Dscho

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
  2008-07-27  4:47 ` Junio C Hamano
@ 2008-07-28  6:10   ` Christian Couder
  2008-07-28  6:18     ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Couder @ 2008-07-28  6:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Johannes Schindelin, Miklos Vajna

Le dimanche 27 juillet 2008, Junio C Hamano a écrit :
>
> The point of merge_bases_many() is that it allows you to compute a merge
> base between a commit and another commit that does not yet exist which is
> a merge across all others.
>
>                o---o---o---o---C
>               /                 :
>              /   o---o---o---B..(M)
>             /   /                 :
> 	---o---*---o---o---o---A..(X)
>
> Suppose you have commits A, B and C, and you would want to come up with
> an Octopus merge X across these three commits.  Because our low-level
> merge machinery works always on two trees with one common ancestor tree,
> we would first create a tree that is a merge between B and C (which is
> marked as (M) in the picture), and then merge the tree of (M) and A using
> common ancestor between (M) and A.
>
> If we did not have merge_bases_many(), we would actually create (M) as a
> real commit, and compute merge base between A and (M), which is marked as
> "*" in the picture.  The use of merge_bases_many() allows us to run the
> same merge base computation without actually creating commit (M). 
> Instead of computing merge-base between A and (M), you can ask for the
> merge base between A ("the first commit") and B, C ("the other commits")
> to obtain the same answer "*".

Yeah, but what is quite confusing I think is that a merge base seems to be 
defined as "an as good as possible _common_ ancestor" but in this case, the 
result of "git merge-base A B C" is not an ancestor of C (even with --all 
option). So perhaps we need a better definition.

> Base between A and B is that "*", and you are correct to say that it is
> an ancestor of the "*" that is output from the command; base between A
> and C is the parent of "*", and again you are correct to say it is
> ancestor of the "*" that is output from the command.
>
> But if we output any other commit between "*" and A from the command, it
> still satisifies your condition.  "The merge base between A and each of
> B, C,... should be an ancestor of what is output".  In order to satisify
> your condition, in the extreme case, we could even output A.  Both the
> merge base between A and B, and the merge base between A and C, would be
> an ancestor of A.
>
> So your description may not be incorrect, but I think it completely
> misses the point of what is being computed.
>
> >  Author
> >  ------
> > diff --git a/builtin-merge-base.c b/builtin-merge-base.c
> > index 1cb2925..f2c9756 100644
> > --- a/builtin-merge-base.c
> > +++ b/builtin-merge-base.c
> > @@ -2,9 +2,11 @@
> >  #include "cache.h"
> >  #include "commit.h"
> >
> > -static int show_merge_base(struct commit *rev1, struct commit *rev2,
> > int show_all) +static int show_merge_base(struct commit *rev1, int
> > prev2_nr, +			   struct commit **prev2, int show_all)
> >  {
> > -	struct commit_list *result = get_merge_bases(rev1, rev2, 0);
> > +	struct commit_list *result = get_merge_bases_many(rev1, prev2_nr,
> > +							  prev2, 0);
>
> This is just style, but if you must break lines somewhere, I'd prefer to
> have prev2_nr and prev2 on the same line, like this:
>
> 	struct commit_list *result = get_merge_bases_many(rev1,
> 							  prev2_nr, prev2, 0);
>
> because they logically belong to each other.  Further, I think this
> 84-column single-line statement is perfectly fine as well in this case:
>
> 	struct commit_list *result = get_merge_bases_many(rev1, prev2_nr, prev2,
> 0);
>
> I would probably do this myself in this case, though:
>
> 	struct commit_list *result;
>
> 	result  = get_merge_bases_many(rev1, prev2_nr, prev2, 0);

I used that in v3.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments
  2008-07-28  6:10   ` Christian Couder
@ 2008-07-28  6:18     ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2008-07-28  6:18 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Johannes Schindelin, Miklos Vajna

Christian Couder <chriscool@tuxfamily.org> writes:

> Yeah, but what is quite confusing I think is that a merge base seems to be 
> defined as "an as good as possible _common_ ancestor" but in this case, the 
> result of "git merge-base A B C" is not an ancestor of C (even with --all 
> option). So perhaps we need a better definition.

See my other message.  "git merge-base A B C" is not the best common
ancestors across A B and C.  It is the best common ancestors between A and
a commit that is a merge between B and C.

By defining/explaining it that way, you can avoid the "Huh, are you
talking about OR or AND" question you would inevitably get when you say
"Common ancestor of A and B or A and C".  Also in "git merge-base A B C",
A is fundamentally different from any other commit; a commit being (or not
being) common between A and B (or A and C) is what we care about a lot,
but a commit being (or not being) common between B and C does not matter
in this computation.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-07-28  6:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-27  3:33 [RFC/PATCH v2] merge-base: teach "git merge-base" to accept more than 2 arguments Christian Couder
2008-07-27  4:47 ` Junio C Hamano
2008-07-28  6:10   ` Christian Couder
2008-07-28  6:18     ` Junio C Hamano
2008-07-27 14:38 ` Johannes Schindelin
2008-07-27 15:28   ` Miklos Vajna
2008-07-27 18:11   ` Junio C Hamano
2008-07-27 20:44     ` Johannes Schindelin
2008-07-27 20:47     ` Re* " Junio C Hamano
2008-07-28  5:49   ` Christian Couder
2008-07-27 15:02 ` Jakub Narebski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).