git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* breakage in revision traversal with pathspec
@ 2013-09-10 17:19 Junio C Hamano
  2013-09-10 21:27 ` Kevin Bracey
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2013-09-10 17:19 UTC (permalink / raw)
  To: Kevin Bracey; +Cc: git

I am grumpy X-<.

It appears that we introduced a large breakage during 1.8.4 cycle to
the revision traversal machinery and made pathspec-limited "git log"
pretty much useless.

This command

    $ git log v1.8.3.1..v1.8.4 -- git-cvsserver.perl

reports that a merge 766f0f8ef7 (which did not touch the specified
path at all) touches it.

Bisecting points at d0af663e (revision.c: Make --full-history
consider more merges, 2013-05-16).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-10 17:19 breakage in revision traversal with pathspec Junio C Hamano
@ 2013-09-10 21:27 ` Kevin Bracey
  2013-09-10 22:23   ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Kevin Bracey @ 2013-09-10 21:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 10/09/2013 20:19, Junio C Hamano wrote:
> I am grumpy X-<.
>
> It appears that we introduced a large breakage during 1.8.4 cycle to
> the revision traversal machinery and made pathspec-limited "git log"
> pretty much useless.
>
> This command
>
>      $ git log v1.8.3.1..v1.8.4 -- git-cvsserver.perl
>
> reports that a merge 766f0f8ef7 (which did not touch the specified
> path at all) touches it.
>
> Bisecting points at d0af663e (revision.c: Make --full-history
> consider more merges, 2013-05-16).
>
That merge appearing *with* --full-history would seem like correct 
behaviour to me. Or at least it's what I intended.

That merge *did* touch that path - that file differs between the two 
parents, and it resolved on one of them.

This goes back to my original motivation for the change - the ability to 
find merges that resolved in unexpected ways - I want "--full-history" 
to show every merge where the end result is not identical to every parent.

This does mean that "--full-history" will show more merges than it used 
to for a pathspec - it will show merges in from topic branches which 
didn't touch that pathspec, but where the mainline did change it.

These extra merges can be pared back by "--simplify-merges", which will 
generally eliminate any irrelevant topic branches, although not for 
topic branches that are rooted older than your bottom commit, like in 
this example.


However, your particular example occurs *without*--full-history, which 
suggests a problem.

That merge is right near the bottom of the range. Its first parent is 
v1.8.3. It has an incoming pre-1.8.3 topic branch 
(fc/transport-helper-error-reporting) with an old version of 
git-cvsserver.perl (which the merge correctly didn't take). Display of 
that sort of bottom-of-range merge is a problem area I did try to 
address - it was problematic in older Git versions, but that problem was 
partially concealed by the overly-permissive "hide any merge identical 
to any one parent" logic, and became more exposed by my "show more merges".

I'm pretty certain non-full "git log v1.8.3..v1.8.4" shouldn't show that 
merge. And indeed it doesn't. Yay! At this point the various "follow 
first parent if identical", "prioritise on-graph treesame" and "treat 
bottom commits as on-graph" rules are working. That merge is identical 
to v1.8.3, its first parent, and we've expressed an interest in v1.8.3, 
so that's treated as on-graph, so it doesn't get shown.

(Oddly, "gitk v1.8.3..v1.8.4" fails and shows the merge. It seems gitk 
fails here because of the annotated tag: "gitk v1.8.3^0..v1.8.4" 
correctly shows nothing. So one apparent "failure to peel" bug 
somewhere. Can't seem to provoke this with git log.)

"git log v1.8.3.1..v1.8.4" on the other hand I'm not so sure about. The 
"follow first treesame parent" logic doesn't kick in because the merge's 
only treesame parent (v1.8.3) is off-graph. The merge is not treesame to 
its only on-graph parent (fc/transport-helper-error-reporting), so that 
"default following" rule doesn't activate.

I'm going to have to think a bit. "git log (--ancestry-path?) 
fc/transport-helper-error-reporting..v1.8.4" should definitely show that 
merge - we want to see how we got from the version of the file on the 
specified topic branch to the different version in v1.8.4.

Maybe if the first-treesame-parent rule was reapplied again later after 
rewriting? After we've pruned away the topic branch by rewriting because 
its commits don't touch the pathspec, then our merge is left with two 
off-graph rewritten parents. At which point maybe it would be reasonable 
for the default log to reapply the "follow first treesame parent" rule 
if all remaining parents are off-graph.

Does that make sense? Going to have to think harder.

I note that "gitk v1.8.3^0..v1.8.4" and "git log --parents 
v1.8.3..v1.8.4" show that merge in Git 1.8.3, but not in Git 1.8.4. So 
we're going partially forwards, at least.

Kevin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-10 21:27 ` Kevin Bracey
@ 2013-09-10 22:23   ` Junio C Hamano
  2013-09-11 17:49     ` Kevin Bracey
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2013-09-10 22:23 UTC (permalink / raw)
  To: Kevin Bracey; +Cc: git

Kevin Bracey <kevin@bracey.fi> writes:

> On 10/09/2013 20:19, Junio C Hamano wrote:
>> I am grumpy X-<.
>>
>> It appears that we introduced a large breakage during 1.8.4 cycle to
>> the revision traversal machinery and made pathspec-limited "git log"
>> pretty much useless.
>>
>> This command
>>
>>      $ git log v1.8.3.1..v1.8.4 -- git-cvsserver.perl
>>
>> reports that a merge 766f0f8ef7 (which did not touch the specified
>> path at all) touches it.
>>
>> Bisecting points at d0af663e (revision.c: Make --full-history
>> consider more merges, 2013-05-16).
>>
> That merge appearing *with* --full-history would seem like correct
> behaviour to me. Or at least it's what I intended.

Oh, of course.  "--full-history" is about showing any pointless
change, "the mainline was a lot more up-to-date and there were
changes relative to a fork based on an older baseline", so your
updated "log" should show that in the mainline git-cvsserver.perl
has been more fresh when that merge happened.  But it shouldn't
appear if the user does not ask for "--full-history".

> However, your particular example occurs *without*--full-history, which
> suggests a problem.

Yes.

> I note that "gitk v1.8.3^0..v1.8.4" and "git log --parents
> v1.8.3..v1.8.4" show that merge in Git 1.8.3, but not in Git 1.8.4. So
> we're going partially forwards, at least.

With the testcases demonstrating the cases your series fixed that
all look sensible, I think it is not really an option for us to
revert them; you do not have to defend it with "we are going
partially forwards" ;-).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-10 22:23   ` Junio C Hamano
@ 2013-09-11 17:49     ` Kevin Bracey
  2013-09-11 18:24       ` Jonathan Nieder
  0 siblings, 1 reply; 14+ messages in thread
From: Kevin Bracey @ 2013-09-11 17:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 11/09/2013 01:23, Junio C Hamano wrote:
> Kevin Bracey <kevin@bracey.fi> writes:
>
>> On 10/09/2013 20:19, Junio C Hamano wrote:
>>> This command
>>>
>>>       $ git log v1.8.3.1..v1.8.4 -- git-cvsserver.perl
>>>
>>> reports that a merge 766f0f8ef7 (which did not touch the specified
>>> path at all) touches it.
>>>
>>> Bisecting points at d0af663e (revision.c: Make --full-history
>>> consider more merges, 2013-05-16).
>>>
>> That merge appearing *with* --full-history would seem like correct
>> behaviour to me. Or at least it's what I intended.
> ... But it shouldn't
> appear if the user does not ask for "--full-history".

Well, there is a functioning semi-work-around for now: avoid difficult 
non-linear questions like "v1.8.3.1..v1.8.4". A question like 
"v1.8.3..v1.8.4" is a lot easier to visualise, and it does already omit 
the merge.

On reflection I'm not sure what we should for the "simple history" view 
of v1.8.3.1..v1.8.4. We're not rewriting parents, so we don't get a 
chance to reconsider the merge as being zero-parent, and we do have this 
little section of graph to traverse at the bottom:

           1.8.3
             o----x----x----x----x---x---     (x = included, o = 
excluded, *=!treesame)
                 /
                /*
   o--x--x--x--x

In effect, we do have a linear section of history to follow, and the 
file does change in the middle of that line. It may be quite hard to 
come up with a solid rule to hide the merge that doesn't go wrong 
somewhere else.

The current rules for this are

1) if identical to any on-graph parent, follow that one, and rewrite the 
merge as a non-merge. We currently do not follow to an identical 
off-graph parent. This long-standing comment in try_to_simplify_commit 
applies: "Even if a merge with an uninteresting side branch brought the 
entire change we are interested in, we do not want to lose the other 
branches of this merge, so we just keep going." For this query, the 
mainline link to 1.8.3 is the "uninteresting side branch"! If you do 
specify v1.8.3..v1.8.4, then v1.8.3 becomes "on-graph" thanks to other 
new rules, and this rule does kick in, hiding the merge.

2) If rule 1 doesn't activate, and it remains as a merge, hide it if 
treesame to all on-graph parents. Previously this rule was "hide if 
treesame to any parent", and so that would have hidden the merge.

Now, when I changed rule 2, I did not think this would affect the 
default log. See my commit message:

     "Now redefine a commit's TREESAME flag to be true only if a commit is
     TREESAME to _all_ of its [later: on-graph] parent. This doesn't 
affect ... the default
     simplify_history behaviour (because partially TREESAME merges are 
turned
     into normal commits)..."

Whoops - partially TREESAME merges are not always turned into normal 
commits.

Maybe the fix is to define TREESAME differently for simplify_history - 
to use the old definition of "identical to any parent" in that case. I'm 
not sure that's right though.

I currently feel instinctively more disposed to dropping the older 
"don't follow off-graph identical parents" rule. Let the default history 
go straight to v1.8.3 even though it goes off the graph, stopping us 
traversing the topic branch.

Kevin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-11 17:49     ` Kevin Bracey
@ 2013-09-11 18:24       ` Jonathan Nieder
  2013-09-11 19:21         ` Junio C Hamano
  2013-09-11 19:39         ` Kevin Bracey
  0 siblings, 2 replies; 14+ messages in thread
From: Jonathan Nieder @ 2013-09-11 18:24 UTC (permalink / raw)
  To: Kevin Bracey; +Cc: Junio C Hamano, git

Kevin Bracey wrote:

> On reflection I'm not sure what we should for the "simple history"
> view of v1.8.3.1..v1.8.4. We're not rewriting parents, so we don't
> get a chance to reconsider the merge as being zero-parent, and we do
> have this little section of graph to traverse at the bottom:
>
>           1.8.3
>             o----x----x----x----x---x---     (x = included, o = excluded, *=!treesame)
>                 /
>                /*
>   o--x--x--x--x
[...]
> 1) if identical to any on-graph parent, follow that one, and rewrite
> the merge as a non-merge. We currently do not follow to an identical
> off-graph parent. This long-standing comment in try_to_simplify_commit
> applies: "Even if a merge with an uninteresting side branch brought
> the entire change we are interested in, we do not want to lose the
> other branches of this merge, so we just keep going."
[...]
> 2) If rule 1 doesn't activate, and it remains as a merge, hide it if
> treesame to all on-graph parents. Previously this rule was "hide if
> treesame to any parent", and so that would have hidden the merge.
>
> Now, when I changed rule 2, I did not think this would affect the
> default log. See my commit message:
[...]
> I currently feel instinctively more disposed to dropping the older
> "don't follow off-graph identical parents" rule. Let the default
> history go straight to v1.8.3 even though it goes off the graph,
> stopping us traversing the topic branch.

Thanks for this analysis.  Interesting.

The rule (1) comes from v1.3.0-rc1~13^2~6:

	commit f3219fbbba32b5100430c17468524b776eb869d6
	Author: Junio C Hamano <junkio@cox.net>
	Date:   Fri Mar 10 21:59:37 2006 -0800

	    try_to_simplify_commit(): do not skip inspecting tree change at boundary.
	    
	    When git-rev-list (and git-log) collapsed ancestry chain to
	    commits that touch specified paths, we failed to inspect and
	    notice tree changes when we are about to hit uninteresting
	    parent.  This resulted in "git rev-list since.. -- file" to
	    always show the child commit after the lower bound, even if it
	    does not touch the file.  This commit fixes it.
	    
	    Thanks for Catalin for reporting this.
	    
	    See also:
		461cf59f8924f174d7a0dcc3d77f576d93ed29a4
	    
	    Signed-off-by: Junio C Hamano <junkio@cox.net>

I think you're right that dropping the "don't follow off-graph
treesame parents" rule would be a sensible change.  The usual point of
the "follow the treesame parent" rule is to avoid drawing undue
attention to merges of ancient history where some of the parents are
side-branches with an old version of the files being tracked and did
not actually change those files.  That rationale applies just as much
for a merge on top of an UNINTERESTING rev as any other merge.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-11 18:24       ` Jonathan Nieder
@ 2013-09-11 19:21         ` Junio C Hamano
  2013-09-11 19:39         ` Kevin Bracey
  1 sibling, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2013-09-11 19:21 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Kevin Bracey, git

Jonathan Nieder <jrnieder@gmail.com> writes:

> I think you're right that dropping the "don't follow off-graph
> treesame parents" rule would be a sensible change.  The usual point of
> the "follow the treesame parent" rule is to avoid drawing undue
> attention to merges of ancient history where some of the parents are
> side-branches with an old version of the files being tracked and did
> not actually change those files.  That rationale applies just as much
> for a merge on top of an UNINTERESTING rev as any other merge.

Sounds sensible.  Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-11 18:24       ` Jonathan Nieder
  2013-09-11 19:21         ` Junio C Hamano
@ 2013-09-11 19:39         ` Kevin Bracey
  2013-09-11 21:15           ` Junio C Hamano
  1 sibling, 1 reply; 14+ messages in thread
From: Kevin Bracey @ 2013-09-11 19:39 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, git

On 11/09/2013 21:24, Jonathan Nieder wrote:
> Kevin Bracey wrote:
>
>> On reflection I'm not sure what we should for the "simple history"
>> view of v1.8.3.1..v1.8.4. We're not rewriting parents, so we don't
>> get a chance to reconsider the merge as being zero-parent, and we do
>> have this little section of graph to traverse at the bottom:
>>
>>            1.8.3
>>              o----x----x----x----x---x---     (x = included, o = excluded, *=!treesame)
>>                  /
>>                 /*
>>    o--x--x--x--x
> [...]
>> 1) if identical to any on-graph parent, follow that one, and rewrite
>> the merge as a non-merge. We currently do not follow to an identical
>> off-graph parent. This long-standing comment in try_to_simplify_commit
>> applies: "Even if a merge with an uninteresting side branch brought
>> the entire change we are interested in, we do not want to lose the
>> other branches of this merge, so we just keep going."
>>
> [...]
>> I currently feel instinctively more disposed to dropping the older
>> "don't follow off-graph identical parents" rule. Let the default
>> history go straight to v1.8.3 even though it goes off the graph,
>> stopping us traversing the topic branch.
> Thanks for this analysis.  Interesting.
>
> The rule (1) comes from v1.3.0-rc1~13^2~6: ...
>
> I think you're right that dropping the "don't follow off-graph
> treesame parents" rule would be a sensible change.  The usual point of
> the "follow the treesame parent" rule is to avoid drawing undue
> attention to merges of ancient history where some of the parents are
> side-branches with an old version of the files being tracked and did
> not actually change those files.  That rationale applies just as much
> for a merge on top of an UNINTERESTING rev as any other merge.
I agree about the rationale still applying - why not follow off-graph, 
unless you're doing --ancestry-path? (Fortunately ancestry_path already 
disables simplify_history). That makes more sense if you try to ignore 
the misleading comment. In a typical "v1..v3" range, the temporal 
limiting means that it's paths to the mainline that will tend to be 
marked UNINTERESTING, not to the topic branches...

But I can imagine going off graph it may previously have tripped up 
other parts of the code. It could be that this Git 1.3.0 rule ended up 
covering over some of the older merge hiding logic flakiness. Maybe it's 
no longer necessary. I'll do some experiments.

Now, one bit of news - I have just figured out why gitk is behaving 
differently. It transforms ".." before it reaches git.

To see the effect at the command line: "git log v1.8.3..v.1.8.4" hides 
the merge, but "git log ^v1.8.3 v1.8.4" shows it. Whoops. A new example 
of a dotty shorthand not being exactly equivalent.

In the ".." case the v1.8.3 tag gets peeled before being sent to 
add_rev_cmdline , and the "mark bottom commits" logic works. But in the 
"^" case, the v1.8.3 doesn't get peeled. Junio - any thoughts on the 
correct place to fix that? (And gitk actually does ^<tag-sha>, just to 
be odd, so that needs to be handled too). Should these things be peeled 
in revs->cmdline or not? We should be consistent.

Kevin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-11 19:39         ` Kevin Bracey
@ 2013-09-11 21:15           ` Junio C Hamano
  2013-09-19 21:35             ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2013-09-11 21:15 UTC (permalink / raw)
  To: Kevin Bracey; +Cc: Jonathan Nieder, git

Kevin Bracey <kevin@bracey.fi> writes:

> To see the effect at the command line: "git log v1.8.3..v.1.8.4" hides
> the merge, but "git log ^v1.8.3 v1.8.4" shows it. Whoops. A new
> example of a dotty shorthand not being exactly equivalent.
>
> In the ".." case the v1.8.3 tag gets peeled before being sent to
> add_rev_cmdline , and the "mark bottom commits" logic works. But in
> the "^" case, the v1.8.3 doesn't get peeled.

That sounds like a bug.  ^v1.8.3 should mark v1.8.3^0 as
uninteresting.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-11 21:15           ` Junio C Hamano
@ 2013-09-19 21:35             ` Junio C Hamano
  2013-09-20  3:35               ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2013-09-19 21:35 UTC (permalink / raw)
  To: Kevin Bracey; +Cc: Jonathan Nieder, git

Junio C Hamano <gitster@pobox.com> writes:

> Kevin Bracey <kevin@bracey.fi> writes:
>
>> To see the effect at the command line: "git log v1.8.3..v.1.8.4" hides
>> the merge, but "git log ^v1.8.3 v1.8.4" shows it. Whoops. A new
>> example of a dotty shorthand not being exactly equivalent.
>>
>> In the ".." case the v1.8.3 tag gets peeled before being sent to
>> add_rev_cmdline , and the "mark bottom commits" logic works. But in
>> the "^" case, the v1.8.3 doesn't get peeled.
>
> That sounds like a bug.  ^v1.8.3 should mark v1.8.3^0 as
> uninteresting.

OK, so "git rev-list ^v1.8.3 v1.8.4" throws two objects into
revs->pending.objects[] array.  Two tags, v1.8.3 marked as
UNINTERESTING and v1.8.4.  The revision walking machinery will peel
the tag by calling handle_commit() (which by the way arguably is
misnamed because it has to be called for any type of object) when it
starts to walk in prepare_revision_walk().

But "git rev-list v1.8.3..v1.8.4" throws two commits (v1.8.3^0 with
UNINTERESTING bit and v1.8.4^0) to revs->pending.objects[] after
peeling.  I _think_ it is wrong.  Because the range is only defined
over commit DAG, and because the same codepath handles the symmetric
difference v1.8.3...v1.8.4 as well, both ends of dots operator do
need to be peeled to commits, but I think it is wrong to throw these
peeled results into revs->pending.objects[].

Where it makes a difference is when rev-list is used with --objects.

    $ git rev-list --objects v1.8.4^1..v1.8.4 | grep $(git rev-parse v1.8.4)
    $ git rev-list --objects v1.8.4 ^v1.8.4^1 | grep $(git rev-parse v1.8.4)
    04f013dc38d7512eadb915eba22efc414f18b869 v1.8.4

-- >8 --
Subject: revision: do not peel tags used in range notation

A range notation "A..B" means exactly the same thing as what "^A B"
means, i.e. the set of commits that are reachable from B but not
from A.  But the internal representation after the revision parser
parsed these two notations are subtly different.

 - "rev-list ^A B" leaves A and B in the revs->pending.objects[]
   array, with the former marked as UNINTERESTING and the revision
   traversal machinery propagates the mark to underlying commit
   objects A^0 and B^0.

 - "rev-list A..B" peels tags and leaves A^0 (marked as
   UNINTERESTING) and B^0 in revs->pending.objects[] array before
   the traversal machinery kicks in.

This difference usually does not matter, but starts to matter when
the --objects option is used.  For example, we see this:

    $ git rev-list --objects v1.8.4^1..v1.8.4 | grep $(git rev-parse v1.8.4)
    $ git rev-list --objects v1.8.4 ^v1.8.4^1 | grep $(git rev-parse v1.8.4)
    04f013dc38d7512eadb915eba22efc414f18b869 v1.8.4

With the former invocation, the revision traversal machinery never
hears about the tag v1.8.4 (it only sees the result of peeling it,
i.e. the commit v1.8.4^0), and the tag itself does not appear in the
output.  The latter does send the tag object itself to the output.

Make the range notation keep the unpeeled objects and feed them to
the traversal machinery to fix this inconsistency.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * Made against maint-1.8.0 just in case we may want to fix older
   maintenance series.

 revision.c               | 19 +++++++++++++------
 t/t6000-rev-list-misc.sh |  8 ++++++++
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/revision.c b/revision.c
index 68545c8..a6d2150 100644
--- a/revision.c
+++ b/revision.c
@@ -1159,6 +1159,7 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi
 		    !get_sha1_committish(next, sha1)) {
 			struct commit *a, *b;
 			struct commit_list *exclude;
+			struct object *a_obj, *b_obj;
 
 			a = lookup_commit_reference(from_sha1);
 			b = lookup_commit_reference(sha1);
@@ -1184,14 +1185,20 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi
 				a_flags = flags | SYMMETRIC_LEFT;
 			} else
 				a_flags = flags_exclude;
-			a->object.flags |= a_flags;
-			b->object.flags |= flags;
-			add_rev_cmdline(revs, &a->object, this,
+			a_obj = (!hashcmp(a->object.sha1, from_sha1)
+				 ? &a->object
+				 : lookup_object(from_sha1));
+			b_obj = (!hashcmp(b->object.sha1, sha1)
+				 ? &b->object
+				 : lookup_object(sha1));
+			a_obj->flags |= a_flags;
+			b_obj->flags |= flags;
+			add_rev_cmdline(revs, a_obj, this,
 					REV_CMD_LEFT, a_flags);
-			add_rev_cmdline(revs, &b->object, next,
+			add_rev_cmdline(revs, b_obj, next,
 					REV_CMD_RIGHT, flags);
-			add_pending_object(revs, &a->object, this);
-			add_pending_object(revs, &b->object, next);
+			add_pending_object(revs, a_obj, this);
+			add_pending_object(revs, b_obj, next);
 			return 0;
 		}
 		*dotdot = '.';
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index b10685a..15e3d64 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -48,4 +48,12 @@ test_expect_success 'rev-list --objects with pathspecs and copied files' '
 	! grep one output
 '
 
+test_expect_success 'rev-list A..B and rev-list ^A B are the same' '
+	git commit --allow-empty -m another &&
+	git tag -a -m "annotated" v1.0 &&
+	git rev-list --objects ^v1.0^ v1.0 >expect &&
+	git rev-list --objects v1.0^..v1.0 >actual &&
+	test_cmp expect actual
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-19 21:35             ` Junio C Hamano
@ 2013-09-20  3:35               ` Jeff King
  2013-09-20  4:58                 ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2013-09-20  3:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kevin Bracey, Jonathan Nieder, git

On Thu, Sep 19, 2013 at 02:35:40PM -0700, Junio C Hamano wrote:

> -- >8 --
> Subject: revision: do not peel tags used in range notation
> 
> A range notation "A..B" means exactly the same thing as what "^A B"
> means, i.e. the set of commits that are reachable from B but not
> from A.  But the internal representation after the revision parser
> parsed these two notations are subtly different.
> [...]

Thanks for a very clear explanation. This definitely seems like an
improvement, and the patch looks good to me.

One question, though. With your patch, if I do "tag1..tag2", I get both
the tags and the peeled commits in the pending object list. Whereas with
"^tag1 tag2", we put only the tags into the list, and we expect the
traversal machinery to peel them later. I cannot off-hand think of a
reason this difference should be a problem, but I am wondering if there
is some code path that does not traverse, but just looks at pending
objects, that might care.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-20  3:35               ` Jeff King
@ 2013-09-20  4:58                 ` Junio C Hamano
  2013-09-20  5:11                   ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2013-09-20  4:58 UTC (permalink / raw)
  To: Jeff King; +Cc: Kevin Bracey, Jonathan Nieder, git

Jeff King <peff@peff.net> writes:

> One question, though. With your patch, if I do "tag1..tag2", I get both
> the tags and the peeled commits in the pending object list. Whereas with
> "^tag1 tag2", we put only the tags into the list, and we expect the
> traversal machinery to peel them later. I cannot off-hand think of a
> reason this difference should be a problem, but I am wondering if there
> is some code path that does not traverse, but just looks at pending
> objects, that might care.

Did I really do that?

I thought that the original was pushing peeled tag1^0 and tag2^0
(and nothing else) for "tag1..tag2", and the intent of the patch was
to see if "a" (which is "tag1^0" in this case) has the same object
name as the object originally given on the side of the dots
(i.e. "tag1").  If they differ, that means "a" is the peeled object,
and instead use the original "tag1" for "a_obj" that is pushed into
the pending (and if they are the same, "a_obj" is just "&a->object",
the object itself).  The same for "b", "tag2" and "b_obj".  So at
least I didn't mean to push four objects into the pending list
before prepare_revision_walk() kicks in.

Perhaps I missed something?

Now, when prepare_revision_walk() picks up objects from the pending
list, they are fed to handle_commit(), and these two tags will be
peeled and their commits are returned to be queued in revs->commits
linked list, while the tags themselves are sent to the pending list
to be emitted in "--objects" output. But that should be the same
between "tag1..tag2" and "^tag1 tag2".

A possible difference in behaviour is that with "^tag1 tag2", we do
not instantiate the commit objects pointed at by these tags until
prepare_revision_walk() sends these tags to handle_commit(), while
with "tag1..tag2", these tags and the commit objects would already
be parsed when setup_revisions() returns (and the updated code does
rely on this behaviour by saying "if a->object.sha1 and from_sha1
are different, we know the tag whose name is from_sha1 is already
parsed, so we can just call lookup_object() on from_sha1 to grab
it").  But I do not think any code just tries to grab an object
using a random object name outside the revision traversal and decide
to do things that results in semantically different behaviour if the
resulting object has (or has not) already been parsed.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-20  4:58                 ` Junio C Hamano
@ 2013-09-20  5:11                   ` Jeff King
  2013-09-20 17:51                     ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2013-09-20  5:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kevin Bracey, Jonathan Nieder, git

On Thu, Sep 19, 2013 at 09:58:23PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > One question, though. With your patch, if I do "tag1..tag2", I get both
> > the tags and the peeled commits in the pending object list. Whereas with
> > "^tag1 tag2", we put only the tags into the list, and we expect the
> > traversal machinery to peel them later. I cannot off-hand think of a
> > reason this difference should be a problem, but I am wondering if there
> > is some code path that does not traverse, but just looks at pending
> > objects, that might care.
> 
> Did I really do that?
> 
> I thought that the original was pushing peeled tag1^0 and tag2^0
> (and nothing else) for "tag1..tag2", and the intent of the patch was
> to see if "a" (which is "tag1^0" in this case) has the same object
> name as the object originally given on the side of the dots
> (i.e. "tag1").  If they differ, that means "a" is the peeled object,
> and instead use the original "tag1" for "a_obj" that is pushed into
> the pending (and if they are the same, "a_obj" is just "&a->object",
> the object itself).  The same for "b", "tag2" and "b_obj".  So at
> least I didn't mean to push four objects into the pending list
> before prepare_revision_walk() kicks in.
> 
> Perhaps I missed something?

Hrm, no, it is me misreading the diff.

My original question was going to be: why bother peeling at all if we
are just going to push the outer objects, anyway?

And after staring at it, I somehow convinced myself that the answer was
that you were pushing both. But that is not the case. Sorry for the
noise.

The other reason I considered is that we want to make sure they do peel
to commits. I do not think that is technically required for "A..B",
which can operate on non-commits. But it is for "A...B", and I do not
see any advantage in loosening "A..B" for the non-commit case. It would
just complicate the code.

> Now, when prepare_revision_walk() picks up objects from the pending
> list, they are fed to handle_commit(), and these two tags will be
> peeled and their commits are returned to be queued in revs->commits
> linked list, while the tags themselves are sent to the pending list
> to be emitted in "--objects" output. But that should be the same
> between "tag1..tag2" and "^tag1 tag2".

Yes, I was specifically concerned about sites that did not call
prepare_revision_walk(), but since the state is the same for both cases,
it's a non-issue.

> it").  But I do not think any code just tries to grab an object
> using a random object name outside the revision traversal and decide
> to do things that results in semantically different behaviour if the
> resulting object has (or has not) already been parsed.

Yeah, I think any code relying on that would be insane, from a
modularity perspective. The caching of parsed object state is an
optimization, and callers have no business making assumptions about it.
Otherwise they are fragile with respect to a previous traversal being
added in the same in-memory process.

So I think your patch is doing the right thing, and my concern was just
from mis-reading. Again, sorry for the confusion.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-20  5:11                   ` Jeff King
@ 2013-09-20 17:51                     ` Junio C Hamano
  2013-09-25  9:12                       ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2013-09-20 17:51 UTC (permalink / raw)
  To: Jeff King; +Cc: Kevin Bracey, Jonathan Nieder, git

Jeff King <peff@peff.net> writes:

> My original question was going to be: why bother peeling at all if we
> are just going to push the outer objects, anyway?
>
> And after staring at it, I somehow convinced myself that the answer was
> that you were pushing both. But that is not the case. Sorry for the
> noise.

But that is still a valid point, and the patch to avoid peeling for
non symmetric diff does not look too bad, either.

 revision.c               | 59 ++++++++++++++++++++++++++++++------------------
 t/t6000-rev-list-misc.sh |  8 +++++++
 2 files changed, 45 insertions(+), 22 deletions(-)

diff --git a/revision.c b/revision.c
index 68545c8..7010aff 100644
--- a/revision.c
+++ b/revision.c
@@ -1157,41 +1157,56 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi
 		}
 		if (!get_sha1_committish(this, from_sha1) &&
 		    !get_sha1_committish(next, sha1)) {
-			struct commit *a, *b;
-			struct commit_list *exclude;
-
-			a = lookup_commit_reference(from_sha1);
-			b = lookup_commit_reference(sha1);
-			if (!a || !b) {
-				if (revs->ignore_missing)
-					return 0;
-				die(symmetric ?
-				    "Invalid symmetric difference expression %s...%s" :
-				    "Invalid revision range %s..%s",
-				    arg, next);
-			}
+			struct object *a_obj, *b_obj;
 
 			if (!cant_be_filename) {
 				*dotdot = '.';
 				verify_non_filename(revs->prefix, arg);
 			}
 
-			if (symmetric) {
+			a_obj = parse_object(from_sha1);
+			b_obj = parse_object(sha1);
+			if (!a_obj || !b_obj) {
+			missing:
+				if (revs->ignore_missing)
+					return 0;
+				die(symmetric
+				    ? "Invalid symmetric difference expression %s"
+				    : "Invalid revision range %s", arg);
+			}
+
+			if (!symmetric) {
+				/* just A..B */
+				a_flags = flags_exclude;
+			} else {
+				/* A...B -- find merge bases between the two */
+				struct commit *a, *b;
+				struct commit_list *exclude;
+
+				a = (a_obj->type == OBJ_COMMIT
+				     ? (struct commit *)a_obj
+				     : lookup_commit_reference(a_obj->sha1));
+				b = (b_obj->type == OBJ_COMMIT
+				     ? (struct commit *)b_obj
+				     : lookup_commit_reference(b_obj->sha1));
+				if (!a || !b)
+					goto missing;
 				exclude = get_merge_bases(a, b, 1);
 				add_pending_commit_list(revs, exclude,
 							flags_exclude);
 				free_commit_list(exclude);
+
 				a_flags = flags | SYMMETRIC_LEFT;
-			} else
-				a_flags = flags_exclude;
-			a->object.flags |= a_flags;
-			b->object.flags |= flags;
-			add_rev_cmdline(revs, &a->object, this,
+			}
+
+			a_obj->flags |= a_flags;
+			b_obj->flags |= flags;
+			add_rev_cmdline(revs, a_obj, this,
 					REV_CMD_LEFT, a_flags);
-			add_rev_cmdline(revs, &b->object, next,
+			add_rev_cmdline(revs, b_obj, next,
 					REV_CMD_RIGHT, flags);
-			add_pending_object(revs, &a->object, this);
-			add_pending_object(revs, &b->object, next);
+			add_pending_object(revs, a_obj, this);
+			add_pending_object(revs, b_obj, next);
 			return 0;
 		}
 		*dotdot = '.';
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index b10685a..15e3d64 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -48,4 +48,12 @@ test_expect_success 'rev-list --objects with pathspecs and copied files' '
 	! grep one output
 '
 
+test_expect_success 'rev-list A..B and rev-list ^A B are the same' '
+	git commit --allow-empty -m another &&
+	git tag -a -m "annotated" v1.0 &&
+	git rev-list --objects ^v1.0^ v1.0 >expect &&
+	git rev-list --objects v1.0^..v1.0 >actual &&
+	test_cmp expect actual
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: breakage in revision traversal with pathspec
  2013-09-20 17:51                     ` Junio C Hamano
@ 2013-09-25  9:12                       ` Jeff King
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2013-09-25  9:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kevin Bracey, Jonathan Nieder, git

On Fri, Sep 20, 2013 at 10:51:55AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > My original question was going to be: why bother peeling at all if we
> > are just going to push the outer objects, anyway?
> >
> > And after staring at it, I somehow convinced myself that the answer was
> > that you were pushing both. But that is not the case. Sorry for the
> > noise.
> 
> But that is still a valid point, and the patch to avoid peeling for
> non symmetric diff does not look too bad, either.
> 
>  revision.c               | 59 ++++++++++++++++++++++++++++++------------------
>  t/t6000-rev-list-misc.sh |  8 +++++++
>  2 files changed, 45 insertions(+), 22 deletions(-)

FWIW, the flow of this version makes more sense to me. It also allows
things like:

  git rev-list --objects $blob..$tree

which I cannot see anybody actually wanting, but it somehow seems
simpler to me to say "A..B" is syntactic sugar for "^B A", without
qualifying "except that A and B must be commit-ishes".

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-09-25  9:13 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-10 17:19 breakage in revision traversal with pathspec Junio C Hamano
2013-09-10 21:27 ` Kevin Bracey
2013-09-10 22:23   ` Junio C Hamano
2013-09-11 17:49     ` Kevin Bracey
2013-09-11 18:24       ` Jonathan Nieder
2013-09-11 19:21         ` Junio C Hamano
2013-09-11 19:39         ` Kevin Bracey
2013-09-11 21:15           ` Junio C Hamano
2013-09-19 21:35             ` Junio C Hamano
2013-09-20  3:35               ` Jeff King
2013-09-20  4:58                 ` Junio C Hamano
2013-09-20  5:11                   ` Jeff King
2013-09-20 17:51                     ` Junio C Hamano
2013-09-25  9:12                       ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).