git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Bracey <kevin@bracey.fi>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH v2 8/8] revision.c: discount UNINTERESTING parents
Date: Thu, 02 May 2013 20:52:12 +0300	[thread overview]
Message-ID: <5182A7CC.4040009@bracey.fi> (raw)
In-Reply-To: <7vmwsfbtu7.fsf@alter.siamese.dyndns.org>

On 01/05/2013 00:18, Junio C Hamano wrote:

>> These rules paying more attention to UNINTERESTING do add a tricky
>> wrinkle to behaviour. Because limited revision lists are conventionally
>> expressed as A..B (ie B ^A), the bottom commit is UNINTERESTING.
> OK.
>
>> Thus
>> its connection to the INTERESTING graph is not privileged over side
>> branches,
> I take that "its connection" refers to the "===" link below, the
> nodes connected with "---" form the "INTERESTING graph", and
>
>       ....Z...A===X---o---o---B
>            \\    /
>             W---Y
>      
> "side branches" refer to the development that built W and Y and
> merged at X.  And you are saying that A===X is not "privileged" over
> "Y---X", with some definition of "privileged" I am not sure about.

Okay, that's a good graph. The basic problem is that all the rules above 
try to identify a merge from an irrelevant branch and eliminate it. But 
how can we define what a side branch is?

I think the rules I state are conceptually sound - a side branch is an 
extra parent coming in from outside our limited history graph. But the 
problem is at the bottom. In the event that someone specifies "A..B" 
with the above history, we get the resultant graph "W-Y-X-o-o-B". A is 
not on that graph. So with the rules as they stand, "A" being 
UNINTERESTING makes it get treated as a side branch of X, which isn't 
good. A needs to be INTERESTING for the purpose of side-branch logic.

So when someone says "A..B" and generates "W-Y-X-o-o-B", we want to know 
that X's parent path "(Z)-W-Y-X" is the (possibly irrelevant) side 
branch, not "(A)-X".

Example undesirable behaviour, with A treated as a side branch:

1) If only commit A changed our file, and merge X took "old" version Y 
for some reason, under these rules "--full-history A..B" would show 
nothing. X doesn't consider A because it's UNINTERESTING. If there had 
been an intervening (even irrelevant) commit A1 between A1 and X, X 
would have been shown.

2) If only commit A changed our file, and merge X took A, with this 
rules"--simplify-merges A..B" would show X, with two rewritten 
UNINTERESTING parents A and Z. That's not what we want - Z is an 
irrelevant side-branch in this case. If there had been an intervening 
(unsimplifiable) commit A1 between A and X, X would have had INTERESTING 
parent A1, and WY would have been successfully discarded.

3) Even before my new rules, there is one existing place trying to spot 
side branches, and it can fail here for the same reasons. See 
simplify_history's "even if a merge with an uninteresting side branch 
brought the entire change we are interested in" test. It would do the 
wrong thing at X, if W had made a change and Y reverted it. "git log 
A..B" would show W and Y, which is not what we want. As the scan hits X, 
it follows parent Y rather than just go for first parent A, because it 
thinks A is "an uninteresting side branch". Again, if there had been an 
intervening commit A1 before X, X would have followed A1 and W and Y 
would not have been shown.

All 3 cases work if A is treated as "INTERESTING" for side-branch rules. 
We shouldn't have needed to put in an extra A1 commit to make them work.

>>   #          D---E-------F
>>   #         /     \       \
>> +#    B---C-G0-G--H---I---J
>>   #   /                     \
>>   #  A-------K---------------L--M
>>   #
>>
>>
>> Conceptually, the "ancestry-path" shouldn't get affected by any
>> pathspec. The range "--ancestry-path G0..M" should be equivalent to
>> the range "G0..M ^F ^E ^K", and with the rule to ignore non-sameness
>> with uninteresting side branches, I would have expected that H and J
>> would be equally irrelevant, because E and F would be outside the
>> graph we would want to look at sameness.

Those two pathspecs produce the same graph of commits, and yes, they've 
always produced the same thing up until now. But we're trying to do 
something new(ish) here. We're trying to define "side branch", to allow 
us to make more useful pruning comparisons. And we can't reliably define 
"side branch" unless we can reliably define where we're coming from.

Looking at this case, given that graph of commits G-H-I-J-L-M (produced 
by any pathspec/flags), that is "easy" because it's linear. The bottom 
of the INTERESTING graph has a single parent, and we can follow it 
straight from bottom to top. We can deduce that non-merge "G" is bottom, 
and thus call the connections to E and F "sides". (But that could have 
been wrong if the graph had been made some other way, and the user 
wasn't asking for history since G0).

But if presented with H-I-J-L-M we get stuck. The lowest commit in our 
graph has 2 parents. How do we distinguish between E and G? Which is the 
side, and which is the bottom? We can only define "side branch" here if 
we know where our bottom is. The version of this patch as posted can't 
distinguish whether E or G is more important, so merge H always gets 
shown. And that makes me unhappy. And note that normal 
merge-simplification will often prune away boring non-merge commits, 
leaving the user-specified UNINTERESTING bottom attached to an 
INTERESTING graph by a merge commit like this. So it would be very 
common to stumble over this first connection onto the INTERESTING graph 
with --simplify-merges.

So my proposal here is that for pruning purposes, we need to mark the 
bottom(s), so we can reliably identify the sides.A side branch is 
spotted by being UNINTERESTING && !BOTTOM.

Doing this means that "--ancestry-path E..M", "--ancestry-path G..M" and 
"G..M ^F ^E ^K" would all produce a walk starting at H, but for pruning 
purposes they will prioritise differences against specified bottom 
commits, and discount them against non-specified boundary commits.

So "E..M" will treat "G" as a side branch, and "G..M" will treat "D-E" 
as a side branch. "--ancestry-path E..M" will show H iff it differs from 
E, and "--ancestry-path G..M" will show H iff it differs from G.

And as a result, manually specifying with "G..M ^F ^E ^K" will be paying 
extra attention to merges H, J and L, caring if they differ from listed 
commits E F and K, and not treating them as ignorable side branches. H 
will be shown if it differs from either G or E.

All of which feels right and good to me, but it does mean the neat 
mathematical purity of the commit set model is somewhat disrupted - 
different pathspecs specifying the same set of commits will prune 
differently. But I think the behaviour fits intuitive expectation well. 
When a user specifies "A..B" or "B ^A", they almost always mean that 
they want the change since A, and they’re not thinking of A as a “commit 
set subtraction”. The specific edge to A is important to them. So let's 
be helpful and meet normal expectation, and treat the edge between the 
INTERESTING graph and the specified bottom commit as important.

Revised patch 8/8 doing this follows. The fact it adds a BOTTOM flag 
potentially has an impact on other areas, as noted in the comments. If 
we went down this route, the BOTTOM flag addition would obviously want 
to be a separate preceding patch, covering impact on 
collect_bottom_commits etc.

And the thing about "only need merges with 2+ INTERESTING parents for 
topology" should really be separated out too - that's actually quite 
distinct from the TREESAME side branch stuff, and rather more 
straightforward.

Kevin

  reply	other threads:[~2013-05-02 19:09 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 1/8] decorate.c: compact table when growing Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 2/8] t6019: test file dropped in -s ours merge Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 3/8] rev-list-options.txt: correct TREESAME for P Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 4/8] revision.c: Make --full-history consider more merges Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 5/8] t6012: update test for tweaked full-history traversal Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 6/8] simplify-merges: never remove all TREESAME parents Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 7/8] simplify-merges: drop merge from irrelevant side branch Kevin Bracey
2013-04-30 20:54   ` Junio C Hamano
2013-04-30 17:26 ` [PATCH v2 8/8] revision.c: discount UNINTERESTING parents Kevin Bracey
2013-04-30 21:18   ` Junio C Hamano
2013-05-02 17:52     ` Kevin Bracey [this message]
2013-05-02 17:58       ` [PATCH v2.1 8/8] revision.c: discount side branches when computing TREESAME Kevin Bracey
2013-05-02 18:26       ` [PATCH v2.2 " Kevin Bracey
2013-05-02 20:05       ` [PATCH v2 8/8] revision.c: discount UNINTERESTING parents Junio C Hamano
2013-05-04 20:18         ` Kevin Bracey
2013-04-30 21:28 ` [PATCH v2 0/8] History traversal refinements Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5182A7CC.4040009@bracey.fi \
    --to=kevin@bracey.fi \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).