git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Bracey <kevin@bracey.fi>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges
Date: Sun, 28 Apr 2013 10:03:10 +0300	[thread overview]
Message-ID: <517CC9AE.30407@bracey.fi> (raw)
In-Reply-To: <7vppxfsirl.fsf@alter.siamese.dyndns.org>

On 28/04/2013 01:36, Junio C Hamano wrote:
> Kevin Bracey <kevin@bracey.fi> writes:
>
>> Historically TREESAME was set on a commit if it was TREESAME to _any_ of
>> its parents. This is not optimal, as such a merge could still be worth
>> showing, particularly if it is an odd "-s ours" merge that (possibly
>> accidentally) dropped a change.
> "... and with options like --full-history or --simplify-merges are
> used to get more complete history", I think.  "git log path" without
> these options is a tool to get one version of simplified history
> that explains the end result, and by definition, the side branch
> merged by "-s ours" did _not_ contribute anything to the end result.

Yeah, I'm not happy with this commit message - I knocked it up 
separately from my first pass, which I didn't have to hand. Next version 
will combine it with the original, which better distinguished the 
default mode, and specifically addressed the "--full-history -S" search 
problem.

That's key - that I really want such searches to be able to track the 
entire life of a change on a side branch, not potentially showing just 
its birth as now, but also always including any ultimate merge death. (I 
think that we may be able to refine --ancestry-path to give an even 
tighter pinpoint, but --full-history should definitely include the 
information, as per its name).

>
> Do we want to discard the decoration data when the commit becomes a
> non-merge?

Would seem reasonable, and would also help make concrete why we update 
TREESAME immediately, and not in update_treesame(), but I didn't spot a 
mechanism to discard decoration. I'll recheck.

>
>> +		commit->object.flags |= TREESAME;
>> +		for (n = 0; n < st->nparents; n++) {
>> +			if (!st->treesame[n]) {
>> +				commit->object.flags &= ~TREESAME;
>> +				break;
>> +			}
>> +		}
> Can a commit that earlier was marked as TREESAME become not TREESAME?
> Wouldn't simplification only increase sameness, never decrease?

That's true - I paid attention to that earlier when it really mattered 
due to the cost of recalculating it with try_to_simplify_commit(). Not 
sure that it matters so much any more, and I don't see how we can use 
that information to change this "scan for !treesame" loop.

I could insert an "if (!commit->object.flags & TREESAME)" test to skip 
the entire update. I'd be inclined to do that as the caller of 
update_treesame(). I think update_treesame() itself should be 
general-purpose without assumptions about what changes have been made, 
so it's a pure treesame[]->TREESAME calculation, without TREESAME as an 
input.

(Aside - just occurred to me we could swap the loop for 
"strlen(st->treesame) == st->nparents", if we kept a zero terminator in 
the array. Maybe a bit too smart-ass?)


>
>> +	for (pp = &commit->parents;
>> +	     (parent = *pp) != NULL;
>> +	     pp = &parent->next, nth_parent++) {
> I see the reason to change from while to for is because you wanted
> to count, and I think it makes sense; but it is more readable to
> initialise the counter here, too, if that is the case. I.e.
>
> 	for (pp = &commit->parents, nth_parent = 0;
> 	     !(parent = *pp);
> 	     pp = &parent->next, nth_parent++) {

Agree on nth_parent, but  "!(parent = *pp)"  isn't "(parent = *pp) != 
NULL", mind. Did you mean "!!"? In which case I still prefer it my way.

>
> +				if (!tree_changed)
> +					ts->treesame[0] = 1;
> Have we made any two tree comparison at this point to set this one?
> Ahh, this is tricky.  You do this in the _second_ iteration of the
> loop, so tree_changed here is from inspecting the first parent, not
> the one we are looking at (i.e. *p).

Yes, this is the "we've reached our second iteration, so from now on 
we're dealing a merge" if {} block. I'll clarify this in the comment at 
the top, and note that we're populating the newly-allocated treesame[] 
from our first iteration.

>
>>   
>> @@ -773,6 +861,9 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
>>   	 * NEEDSWORK: decide if we want to remove parents that are
>>   	 * not marked with TMP_MARK from commit->parents for commits
>>   	 * in the resulting list.  We may not want to do that, though.
>> +	 *
>> +	 * Maybe it should be considered if we are TREESAME to such
>> +	 * parents - now possible with stored per-parent flags.
>>   	 */
> Hmm, that is certainly a thought.

My comment's wrong though. Reconsidering, what I think needs removing is 
actually off-ancestry parents that we are !TREESAME to, when we are 
TREESAME on the ancestry path.

I've realised while testing this that there's been one thing that's 
confused me repeatedly, and I think this comment was an example of it. 
The example in the rev-list-options manual is wrong.

           .-A---M---N---O---P
          /     /   /   /   /
         I     B   C   D   E
          \   /   /   /   /
           `-------------'

Contrary to the manual, merge P is !TREESAME to E (or I).  E's base is 
old enough that E isn't up-to-date w.r.t. "foo". Thus merge "P" is no 
longer TREESAME and does become subject to display with the new 
--full-history:

    I  A  B  N  D  O  P

I believe this is correct, because P is a merge that determined the fate 
of "foo", so merits --full-history inspection. (--simplify-merges 
obviously knocks P back out again: --simplify-merges becomes more 
important if --full-history gets fuller).

Given this error, and this change, I think this example may want a 
slight rethink. Do we want a proper "messing with other paths but 
TREESAME merge" example? Say if E's parent was O, P would not be 
TREESAME and not included in --full-history.

>
> OK, even though the use of TMP_MARK (meant to be very localized)
> across two functions feel somewhat yucky, they are file scope
> statics next to each other and hopefully are called back to back.

Well, by the end of the series you've got two functions setting it, in 
preparation for later input to this function. And what's the upper bound 
on complexity of functions that may want to mark removal? They may need 
TMP_MARK to do the job. I'm beginning to think that it should be a 
dedicated REMOVE bit.

Kevin

  parent reply	other threads:[~2013-04-28  7:28 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-09 18:00 Locating merge that dropped a change Kevin Bracey
2013-04-11 17:28 ` Kevin Bracey
2013-04-11 19:21   ` Junio C Hamano
2013-04-22 19:23     ` [RFC/PATCH] Make --full-history consider more merges Kevin Bracey
2013-04-22 19:49       ` Junio C Hamano
2013-04-23 16:35         ` Kevin Bracey
2013-04-24 22:34           ` Junio C Hamano
2013-04-25  1:59             ` Junio C Hamano
2013-04-25 15:48               ` Kevin Bracey
2013-04-25 16:51                 ` Junio C Hamano
2013-04-25 17:11                   ` Kevin Bracey
2013-04-25 18:19                     ` Junio C Hamano
2013-04-26 19:18                       ` Kevin Bracey
2013-04-26 19:31                         ` [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges Kevin Bracey
2013-04-26 19:31                           ` [RFC/PATCH 2/3] simplify-merges: never remove all TREESAME parents Kevin Bracey
2013-04-27 23:02                             ` Junio C Hamano
2013-04-28  7:10                               ` Kevin Bracey
2013-04-28 18:09                                 ` Junio C Hamano
2013-04-26 19:31                           ` [RFC/PATCH 3/3] simplify-merges: drop merge from irrelevant side branch Kevin Bracey
2013-04-27 22:36                           ` [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges Junio C Hamano
2013-04-27 22:57                             ` David Aguilar
2013-04-28  7:03                             ` Kevin Bracey [this message]
2013-04-28 18:38                               ` Junio C Hamano
2013-04-29 17:46                                 ` Kevin Bracey
2013-04-29 18:11                                   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=517CC9AE.30407@bracey.fi \
    --to=kevin@bracey.fi \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).