From: Kevin Bracey <kevin@bracey.fi>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges
Date: Sun, 28 Apr 2013 10:03:10 +0300 [thread overview]
Message-ID: <517CC9AE.30407@bracey.fi> (raw)
In-Reply-To: <7vppxfsirl.fsf@alter.siamese.dyndns.org>
On 28/04/2013 01:36, Junio C Hamano wrote:
> Kevin Bracey <kevin@bracey.fi> writes:
>
>> Historically TREESAME was set on a commit if it was TREESAME to _any_ of
>> its parents. This is not optimal, as such a merge could still be worth
>> showing, particularly if it is an odd "-s ours" merge that (possibly
>> accidentally) dropped a change.
> "... and with options like --full-history or --simplify-merges are
> used to get more complete history", I think. "git log path" without
> these options is a tool to get one version of simplified history
> that explains the end result, and by definition, the side branch
> merged by "-s ours" did _not_ contribute anything to the end result.
Yeah, I'm not happy with this commit message - I knocked it up
separately from my first pass, which I didn't have to hand. Next version
will combine it with the original, which better distinguished the
default mode, and specifically addressed the "--full-history -S" search
problem.
That's key - that I really want such searches to be able to track the
entire life of a change on a side branch, not potentially showing just
its birth as now, but also always including any ultimate merge death. (I
think that we may be able to refine --ancestry-path to give an even
tighter pinpoint, but --full-history should definitely include the
information, as per its name).
>
> Do we want to discard the decoration data when the commit becomes a
> non-merge?
Would seem reasonable, and would also help make concrete why we update
TREESAME immediately, and not in update_treesame(), but I didn't spot a
mechanism to discard decoration. I'll recheck.
>
>> + commit->object.flags |= TREESAME;
>> + for (n = 0; n < st->nparents; n++) {
>> + if (!st->treesame[n]) {
>> + commit->object.flags &= ~TREESAME;
>> + break;
>> + }
>> + }
> Can a commit that earlier was marked as TREESAME become not TREESAME?
> Wouldn't simplification only increase sameness, never decrease?
That's true - I paid attention to that earlier when it really mattered
due to the cost of recalculating it with try_to_simplify_commit(). Not
sure that it matters so much any more, and I don't see how we can use
that information to change this "scan for !treesame" loop.
I could insert an "if (!commit->object.flags & TREESAME)" test to skip
the entire update. I'd be inclined to do that as the caller of
update_treesame(). I think update_treesame() itself should be
general-purpose without assumptions about what changes have been made,
so it's a pure treesame[]->TREESAME calculation, without TREESAME as an
input.
(Aside - just occurred to me we could swap the loop for
"strlen(st->treesame) == st->nparents", if we kept a zero terminator in
the array. Maybe a bit too smart-ass?)
>
>> + for (pp = &commit->parents;
>> + (parent = *pp) != NULL;
>> + pp = &parent->next, nth_parent++) {
> I see the reason to change from while to for is because you wanted
> to count, and I think it makes sense; but it is more readable to
> initialise the counter here, too, if that is the case. I.e.
>
> for (pp = &commit->parents, nth_parent = 0;
> !(parent = *pp);
> pp = &parent->next, nth_parent++) {
Agree on nth_parent, but "!(parent = *pp)" isn't "(parent = *pp) !=
NULL", mind. Did you mean "!!"? In which case I still prefer it my way.
>
> + if (!tree_changed)
> + ts->treesame[0] = 1;
> Have we made any two tree comparison at this point to set this one?
> Ahh, this is tricky. You do this in the _second_ iteration of the
> loop, so tree_changed here is from inspecting the first parent, not
> the one we are looking at (i.e. *p).
Yes, this is the "we've reached our second iteration, so from now on
we're dealing a merge" if {} block. I'll clarify this in the comment at
the top, and note that we're populating the newly-allocated treesame[]
from our first iteration.
>
>>
>> @@ -773,6 +861,9 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
>> * NEEDSWORK: decide if we want to remove parents that are
>> * not marked with TMP_MARK from commit->parents for commits
>> * in the resulting list. We may not want to do that, though.
>> + *
>> + * Maybe it should be considered if we are TREESAME to such
>> + * parents - now possible with stored per-parent flags.
>> */
> Hmm, that is certainly a thought.
My comment's wrong though. Reconsidering, what I think needs removing is
actually off-ancestry parents that we are !TREESAME to, when we are
TREESAME on the ancestry path.
I've realised while testing this that there's been one thing that's
confused me repeatedly, and I think this comment was an example of it.
The example in the rev-list-options manual is wrong.
.-A---M---N---O---P
/ / / / /
I B C D E
\ / / / /
`-------------'
Contrary to the manual, merge P is !TREESAME to E (or I). E's base is
old enough that E isn't up-to-date w.r.t. "foo". Thus merge "P" is no
longer TREESAME and does become subject to display with the new
--full-history:
I A B N D O P
I believe this is correct, because P is a merge that determined the fate
of "foo", so merits --full-history inspection. (--simplify-merges
obviously knocks P back out again: --simplify-merges becomes more
important if --full-history gets fuller).
Given this error, and this change, I think this example may want a
slight rethink. Do we want a proper "messing with other paths but
TREESAME merge" example? Say if E's parent was O, P would not be
TREESAME and not included in --full-history.
>
> OK, even though the use of TMP_MARK (meant to be very localized)
> across two functions feel somewhat yucky, they are file scope
> statics next to each other and hopefully are called back to back.
Well, by the end of the series you've got two functions setting it, in
preparation for later input to this function. And what's the upper bound
on complexity of functions that may want to mark removal? They may need
TMP_MARK to do the job. I'm beginning to think that it should be a
dedicated REMOVE bit.
Kevin
next prev parent reply other threads:[~2013-04-28 7:28 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-09 18:00 Locating merge that dropped a change Kevin Bracey
2013-04-11 17:28 ` Kevin Bracey
2013-04-11 19:21 ` Junio C Hamano
2013-04-22 19:23 ` [RFC/PATCH] Make --full-history consider more merges Kevin Bracey
2013-04-22 19:49 ` Junio C Hamano
2013-04-23 16:35 ` Kevin Bracey
2013-04-24 22:34 ` Junio C Hamano
2013-04-25 1:59 ` Junio C Hamano
2013-04-25 15:48 ` Kevin Bracey
2013-04-25 16:51 ` Junio C Hamano
2013-04-25 17:11 ` Kevin Bracey
2013-04-25 18:19 ` Junio C Hamano
2013-04-26 19:18 ` Kevin Bracey
2013-04-26 19:31 ` [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges Kevin Bracey
2013-04-26 19:31 ` [RFC/PATCH 2/3] simplify-merges: never remove all TREESAME parents Kevin Bracey
2013-04-27 23:02 ` Junio C Hamano
2013-04-28 7:10 ` Kevin Bracey
2013-04-28 18:09 ` Junio C Hamano
2013-04-26 19:31 ` [RFC/PATCH 3/3] simplify-merges: drop merge from irrelevant side branch Kevin Bracey
2013-04-27 22:36 ` [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges Junio C Hamano
2013-04-27 22:57 ` David Aguilar
2013-04-28 7:03 ` Kevin Bracey [this message]
2013-04-28 18:38 ` Junio C Hamano
2013-04-29 17:46 ` Kevin Bracey
2013-04-29 18:11 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=517CC9AE.30407@bracey.fi \
--to=kevin@bracey.fi \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).