From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Piotr Krukowiecki <piotr.krukowiecki@gmail.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: [bug] git checkout lies about number of ahead commits when switching from detached HEAD
Date: Mon, 21 Mar 2011 06:35:49 -0400 [thread overview]
Message-ID: <20110321103549.GA16334@sigill.intra.peff.net> (raw)
In-Reply-To: <7vd3llwrah.fsf@alter.siamese.dyndns.org>
On Sun, Mar 20, 2011 at 12:00:38PM -0700, Junio C Hamano wrote:
> "Let the one that has more information to do the clearing" is not about
> performance but about correctness and reusability potential.
>
> When you insert a new traversal B in the existing codeflow before another
> traversal C, B knows not just which commits it started from (hence knows
> which commits were marked by it), but more importantly it also knows what
> mark bits were potentially modified. If the existing codeflow had another
> traversal A before you added B, and C took the marks A left on the objects
> into account while doing its work, the only sensible clean-up is to clear
> marks B touched (and no other marks) immediately after B, and we obviously
> do not want C (and any later traversals in other codepaths that we may
> later want to insert B) to have too much knowledge of which marks may have
> been clobbered by B.
I see what you are saying, but from a software engineering perspective,
I doubt it works very well in practice. The commit flags are basically
global variables. So imagine you have two traversals, A and C, and C
cares about some intermediate result from A. You want to add a new one,
B, in between the two, and have it clean up after itself. This is going
to work only if one of the two is the case:
1. B does not use any of the same marks as A. Otherwise, it will end
up clearing part of A's result during cleanup (not to mention that
it may get a bogus result because of what was left from A).
2. B works on a totally disjoint set of history from A and C.
I don't think (1) is that realistic in the general case. Sure, you might
only want to use TMP_MARK. But the revision walker behind the scenes is
using SEEN and UNINTERESTING, which is going to bring you into conflict.
For (2), there are cases where it works. In this recent bug, for
example, it would sometimes produce the correct results depending on the
exact traversal done in the orphan-warning (e.g., if you were exploring
way back in history near a tag, the traversal wouldn't come near the
commits needed to get the tracking info for the new branch).
But you can only know that it's a reasonable thing to do if you manually
analyze A, B, and C as a set, and even then only if you can know pretty
specifically what the inputs are. So there's no modularity, and in any
reasonably complex case you are not going to have enough control of the
inputs to be sure it isn't buggy.
So in my mind, the only way to keep sanity is to never insert a "B"
between an A and C who care about each other's results. And unless you
are specifically depending on a previous walk, each walk will want all
marks cleared (whether it's because the previous walker cleaned up, or
because we are clearing them before starting).
This whole global marks thing is really pretty nasty when you come right
down to it. I know why we do it; it's faster to dereference the pointer
rather than looking up the flags in some external per-revision-walker
hash table. But I wonder if we could come up with something else close
to it in speed that would allow per-walker flags.
We allocate commits from a custom allocator pool. I wonder if we could
use the offset from the commit pointer to the pool base to assign an
index to each commit, and then use that index to access an array of
flags in struct rev_info. There are a few problems:
1. Our allocator has many pools, and we don't know which one is the
right base. So I don't think the offset thing works, and we would
be stuck assigning some index into each allocated object.
2. We'd have to grow the rev_info array dynamically. The growth itself
is probably not a huge deal, as its amortized constant effort. But
on each flag access we'd have to do a bounds-check. I don't know if
that would be measurable or not.
> > So how about this?
> >
> > [1/3]: checkout: add basic tests for detached-orphan warning
> > [2/3]: checkout: clear commit marks after detached-orphan check
> > [3/3]: checkout: tweak detached-orphan warning format
>
> Looks very sensible, except for the clear_marks(-1) that clears everything
> I have a slight doubt about.
I can change it to UNINTERESTING. As far as I can tell, that is the only
one set by prepare_revision_walk. It just felt like a leaky abstraction
for this code to have to know which flags might get set by the
underlying code. But obviously callers do need to know and care which
flags might be set and when.
-Peff
next prev parent reply other threads:[~2011-03-21 10:35 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-19 21:53 [bug] git checkout lies about number of ahead commits when switching from detached HEAD Piotr Krukowiecki
2011-03-19 22:28 ` Jeff King
2011-03-19 22:47 ` Jeff King
2011-03-20 0:35 ` Junio C Hamano
2011-03-20 9:01 ` Jeff King
2011-03-20 9:04 ` [PATCH 1/3] checkout: add basic tests for detached-orphan warning Jeff King
2011-03-20 9:09 ` [PATCH 2/3] checkout: clear commit marks after detached-orphan check Jeff King
2011-03-20 19:05 ` Junio C Hamano
2012-04-13 22:59 ` [PATCH] i18n: use test_i18ncmp in t2020 (checkout --detach) Jonathan Nieder
2012-04-13 23:25 ` [PATCH/RFC] checkout --detached test: write supporting files before start of tests Jonathan Nieder
2012-04-13 23:33 ` Jeff King
2012-04-13 23:49 ` Jonathan Nieder
2012-04-14 2:26 ` Jeff King
2012-04-13 23:30 ` [PATCH] i18n: use test_i18ncmp in t2020 (checkout --detach) Jeff King
2012-04-13 23:46 ` Jonathan Nieder
2012-04-14 2:24 ` Jeff King
2012-04-14 4:44 ` [PATCH v2 0/3] " Jonathan Nieder
2012-04-14 4:45 ` [PATCH 1/3] test: do not rely on US English tracking-info messages Jonathan Nieder
2012-04-14 4:46 ` [PATCH 2/3] test: use test_i18ncmp for "Patch format detection failed" message Jonathan Nieder
2012-04-14 4:48 ` [PATCH 3/3] test: am of empty patch should not succeed Jonathan Nieder
2012-04-14 8:15 ` [PATCH v2 0/3] Re: i18n: use test_i18ncmp in t2020 (checkout --detach) Jeff King
2012-04-14 5:02 ` [PATCH] " Jonathan Nieder
2012-04-14 8:22 ` Jeff King
2012-04-14 12:47 ` Jonathan Nieder
2011-03-20 9:19 ` [PATCH 3/3] checkout: tweak detached-orphan warning format Jeff King
2011-03-20 19:00 ` [bug] git checkout lies about number of ahead commits when switching from detached HEAD Junio C Hamano
2011-03-21 10:35 ` Jeff King [this message]
2011-03-21 15:17 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110321103549.GA16334@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=piotr.krukowiecki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).