From: "Jakub Narębski" <jnareb@gmail.com>
To: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Cc: Stefan Beller <sbeller@google.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Marc Strapetz <marc.strapetz@syntevo.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: topological index field for commit objects
Date: Wed, 29 Jun 2016 23:49:35 +0200 [thread overview]
Message-ID: <5774426F.3090000@gmail.com> (raw)
In-Reply-To: <20160629205647.GA25987@sigill.intra.peff.net>
W dniu 2016-06-29 o 22:56, Jeff King pisze:
> On Wed, Jun 29, 2016 at 01:39:17PM -0700, Junio C Hamano wrote:
>
>>> Would it make sense to refuse creating commits that have a commit date
>>> prior to its parents commit date (except when the user gives a
>>> `--dammit-I-know-I-break-a-wildy-used-heuristic`)?
>>
>> I think that has also been discussed in the past. I do not think it
>> would help very much in practice, as projects already have up to 10
>> years (and the ones migrated from CVS, even more) worth of commits
>> they cannot rewrite that may record incorrect committer dates.
>
> Yep, it has been discussed and I agree it runs into a lot of corner
> cases.
>
>> If the use of generation number can somehow be limited narrowly, we
>> may be able to incrementally introduce it only for new commits, but
>> I haven't thought things through, so let me do so aloud here ;-)
>
> I think the problem is that you really _do_ want generation numbers for
> old commits. One of the most obvious cases is something like "tag
> --contains HEAD", because it has to examine older tags.
>
> So your history looks something like:
>
> A -- B -- ... Z
> \ \
> v1.0 HEAD
>
> Without generation numbers (or some proxy), you have to walk the history
> between B..Z to find the answer. With generation numbers, it is
> immediately obvious.
>
> So this is the ideal case for generation numbers (the worst cases are
> when the things you are looking for are in branchy, close history where
> the generation numbers don't tell you much; but in such cases the
> walking is usually not too bad).
There are other approaches (special indices) that help reachability
queries beside "generation number".
>
> So I think you really do want to be able to generate and store
> generation numbers after the fact. That has an added bonus that you do
> not have to worry about baking incorrect values into your objects; you
> do the topological walk once, and you _know_ it is correct (at least as
> correct as the parent links, but that is our source of truth).
By the way, what should happen if you add a replacement (in the git-replace
meaning) that creates a shortcut, therefore invalidating generation numbers,
at least in strict sense - committerdate as generation number would be still
good, I think?
> I have patches that generate and store the numbers at pack time, similar
> to the way we do the reachability bitmaps. They're not production ready,
> but they could probably be made so without too much effort. You wouldn't
> have ready-made generation numbers for commits since the last full
> repack, but you can compute them incrementally based on what you do have
> at a cost linear to the unpacked commits (this is the same for bitmaps).
Do Git use EWAH / EWOK bitmaps for reachability analysis, or is it still
limited to object counting?
--
Jakub Narębski
next prev parent reply other threads:[~2016-06-29 21:50 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-29 18:31 topological index field for commit objects Marc Strapetz
2016-06-29 18:59 ` Junio C Hamano
2016-06-29 20:20 ` Stefan Beller
2016-06-29 20:39 ` Junio C Hamano
2016-06-29 20:54 ` Stefan Beller
2016-06-29 21:37 ` Stefan Beller
2016-06-29 21:43 ` Jeff King
2016-06-29 20:56 ` Jeff King
2016-06-29 21:49 ` Jakub Narębski [this message]
2016-06-29 22:00 ` Jeff King
2016-06-29 22:11 ` Junio C Hamano
2016-06-29 22:30 ` Jeff King
2016-07-05 11:43 ` Johannes Schindelin
2016-07-05 12:59 ` Jakub Narębski
2016-06-30 10:30 ` Jakub Narębski
2016-06-30 18:12 ` Linus Torvalds
2016-06-30 23:39 ` Jakub Narębski
2016-06-30 23:59 ` Mike Hommey
2016-07-01 3:17 ` Jeff King
2016-07-01 6:45 ` Marc Strapetz
2016-07-01 9:48 ` Jakub Narębski
2016-07-01 16:08 ` Junio C Hamano
2016-07-01 6:54 ` Jeff King
2016-07-01 9:59 ` Jakub Narębski
2016-07-20 0:07 ` Jakub Narębski
2016-07-20 13:02 ` Jeff King
2017-02-04 13:43 ` Jakub Narębski
2017-02-17 9:26 ` Jeff King
2017-02-17 9:28 ` Jakub Narębski
2016-06-29 22:15 ` Marc Strapetz
2016-06-29 21:00 ` Jakub Narębski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5774426F.3090000@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=marc.strapetz@syntevo.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).