From: "Jakub Narębski" <jnareb@gmail.com>
To: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Cc: Stefan Beller <sbeller@google.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Marc Strapetz <marc.strapetz@syntevo.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: topological index field for commit objects
Date: Wed, 29 Jun 2016 23:49:35 +0200 [thread overview]
Message-ID: <5774426F.3090000@gmail.com> (raw)
In-Reply-To: <20160629205647.GA25987@sigill.intra.peff.net>
W dniu 2016-06-29 o 22:56, Jeff King pisze:
> On Wed, Jun 29, 2016 at 01:39:17PM -0700, Junio C Hamano wrote:
>
>>> Would it make sense to refuse creating commits that have a commit date
>>> prior to its parents commit date (except when the user gives a
>>> `--dammit-I-know-I-break-a-wildy-used-heuristic`)?
>>
>> I think that has also been discussed in the past. I do not think it
>> would help very much in practice, as projects already have up to 10
>> years (and the ones migrated from CVS, even more) worth of commits
>> they cannot rewrite that may record incorrect committer dates.
>
> Yep, it has been discussed and I agree it runs into a lot of corner
> cases.
>
>> If the use of generation number can somehow be limited narrowly, we
>> may be able to incrementally introduce it only for new commits, but
>> I haven't thought things through, so let me do so aloud here ;-)
>
> I think the problem is that you really _do_ want generation numbers for
> old commits. One of the most obvious cases is something like "tag
> --contains HEAD", because it has to examine older tags.
>
> So your history looks something like:
>
> A -- B -- ... Z
> \ \
> v1.0 HEAD
>
> Without generation numbers (or some proxy), you have to walk the history
> between B..Z to find the answer. With generation numbers, it is
> immediately obvious.
>
> So this is the ideal case for generation numbers (the worst cases are
> when the things you are looking for are in branchy, close history where
> the generation numbers don't tell you much; but in such cases the
> walking is usually not too bad).
There are other approaches (special indices) that help reachability
queries beside "generation number".
>
> So I think you really do want to be able to generate and store
> generation numbers after the fact. That has an added bonus that you do
> not have to worry about baking incorrect values into your objects; you
> do the topological walk once, and you _know_ it is correct (at least as
> correct as the parent links, but that is our source of truth).
By the way, what should happen if you add a replacement (in the git-replace
meaning) that creates a shortcut, therefore invalidating generation numbers,
at least in strict sense - committerdate as generation number would be still
good, I think?
> I have patches that generate and store the numbers at pack time, similar
> to the way we do the reachability bitmaps. They're not production ready,
> but they could probably be made so without too much effort. You wouldn't
> have ready-made generation numbers for commits since the last full
> repack, but you can compute them incrementally based on what you do have
> at a cost linear to the unpacked commits (this is the same for bitmaps).
Do Git use EWAH / EWOK bitmaps for reachability analysis, or is it still
limited to object counting?
--
Jakub Narębski
next prev parent reply other threads:[~2016-06-29 21:50 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-29 18:31 topological index field for commit objects Marc Strapetz
2016-06-29 18:59 ` Junio C Hamano
2016-06-29 20:20 ` Stefan Beller
2016-06-29 20:39 ` Junio C Hamano
2016-06-29 20:54 ` Stefan Beller
2016-06-29 21:37 ` Stefan Beller
2016-06-29 21:43 ` Jeff King
2016-06-29 20:56 ` Jeff King
2016-06-29 21:49 ` Jakub Narębski [this message]
2016-06-29 22:00 ` Jeff King
2016-06-29 22:11 ` Junio C Hamano
2016-06-29 22:30 ` Jeff King
2016-07-05 11:43 ` Johannes Schindelin
2016-07-05 12:59 ` Jakub Narębski
2016-06-30 10:30 ` Jakub Narębski
2016-06-30 18:12 ` Linus Torvalds
2016-06-30 23:39 ` Jakub Narębski
2016-06-30 23:59 ` Mike Hommey
2016-07-01 3:17 ` Jeff King
2016-07-01 6:45 ` Marc Strapetz
2016-07-01 9:48 ` Jakub Narębski
2016-07-01 16:08 ` Junio C Hamano
2016-07-01 6:54 ` Jeff King
2016-07-01 9:59 ` Jakub Narębski
2016-07-20 0:07 ` Jakub Narębski
2016-07-20 13:02 ` Jeff King
2017-02-04 13:43 ` Jakub Narębski
2017-02-17 9:26 ` Jeff King
2017-02-17 9:28 ` Jakub Narębski
2016-06-29 22:15 ` Marc Strapetz
2016-06-29 21:00 ` Jakub Narębski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5774426F.3090000@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=marc.strapetz@syntevo.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.