From: Johan Herland <johan@herland.net>
To: Theodore Tso <tytso@mit.edu>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
Will Palmer <wmpalmer@gmail.com>,
Avery Pennarun <apenwarr@gmail.com>
Subject: Re: Why is "git tag --contains" so slow?
Date: Thu, 8 Jul 2010 16:35:25 +0200 [thread overview]
Message-ID: <201007081635.25381.johan@herland.net> (raw)
In-Reply-To: <11D5771D-EB47-42E9-BCC3-69C8FE1999EC@MIT.EDU>
[-- Attachment #1: Type: text/plain, Size: 1638 bytes --]
On Thursday 08 July 2010, Theodore Tso wrote:
> On Jul 7, 2010, at 1:45 PM, Jeff King wrote:
> > And of course it's just complex, and I tend to shy away from
> > complexity when I can. The question to me comes back to (1)
> > above. Is massive clock skew a breakage that should produce a few
> > incorrect results, or is it something we should always handle?
>
> Going back to the question that kicked off this thread, I wonder if
> there is some way that cacheing could be used to speed up the all
> cases, or at lest the edge cases, without imposing as much latency as
> tracking the max skew? i.e., some thing like gitk's gitk.cache
> file. For bonus points, it could be a cache file that is used by
> both gitk and git tag --contains, git branch --contains, and git
> name-rev.
>
> Does that sound like reasonable idea?
Here's a quick-and-dirty POC which builds a mapping from commits to
their children and stores it using git notes [1], and then uses that to
implement 'git tag --contains <commit>' by traversing _forwards_ from
<commit> and printing all tags we encounter along the way [2].
[1]: The attached "build_childnotes.py" script builds this mapping.
Invoke as follows:
git log --all --format="%H,%P" |
./build_childnotes.py |
git fast-import
[2]: The attached "git_tag_contains.py" script traverses the notes
printing out tags along the way. Invoke it as follows:
git_tag_contains.py <commit>
The second script is way too slow, and really needs to use "git
cat-file --batch" to not fork a process for every commit in history...
...Johan
--
Johan Herland, <johan@herland.net>
www.herland.net
[-- Attachment #2: build_childnotes.py --]
[-- Type: application/x-python, Size: 551 bytes --]
[-- Attachment #3: git_tag_contains.py --]
[-- Type: application/x-python, Size: 836 bytes --]
next prev parent reply other threads:[~2010-07-08 14:36 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-01 0:54 Why is "git tag --contains" so slow? Theodore Ts'o
2010-07-01 0:58 ` Shawn O. Pearce
2010-07-03 23:27 ` Sam Vilain
2010-07-01 1:00 ` Avery Pennarun
2010-07-01 12:17 ` tytso
2010-07-01 15:03 ` Jeff King
2010-07-01 15:38 ` Jeff King
2010-07-02 19:26 ` tytso
2010-07-03 8:06 ` Jeff King
2010-07-04 0:55 ` tytso
2010-07-05 12:27 ` Jeff King
2010-07-05 12:33 ` [RFC/PATCH 1/4] tag: speed up --contains calculation Jeff King
2010-10-13 22:07 ` Jonathan Nieder
2010-10-13 22:56 ` Clemens Buchacher
2011-02-23 15:51 ` Ævar Arnfjörð Bjarmason
2011-02-23 16:39 ` Jeff King
2010-07-05 12:34 ` [RFC/PATCH 2/4] limit "contains" traversals based on commit timestamp Jeff King
2010-10-13 23:21 ` Jonathan Nieder
2010-07-05 12:35 ` [RFC/PATCH 3/4] default core.clockskew variable to one day Jeff King
2010-07-05 12:36 ` [RFC/PATCH 4/4] name-rev: respect core.clockskew Jeff King
2010-07-05 12:39 ` Why is "git tag --contains" so slow? Jeff King
2010-10-14 18:59 ` Jonathan Nieder
2010-10-16 14:32 ` Clemens Buchacher
2010-10-27 17:11 ` Jeff King
2010-10-28 8:07 ` Clemens Buchacher
2010-07-05 14:10 ` tytso
2010-07-06 11:58 ` Jeff King
2010-07-06 15:31 ` Will Palmer
2010-07-06 16:53 ` tytso
2010-07-08 11:28 ` Jeff King
2010-07-08 13:21 ` Will Palmer
2010-07-08 13:54 ` tytso
2010-07-07 17:45 ` Jeff King
2010-07-08 10:29 ` Theodore Tso
2010-07-08 11:12 ` Jakub Narebski
2010-07-08 19:29 ` Nicolas Pitre
2010-07-08 19:39 ` Avery Pennarun
2010-07-08 20:13 ` Nicolas Pitre
2010-07-08 21:20 ` Jakub Narebski
2010-07-08 21:30 ` Sverre Rabbelier
2010-07-08 23:10 ` Nicolas Pitre
2010-07-08 23:15 ` Nicolas Pitre
2010-07-08 11:31 ` Jeff King
2010-07-08 14:35 ` Johan Herland [this message]
2010-07-08 19:06 ` Nicolas Pitre
2010-07-07 17:50 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201007081635.25381.johan@herland.net \
--to=johan@herland.net \
--cc=apenwarr@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=tytso@mit.edu \
--cc=wmpalmer@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.