From: Jeff King <peff@peff.net>
To: Duy Nguyen <pclouds@gmail.com>
Cc: git@vger.kernel.org, "Shawn O. Pearce" <spearce@spearce.org>
Subject: Re: [PATCH 4/6] introduce a commit metapack
Date: Mon, 18 Mar 2013 08:20:11 -0400 [thread overview]
Message-ID: <20130318122011.GE14789@sigill.intra.peff.net> (raw)
In-Reply-To: <CACsJy8CPXFhUYz2f1wuxJvqwknJr5VFNFrs3b_4pS14cxf=3Wg@mail.gmail.com>
On Sun, Mar 17, 2013 at 08:21:13PM +0700, Nguyen Thai Ngoc Duy wrote:
> On Thu, Jan 31, 2013 at 6:06 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> > On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote:
> >> Perhaps we could store abbrev sha-1 instead of full sha-1. Nice
> >> space/time trade-off.
> >
> > Following the on-disk format experiment yesterday, I changed the
> > format to:
> >
> > - a list a _short_ SHA-1 of cached commits
> > ..
> >
> > The length of SHA-1 is chosen to be able to unambiguously identify any
> > cached commits. Full SHA-1 check is done after to catch false
> > positives. For linux-2.6, SHA-1 length is 6 bytes, git and many
> > moderate-sized projects are 4 bytes.
>
> And if we are going to create index v3, the same trick could be used
> for the sha-1 table in the index. We use the short sha-1 table for
> binary search and put the rest of sha-1 in a following table (just
> like file offset table). The advantage is a denser search space, about
> 1/4-1/3 the size of full sha-1 table.
You can make it even smaller at some (potential) run-time cost.
Keep in mind you are just repeating information that is in the full sha1
list in the index. So you could store a fixed-size offset into that list
(e.g., 32-bit), and then instead of comparing sha1s during a binary
search, you would dereference the offset to the real sha1s and compare
those.
The run-time cost is not any worse in a big-O sense, but your cache
locality is much worse (you hit a second random page for each sha1
comparison), which might be noticeable. You'd have to benchmark to see
how big an impact.
-Peff
next prev parent reply other threads:[~2013-03-18 12:20 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-29 9:14 [PATCH/RFC 0/6] commit caching Jeff King
2013-01-29 9:15 ` [PATCH 1/6] csum-file: make sha1write const-correct Jeff King
2013-01-29 9:15 ` [PATCH 2/6] strbuf: add string-chomping functions Jeff King
2013-01-29 10:15 ` Michael Haggerty
2013-01-29 11:10 ` Jeff King
2013-01-30 5:00 ` Michael Haggerty
2013-01-29 9:15 ` [PATCH 3/6] introduce pack metadata cache files Jeff King
2013-01-29 17:35 ` Junio C Hamano
2013-01-30 6:47 ` Jeff King
2013-01-30 1:30 ` Duy Nguyen
2013-01-30 6:50 ` Jeff King
2013-01-29 9:16 ` [PATCH 4/6] introduce a commit metapack Jeff King
2013-01-29 10:24 ` Michael Haggerty
2013-01-29 11:13 ` Jeff King
2013-01-29 17:38 ` Junio C Hamano
2013-01-29 18:08 ` Junio C Hamano
2013-01-30 7:12 ` Jeff King
2013-01-30 7:17 ` Junio C Hamano
2013-02-01 9:21 ` Jeff King
2013-01-30 15:56 ` Junio C Hamano
2013-01-31 17:03 ` Shawn Pearce
2013-02-01 9:42 ` Jeff King
2013-02-02 17:49 ` Junio C Hamano
2013-01-30 7:07 ` Jeff King
2013-01-30 3:36 ` Duy Nguyen
2013-01-30 7:12 ` Jeff King
2013-01-30 13:56 ` Duy Nguyen
2013-01-30 14:16 ` Duy Nguyen
2013-01-31 11:06 ` Duy Nguyen
2013-02-01 10:15 ` Jeff King
2013-02-02 9:49 ` Duy Nguyen
2013-02-01 10:40 ` Jeff King
2013-03-17 13:21 ` Duy Nguyen
2013-03-18 12:20 ` Jeff King [this message]
2013-02-01 10:00 ` Jeff King
2013-01-29 9:16 ` [PATCH 5/6] add git-metapack command Jeff King
2013-01-29 9:16 ` [PATCH 6/6] commit: look up commit info in metapack Jeff King
2013-01-30 3:31 ` [PATCH/RFC 0/6] commit caching Duy Nguyen
2013-01-30 7:18 ` Jeff King
2013-01-30 8:32 ` Duy Nguyen
2013-01-31 17:14 ` Shawn Pearce
2013-02-01 9:11 ` Jeff King
2013-02-02 10:04 ` Shawn Pearce
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130318122011.GE14789@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=pclouds@gmail.com \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).