From: Junio C Hamano <gitster@pobox.com>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Jeff King <peff@peff.net>,
git@vger.kernel.org
Subject: Re: RFC: Flat directory for notes, or fan-out? Both!
Date: Tue, 10 Feb 2009 10:35:39 -0800 [thread overview]
Message-ID: <7vocxam96s.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <20090210165610.GP30949@spearce.org> (Shawn O. Pearce's message of "Tue, 10 Feb 2009 08:56:10 -0800")
"Shawn O. Pearce" <spearce@spearce.org> writes:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>> On Tue, 10 Feb 2009, Junio C Hamano wrote:
>> >
>> > I could do a revert on 'master' if it is really needed, but I found that
>> > the above reasoning is a bit troublesome. The thing is, if a tree to hold
>> > the notes would be huge to be unmanageable, then it would still be huge to
>> > be unmanageable if you split it into 256 pieces.
>>
>> The thing is, a tree object of 17 megabyte is unmanagably large if you
>> have to read it whenever you access even a single node. Having 256 trees
>> instead, each of which is about 68 kilobyte is much nicer.
>
> See my other email on this thread; we'd probably need to unpack
> all 256 subtrees *anyway* due to the distribution of SHA-1 names
> for commits.
I wonder if we can solve this by introducing a local cache that is a flat
file that looks like:
magic number for /usr/bin/file
tree object SHA-1 the file caches
Number of entries in this file
256 fan-out offsets into this file
N entries of <SHA-1, SHA-1>, sorted
Checksum of the file itself
and use it when availble (otherwise optionally create it upon the first
lookup). The file can be used by mmaping it and then doing a newton
raphson or binary search similar to the way patch-ids.c does.
The top-level API for such a hash-map would perhaps look like:
/*
* take the object name a tree object that is a hash map,
* return an opaque struct.
*/
struct hashmap *hashmap_open(const unsigned char *);
/*
* find the value given the key and return 0, or return negative
* if not found.
*/
int hashmap_lookup(struct hashmap *map, const unsigned char *key,
unsigned char *val);
/* discard the thing */
void hashmap_close(struct hashmap *map);
We should be able to use these in "git log" and friends where Dscho added
the hook in his git-notes topic.
I am hoping that I could eventually rewrite rerere to use something like
this, so that rerere database can be shared, just like the way notes can
be shared, across repositories.
next prev parent reply other threads:[~2009-02-10 18:37 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-09 21:12 RFC: Flat directory for notes, or fan-out? Both! Johannes Schindelin
2009-02-10 7:58 ` Boyd Stephen Smith Jr.
2009-02-10 13:16 ` Jeff King
2009-02-11 1:58 ` Boyd Stephen Smith Jr.
2009-02-11 2:35 ` Linus Torvalds
2009-02-11 3:30 ` Sam Vilain
2009-02-11 3:54 ` Linus Torvalds
2009-02-11 5:05 ` Sam Vilain
2009-02-11 12:35 ` Johannes Schindelin
2009-02-10 12:18 ` Jeff King
2009-02-10 12:59 ` Johannes Schindelin
2009-02-10 13:10 ` Jeff King
2009-02-10 13:32 ` Johannes Schindelin
2009-02-10 15:58 ` Junio C Hamano
2009-02-10 16:48 ` Shawn O. Pearce
2009-02-10 16:48 ` Johannes Schindelin
2009-02-10 16:56 ` Shawn O. Pearce
2009-02-10 17:31 ` Johannes Schindelin
2009-02-10 18:35 ` Junio C Hamano [this message]
2009-02-10 19:09 ` Shawn O. Pearce
2009-02-10 21:10 ` Johannes Schindelin
2009-02-10 22:16 ` Thomas Rast
2009-02-10 22:26 ` Thomas Rast
2009-02-10 22:32 ` Junio C Hamano
2009-02-11 20:02 ` Jeff King
2009-02-11 20:57 ` Johannes Schindelin
2009-02-11 21:16 ` Junio C Hamano
2009-02-11 23:05 ` Johannes Schindelin
2009-02-10 16:44 ` Shawn O. Pearce
2009-02-10 17:09 ` Johannes Schindelin
2009-02-10 17:17 ` Shawn O. Pearce
2009-02-11 3:19 ` Sam Vilain
2009-02-11 1:14 ` Sam Vilain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vocxam96s.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).