From: linux@horizon.com
To: paulus@samba.org, torvalds@osdl.org
Cc: git@vger.kernel.org, jonsmirl@gmail.com, linux@horizon.com
Subject: Re: Change set based shallow clone
Date: 11 Sep 2006 10:26:44 -0400 [thread overview]
Message-ID: <20060911142644.32313.qmail@science.horizon.com> (raw)
In-Reply-To: <17669.8191.778645.311304@cargo.ozlabs.ibm.com>
> Could we do a cache of the refs that stores the stat information for
> each of the files under .git/refs plus the sha1 that the ref points
> to? In other words this cache would do for the refs what the index
> does for the working directory. Reading all the refs would mean we
> still had to stat each of the files, but that's much quicker than
> reading them in the cold-cache case. In the common case when most of
> the stat information matches, we don't have to read the file because
> we have the sha1 that the file contains right there in the cache.
Well, that could save one of two seeks, but that's not *much* quicker.
(Indeed, a git ref would fit into the 60 bytes of block pointer space
in an ext2/3 inode if regular files were stuffed there as well as symlinks.)
> Ideally we would have two sha1 values in the cache - the sha1 in the
> file, and if that is the ID of a tag object, we would also put the
> sha1 of the commit that the tag points to in the cache.
Now that's not a bad idea. Hacking it in to Linus's scheme, that's
<foo sha>\t<foo^{} sha>\tfoo
A couple of thoughts:
1) I bet Hans Reiser is enjoying this; he's been agitating for better
lots-of-small-files support for years.
2) Since I've written about two caches in a few minutes (here
and in git-rev-list), a standardized cache validation hook for
git-fsck-objects and git-prune's use might be useful.
3) If we use Linus's idea of a flat "static refs" file overridden by loose
refs (presumably, refs would be stuffed in if their mod times got old
enough, and on initial import you'd use the timestamp of the commit
they point to), we'll have to do a bit of a dance to move refs to and
from it.
Basically, to move refs into the refs file, it's
- Read all the old refs and loose refs and write the new refs file.
- Rename the new refs file into place.
- For each loose ref moved in, lock it, verify it hasn'd changed,
and delete it.
with some more locking to prevent two people from doing this at once.
Folks looking up tags will do an FS search, then validate their refs
file cache, then if necessary, suck in the refs file.
Now, exploding a refs file into loose refs is tricky. There's
the possible race condition with a reader:
A: Looks for loose ref "foo", doesn't find it.
B: Write out loose ref "foo"
B: Deletes now-unpacked refs file
A: Looks for refs file, doesn't find it.
A: Concludes that ref "foo" doesn't exist.
The only solution I can think of is to stat the refs file at the
start of the operation and restart from the beginning if it changes
by the time it actually opens and read it.
next prev parent reply other threads:[~2006-09-11 14:27 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-07 19:52 Change set based shallow clone Jon Smirl
2006-09-07 20:21 ` Jakub Narebski
2006-09-07 20:41 ` Jon Smirl
2006-09-07 21:33 ` Jeff King
2006-09-07 21:51 ` Jakub Narebski
2006-09-07 21:37 ` Jakub Narebski
2006-09-07 22:14 ` Junio C Hamano
2006-09-07 23:09 ` Jon Smirl
2006-09-10 23:20 ` Anand Kumria
2006-09-08 8:48 ` Andreas Ericsson
2006-09-07 22:07 ` Junio C Hamano
2006-09-07 22:40 ` Jakub Narebski
2006-09-08 3:54 ` Martin Langhoff
2006-09-08 5:30 ` Junio C Hamano
2006-09-08 7:15 ` Martin Langhoff
2006-09-08 8:33 ` Junio C Hamano
2006-09-08 17:18 ` A Large Angry SCM
2006-09-08 14:20 ` Jon Smirl
2006-09-08 15:50 ` Jakub Narebski
2006-09-09 3:13 ` Petr Baudis
2006-09-09 8:39 ` Jakub Narebski
2006-09-08 5:05 ` Aneesh Kumar K.V
2006-09-08 1:01 ` linux
2006-09-08 2:23 ` Jon Smirl
2006-09-08 8:36 ` Jakub Narebski
2006-09-08 8:39 ` Junio C Hamano
2006-09-08 18:42 ` linux
2006-09-08 21:13 ` Jon Smirl
2006-09-08 22:27 ` Jakub Narebski
2006-09-08 23:09 ` Linus Torvalds
2006-09-08 23:28 ` Jon Smirl
2006-09-08 23:45 ` Paul Mackerras
2006-09-09 1:45 ` Jon Smirl
2006-09-10 12:41 ` Paul Mackerras
2006-09-10 14:56 ` Jon Smirl
2006-09-10 16:10 ` linux
2006-09-10 18:00 ` Jon Smirl
2006-09-10 19:03 ` linux
2006-09-10 20:00 ` Linus Torvalds
2006-09-10 21:00 ` Jon Smirl
2006-09-11 2:49 ` Linus Torvalds
2006-09-10 22:41 ` Paul Mackerras
2006-09-11 2:55 ` Linus Torvalds
2006-09-11 3:18 ` Linus Torvalds
2006-09-11 6:35 ` Junio C Hamano
2006-09-11 18:54 ` Junio C Hamano
2006-09-11 8:36 ` Paul Mackerras
2006-09-11 14:26 ` linux [this message]
2006-09-11 15:01 ` Jon Smirl
2006-09-11 16:47 ` Junio C Hamano
2006-09-11 21:52 ` Paul Mackerras
2006-09-11 23:47 ` Junio C Hamano
2006-09-12 0:06 ` Jakub Narebski
2006-09-12 0:18 ` Junio C Hamano
2006-09-12 0:25 ` Jakub Narebski
2006-09-11 9:04 ` Jakub Narebski
2006-09-10 18:51 ` Junio C Hamano
2006-09-11 0:04 ` Shawn Pearce
2006-09-11 0:42 ` Junio C Hamano
2006-09-11 0:03 ` Shawn Pearce
2006-09-11 0:41 ` Junio C Hamano
2006-09-11 1:04 ` Jakub Narebski
2006-09-11 2:44 ` Shawn Pearce
2006-09-11 5:27 ` Junio C Hamano
2006-09-11 6:08 ` Shawn Pearce
2006-09-11 7:11 ` Junio C Hamano
2006-09-11 17:52 ` Shawn Pearce
2006-09-11 2:11 ` Jon Smirl
2006-09-09 1:05 ` Paul Mackerras
2006-09-09 2:56 ` Linus Torvalds
2006-09-09 3:23 ` Junio C Hamano
2006-09-09 3:31 ` Paul Mackerras
2006-09-09 4:04 ` Linus Torvalds
2006-09-09 8:47 ` Marco Costalba
2006-09-09 17:33 ` Linus Torvalds
2006-09-09 18:04 ` Marco Costalba
2006-09-09 18:44 ` linux
2006-09-09 19:17 ` Marco Costalba
2006-09-09 20:05 ` Linus Torvalds
2006-09-09 20:43 ` Jeff King
2006-09-09 21:11 ` Junio C Hamano
2006-09-09 21:14 ` Jeff King
2006-09-09 21:40 ` Linus Torvalds
2006-09-09 22:54 ` Jon Smirl
2006-09-10 0:18 ` Linus Torvalds
2006-09-10 1:22 ` Junio C Hamano
2006-09-10 3:49 ` Marco Costalba
2006-09-10 4:13 ` Junio C Hamano
2006-09-10 4:23 ` Marco Costalba
2006-09-10 4:46 ` Marco Costalba
2006-09-10 4:54 ` Junio C Hamano
2006-09-10 5:14 ` Marco Costalba
2006-09-10 5:46 ` Junio C Hamano
2006-09-10 15:21 ` linux
2006-09-10 18:32 ` Marco Costalba
2006-09-11 9:56 ` Paul Mackerras
2006-09-11 12:39 ` linux
2006-09-10 9:49 ` Jakub Narebski
2006-09-10 10:28 ` Josef Weidendorfer
-- strict thread matches above, loose matches on Subject: below --
2006-09-09 10:31 linux
2006-09-09 13:00 ` Marco Costalba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060911142644.32313.qmail@science.horizon.com \
--to=linux@horizon.com \
--cc=git@vger.kernel.org \
--cc=jonsmirl@gmail.com \
--cc=paulus@samba.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).