From: Chris Mason <mason@suse.com>
To: Krzysztof Halasa <khc@pm.waw.pl>
Cc: Linus Torvalds <torvalds@osdl.org>, git@vger.kernel.org
Subject: Re: [PATCH] multi item packed files
Date: Thu, 21 Apr 2005 16:22:46 -0400 [thread overview]
Message-ID: <200504211622.48065.mason@suse.com> (raw)
In-Reply-To: <m3u0m0q69a.fsf@defiant.localdomain>
On Thursday 21 April 2005 15:28, Krzysztof Halasa wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
> > Wrong. You most definitely _can_ lose: you end up having to optimize for
> > one particular filesystem blocking size, and you'll lose on any other
> > filesystem. And you'll lose on the special filesystem of "network
> > traffic", which is byte-granular.
>
> If someone needs better on-disk ratio, (s)he can go with 1 KB filesystem
> or something like that, without all the added complexity of packing.
>
> If we want to optimize that further, I would try doing it at the
> underlying filesystem level. For example, loop-mounted one.
Shrug, we shouldn't need help from the kernel for something like this. git as
a database hits worst case scenarios for almost every FS.
We've got:
1) subdirectories with lots of files
2) wasted space for tiny files
3) files that are likely to be accessed together spread across the whole disk
One compromise for SCM use would be one packed file per commit, with an index
that lets us quickly figure out which commit has a particular version of a
given file. My hack gets something close to that (broken into 32k chunks for
no good reason), and the index to find a given file is just the git directory
tree.
But my code does hide the fact that we're packing things from most of the git
interfaces. So I can almost keep a straight face while claiming to be true
to the original git design...almost. The whole setup is far from perfect,
but it is one option for addressing points 2 & 3 above.
-chris
next prev parent reply other threads:[~2005-04-21 20:18 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-21 15:13 [PATCH] multi item packed files Chris Mason
2005-04-21 15:41 ` Linus Torvalds
2005-04-21 16:23 ` Chris Mason
2005-04-21 19:28 ` Krzysztof Halasa
2005-04-21 20:07 ` Linus Torvalds
2005-04-22 9:40 ` Krzysztof Halasa
2005-04-22 18:12 ` Martin Uecker
2005-04-21 20:22 ` Chris Mason [this message]
2005-04-21 22:47 ` Linus Torvalds
2005-04-22 0:16 ` Chris Mason
2005-04-22 16:22 ` Linus Torvalds
2005-04-22 18:58 ` Chris Mason
2005-04-22 19:43 ` Linus Torvalds
2005-04-22 20:32 ` Chris Mason
2005-04-22 23:55 ` Chris Mason
2005-04-25 22:20 ` Chris Mason
2005-04-22 9:48 ` Krzysztof Halasa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200504211622.48065.mason@suse.com \
--to=mason@suse.com \
--cc=git@vger.kernel.org \
--cc=khc@pm.waw.pl \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).