git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <mason@suse.com>
To: Krzysztof Halasa <khc@pm.waw.pl>
Cc: Linus Torvalds <torvalds@osdl.org>, git@vger.kernel.org
Subject: Re: [PATCH] multi item packed files
Date: Thu, 21 Apr 2005 16:22:46 -0400	[thread overview]
Message-ID: <200504211622.48065.mason@suse.com> (raw)
In-Reply-To: <m3u0m0q69a.fsf@defiant.localdomain>

On Thursday 21 April 2005 15:28, Krzysztof Halasa wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
> > Wrong. You most definitely _can_ lose: you end up having to optimize for
> > one particular filesystem blocking size, and you'll lose on any other
> > filesystem. And you'll lose on the special filesystem of "network
> > traffic", which is byte-granular.
>
> If someone needs better on-disk ratio, (s)he can go with 1 KB filesystem
> or something like that, without all the added complexity of packing.
>
> If we want to optimize that further, I would try doing it at the
> underlying filesystem level. For example, loop-mounted one.

Shrug, we shouldn't need help from the kernel for something like this.  git as 
a database hits worst case scenarios for almost every FS.

We've got:

1) subdirectories with lots of files
2) wasted space for tiny files
3) files that are likely to be accessed together spread across the whole disk

One compromise for SCM use would be one packed file per commit, with an index 
that lets us quickly figure out which commit has a particular version of a 
given file.  My hack gets something close to that (broken into 32k chunks for 
no good reason), and the index to find a given file is just the git directory 
tree.

But my code does hide the fact that we're packing things from most of the git 
interfaces.  So I can almost keep a straight face while claiming to be true 
to the original git design...almost.  The whole setup is far from perfect, 
but it is one option for addressing points 2 & 3 above.

-chris

  parent reply	other threads:[~2005-04-21 20:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-21 15:13 [PATCH] multi item packed files Chris Mason
2005-04-21 15:41 ` Linus Torvalds
2005-04-21 16:23   ` Chris Mason
2005-04-21 19:28   ` Krzysztof Halasa
2005-04-21 20:07     ` Linus Torvalds
2005-04-22  9:40       ` Krzysztof Halasa
2005-04-22 18:12         ` Martin Uecker
2005-04-21 20:22     ` Chris Mason [this message]
2005-04-21 22:47       ` Linus Torvalds
2005-04-22  0:16         ` Chris Mason
2005-04-22 16:22           ` Linus Torvalds
2005-04-22 18:58             ` Chris Mason
2005-04-22 19:43               ` Linus Torvalds
2005-04-22 20:32                 ` Chris Mason
2005-04-22 23:55                   ` Chris Mason
2005-04-25 22:20                     ` Chris Mason
2005-04-22  9:48       ` Krzysztof Halasa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200504211622.48065.mason@suse.com \
    --to=mason@suse.com \
    --cc=git@vger.kernel.org \
    --cc=khc@pm.waw.pl \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).