git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <mason@suse.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Krzysztof Halasa <khc@pm.waw.pl>, git@vger.kernel.org
Subject: Re: [PATCH] multi item packed files
Date: Thu, 21 Apr 2005 20:16:16 -0400	[thread overview]
Message-ID: <200504212016.16729.mason@suse.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0504211530370.2344@ppc970.osdl.org>

On Thursday 21 April 2005 18:47, Linus Torvalds wrote:
> On Thu, 21 Apr 2005, Chris Mason wrote:
> > Shrug, we shouldn't need help from the kernel for something like this. 
> > git as a database hits worst case scenarios for almost every FS.

[ ... ]

We somewhat agree on most of this, I snipped out the parts that aren't worth 
nitpicking over.  git is really fast right now, and I'm all for throwing 
drive space at things to solve problems.  I just don't think we have to throw 
as much space at it as we are.

> The _seek_ issue is real, but git actually has a very nice architecture
> even there: not only dos it cache really really well (and you can do a
> simple "ls-tree $(cat .git/HEAD)" and populate the case from the results),
> but the low level of indirection in a git archive means that it's almost
> totally prefetchable with near-perfect access patterns.

We can sort by the files before reading them in, but even if we order things 
perfectly, we're spreading the io out too much across the drive. It works 
right now because the git archive is relatively dense.  At a few hundred MB 
when we order things properly the drive head isn't moving that much.

At 3-6 GB this hurts more.  The data gets farther apart as things age, and 
drive performance rots away.  I'll never convince you without numbers, which 
means I'll have to wait for the full load of old history and try it out ;)

-chris

  reply	other threads:[~2005-04-22  0:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-21 15:13 [PATCH] multi item packed files Chris Mason
2005-04-21 15:41 ` Linus Torvalds
2005-04-21 16:23   ` Chris Mason
2005-04-21 19:28   ` Krzysztof Halasa
2005-04-21 20:07     ` Linus Torvalds
2005-04-22  9:40       ` Krzysztof Halasa
2005-04-22 18:12         ` Martin Uecker
2005-04-21 20:22     ` Chris Mason
2005-04-21 22:47       ` Linus Torvalds
2005-04-22  0:16         ` Chris Mason [this message]
2005-04-22 16:22           ` Linus Torvalds
2005-04-22 18:58             ` Chris Mason
2005-04-22 19:43               ` Linus Torvalds
2005-04-22 20:32                 ` Chris Mason
2005-04-22 23:55                   ` Chris Mason
2005-04-25 22:20                     ` Chris Mason
2005-04-22  9:48       ` Krzysztof Halasa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200504212016.16729.mason@suse.com \
    --to=mason@suse.com \
    --cc=git@vger.kernel.org \
    --cc=khc@pm.waw.pl \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).