From: Linus Torvalds <torvalds@osdl.org>
To: "C. Scott Ananian" <cscott@cscott.net>
Cc: Petr Baudis <pasky@ucw.cz>, Tom Lord <lord@emf.net>,
gnu-arch-users@gnu.org, gnu-arch-dev@lists.seyza.com,
Git Mailing List <git@vger.kernel.org>,
talli@museatech.net
Subject: chunking (Re: [ANNOUNCEMENT] /Arch/ embraces `git')
Date: Wed, 20 Apr 2005 15:22:12 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.58.0504201510520.6467@ppc970.osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.61.0504201754450.2630@cag.csail.mit.edu>
On Wed, 20 Apr 2005, C. Scott Ananian wrote:
>
> I'm hoping my 'chunking' patches will fix this. This ought to reduce the
> size of the object store by (in effect) doing delta compression; rsync
> will then Do The Right Thing and only transfer the needed deltas.
> Running some benchmarks right now to see how well it lives up to this
> promise...
What's the disk usage results? I'm on ext3, for example, which means that
even small files invariably take up 4.125kB on disk (with the inode).
Even uncompressed, most source files tend to be small. Compressed, I'm
seeing the median blob size being ~1.6kB in my trivial checks. That's
blobs only, btw.
My point being that about 75% of all blobs already take up less than the
minimal amount of space that most filesystems can sanely allocate. And I'm
_not_ going to say "you have to use reiserfs" with git.
So the disk fragmentation really does matter. It doesn't help to make a
file smaller than 4kB, it hurts - while that can be offset by sharing
chunks, it might not be.
Also, while network performance is important, so is the handshaking on
which objects to get. Lots of small objects potentially need lots of
handshaking to figure out _which_ of the objects to do.
Linus
next prev parent reply other threads:[~2005-04-20 22:22 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-20 10:00 [ANNOUNCEMENT] /Arch/ embraces `git' Tom Lord
2005-04-20 10:19 ` Miles Bader
2005-04-20 17:15 ` duchier
2005-04-20 22:40 ` [Gnu-arch-users] Re: [GNU-arch-dev] " Tomas Mraz
2005-04-21 9:09 ` Denys Duchier
2005-04-21 10:21 ` Tomas Mraz
2005-04-21 11:46 ` [Gnu-arch-users] " duchier
2005-04-20 22:51 ` Tomas Mraz
2005-04-21 19:04 ` Tom Lord
2005-04-21 20:35 ` [Gnu-arch-users] Re: [GNU-arch-dev] " Tom Lord
2005-04-20 23:04 ` Tom Lord
2005-04-21 0:05 ` [Gnu-arch-users] Re: [GNU-arch-dev] " Denys Duchier
2005-04-21 20:39 ` [Gnu-arch-users] " Tom Lord
2005-04-21 7:49 ` Tomas Mraz
2005-04-21 21:51 ` [Gnu-arch-users] Re: [GNU-arch-dev] " Tom Lord
2005-04-21 21:52 ` Tom Lord
2005-04-22 16:13 ` Linus Torvalds
2005-04-22 17:39 ` Edésio Costa e Silva
2005-04-20 21:31 ` Petr Baudis
2005-04-20 21:55 ` C. Scott Ananian
2005-04-20 22:22 ` Linus Torvalds [this message]
2005-04-20 23:42 ` chunking (Re: [ANNOUNCEMENT] /Arch/ embraces `git') C. Scott Ananian
2005-04-22 21:02 ` blowing chunks (quick update) C. Scott Ananian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.58.0504201510520.6467@ppc970.osdl.org \
--to=torvalds@osdl.org \
--cc=cscott@cscott.net \
--cc=git@vger.kernel.org \
--cc=gnu-arch-dev@lists.seyza.com \
--cc=gnu-arch-users@gnu.org \
--cc=lord@emf.net \
--cc=pasky@ucw.cz \
--cc=talli@museatech.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).