From: Theodore Tso <tytso@mit.edu>
To: Linus Torvalds <torvalds@osdl.org>
Cc: git@vger.kernel.org
Subject: Re: Unresolved issues #2 (shallow clone again)
Date: Sun, 7 May 2006 21:26:32 -0400 [thread overview]
Message-ID: <20060508012632.GD17138@thunk.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0605071744210.3718@g5.osdl.org>
On Sun, May 07, 2006 at 05:50:42PM -0700, Linus Torvalds wrote:
>
>
> On Sun, 7 May 2006, Theodore Tso wrote:
> >>
> > If there are 233338 objects, then the average wasted space due to
> > internal fragmentation is 233338 * 2k, or 466676 kilobytes, or only
> > 36% of the wasted space.
>
> That's not necessarily true.
>
> That assumes a randomly distributed filesize. File sizes are _not_ random,
> and in particular if you have the distribution leaning towards <2kB being
> common, you can actually get >50% fragmentation.
>
> Btw, I hit this when some people argued that the page size should be made
> 64kB. The above (incorrect) logic implies that you waste 32kB on average
> per file. That's not true, if a large fraction of your files are small, in
> which case you may actually be wastign closer to 60kB on average from
> using a big page-size, because about half of the kernel files are actually
> smaller than 4kB (or something. I forget the exact statistics, I did them
> with a script at some point).
>
> Anyway, with inode overhead and a lot of objects being just a couple of
> hundred bytes, I think I estimated at some point that you actually lost
> closer to 3kB per object.
I just ran the numbers on filesizes of a kernel tree I had handy,
which happened to be 2.6.16.11. With no object files, git files,
etc. the average loss was 2351 bytes --- not that far away from the
average of 2048 bytes. Granted, it may be there is more different
versions of small objects causing a skewing of the distributions of
git objects in the 2.6 tree, but I'm not familiar enough with the git
porcelain to be able to make it disgorge the sizes of the repository
to do the math.
- Ted
next prev parent reply other threads:[~2006-05-08 1:26 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-14 9:31 Recent unresolved issues Junio C Hamano
2006-04-14 16:02 ` Petr Baudis
[not found] ` <20060414151030.11c64730.seanlkml@sympatico.ca>
2006-04-14 19:10 ` sean
2006-04-14 19:24 ` Petr Baudis
2006-04-14 22:56 ` Recent unresolved issues: shallow clone Carl Worth
2006-04-15 0:17 ` Johannes Schindelin
2006-04-15 0:25 ` Junio C Hamano
2006-04-15 2:11 ` Junio C Hamano
2006-04-14 23:52 ` Recent unresolved issues Linus Torvalds
2006-04-15 0:19 ` Linus Torvalds
2006-04-15 0:39 ` Linus Torvalds
2006-04-15 0:38 ` Junio C Hamano
2006-04-15 0:49 ` Linus Torvalds
2006-04-15 0:56 ` Linus Torvalds
2006-04-15 1:09 ` Linus Torvalds
2006-04-15 2:22 ` Junio C Hamano
2006-04-15 6:18 ` Junio C Hamano
2006-04-15 8:57 ` Junio C Hamano
2006-04-15 11:46 ` Johannes Schindelin
2006-04-15 16:59 ` Linus Torvalds
2006-04-15 17:17 ` Linus Torvalds
2006-04-16 8:14 ` Junio C Hamano
2006-04-15 1:35 ` Junio C Hamano
2006-04-15 4:09 ` Linus Torvalds
2006-04-15 5:06 ` Junio C Hamano
2006-05-04 8:15 ` Unresolved issues #2 Junio C Hamano
2006-05-04 8:32 ` Jakub Narebski
2006-05-04 9:14 ` Junio C Hamano
2006-05-04 9:26 ` Jakub Narebski
2006-05-04 9:58 ` Petr Baudis
2006-05-04 15:45 ` Pavel Roskin
2006-05-04 17:01 ` Unresolved issues #2 (shallow clone again) Carl Worth
2006-05-05 0:25 ` Junio C Hamano
2006-05-05 5:17 ` Martin Langhoff
2006-05-05 5:23 ` Carl Worth
2006-05-05 5:48 ` Jakub Narebski
2006-05-05 15:10 ` Linus Torvalds
2006-05-05 15:18 ` Jakub Narebski
2006-05-05 15:59 ` Linus Torvalds
2006-05-06 6:23 ` Martin Langhoff
2006-05-06 7:10 ` Junio C Hamano
2006-05-07 6:08 ` Martin Langhoff
2006-05-07 7:56 ` Jeff King
2006-05-07 15:27 ` Linus Torvalds
2006-05-08 4:24 ` Jeff King
2006-05-08 15:32 ` Linus Torvalds
2006-05-08 0:33 ` Theodore Tso
2006-05-08 0:50 ` Linus Torvalds
2006-05-08 1:26 ` Theodore Tso [this message]
2006-05-08 2:04 ` Linus Torvalds
2006-05-08 2:24 ` Theodore Tso
2006-05-08 2:42 ` Linus Torvalds
2006-05-07 8:01 ` Sergey Vlasov
2006-05-07 23:27 ` Martin Langhoff
2006-05-07 23:35 ` Junio C Hamano
2006-05-07 23:44 ` Martin Langhoff
2006-05-05 15:31 ` Carl Worth
2006-05-07 13:30 ` Jakub Narebski
2006-05-08 2:54 ` Junio C Hamano
2006-05-08 4:02 ` Jakub Narebski
2006-05-08 4:24 ` Jakub Narebski
2006-05-04 20:41 ` Unresolved issues #2 Daniel Barkalow
2006-05-04 21:33 ` Linus Torvalds
2006-05-06 5:58 ` Junio C Hamano
2006-05-06 15:26 ` Linus Torvalds
[not found] ` <20060506113549.48e553d1.seanlkml@sympatico.ca>
2006-05-06 15:35 ` sean
2006-05-06 16:30 ` Linus Torvalds
[not found] ` <20060506125323.544c35db.seanlkml@sympatico.ca>
2006-05-06 16:53 ` sean
2006-05-06 17:20 ` Linus Torvalds
2006-05-06 21:16 ` Junio C Hamano
2006-05-06 21:33 ` Johannes Schindelin
2006-05-06 21:51 ` Linus Torvalds
2006-05-07 9:39 ` Junio C Hamano
2006-05-07 9:42 ` Junio C Hamano
2006-05-07 11:31 ` Johannes Schindelin
2006-05-07 11:38 ` Jakub Narebski
2006-05-08 2:51 ` Junio C Hamano
2006-05-07 0:41 ` Jakub Narebski
2006-05-09 11:40 ` David Woodhouse
2006-05-09 11:53 ` Bertrand Jacquin
2006-05-09 13:09 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060508012632.GD17138@thunk.org \
--to=tytso@mit.edu \
--cc=git@vger.kernel.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).