From: Hugo Mills <hugo@carfax.org.uk>
To: Hans Stimer <hans.stimer@gmail.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Packed small files
Date: Tue, 31 Jan 2012 16:46:10 +0000 [thread overview]
Message-ID: <20120131164610.GE5531@carfax.org.uk> (raw)
In-Reply-To: <CAJJ3ruHxPqL6YRktpK+5V7Hgn2aAe=BCJLyicoLmYkxO4sVh1g@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2047 bytes --]
On Tue, Jan 31, 2012 at 08:21:30AM -0800, Hans Stimer wrote:
> How space efficient are packed small files? For instance, if I have a
> 10,000 16 byte files with 16 byte file names, how much disk space will
> they use? Are there any caveats or gotchas around using btrfs to store
> millions of small (size<64bytes) files?
For small files, the file content may be kept inline in the btree
blocks of the extent tree. The data is packed in alongside the
associated keys and inline extent metadata, and the Btree
implementation can be somewhat variable in its usage as well, so it's
hard to put fixed values on these things.
There's metadata overhead on storing files as well -- each file
will be represented by a number of keys (17 bytes each) and metadata
structures to go with them. In the FS tree:
DIR_ITEM 13
DIR_INDEX 13
INODE_ITEM 162
INODE_REF 10 + size(name)
EXTENT_DATA 52
Keys 85
---
335 + size(name)
And in the extent tree:
EXTENT_ITEM 24 + 17 for key
extent_inline_ref 9
extent_data_ref 28
---
78
So you're looking at a minimum of 413 bytes of metadata overhead
for an inline file, plus the length of the filename.
Also note that the file is stored in the metadata, so by default
it's stored with DUP or RAID-1 replication (even if data is set to be
"single"). This means that you'll actually use up twice this amount of
space on the disks, unless you create the FS with metadata set to
"single".
I don't know how these figures compare with other filesystems. My
entirely uneducated guess is that they're probably comparable, with
the exception of the DUP effect.
Hugo.
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- I know of three kinds: hot, ---
cool, and what-time-does-the-tune-start?
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]
next prev parent reply other threads:[~2012-01-31 16:46 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-31 16:21 Packed small files Hans Stimer
2012-01-31 16:46 ` Hugo Mills [this message]
2012-02-10 19:02 ` Phillip Susi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120131164610.GE5531@carfax.org.uk \
--to=hugo@carfax.org.uk \
--cc=hans.stimer@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).