linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Packed small files
@ 2012-01-31 16:21 Hans Stimer
  2012-01-31 16:46 ` Hugo Mills
  0 siblings, 1 reply; 3+ messages in thread
From: Hans Stimer @ 2012-01-31 16:21 UTC (permalink / raw)
  To: linux-btrfs

How space efficient are packed small files? For instance, if I have a
10,000 16 byte files with 16 byte file names, how much disk space will
they use? Are there any caveats or gotchas around using btrfs to store
millions of small (size<64bytes) files?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Packed small files
  2012-01-31 16:21 Packed small files Hans Stimer
@ 2012-01-31 16:46 ` Hugo Mills
  2012-02-10 19:02   ` Phillip Susi
  0 siblings, 1 reply; 3+ messages in thread
From: Hugo Mills @ 2012-01-31 16:46 UTC (permalink / raw)
  To: Hans Stimer; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2047 bytes --]

On Tue, Jan 31, 2012 at 08:21:30AM -0800, Hans Stimer wrote:
> How space efficient are packed small files? For instance, if I have a
> 10,000 16 byte files with 16 byte file names, how much disk space will
> they use? Are there any caveats or gotchas around using btrfs to store
> millions of small (size<64bytes) files?

   For small files, the file content may be kept inline in the btree
blocks of the extent tree. The data is packed in alongside the
associated keys and inline extent metadata, and the Btree
implementation can be somewhat variable in its usage as well, so it's
hard to put fixed values on these things.

   There's metadata overhead on storing files as well -- each file
will be represented by a number of keys (17 bytes each) and metadata
structures to go with them. In the FS tree:

   DIR_ITEM      13
   DIR_INDEX     13
   INODE_ITEM   162
   INODE_REF     10 + size(name)
   EXTENT_DATA   52
   Keys          85
                ---
                335 + size(name)

   And in the extent tree:

   EXTENT_ITEM        24 + 17 for key
   extent_inline_ref   9
   extent_data_ref    28
                     ---
                      78

   So you're looking at a minimum of 413 bytes of metadata overhead
for an inline file, plus the length of the filename.

   Also note that the file is stored in the metadata, so by default
it's stored with DUP or RAID-1 replication (even if data is set to be
"single"). This means that you'll actually use up twice this amount of
space on the disks, unless you create the FS with metadata set to
"single".

   I don't know how these figures compare with other filesystems. My
entirely uneducated guess is that they're probably comparable, with
the exception of the DUP effect.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
                   --- I know of three kinds: hot, ---                   
                cool,  and what-time-does-the-tune-start?                

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Packed small files
  2012-01-31 16:46 ` Hugo Mills
@ 2012-02-10 19:02   ` Phillip Susi
  0 siblings, 0 replies; 3+ messages in thread
From: Phillip Susi @ 2012-02-10 19:02 UTC (permalink / raw)
  To: Hugo Mills, Hans Stimer, linux-btrfs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/31/2012 11:46 AM, Hugo Mills wrote:
> So you're looking at a minimum of 413 bytes of metadata overhead 
> for an inline file, plus the length of the filename.
> 
> Also note that the file is stored in the metadata, so by default 
> it's stored with DUP or RAID-1 replication (even if data is set to
> be "single"). This means that you'll actually use up twice this
> amount of space on the disks, unless you create the FS with
> metadata set to "single".
> 
> I don't know how these figures compare with other filesystems. My 
> entirely uneducated guess is that they're probably comparable,
> with the exception of the DUP effect.

On ext4 you are looking at 256 bytes for the inode, name length + a
few bytes for the directory entry, another few bytes for the hashed
directory entry, and a whole 4k block to hold the data, so ~4300 bytes
( + name length ) of overhead to store a 64 byte file.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPNWnBAAoJEJrBOlT6nu75lVkIAIO2mjYeVK5BbMNfw5HJ7jZO
WfIBv5xR8V06e0VLgv4FQqPlWcm+ZQHorYDM7h15q4cIgoZ3x0P3n3bSCurFRLfF
lSRjn/fsX1Y9isPEB6/monPm+08U6qh7jXGldEMOLKaA7VG/QOVR01k3W2a3FkJ4
kWBjEbK/xE013WaQnfR26PydRT8ILRzGUE4uEKGsdV39JkcEorQ1lDg+XWz5Hvy7
VmelT21272PssIUbRub1QkZXj6p0SUu1zeU1IwOdt6X1uXFcWqFbBFRGJk4f2+ZM
5MquuVC+YrzfDIBnS0ZBZ4UqNmxYuPCSzTLlpPiJJiY/AwR7916H/CoF5k38k/M=
=8YmX
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-02-10 19:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-31 16:21 Packed small files Hans Stimer
2012-01-31 16:46 ` Hugo Mills
2012-02-10 19:02   ` Phillip Susi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).