From: Hubert Kario <hka@qbs.com.pl>
To: Alessio Focardi <alessiof@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: btrfs and 1 billion small files
Date: Mon, 07 May 2012 11:58:47 +0200 [thread overview]
Message-ID: <7802030.zmTeQQDGHD@bursa01> (raw)
In-Reply-To: <711331964.2091.1336382892940.JavaMail.root@zimbra.interconnessioni.it>
On Monday 07 of May 2012 11:28:13 Alessio Focardi wrote:
> Hi,
>=20
> I need some help in designing a storage structure for 1 billion of sm=
all
> files (<512 Bytes), and I was wondering how btrfs will fit in this
> scenario. Keep in mind that I never worked with btrfs - I just read s=
ome
> documentation and browsed this mailing list - so forgive me if my que=
stions
> are silly! :X
>=20
>=20
> On with the main questions, then:
>=20
> - What's the advice to maximize disk capacity using such small files,=
even
> sacrificing some speed?
>=20
> - Would you store all the files "flat", or would you build a hierarch=
ical
> tree of directories to speed up file lookups? (basically duplicating =
the
> filesystem Btree indexes)
>=20
>=20
> I tried to answer those questions, and here is what I found:
>=20
> it seems that the smallest block size is 4K. So, in this scenario, if=
every
> file uses a full block I will end up with lots of space wasted. Would=
n't
> change much if block was 2K, anyhow.
>=20
> I tough about compression, but is not clear to me the compression is =
handled
> at the file level or at the block level.
>=20
> Also I read that there is a mode that uses blocks for shared storage =
of
> metadata and data, designed for small filesystems. Haven't found any =
other
> info about it.
>=20
>=20
> Still is not yet clear to me if btrfs can fit my situation, would you
> recommend it over XFS?
>=20
> XFS has a minimum block size of 512, but BTRFS is more modern and, gi=
ven the
> fact that is able to handle indexes on his own, it could help us spee=
d up
> file operations (could it?)
>=20
> Thank you for any advice!
>=20
btrfs will inline such small files in metadata blocks.
I'm not sure about limits to size of directory, but I'd guess that goin=
g over=20
few tens of thousands of files in single flat directory will have speed=
=20
penalties.
Regards,
--=20
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawer=F3w 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-05-07 9:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1913174825.1910.1336382310577.JavaMail.root@zimbra.interconnessioni.it>
2012-05-07 9:28 ` btrfs and 1 billion small files Alessio Focardi
2012-05-07 9:58 ` Hubert Kario [this message]
2012-05-07 10:06 ` Boyd Waters
2012-05-08 6:31 ` Chris Samuel
2012-05-07 10:55 ` Hugo Mills
2012-05-07 11:15 ` Alessio Focardi
2012-05-07 11:39 ` Hugo Mills
2012-05-07 12:19 ` Johannes Hirte
2012-05-07 11:05 ` vivo75
2012-05-08 16:46 ` Martin
2012-05-07 15:13 ` David Sterba
2012-05-08 12:31 ` Chris Mason
2012-05-08 16:51 ` Martin
2012-05-08 20:54 ` Chris Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7802030.zmTeQQDGHD@bursa01 \
--to=hka@qbs.com.pl \
--cc=alessiof@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.