Re: btrfs and 1 billion small files

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Alessio Focardi <alessiof@gmail.com>
To: Hugo Mills <hugo@carfax.org.uk>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: btrfs and 1 billion small files
Date: Mon, 7 May 2012 13:15:26 +0200 (CEST)	[thread overview]
Message-ID: <1429905255.3406.1336389326378.JavaMail.root@zimbra.interconnessioni.it> (raw)
In-Reply-To: <20120507105552.GC8938@carfax.org.uk>

> This is a lot more compact (as you can have several files' data in a
> single block), but by default will write two copies of each file,
> even
> on a single disk.

Great, no (or less) space wasted, then! I will have a filesystem that's composed mostly of metadata blocks, if I understand correctly. Will this create any problem? 

>    So, if you want to use some form of redundancy (e.g. RAID-1), then
> that's great, and you need to do nothing unusual. However, if you
> want
> to maximise space usage at the expense of robustness in a device
> failure, then you need to ensure that you only keep one copy of your
> data. This will mean that you should format the filesystem with the
> -m
> single option.

That's a very clever suggestion, I'm preparing a test server right now: going to use the -m single option. Any other suggestion regarding format options?

pagesize? leafsize?

> > XFS has a minimum block size of 512, but BTRFS is more modern and,
> > given the fact that is able to handle indexes on his own, it could
> > help us speed up file operations (could it?)
> 
>    Not sure what you mean by "handle indexes on its own". XFS will
> have its own set of indexes and file metadata -- it wouldn't be much
> of a filesystem if it didn't.

Yes, you are perfectly right; I tough that recreating a tree like /d/u/m/m/y/ to store "dummy" would have been redundant since the whole filesystem is based on trees - I don't have to "ls" directories, we are using php to write and read files, I will have to find a "compromise" between levels of directories and number of files in each one of them.

May I ask you about compression? Would you use it in the scenario I described?

Thank you for your help!

next prev parent reply	other threads:[~2012-05-07 11:15 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1913174825.1910.1336382310577.JavaMail.root@zimbra.interconnessioni.it>
2012-05-07  9:28 ` btrfs and 1 billion small files Alessio Focardi
2012-05-07  9:58   ` Hubert Kario
2012-05-07 10:06     ` Boyd Waters
2012-05-08  6:31       ` Chris Samuel
2012-05-07 10:55   ` Hugo Mills
2012-05-07 11:15     ` Alessio Focardi [this message]
2012-05-07 11:39       ` Hugo Mills
2012-05-07 12:19         ` Johannes Hirte
2012-05-07 11:05   ` vivo75
2012-05-08 16:46     ` Martin
2012-05-07 15:13   ` David Sterba
2012-05-08 12:31   ` Chris Mason
2012-05-08 16:51     ` Martin
2012-05-08 20:54       ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1429905255.3406.1336389326378.JavaMail.root@zimbra.interconnessioni.it \
    --to=alessiof@gmail.com \
    --cc=hugo@carfax.org.uk \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).