All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: ashford@whisperpc.com
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [Discussion] Extent Block Group allocations
Date: Tue, 20 Jan 2009 18:07:21 -0500	[thread overview]
Message-ID: <1232492841.16352.2.camel@think.oraclecorp.com> (raw)
In-Reply-To: <52232.75.80.183.92.1232484873.squirrel@www.whisperpc.com>

On Tue, 2009-01-20 at 12:54 -0800, ashford@whisperpc.com wrote:
> Hi all,
> 
> I searched the archives, and didn't find any answers to my questions, so I
> think it's time to ask.
> 
> From:  http://btrfs.wiki.kernel.org/index.php/Btrfs_design#Extent_Block_Groups
> 
>         Block groups have a flag that indicate if they are preferred for data
>         or metadata allocations, and at mkfs time the disk is broken up into
>         alternating metadata (33% of the disk) and data groups (66% of the
>         disk). As the disk fills, a group's preference may change back and
>         forth, but Btrfs always tries to avoid intermixing data and metadata
>         extents in the same group. This substantially improves fsck throughput,
>         and reduces seeks during writeback while the FS is mounted. It does
>         slightly increase the seeks while reading.
> 

I missed this when I last updated the design doc.  It is much more
flexible now.  Chunks of storage are allocated from each device for use
as data or metadata as required.

> Based on this, it appears that there is a semi-fixed allocation of 33% of the
> disk to metadata, but that this allocation can change dynamically as the disk
> fills.  It would appear that if the metadata approaches/exceeds its
> allocation, a data group will be reallocated to it, and the same with the data
> (an extent group would be reallocated).
> 
> At the present, there is only one logical device per file-system (single,
> RAID-0, RAID-1 or RAID-10 - each is one logical device).  Based on the
> documentation, there appears to be an intent to support RAID-6 (and optionally
> RAID-5 - I believe this would be good) as logical devices.
> 

There is one logical address space per FS right now.  Each device in the
FS can contribute to the logical address space.

> >From what I see in the Multiple Device Support page
> (http://btrfs.wiki.kernel.org/index.php/Multiple_Device_Support), it appears
> that the intent in the future is to allow a BTRFS file-system to reside on
> multiple logical devices.  This is the starting point for my questions.
> 
> In an installation where a large number of physical devices are available for
> use (something like a Sun Thumper - 48 total disks, or a server connected to a
> SAN), the optimum configuration might be to dedicate certain logical devices
> (small/fast disks in RAID-1) to metadata, and other devices (large/slow disks
> in RAID-5 or RAID-6) to data.  To perform this, the metadata allocation
> percentage would need to be tunable (0% for data-only and 100% for
> metadata-only), and it would have to be able to be locked, so that the block
> group reallocation between metadata and data would be disabled (another option
> might be to allow metadata to reallocate data block groups, but not the other
> way around).
> 

Yes, we definitely want to be able to tie metadata or data to specific
drives.  The disk format has what it needs for this, but it hasn't been
coded up yet.

> I believe that a configuration like this would be more flexible than having
> the metadata block groups interleaved with the data block groups.  I also
> believe that this should be able to provide better overall response and
> throughput on a large multi-user server.
> 
> Is something like this intended to be possible?

Definitely ;)  Thanks for these comments.

-chris



  reply	other threads:[~2009-01-20 23:07 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-20 20:54 [Discussion] Extent Block Group allocations ashford
2009-01-20 23:07 ` Chris Mason [this message]
2009-01-22 18:20   ` [PATCH] Add validation for sector size ashford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1232492841.16352.2.camel@think.oraclecorp.com \
    --to=chris.mason@oracle.com \
    --cc=ashford@whisperpc.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.