From: Stan Hoeppner <stan@hardwarefreak.com>
To: NeilBrown <neilb@suse.de>
Cc: Aaron Scheiner <blue@aquarat.za.net>, linux-raid@vger.kernel.org
Subject: Re: Grub-install, superblock corrupted/erased and other animals
Date: Tue, 02 Aug 2011 03:01:44 -0500 [thread overview]
Message-ID: <4E37AEE8.1080108@hardwarefreak.com> (raw)
In-Reply-To: <20110802163907.29fc40b4@notabene.brown>
On 8/2/2011 1:39 AM, NeilBrown wrote:
> On Wed, 27 Jul 2011 14:16:52 +0200 Aaron Scheiner <blue@aquarat.za.net> wrote:
>> Do these segments follow on from each other without interruption or is
>> there some other data in-between (like metadata? I'm not sure where
>> that resides).
>
> That depends on how XFS lays out the data. It will probably be mostly
> contiguous, but no guarantees.
Looks like he's still under the 16TB limit (8*2TB drives) so this is an
'inode32' XFS filesystem. inode32 and inoe64 have very different
allocation behavior. I'll take a stab at an answer, and though the
following is not "short" by any means, it's not nearly long enough to
fully explain how XFS lays out data on disk.
With inode32, all inodes (metadata) are stored in the first allocation
group, maximum 1TB, with file extents in the remaining AGs. When the
original array was created (and this depends a bit on how old his
kernel/xfs module/xfsprogs are) mkfs.xfs would have queried mdraid for
the existence of a stripe layout. If found, mkfs.xfs would have created
16 allocation groups of 500GB each, the first 500GB AG being reserved
for inodes. inode32 writes all inodes to the first AG and distributes
files fairly evenly across top level directories in the remaining 15 AGs.
This allocation parallelism is driven by directory count. The more top
level directories the greater the filesystem write parallelism. inode64
is much better as inodes are spread across all AGs instead of being
limited to the first AG, giving metadata heavy workloads a boost (e.g.
maildir). inode32 filesystems are limited to 16TB in size, while
inode64 is limited to 16 exabytes. inode64 requires a fully 64 bit
Linux operating system, and though inode64 scales far beyond 16TB, one
can use inode64 on much smaller filesystems for the added benefits.
This allocation behavior is what allows XFS to have high performance
with large files as free space management within and across multiple
allocation groups keeps file fragmentation to a minimum. Thus, there
are normally large spans of free space between AGs, on a partially
populated XFS filesystem.
So, to answer the question, if I understood it correctly, there will
indeed be data spread all over all of the disks with large free space
chunks in between. The pattern of files on disk will not be contiguous.
Again, this is by design, and yields superior performance for large
file workloads, the design goal of XFS. It doesn't do horribly bad with
many small file workloads either.
--
Stan
next prev parent reply other threads:[~2011-08-02 8:01 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-27 12:16 Grub-install, superblock corrupted/erased and other animals Aaron Scheiner
2011-08-02 6:39 ` NeilBrown
2011-08-02 8:01 ` Stan Hoeppner [this message]
2011-08-02 16:24 ` Aaron Scheiner
2011-08-02 16:41 ` Stan Hoeppner
2011-08-02 21:13 ` Aaron Scheiner
2011-08-03 4:02 ` Stan Hoeppner
2011-08-02 16:16 ` Aaron Scheiner
2011-08-03 5:01 ` NeilBrown
2011-08-03 8:59 ` Aaron Scheiner
2011-08-03 9:20 ` NeilBrown
2011-08-05 10:04 ` Aaron Scheiner
2011-08-05 10:32 ` Stan Hoeppner
2011-08-05 11:28 ` Aaron Scheiner
2011-08-05 12:16 ` NeilBrown
2011-08-03 7:13 ` Stan Hoeppner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E37AEE8.1080108@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=blue@aquarat.za.net \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).