From: Stan Hoeppner <stan@hardwarefreak.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jimmy Thrasibule <thrasibule.jimmy@gmail.com>,
Linux RAID <linux-raid@vger.kernel.org>,
"xfs@oss.sgi.com" <xfs@oss.sgi.com>
Subject: Re: ARC-1120 and MD very sloooow
Date: Tue, 26 Nov 2013 02:03:23 -0600 [thread overview]
Message-ID: <529455CB.6050907@hardwarefreak.com> (raw)
In-Reply-To: <20131126061458.GM8803@dastard>
On 11/26/2013 12:14 AM, Dave Chinner wrote:
> On Mon, Nov 25, 2013 at 09:58:21PM -0600, Stan Hoeppner wrote:
>> On 11/25/2013 8:52 PM, Dave Chinner wrote:
>> ...
>>> sunit/swidth is in filesystem blocks, not sectors. Hence
>>> sunit is 1MB, swidth = 2MB. While it's not quite correct
>>> (su=512k,sw=1m), it's not actually a problem...
>>
>> Well that's what I thought as well, and I was puzzled by the 8 blocks
>> value for the log sunit. So I double checked before posting, and 'man
>> mkfs.xfs' told me
>>
>> sunit=value
>> This is used to specify the stripe unit for a RAID device
>> or a logical volume. The value has to be specified in
>> 512-byte block units.
>>
>> So apparently the units of 'sunit' are different depending on which XFS
>> tool one is using.
>
> No they don't. sunit as a mkfs input value is determined by 512 byte
> units. The output is given in units of "blks" i.e. the log block
> size:
Yes. That's pretty clear now. And I've figured out why this is...
> $ mkfs.xfs -N -l sunit=64 /dev/vdb
> ....
> log =internal log bsize=4096 blocks=12800, version=2
> = sectsz=512 sunit=8 blks, lazy-count=1
>
> Which is given by the "bsize=4096" variable and so are, in this
> case, 4k in size. input = 64 * 512 bytes = 8 * 4096 bytes = output
>
> Remember, you can specify su rather than sunit, and they are
> specified in sectors, filesystem blocks or bytes, and the output is
> still in units of log block size:
I never used IRIX. But I've deduced that this made sense then due to
variable filesystem block size selection during mkfs. But in Linux the
filesystem block size is static, at 4KB, equal to page size, and from
everything I've read the page size isn't going to change any time soon.
Thus for Linux only users, this exercise of using creation values in
512 byte blocks, or bytes, or multiples of the fs block size, can be
very confusing, when the output is always a multiple of filesystem
blocks, always a multiple of 4KB.
> # mkfs.xfs -N -b size=4096 -l su=8b /dev/vdb
^^^^^
I never noticed this until now because I've never used an external log,
nor needed an internal log with different geometry than the data section.
But why do we have different input values for su in the data (bytes) and
log (blocks) sections? I hope to learn something from your answer, as I
usually do. :)
> ....
> log =internal log bsize=4096 blocks=12800, version=2
> = sectsz=512 sunit=8 blks, lazy-count=1
>
> # mkfs.xfs -N -l su=32k /dev/vdb
> ....
> log =internal log bsize=4096 blocks=12800, version=2
> = sectsz=512 sunit=8 blks, lazy-count=1
>
> IOws, the input units can vary, but the output units are always the
> same.
>
>> That's a bit confusing. And 'man xfs_info'
>> (xfs_growfs) doesn't tell us that sunit is given in filesystem blocks.
>> I'm using xfsprogs 3.1.4 so maybe these have been corrected since.
>
> It might seem confusing at first, but it's actually quite
> consistent...
At first? Dang Dave, you've been mentoring me for something like 3+
years now. :) I don't deal with alignment issues very often, but this
isn't my first rodeo. I had my answer based on 4KB blocks, and went to
the docs to verify it before posting. That's the logical thing to do.
In this case, the docs led me astray. That shouldn't happen.
It won't happen to me again, but if it did once, after using the
software and documentation for over 4 years, it may likely happen to
someone else. So I'm thinking a short caveat/note might be in order in
mkfs.xfs(8). Something like
"Note: During filesystem creation, data section stripe alignment values
(sunit/swidth/su/sw) are specified in units other than filesystem
blocks. After creation, sunit/swidth values are referenced in multiples
of filesystem blocks by the xfsprogs tools."
>>> Again, lsunit is in filesystem blocks, so it is 32k, not 4k. And
>>> yes, the default lsunit when the sunit > 256k is 32k. So, nothing
>>> wrong there, either.
>>
>> So where should I have looked to confirm sunit reported by xfs_info is
>> in fs block (4KB) multiples, not the in the 512B multiples of mkfs.xfs?
>
> Explained above.
Thanks Dave. Hopefully others learn from this as well.
--
Stan
next prev parent reply other threads:[~2013-11-26 8:03 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-22 11:13 ARC-1120 and MD very sloooow Jimmy Thrasibule
2013-11-22 11:17 ` Mikael Abrahamsson
2013-11-22 20:17 ` Stan Hoeppner
2013-11-25 8:56 ` Jimmy Thrasibule
2013-11-26 0:45 ` Stan Hoeppner
2013-11-26 2:52 ` Dave Chinner
2013-11-26 3:58 ` Stan Hoeppner
2013-11-26 6:14 ` Dave Chinner
2013-11-26 8:03 ` Stan Hoeppner [this message]
2013-11-28 15:59 ` Jimmy Thrasibule
2013-11-28 19:59 ` Stan Hoeppner
2013-11-27 13:48 ` md raid5 performace 6x SSD RAID5 lilofile
2013-11-27 13:51 ` 答复:md " lilofile
2013-11-28 4:41 ` Stan Hoeppner
2013-11-28 4:46 ` Roman Mamedov
2013-11-28 6:24 ` Stan Hoeppner
2013-11-28 10:02 ` 答复:答复:md " lilofile
2013-11-29 2:38 ` Stan Hoeppner
2013-11-29 6:23 ` Stan Hoeppner
2013-11-30 14:12 ` 答复:答复:答复:md raid5 random " lilofile
2013-12-01 14:14 ` Stan Hoeppner
2013-12-01 16:33 ` md " lilofile
2013-12-02 2:37 ` Stan Hoeppner
2013-11-28 11:54 ` 答复:答复:md raid5 " lilofile
2013-12-02 3:48 ` md " lilofile
2013-12-02 5:51 ` Stan Hoeppner
2014-09-23 3:34 ` raid sync speed lilofile
2014-09-23 5:11 ` behind_writes lilofile
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=529455CB.6050907@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=david@fromorbit.com \
--cc=linux-raid@vger.kernel.org \
--cc=thrasibule.jimmy@gmail.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).