All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Brown <david.brown@hesbynett.no>
To: Peter Grandi <pg@lxra2.for.sabi.co.UK>
Cc: Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: XFS on top RAID10 with odd drives count and 2 near copies
Date: Sat, 18 Feb 2012 14:59:35 +0100	[thread overview]
Message-ID: <4F3FAEC7.9000505@hesbynett.no> (raw)
In-Reply-To: <20286.43753.55416.715662@tree.ty.sabi.co.UK>

On 17/02/12 20:30, Peter Grandi wrote:
> [ ... ]
>
>> To my mind, stripe width applies to reads and writes.  For
>> reads, it is the number of spindles that are used in parallel
>> while reading larger blocks of data.  For writes, it is in
>> addition the width of a parity stripe for raid5 or raid6.
>
> In the XFS case that's completely wrong, and irrelevant: in the
> XFS case it is the number of sectors/blocks that IO has to be
> _aligned_ to avoid read-modify-write, if there is the risk for
> that.
>
> The stripe width per se matters less than aligned writes as to
> avoiding read-modify-write impact: if one does IO in stripe
> width units but they are not aligned, performance will be
> terrible as double read-modify-write will not be prevented.
>
> What is the stripe width does not matter to applications like a
> filesystem other than for read-modify-write avoidance because
> how many sectors/blocks are/can be read in parallel depends
> primarily on application access patterns, and secondarily on how
> good is the IO subsystem scheduling.
>
>> [ ... ] Some filesystems care a /little/ about stripe width in
>> that they align certain structures to stripe boundaries to
>> make accesses more efficient.
>
> That in the case where read-modify-write cannot happen, if
> read-modify write can happen, unaligned or non-full-width writes
> are very costly, and not just for arrays; it happens in RAM too,
> and for 4KiB physical sector drives simulating 512B logical
> sectors.

I see your point here - when using a raid that requires 
read-modify-write (such as raid5 or raid6), then having the filesystem 
optimise writes by aligning them to RMW stripes is critical to avoiding 
poor write performance.  I'll remember that for future cases with XFS 
over raid5 or raid6.

In this case (raid10), RMW is not relevant.  So the effect of stripes is 
to allow single reads (or writes) to make use of as many spindles as 
possible in parallel.  Since this is a read-heavy application, the speed 
of reads is important - thus stripe widths /are/ important for performance.

As far as I understand you, XFS doesn't care about the stripe width for 
reading, so it doesn't matter whether you give it the correct width when 
creating the filesystem.  But that's a different matter from saying the 
/actual/ stripe width is relevant or not for read performance - it is 
just XFS's idea of the stripe width that is irrelevant for reading, 
while the real-world underlying stripe width on the raid array /is/ 
relevant.

mvh.,

David

  reply	other threads:[~2012-02-18 13:59 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-10 15:17 XFS on top RAID10 with odd drives count and 2 near copies CoolCold
2012-02-11  4:05 ` Stan Hoeppner
2012-02-11 14:32   ` David Brown
2012-02-12 20:16   ` CoolCold
2012-02-13  8:50     ` David Brown
2012-02-13  9:46       ` CoolCold
2012-02-13 11:19         ` David Brown
2012-02-13 13:46       ` Stan Hoeppner
2012-02-13  8:54     ` David Brown
2012-02-13  9:49       ` CoolCold
2012-02-13 12:09     ` Stan Hoeppner
2012-02-13 12:42       ` David Brown
2012-02-13 14:46         ` Stan Hoeppner
2012-02-13 21:40       ` CoolCold
2012-02-13 23:02         ` keld
2012-02-14  3:49           ` Stan Hoeppner
2012-02-14  8:58             ` David Brown
2012-02-14 11:38             ` keld
2012-02-14 23:27               ` Stan Hoeppner
2012-02-15  8:30                 ` Robin Hill
2012-02-15 13:30                   ` Stan Hoeppner
2012-02-15 14:03                     ` Robin Hill
2012-02-15 15:40                     ` David Brown
2012-02-17 13:16                       ` Stan Hoeppner
2012-02-17 14:57                         ` David Brown
2012-02-17 19:30                           ` Peter Grandi
2012-02-18 13:59                             ` David Brown [this message]
2012-02-19 14:46                           ` Peter Grandi
2012-02-17 19:03                         ` Peter Grandi
2012-02-17 22:12                           ` Stan Hoeppner
2012-02-18 17:09                           ` Peter Grandi
2012-02-15  9:24                 ` keld
2012-02-15 12:10                 ` David Brown
2012-02-15 13:08                   ` keld
2012-02-17 18:44                 ` Peter Grandi
2012-02-18 17:39                   ` Peter Grandi
2012-02-14  7:31           ` CoolCold
2012-02-14  9:05             ` David Brown
2012-02-14 11:10               ` Stan Hoeppner
2012-02-14  2:49         ` Stan Hoeppner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F3FAEC7.9000505@hesbynett.no \
    --to=david.brown@hesbynett.no \
    --cc=linux-raid@vger.kernel.org \
    --cc=pg@lxra2.for.sabi.co.UK \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.