linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Wols Lists <antlists@youngman.org.uk>
Cc: linux-xfs@vger.kernel.org, linux-raid@vger.kernel.org
Subject: Re: Growing RAID10 with active XFS filesystem
Date: Wed, 10 Jan 2018 09:25:23 +1100	[thread overview]
Message-ID: <20180109222523.GJ16421@dastard> (raw)
In-Reply-To: <5A548D31.4000002@youngman.org.uk>

On Tue, Jan 09, 2018 at 09:36:49AM +0000, Wols Lists wrote:
> On 08/01/18 22:01, Dave Chinner wrote:
> > Yup, 21 devices in a RAID 10. That's a really nasty config for
> > RAID10 which requires an even number of disks to mirror correctly.
> > Why does MD even allow this sort of whacky, sub-optimal
> > configuration?
> 
> Just to point out - if this is raid-10 (and not raid-1+0 which is a
> completely different beast) this is actually a normal linux config. I'm
> planning to set up a raid-10 across 3 devices. What happens is that is
> that raid-10 writes X copies across Y devices. If X = Y then it's a
> normal mirror config, if X > Y it makes good use of space (and if X < Y
> it doesn't make sense :-)
> 
> SDA: 1, 2, 4, 5
> 
> SDB: 1, 3, 4, 6
> 
> SDC: 2, 3, 5, 6

It's nice to know that MD has redefined RAID-10 to be different to
the industry standard definition that has been used for 20 years and
optimised filesystem layouts for.  Rotoring data across odd numbers
of disks like this is going to really, really suck on filesystems
that are stripe layout aware..

For example, XFS has hot-spot prevention algorithms in it's
internal physical layout for striped devices. It aligns AGs across
different stripe units so that metadata and data doesn't all get
aligned to the one disk in a RAID0/5/6 stripe. If the stripes are
rotoring across disks themselves, then we're going to end up back in
the same position we started with - multiple AGs aligned to the
same disk.

The result is that many XFS workloads are going to hotspot disks and
result in unbalanced load when there are an odd number of disks in a
RAID-10 array.  Actually, it's probably worse than having no
alignment, because it makes hotspot occurrence and behaviour very
unpredictable.

Worse is the fact that there's absolutely nothing we can do to
optimise allocation alignment or IO behaviour at the filesystem
level. We'll have to make mkfs.xfs aware of this clusterfuck and
turn off stripe alignment when we detect such a layout, but that
doesn't help all the existing user installations out there right
now.

IMO, odd-numbered disks in RAID-10 should be considered harmful and
never used....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2018-01-09 22:25 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <f289da8f-96ec-7db4-abb1-b151d553c088@gmail.com>
     [not found] ` <20180108192607.GS5602@magnolia>
2018-01-08 22:01   ` Growing RAID10 with active XFS filesystem Dave Chinner
2018-01-08 23:44     ` mdraid.pkoch
2018-01-09  9:36     ` Wols Lists
2018-01-09 21:47       ` IMAP-FCC:Sent
2018-01-09 22:25       ` Dave Chinner [this message]
2018-01-09 22:32         ` Reindl Harald
2018-01-10  6:17         ` Wols Lists
2018-01-11  2:14           ` Dave Chinner
2018-01-12  2:16             ` Guoqing Jiang
2018-01-10 14:10         ` Phil Turmel
2018-01-10 21:57           ` Wols Lists
2018-01-11  3:07           ` Dave Chinner
2018-01-12 13:32             ` Wols Lists
2018-01-12 14:25               ` Emmanuel Florac
2018-01-12 17:52                 ` Wols Lists
2018-01-12 18:37                   ` Emmanuel Florac
2018-01-12 19:35                     ` Wol's lists
2018-01-13 12:30                       ` Brad Campbell
2018-01-13 13:18                         ` Wols Lists
2018-01-13  0:20                   ` Stan Hoeppner
2018-01-13 19:29                     ` Wol's lists
2018-01-13 22:40                       ` Dave Chinner
2018-01-13 23:04                         ` Wols Lists
2018-01-14 21:33                 ` Wol's lists
2018-01-15 17:08                   ` Emmanuel Florac
2018-01-08 19:06 mdraid.pkoch
  -- strict thread matches above, loose matches on Subject: below --
2018-01-06 15:44 mdraid.pkoch
2018-01-07 19:33 ` John Stoffel
2018-01-07 20:16 ` Andreas Klauer
2018-01-08  7:31 ` Guoqing Jiang
2018-01-08 15:16   ` Wols Lists
2018-01-08 15:34     ` Reindl Harald
2018-01-08 16:24     ` Wolfgang Denk
2018-01-10  1:57     ` Guoqing Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180109222523.GJ16421@dastard \
    --to=david@fromorbit.com \
    --cc=antlists@youngman.org.uk \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).