From: Dave Chinner <david@fromorbit.com>
To: Stefan Ring <stefanrin@gmail.com>
Cc: stan@hardwarefreak.com, Linux fs XFS <xfs@oss.sgi.com>
Subject: Re: XFS: Abysmal write performance because of excessive seeking (allocation groups to blame?)
Date: Mon, 9 Apr 2012 10:19:43 +1000 [thread overview]
Message-ID: <20120409001943.GI18323@dastard> (raw)
In-Reply-To: <CAAxjCEyJW1b4dbKctbrgdWjykQt8Hb4Sw1RKdys3oUsehNHCcQ@mail.gmail.com>
On Sat, Apr 07, 2012 at 09:27:50AM +0200, Stefan Ring wrote:
> > Instead, a far more optimal solution would be to set aside 4 spares per
> > chassis and create 14 four drive RADI10 arrays. This would yield ~600
> > seeks/sec and ~400MB/s sequential throughput performance per 2 spindle
> > array. We'd stitch the resulting 56 hardware RAID10 arrays together in
> > an mdraid linear (concatenated) array. Then we'd format this 112
> > effective spindle linear array with simply:
> >
> > $ mkfs.xfs -d agcount=56 /dev/md0
> >
> > Since each RAID10 is 900GB capacity, we have 56 AGs of just under the
> > 1TB limit, 1 AG per 2 physical spindles. Due to the 2 stripe spindle
> > nature of the constituent hardware RAID10 arrays, we don't need to worry
> > about aligning XFS writes to the RAID stripe width. The hardware cache
> > will take care of filling the small stripes. Now we're in the opposite
> > situation of having too many AGs per spindle. We've put 2 spindles in a
> > single AG and turned the seek starvation issues on its head.
>
> So it sounds like that for poor guys like us, who can’t afford the
> hardware to have dozens of spindles, the best option would be to
> create the XFS file system with agcount=1?
No, because then you have no redundancy in metadata structures, so
if you lose/corrupt the superblock you can easier lose the entire
filesytem. Not to mention you have no allocation parallelism in the
filesystem, so you'll get terrible performance in many common
workloads. IO fairness will also be a big problem.
> That seems to be the only reasonable conclusion to me, since a
> single RAID device, like a single disk, cannot write in parallel
> anyway.
A decent RAID controller with a BBWC and a single LUN benefits from
parallelism just as much as a large disk arrays do because the BBWC
minimises the write IO latency and the controller to do a better job
of scheduling it's IO.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-04-09 0:19 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-05 18:10 XFS: Abysmal write performance because of excessive seeking (allocation groups to blame?) Stefan Ring
2012-04-05 19:56 ` Peter Grandi
2012-04-05 22:41 ` Peter Grandi
2012-04-06 14:36 ` Peter Grandi
2012-04-06 15:37 ` Stefan Ring
2012-04-07 13:33 ` Peter Grandi
2012-04-05 21:37 ` Christoph Hellwig
2012-04-06 1:09 ` Peter Grandi
2012-04-06 8:25 ` Stefan Ring
2012-04-07 18:57 ` Martin Steigerwald
2012-04-10 14:02 ` Stefan Ring
2012-04-10 14:32 ` Joe Landman
2012-04-10 15:56 ` Stefan Ring
2012-04-10 18:13 ` Martin Steigerwald
2012-04-10 20:44 ` Stan Hoeppner
2012-04-10 21:00 ` Stefan Ring
2012-04-05 22:32 ` Roger Willcocks
2012-04-06 7:11 ` Stefan Ring
2012-04-06 8:24 ` Stefan Ring
2012-04-05 23:07 ` Peter Grandi
2012-04-06 0:13 ` Peter Grandi
2012-04-06 7:27 ` Stefan Ring
2012-04-06 23:28 ` Stan Hoeppner
2012-04-07 7:27 ` Stefan Ring
2012-04-07 8:53 ` Emmanuel Florac
2012-04-07 14:57 ` Stan Hoeppner
2012-04-09 11:02 ` Stefan Ring
2012-04-09 12:48 ` Emmanuel Florac
2012-04-09 12:53 ` Stefan Ring
2012-04-09 13:03 ` Emmanuel Florac
2012-04-09 23:38 ` Stan Hoeppner
2012-04-10 6:11 ` Stefan Ring
2012-04-10 20:29 ` Stan Hoeppner
2012-04-10 20:43 ` Stefan Ring
2012-04-10 21:29 ` Stan Hoeppner
2012-04-09 0:19 ` Dave Chinner [this message]
2012-04-09 11:39 ` Emmanuel Florac
2012-04-09 21:47 ` Dave Chinner
2012-04-07 8:49 ` Emmanuel Florac
2012-04-08 20:33 ` Stan Hoeppner
2012-04-08 21:45 ` Emmanuel Florac
2012-04-09 5:27 ` Stan Hoeppner
2012-04-09 12:45 ` Emmanuel Florac
2012-04-13 19:36 ` Stefan Ring
2012-04-14 7:32 ` Stan Hoeppner
2012-04-14 11:30 ` Stefan Ring
2012-04-09 14:21 ` Geoffrey Wehrman
2012-04-10 19:30 ` Stan Hoeppner
2012-04-11 22:19 ` Geoffrey Wehrman
2012-04-07 16:50 ` Peter Grandi
2012-04-07 17:10 ` Joe Landman
2012-04-08 21:42 ` Stan Hoeppner
2012-04-09 5:13 ` Stan Hoeppner
2012-04-09 11:52 ` Stefan Ring
2012-04-10 7:34 ` Stan Hoeppner
2012-04-10 13:59 ` Stefan Ring
2012-04-09 9:23 ` Stefan Ring
2012-04-09 23:06 ` Stan Hoeppner
2012-04-06 0:53 ` Peter Grandi
2012-04-06 7:32 ` Stefan Ring
2012-04-06 5:53 ` Stefan Ring
2012-04-06 15:35 ` Peter Grandi
2012-04-10 14:05 ` Stefan Ring
2012-04-07 19:11 ` Peter Grandi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120409001943.GI18323@dastard \
--to=david@fromorbit.com \
--cc=stan@hardwarefreak.com \
--cc=stefanrin@gmail.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox