public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Stan Hoeppner <stan@hardwarefreak.com>
Cc: xfs@oss.sgi.com
Subject: Re: storage, libaio, or XFS problem?  3.4.26
Date: Mon, 1 Sep 2014 09:57:49 +1000	[thread overview]
Message-ID: <20140831235749.GH20518@dastard> (raw)
In-Reply-To: <d20fe777ec1fd318ae5d4054dffda3f4@localhost>

On Fri, Aug 29, 2014 at 09:55:53PM -0500, Stan Hoeppner wrote:
> On Sat, 30 Aug 2014 09:55:38 +1000, Dave Chinner <david@fromorbit.com> wrote:
> > On Fri, Aug 29, 2014 at 11:38:16AM -0500, Stan Hoeppner wrote:
> >> 
> >> Another storage crash yesterday.  xfs_repair output inline below for the 7
> >> filesystems.  I'm also pasting the dmesg output.  This time there is no
> >> oops, no call traces.  The filesystems mounted fine after mounting,
> >> replaying, and repairing. 
> > 
> > Ok, what version of xfs_repair did you use?
> 
> 3.1.4 which is a little long in the tooth.

And so not useful for th epurposes of finding free space tree
corruptions. Old xfs_repair versions only rebuild the freespace
trees - they don't check them first. IOWs, silence from an old
xfs_repair does not mean the filesystem was free of errors.

> >> This because some of our writes for a given low rate stream are as low as
> >> 32KB and may be 2-3 seconds apart.  With a 64-128KB chunk, 768 to 1536KB
> >> stripe width, we'd get massive RMW without this feature.  Testing thus
> >> far
> >> shows it is fairly effective, though we still get pretty serious RMW due
> >> to
> >> the fact we're writing 350 of these small streams per array at ~72 KB/s
> >> max, along with 2 streams at ~48 MB/s, and and 50 streams at ~1.2 MB/s.
> 
> >> Multiply this by 7 LUNs per controller and it becomes clear we're
> >> putting a
> >> pretty serious load on the firmware and cache.
> > 
> > Yup, so having the array cache do the equivalent of sequential
> > readahead multi-stream detection for writeback would make a big
> > difference. But not simple to do....
> 
> Not at all, especially with only 3 GB of RAM to work with, as I'm told. 
> Seems low for a high end controller with 4x 12G SAS ports.  We're only able
> to achieve ~250 MB/s per array at the application due to the access pattern
> being essentially random, and still with a serious quantity of RMWs.  Which
> is why we're going to test with an even smaller chunk of 32KB.  I believe
> that's the lower bound on these controllers.  For this workload 16KB or
> maybe even 8KB would likely be more optimal.  We're also going to test with
> bcache and a 400 GB Intel 3700 (datacenter grade) SSD backing two LUNs. 
> But with bcache chunk size should be far less relevant.  I'm anxious to
> kick those tires, but it'll be a couple of weeks.
> 
> Have you played with bcache yet?

Enough to scare me. So many ways for things to go wrong, no easy way
to recover when things go wrong. And that's before I even get to
performance warts, like having systems stall completely because
there's tens or hundreds of GB of 4k random writes that have to be
flushed to slow SATA RAID6 in the cache....

Cheers,

Dave.

PS: can you wrap your text at 68 or 72 columns so quoted text
doesn't overflow 80 columns and get randomly wrapped and messed up?

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2014-08-31 23:57 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-26  6:18 storage, libaio, or XFS problem? 3.4.26 Stan Hoeppner
2014-08-26  6:25 ` Stan Hoeppner
2014-08-26  7:53 ` Dave Chinner
2014-08-26 17:19   ` Stan Hoeppner
2014-08-28  0:32     ` Dave Chinner
2014-08-28 22:31       ` Stan Hoeppner
2014-08-28 23:08         ` Dave Chinner
2014-08-29 16:38           ` Stan Hoeppner
2014-08-29 23:55             ` Dave Chinner
2014-08-30  2:55               ` Stan Hoeppner
2014-08-31 23:57                 ` Dave Chinner [this message]
2014-09-01  3:36                   ` stan hoeppner
2014-09-01 23:45                     ` Dave Chinner
2014-09-02 17:15                       ` stan hoeppner
2014-09-02 22:19                         ` Dave Chinner
2014-09-07  5:23                           ` stan hoeppner
2014-09-07 23:39                             ` Dave Chinner
2014-09-08 15:13                               ` stan hoeppner
2014-09-20 19:47                                 ` stan hoeppner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140831235749.GH20518@dastard \
    --to=david@fromorbit.com \
    --cc=stan@hardwarefreak.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox