Re: Filesystem writes on RAID5 too slow

From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Martin Boutin <martboutin@gmail.com>,
	"Kernel.org-Linux-RAID" <linux-raid@vger.kernel.org>,
	Eric Sandeen <sandeen@redhat.com>,
	"Kernel.org-Linux-EXT4" <linux-ext4@vger.kernel.org>,
	xfs-oss <xfs@oss.sgi.com>
Subject: Re: Filesystem writes on RAID5 too slow
Date: Mon, 25 Nov 2013 10:21:37 +1100	[thread overview]
Message-ID: <20131124232137.GA8803@dastard> (raw)
In-Reply-To: <20131123084106.GA19088@infradead.org>

On Sat, Nov 23, 2013 at 12:41:06AM -0800, Christoph Hellwig wrote:
> On Sat, Nov 23, 2013 at 09:40:38AM +1100, Dave Chinner wrote:
> > > geometry, and we already have it wired to to large sector size
> > > testing in xfstests.
> > 
> > We don't need to screw around with the sector size - that is
> > irrelevant to the problem, and we have an allocation alignment
> > test that is supposed to catch these issues: generic/223.
> 
> It didn't imply we need large sector sizes, but the same mechanism
> to expodse a large sector size can also be used to present large
> stripe units/width.
> 
> > As I said, I have seen occasional failures of that test (once a
> > month, on average) as a result of this bug. It was simply not often
> > enough - running in a hard loop didn't increase the frequency of
> > failures - to be able debug it or to reach my "there's a regression
> > I need to look at" threshold. Perhaps we need to revisit that test
> > and see if we can make it more likely to trigger failures...
> 
> Seems like 233 should have cought it regularly with the explicit
> alignment options on mkfs time.  Maybe we also need a test mirroring
> the plain dd more closely?

Preallocation showed the problem, too, so we probably don't even
need dd to check whether allocation alignment is working properly.
We should probably write a test that spefically checks all the
different anlignment/extent size combinations we can use.

Preallocation should behave very similarly to direct IO, but I'm
pretty sure that it won't do things like round up allocations to
stripe unit/widths like direct IO does. The fact that we do
allocation sunit/swidth size alignment for direct Io outside the
allocator and sunit/swidth offset alignment inside the allocation is
kinda funky....

> I've not seen 233 fail for a long time..

Not surprising, it is a one in several hundred test runs occurrence
here...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com