From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Hellwig Subject: Re: mkfs.xfs states log stripe unit is too large Date: Mon, 2 Jul 2012 02:18:27 -0400 Message-ID: <20120702061827.GB16671@infradead.org> References: <20120623234445.GZ19223@dastard> <4FE67970.2030008@sandeen.net> <4FE710B7.5010704@hardwarefreak.com> <20120626023059.GC19223@dastard> <20120626080217.GA30767@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20120626080217.GA30767@infradead.org> Sender: linux-raid-owner@vger.kernel.org To: Dave Chinner Cc: Ingo J?rgensmann , xfs@oss.sgi.com, linux-raid@vger.kernel.org List-Id: linux-raid.ids Ping to Neil / the raid list. On Tue, Jun 26, 2012 at 04:02:17AM -0400, Christoph Hellwig wrote: > On Tue, Jun 26, 2012 at 12:30:59PM +1000, Dave Chinner wrote: > > You can't, simple as that. The maximum supported is 256k. As it is, > > a default chunk size of 512k is probably harmful to most workloads - > > large chunk sizes mean that just about every write will trigger a > > RMW cycle in the RAID because it is pretty much impossible to issue > > full stripe writes. Writeback doesn't do any alignment of IO (the > > generic page cache writeback path is the problem here), so we will > > lamost always be doing unaligned IO to the RAID, and there will be > > little opportunity for sequential IOs to merge and form full stripe > > writes (24 disks @ 512k each on RAID6 is a 11MB full stripe write). > > > > IOWs, every time you do a small isolated write, the MD RAID volume > > will do a RMW cycle, reading 11MB and writing 12MB of data to disk. > > Given that most workloads are not doing lots and lots of large > > sequential writes this is, IMO, a pretty bad default given typical > > RAID5/6 volume configurations we see.... > > Not too long ago I benchmarked out mdraid stripe sizes, and at least > for XFS 32kb was a clear winner, anything larger decreased performance. > > ext4 didn't get hit that badly with larger stripe sizes, probably > because they still internally bump the writeback size like crazy, but > they did not actually get faster with larger stripes either. > > This was streaming data heavy workloads, anything more metadata heavy > probably will suffer from larger stripes even more. > > Ccing the linux-raid list if there actually is any reason for these > defaults, something I wanted to ask for a long time but never really got > back to. > > Also I'm pretty sure back then the md default was 256kb writes, not 512 > so it seems the defaults further increased. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ---end quoted text---