From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Brown Subject: Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs Date: Thu, 5 Aug 2010 08:24:38 +1000 Message-ID: <20100805082438.0b476adb@notabene> References: <20100804073546.GA7494@comet.dominikbrodowski.net> <20100804085039.GA11671@infradead.org> <20100804091317.GA27779@isilmar-3.linta.de> <20100804092122.GA2998@infradead.org> <20100804073546.GA7494@comet.dominikbrodowski.net> <201008041116.09822@zmi.at> <20100804102526.GB13766@isilmar-3.linta.de> <20100804111803.GA32643@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: Mikael Abrahamsson Cc: Christoph Hellwig , Dominik Brodowski , Michael Monnerie , linux-raid@vger.kernel.org, xfs@oss.sgi.com, linux-kernel@vger.kernel.org, dm-devel@redhat.com List-Id: dm-devel.ids On Wed, 4 Aug 2010 13:53:03 +0200 (CEST) Mikael Abrahamsson wrote: > On Wed, 4 Aug 2010, Christoph Hellwig wrote: > > > The good news is that you have it tracked down, the bad news is that I > > know very little about dm-crypt. Maybe the issue is the single threaded > > decryption in dm-crypt? Can you check how much CPU time the dm crypt > > kernel thread uses? > > I'm not sure it's that. I have a Core i5 with AES-NI and that didn't > significantly increase my overall performance, as it's not there the > bottleneck is (at least in my system). > > I earlier sent out an email wondering if someone could shed some light on > how scheduling, block caching and read-ahead works together when one does > disks->md->crypto->lvm->fs, becase that's a lot of layers and potentially > a lot of unneeded buffering, readahead and scheduling magic? > Both page-cache and read-ahead work at the filesystem level, so only the device in the stack that the filesystem mounts from is relevant for these. Any read-ahead setting on other devices are ignored. Other levels only have a cache if they explicitly need one. e.g. raid5 has a stripe-cache to allow parity calculations across all blocks in a stripe. Scheduling can potentially happen at every layer, but it takes very different forms. Crypto, lvm, raid0 etc don't do any scheduling - it is just first-in-first-out. RAID5 does some scheduling for writes (but not reads) to try to gather full stripes. If you write 2 of 3 blocks in a stripe, then 3 of 3 in another stripe, the 3 of 3 will be processed immediately while the 2 of 3 might be delayed a little in the hope that the third will arrive. The sys/block/XXX/queue/scheduler setting only applies at the bottom of the stack (though when you have dm-multipath it is actually one step above the bottom). Hope that helps, NeilBrown