All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Mikael Abrahamsson <swmike@swm.pp.se>
Cc: Christoph Hellwig <hch@infradead.org>,
	Dominik Brodowski <linux@dominikbrodowski.net>,
	Michael Monnerie <michael.monnerie@is.it-management.at>,
	linux-raid@vger.kernel.org, xfs@oss.sgi.com,
	linux-kernel@vger.kernel.org, dm-devel@redhat.com
Subject: Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
Date: Thu, 5 Aug 2010 08:24:38 +1000	[thread overview]
Message-ID: <20100805082438.0b476adb@notabene> (raw)
In-Reply-To: <alpine.DEB.1.10.1008041351100.19930@uplift.swm.pp.se>

On Wed, 4 Aug 2010 13:53:03 +0200 (CEST)
Mikael Abrahamsson <swmike@swm.pp.se> wrote:

> On Wed, 4 Aug 2010, Christoph Hellwig wrote:
> 
> > The good news is that you have it tracked down, the bad news is that I 
> > know very little about dm-crypt.  Maybe the issue is the single threaded 
> > decryption in dm-crypt?  Can you check how much CPU time the dm crypt 
> > kernel thread uses?
> 
> I'm not sure it's that. I have a Core i5 with AES-NI and that didn't 
> significantly increase my overall performance, as it's not there the 
> bottleneck is (at least in my system).
> 
> I earlier sent out an email wondering if someone could shed some light on 
> how scheduling, block caching and read-ahead works together when one does 
> disks->md->crypto->lvm->fs, becase that's a lot of layers and potentially 
> a lot of unneeded buffering, readahead and scheduling magic?
> 

Both page-cache and read-ahead work at the filesystem level, so only the
device in the stack that the filesystem mounts from is relevant for these.
Any read-ahead setting on other devices are ignored.
Other levels only have a cache if they explicitly need one.  e.g. raid5 has a
stripe-cache to allow parity calculations across all blocks in a stripe.

Scheduling can potentially happen at every layer, but it takes very different
forms.  Crypto, lvm, raid0 etc don't do any scheduling - it is just
first-in-first-out.
RAID5 does some scheduling for writes (but not reads) to try to gather full
stripes.  If you write 2 of 3 blocks in a stripe, then 3 of 3 in another
stripe, the 3 of 3 will be processed immediately while the 2 of 3 might be
delayed a little in the hope that the third will arrive.

The sys/block/XXX/queue/scheduler setting only applies at the bottom of the
stack (though when you have dm-multipath it is actually one step above the
bottom).

Hope that helps,
NeilBrown

WARNING: multiple messages have this Message-ID (diff)
From: Neil Brown <neilb@suse.de>
To: Mikael Abrahamsson <swmike@swm.pp.se>
Cc: Michael Monnerie <michael.monnerie@is.it-management.at>,
	linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
	Dominik Brodowski <linux@dominikbrodowski.net>,
	xfs@oss.sgi.com, Christoph Hellwig <hch@infradead.org>,
	dm-devel@redhat.com
Subject: Re: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs
Date: Thu, 5 Aug 2010 08:24:38 +1000	[thread overview]
Message-ID: <20100805082438.0b476adb@notabene> (raw)
In-Reply-To: <alpine.DEB.1.10.1008041351100.19930@uplift.swm.pp.se>

On Wed, 4 Aug 2010 13:53:03 +0200 (CEST)
Mikael Abrahamsson <swmike@swm.pp.se> wrote:

> On Wed, 4 Aug 2010, Christoph Hellwig wrote:
> 
> > The good news is that you have it tracked down, the bad news is that I 
> > know very little about dm-crypt.  Maybe the issue is the single threaded 
> > decryption in dm-crypt?  Can you check how much CPU time the dm crypt 
> > kernel thread uses?
> 
> I'm not sure it's that. I have a Core i5 with AES-NI and that didn't 
> significantly increase my overall performance, as it's not there the 
> bottleneck is (at least in my system).
> 
> I earlier sent out an email wondering if someone could shed some light on 
> how scheduling, block caching and read-ahead works together when one does 
> disks->md->crypto->lvm->fs, becase that's a lot of layers and potentially 
> a lot of unneeded buffering, readahead and scheduling magic?
> 

Both page-cache and read-ahead work at the filesystem level, so only the
device in the stack that the filesystem mounts from is relevant for these.
Any read-ahead setting on other devices are ignored.
Other levels only have a cache if they explicitly need one.  e.g. raid5 has a
stripe-cache to allow parity calculations across all blocks in a stripe.

Scheduling can potentially happen at every layer, but it takes very different
forms.  Crypto, lvm, raid0 etc don't do any scheduling - it is just
first-in-first-out.
RAID5 does some scheduling for writes (but not reads) to try to gather full
stripes.  If you write 2 of 3 blocks in a stripe, then 3 of 3 in another
stripe, the 3 of 3 will be processed immediately while the 2 of 3 might be
delayed a little in the hope that the third will arrive.

The sys/block/XXX/queue/scheduler setting only applies at the bottom of the
stack (though when you have dm-multipath it is actually one step above the
bottom).

Hope that helps,
NeilBrown

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2010-08-04 22:24 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-04  7:35 How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs Dominik Brodowski
2010-08-04  7:35 ` Dominik Brodowski
2010-08-04  8:50 ` Christoph Hellwig
2010-08-04  8:50   ` Christoph Hellwig
2010-08-04  9:13   ` Dominik Brodowski
2010-08-04  9:13     ` Dominik Brodowski
2010-08-04  9:21     ` Christoph Hellwig
2010-08-04  9:21       ` Christoph Hellwig
2010-08-04  9:16 ` Michael Monnerie
2010-08-04  9:16   ` Michael Monnerie
2010-08-04 10:25   ` Dominik Brodowski
2010-08-04 10:25     ` Dominik Brodowski
2010-08-04 11:18     ` Christoph Hellwig
2010-08-04 11:18     ` Christoph Hellwig
2010-08-04 11:18       ` Christoph Hellwig
2010-08-04 11:18       ` Christoph Hellwig
2010-08-04 11:24       ` Dominik Brodowski
2010-08-04 11:24         ` Dominik Brodowski
2010-08-04 11:53       ` Mikael Abrahamsson
2010-08-04 11:53         ` Mikael Abrahamsson
2010-08-04 12:56         ` Mike Snitzer
2010-08-04 12:56           ` Mike Snitzer
2010-08-04 22:24         ` Neil Brown [this message]
2010-08-04 22:24           ` Neil Brown
2010-08-05  8:33           ` Stan Hoeppner
2010-08-07 10:13             ` Dave Chinner
2010-08-07 23:43               ` Stan Hoeppner
2010-08-08  7:46               ` Michael Monnerie
     [not found]     ` <20100804111803.GA32643__39273.3621680692$1280923964$gmane$org@infradead.org>
2010-08-04 17:43       ` Andi Kleen
2010-08-04 20:33     ` Valdis.Kletnieks
2010-08-04 20:33       ` Valdis.Kletnieks
2010-08-04 20:33       ` Valdis.Kletnieks
2010-08-05  9:31       ` direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs] Dominik Brodowski
2010-08-05  9:31         ` Dominik Brodowski
2010-08-05 11:32         ` Chris Mason
2010-08-05 11:32         ` Chris Mason
2010-08-05 11:32           ` Chris Mason
2010-08-05 11:32           ` Chris Mason
2010-08-05 12:36           ` Josef Bacik
2010-08-05 12:36             ` Josef Bacik
2010-08-05 12:36             ` Josef Bacik
2010-08-05 12:36           ` Josef Bacik
2010-08-05 15:35           ` Dominik Brodowski
2010-08-05 15:35             ` Dominik Brodowski
2010-08-05 15:39             ` Chris Mason
2010-08-05 15:39               ` Chris Mason
2010-08-05 15:39               ` Chris Mason
2010-08-05 15:53               ` Dominik Brodowski
2010-08-05 15:53                 ` Dominik Brodowski
2010-08-05 15:53                 ` Dominik Brodowski
2010-08-05 15:39             ` Chris Mason
2010-08-05 16:35             ` Dominik Brodowski
2010-08-05 16:35             ` Dominik Brodowski
2010-08-05 16:35               ` Dominik Brodowski
2010-08-05 16:35               ` Dominik Brodowski
2010-08-05 20:47               ` Performance impact of CONFIG_DEBUG? direct-io test case Dominik Brodowski
2010-08-05 20:47                 ` Dominik Brodowski
2010-08-05 20:54               ` Performance impact of CONFIG_SCHED_MC? " Dominik Brodowski
2010-08-05 20:54                 ` Dominik Brodowski
2010-08-05 18:58           ` direct-io regression [Was: How to track down abysmal performance ata - raid1 - crypto - vg/lv - xfs] Jeff Moyer
2010-08-05 18:58             ` Jeff Moyer
2010-08-05 19:01             ` Chris Mason
2010-08-05 19:01               ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100805082438.0b476adb@notabene \
    --to=neilb@suse.de \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux@dominikbrodowski.net \
    --cc=michael.monnerie@is.it-management.at \
    --cc=swmike@swm.pp.se \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.