public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <jens.axboe@oracle.com>
To: "Alan D. Brunelle" <Alan.Brunelle@hp.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH 0/3] Skip I/O merges when disabled
Date: Thu, 24 Apr 2008 09:09:23 +0200	[thread overview]
Message-ID: <20080424070923.GQ12774@kernel.dk> (raw)
In-Reply-To: <480F8936.5030406@hp.com>

On Wed, Apr 23 2008, Alan D. Brunelle wrote:
> The block I/O + elevator + I/O scheduler code spends a lot of time
> trying to merge I/Os -- rightfully so under "normal" circumstances.
> However, if one were to know that the incoming I/O stream was /very/
> random in nature, the cycles are wasted. (This can be the case, for
> example, during OLTP-type runs.)
> 
> This patch stream adds a per-request_queue tunable that (when set)
> disables merge attempts, thus freeing up a non-trivial amount of CPU cycles.
> 
> I'll be doing some more benchmarking, but this is a representative set
> of data on a two-way Opteron box w/ 4 SATA drives. 'fio' was used to
> generate random 4k asynchronous direct I/Os over the 128GiB of each SATA
> drive.  Oprofile was used to collect the results, and we collected
> CPU_CLK_UNHALTED (CPU) and DATA_CACHE_MISSES (DCM) events. The data
> extracted below shows both the percentage for all samples (including
> non-kernel) as well as just those from the block I/O layer + elevator +
> deadline I/O scheduler + SATA modules.
> 
> v2.6.25 (not patched):  CPU: 5.8330% (total)  7.5644% (I/O code only)
> v2.6.25 + nomerges = 0: CPU: 5.8008% (total)  7.5806% (I/O code only)
> v2.6.25 + nomerges = 1: CPU: 4.5404% (total)  5.9416% (I/O code only)
> 
> v2.6.25 (not patched):  DCM: 8.1967% (total) 10.5188% (I/O code only)
> v2.6.25 + nomerges = 0: DCM: 7.2291% (total)  9.4087% (I/O code only)
> v2.6.25 + nomerges = 1: DCM: 6.1989% (total)  8.0155% (I/O code only)
> 
> I've typically been seeing a good 20-25% reduction in CPU samples, and
> 10-15% in DCM samples for the random load w/ nomerges set to 1 compared
> to set to 0 (looking at just the block code).
> 
> [BTW: The I/O performance doesn't change much between the 3 sets of data
> - the seek + I/O times themselves dominate things to such a large
> extent.  There is a very small improvement seen w/ nomerges=1, but <<1%.]
> 
> It's not clear to me why 2.6.25 (not patched) requires /more/ cycles
> than does the patched kernel w/ nomerges=0 -- it's been consistent in
> the handful of runs I've done. I'm going to do a large set of runs for
> each condition (not patched, nomerges=0 & nomerges=1) to verify that
> this holds over multiple runs. I'm also going to check out sequential
> loads to see what (if any) penalty the extra couple of checks incurs on
> those (probably not noticeable).
> 
> The first patch in the series adds the tunable; The second adds in the
> check to skip the merge code; and the third adds in the check to skip
> adding requests to hash lists for merging.

The functionality is fine with me, merging is obviously a non-zero
amount of cycles spent on IO and if you know it's in vain, may as well
turn it off. One suggestion, though - if you add this as a performance
rather than functionality change, I would suggest keeping the one-hit
cache merge as that is essentially free. Better than free actually,
since if you hit that merge point you'll be spending way less cycles
than allocating+setting up a new request.

-- 
Jens Axboe


  parent reply	other threads:[~2008-04-24  7:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-23 19:08 [RFC][PATCH 0/3] Skip I/O merges when disabled Alan D. Brunelle
2008-04-23 19:12 ` [RFC][PATCH 1/3] Add flag and sysfs interfaces Alan D. Brunelle
2008-04-23 19:14 ` [RFC][PATCH 2/3] Have __make_request skip merges when disabled Alan D. Brunelle
2008-04-23 19:15 ` [RFC][PATCH 3/3] Do not use rqhash when merges disabled Alan D. Brunelle
2008-04-24  0:37   ` Aaron Carroll
2008-04-24  0:59     ` Alan D. Brunelle
2008-04-24  2:07       ` Aaron Carroll
2008-04-24  7:09 ` Jens Axboe [this message]
2008-04-24 12:09   ` [RFC][PATCH 0/3] Skip I/O merges when disabled Alan D. Brunelle
2008-04-25  8:38     ` Jens Axboe
2008-04-25 11:17       ` Alan D. Brunelle
2008-04-25 11:25         ` Jens Axboe
2008-04-25 12:06           ` Aaron Carroll
2008-04-25 12:14             ` Jens Axboe
2008-04-25 12:17         ` Alan D. Brunelle
2008-04-28 16:36           ` Alan D. Brunelle
2008-04-29  7:37             ` Jens Axboe
2008-04-24 20:38   ` Alan D. Brunelle
2008-04-24 13:29 ` Andi Kleen
2008-04-24 13:59   ` Jens Axboe
2008-04-24 14:13     ` Alan D. Brunelle
2008-04-24 15:05       ` Jens Axboe
2008-04-24 22:04       ` Carl Henrik Lunde
2008-04-25  7:13       ` Andi Kleen
2008-04-24 14:15     ` Andi Kleen
2008-04-24 15:04       ` Jens Axboe
2008-04-24 15:53         ` David Collier-Brown
2008-04-24 16:29           ` Alan D. Brunelle
2008-04-24 13:31 ` Alan D. Brunelle
2008-04-24 13:43   ` Alan D. Brunelle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080424070923.GQ12774@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=Alan.Brunelle@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox