Re: [RFC][PATCH 0/3] Skip I/O merges when disabled

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Alan D. Brunelle" <Alan.Brunelle@hp.com>
To: linux-kernel@vger.kernel.org
Cc: Jens Axboe <jens.axboe@oracle.com>
Subject: Re: [RFC][PATCH 0/3] Skip I/O merges when disabled
Date: Thu, 24 Apr 2008 08:09:53 -0400	[thread overview]
Message-ID: <48107891.5000308@hp.com> (raw)
In-Reply-To: <20080424070923.GQ12774@kernel.dk>

Jens Axboe wrote:
> On Wed, Apr 23 2008, Alan D. Brunelle wrote:
>> The block I/O + elevator + I/O scheduler code spends a lot of time
>> trying to merge I/Os -- rightfully so under "normal" circumstances.
>> However, if one were to know that the incoming I/O stream was /very/
>> random in nature, the cycles are wasted. (This can be the case, for
>> example, during OLTP-type runs.)
>>
>> This patch stream adds a per-request_queue tunable that (when set)
>> disables merge attempts, thus freeing up a non-trivial amount of CPU cycles.
>>
>> I'll be doing some more benchmarking, but this is a representative set
>> of data on a two-way Opteron box w/ 4 SATA drives. 'fio' was used to
>> generate random 4k asynchronous direct I/Os over the 128GiB of each SATA
>> drive.  Oprofile was used to collect the results, and we collected
>> CPU_CLK_UNHALTED (CPU) and DATA_CACHE_MISSES (DCM) events. The data
>> extracted below shows both the percentage for all samples (including
>> non-kernel) as well as just those from the block I/O layer + elevator +
>> deadline I/O scheduler + SATA modules.
>>
>> v2.6.25 (not patched):  CPU: 5.8330% (total)  7.5644% (I/O code only)
>> v2.6.25 + nomerges = 0: CPU: 5.8008% (total)  7.5806% (I/O code only)
>> v2.6.25 + nomerges = 1: CPU: 4.5404% (total)  5.9416% (I/O code only)
>>
>> v2.6.25 (not patched):  DCM: 8.1967% (total) 10.5188% (I/O code only)
>> v2.6.25 + nomerges = 0: DCM: 7.2291% (total)  9.4087% (I/O code only)
>> v2.6.25 + nomerges = 1: DCM: 6.1989% (total)  8.0155% (I/O code only)
>>
>> I've typically been seeing a good 20-25% reduction in CPU samples, and
>> 10-15% in DCM samples for the random load w/ nomerges set to 1 compared
>> to set to 0 (looking at just the block code).
>>
>> [BTW: The I/O performance doesn't change much between the 3 sets of data
>> - the seek + I/O times themselves dominate things to such a large
>> extent.  There is a very small improvement seen w/ nomerges=1, but <<1%.]
>>
>> It's not clear to me why 2.6.25 (not patched) requires /more/ cycles
>> than does the patched kernel w/ nomerges=0 -- it's been consistent in
>> the handful of runs I've done. I'm going to do a large set of runs for
>> each condition (not patched, nomerges=0 & nomerges=1) to verify that
>> this holds over multiple runs. I'm also going to check out sequential
>> loads to see what (if any) penalty the extra couple of checks incurs on
>> those (probably not noticeable).
>>
>> The first patch in the series adds the tunable; The second adds in the
>> check to skip the merge code; and the third adds in the check to skip
>> adding requests to hash lists for merging.
> 
> The functionality is fine with me, merging is obviously a non-zero
> amount of cycles spent on IO and if you know it's in vain, may as well
> turn it off. One suggestion, though - if you add this as a performance
> rather than functionality change, I would suggest keeping the one-hit
> cache merge as that is essentially free. Better than free actually,
> since if you hit that merge point you'll be spending way less cycles
> than allocating+setting up a new request.
> 

Hi Jens -

I'll look into retaining the one-hit cache merge functionality, remove
the errant elv_rqhas_del code, and repost w/ the results from the other
tests I've run.

Thanks,
Alan

next prev parent reply	other threads:[~2008-04-24 12:10 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-23 19:08 [RFC][PATCH 0/3] Skip I/O merges when disabled Alan D. Brunelle
2008-04-23 19:12 ` [RFC][PATCH 1/3] Add flag and sysfs interfaces Alan D. Brunelle
2008-04-23 19:14 ` [RFC][PATCH 2/3] Have __make_request skip merges when disabled Alan D. Brunelle
2008-04-23 19:15 ` [RFC][PATCH 3/3] Do not use rqhash when merges disabled Alan D. Brunelle
2008-04-24  0:37   ` Aaron Carroll
2008-04-24  0:59     ` Alan D. Brunelle
2008-04-24  2:07       ` Aaron Carroll
2008-04-24  7:09 ` [RFC][PATCH 0/3] Skip I/O merges when disabled Jens Axboe
2008-04-24 12:09   ` Alan D. Brunelle [this message]
2008-04-25  8:38     ` Jens Axboe
2008-04-25 11:17       ` Alan D. Brunelle
2008-04-25 11:25         ` Jens Axboe
2008-04-25 12:06           ` Aaron Carroll
2008-04-25 12:14             ` Jens Axboe
2008-04-25 12:17         ` Alan D. Brunelle
2008-04-28 16:36           ` Alan D. Brunelle
2008-04-29  7:37             ` Jens Axboe
2008-04-24 20:38   ` Alan D. Brunelle
2008-04-24 13:29 ` Andi Kleen
2008-04-24 13:59   ` Jens Axboe
2008-04-24 14:13     ` Alan D. Brunelle
2008-04-24 15:05       ` Jens Axboe
2008-04-24 22:04       ` Carl Henrik Lunde
2008-04-25  7:13       ` Andi Kleen
2008-04-24 14:15     ` Andi Kleen
2008-04-24 15:04       ` Jens Axboe
2008-04-24 15:53         ` David Collier-Brown
2008-04-24 16:29           ` Alan D. Brunelle
2008-04-24 13:31 ` Alan D. Brunelle
2008-04-24 13:43   ` Alan D. Brunelle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48107891.5000308@hp.com \
    --to=alan.brunelle@hp.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.