From: "Alan D. Brunelle" <Alan.Brunelle@hp.com>
To: linux-kernel@vger.kernel.org, Jens Axboe <jens.axboe@oracle.com>
Subject: [RFC][PATCH 0/3] Skip I/O merges when disabled
Date: Wed, 23 Apr 2008 15:08:38 -0400 [thread overview]
Message-ID: <480F8936.5030406@hp.com> (raw)
The block I/O + elevator + I/O scheduler code spends a lot of time
trying to merge I/Os -- rightfully so under "normal" circumstances.
However, if one were to know that the incoming I/O stream was /very/
random in nature, the cycles are wasted. (This can be the case, for
example, during OLTP-type runs.)
This patch stream adds a per-request_queue tunable that (when set)
disables merge attempts, thus freeing up a non-trivial amount of CPU cycles.
I'll be doing some more benchmarking, but this is a representative set
of data on a two-way Opteron box w/ 4 SATA drives. 'fio' was used to
generate random 4k asynchronous direct I/Os over the 128GiB of each SATA
drive. Oprofile was used to collect the results, and we collected
CPU_CLK_UNHALTED (CPU) and DATA_CACHE_MISSES (DCM) events. The data
extracted below shows both the percentage for all samples (including
non-kernel) as well as just those from the block I/O layer + elevator +
deadline I/O scheduler + SATA modules.
v2.6.25 (not patched): CPU: 5.8330% (total) 7.5644% (I/O code only)
v2.6.25 + nomerges = 0: CPU: 5.8008% (total) 7.5806% (I/O code only)
v2.6.25 + nomerges = 1: CPU: 4.5404% (total) 5.9416% (I/O code only)
v2.6.25 (not patched): DCM: 8.1967% (total) 10.5188% (I/O code only)
v2.6.25 + nomerges = 0: DCM: 7.2291% (total) 9.4087% (I/O code only)
v2.6.25 + nomerges = 1: DCM: 6.1989% (total) 8.0155% (I/O code only)
I've typically been seeing a good 20-25% reduction in CPU samples, and
10-15% in DCM samples for the random load w/ nomerges set to 1 compared
to set to 0 (looking at just the block code).
[BTW: The I/O performance doesn't change much between the 3 sets of data
- the seek + I/O times themselves dominate things to such a large
extent. There is a very small improvement seen w/ nomerges=1, but <<1%.]
It's not clear to me why 2.6.25 (not patched) requires /more/ cycles
than does the patched kernel w/ nomerges=0 -- it's been consistent in
the handful of runs I've done. I'm going to do a large set of runs for
each condition (not patched, nomerges=0 & nomerges=1) to verify that
this holds over multiple runs. I'm also going to check out sequential
loads to see what (if any) penalty the extra couple of checks incurs on
those (probably not noticeable).
The first patch in the series adds the tunable; The second adds in the
check to skip the merge code; and the third adds in the check to skip
adding requests to hash lists for merging.
Alan D. Brunelle
Hewlett-Packard
next reply other threads:[~2008-04-23 19:09 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-23 19:08 Alan D. Brunelle [this message]
2008-04-23 19:12 ` [RFC][PATCH 1/3] Add flag and sysfs interfaces Alan D. Brunelle
2008-04-23 19:14 ` [RFC][PATCH 2/3] Have __make_request skip merges when disabled Alan D. Brunelle
2008-04-23 19:15 ` [RFC][PATCH 3/3] Do not use rqhash when merges disabled Alan D. Brunelle
2008-04-24 0:37 ` Aaron Carroll
2008-04-24 0:59 ` Alan D. Brunelle
2008-04-24 2:07 ` Aaron Carroll
2008-04-24 7:09 ` [RFC][PATCH 0/3] Skip I/O merges when disabled Jens Axboe
2008-04-24 12:09 ` Alan D. Brunelle
2008-04-25 8:38 ` Jens Axboe
2008-04-25 11:17 ` Alan D. Brunelle
2008-04-25 11:25 ` Jens Axboe
2008-04-25 12:06 ` Aaron Carroll
2008-04-25 12:14 ` Jens Axboe
2008-04-25 12:17 ` Alan D. Brunelle
2008-04-28 16:36 ` Alan D. Brunelle
2008-04-29 7:37 ` Jens Axboe
2008-04-24 20:38 ` Alan D. Brunelle
2008-04-24 13:29 ` Andi Kleen
2008-04-24 13:59 ` Jens Axboe
2008-04-24 14:13 ` Alan D. Brunelle
2008-04-24 15:05 ` Jens Axboe
2008-04-24 22:04 ` Carl Henrik Lunde
2008-04-25 7:13 ` Andi Kleen
2008-04-24 14:15 ` Andi Kleen
2008-04-24 15:04 ` Jens Axboe
2008-04-24 15:53 ` David Collier-Brown
2008-04-24 16:29 ` Alan D. Brunelle
2008-04-24 13:31 ` Alan D. Brunelle
2008-04-24 13:43 ` Alan D. Brunelle
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=480F8936.5030406@hp.com \
--to=alan.brunelle@hp.com \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox