From: Suparna Bhattacharya <suparna@in.ibm.com>
To: linux-aio@kvack.org, linux-kernel@vger.kernel.org
Cc: linux-osdl@osdl.org
Subject: Re: [PATCH 11/22] Reduce AIO worker context switches
Date: Fri, 2 Jul 2004 18:56:18 +0530 [thread overview]
Message-ID: <20040702132618.GK4374@in.ibm.com> (raw)
In-Reply-To: <20040702130030.GA4256@in.ibm.com>
On Fri, Jul 02, 2004 at 06:30:30PM +0530, Suparna Bhattacharya wrote:
> The patchset contains modifications and fixes to the AIO core
> to support the full retry model, an implementation of AIO
> support for buffered filesystem AIO reads and O_SYNC writes
> (the latter courtesy O_SYNC speedup changes from Andrew Morton),
> an implementation of AIO reads and writes to pipes (from
> Chris Mason) and AIO poll (again from Chris Mason).
>
> Full retry infrastructure and fixes
> [1] aio-retry.patch
> [2] 4g4g-aio-hang-fix.patch
> [3] aio-retry-elevated-refcount.patch
> [4] aio-splice-runlist.patch
>
> FS AIO read
> [5] aio-wait-page.patch
> [6] aio-fs_read.patch
> [7] aio-upfront-readahead.patch
>
> AIO for pipes
> [8] aio-cancel-fix.patch
> [9] aio-read-immediate.patch
> [10] aio-pipe.patch
> [11] aio-context-switch.patch
>
Should also apply aio-context-stall (later in the patchset)
--
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Lab, India
-----------------------------------------------
From: Chris Mason
I compared the 2.6 pipetest results with the 2.4 suse kernel, and 2.6
was roughly 40% slower. During the pipetest run, 2.6 generates ~600,000
context switches per second while 2.4 generates 30 or so.
aio-context-switch (attached) has a few changes that reduces our context
switch rate, and bring performance back up to 2.4 levels. These have
only really been tested against pipetest, they might make other
workloads worse.
The basic theory behind the patch is that it is better for the userland
process to call run_iocbs than it is to schedule away and let the worker
thread do it.
1) on io_submit, use run_iocbs instead of run_iocb
2) on io_getevents, call run_iocbs if no events were available.
3) don't let two procs call run_iocbs for the same context at the same
time. They just end up bouncing on spinlocks.
The first three optimizations got me down to 360,000 context switches
per second, and they help build a little structure to allow optimization
#4, which uses queue_delayed_work(HZ/10) instead of queue_work.
That brings down the number of context switches to 2.4 levels.
On Tue, 2004-02-24 at 13:32, Suparna Bhattacharya wrote:
> On more thought ...
> The aio-splice-runlist patch runs counter-purpose to some of
> your optimizations. I put that one in to avoid starvation when
> multiple ioctx's are in use. But it means that ctx->running
> doesn't ensure that it will process the new request we just put on
> the run-list.
The ctx->running optimization probably isn't critical. It should be
enough to call run_iocbs from io_submit_one and getevents, which will
help make sure the process does its own retries whenever possible.
Doing the run_iocbs from getevents is what makes the queue_delayed_work
possible, since someone waiting on an event won't have to wait the extra
HZ/10 for the worker thread to schedule in.
Wow, 15% slower with ctx->running removed, but the number of context
switches is stays nice and low. We can play with ctx->running
variations later, here's a patch without them. It should be easier to
apply with the rest of your code.
aio.c | 18 ++++++++++++------
1 files changed, 12 insertions(+), 6 deletions(-)
--- aio/fs/aio.c 2004-06-18 09:48:00.712363848 -0700
+++ aio-context-switch.patch/fs/aio.c 2004-06-18 10:01:52.963842392 -0700
@@ -851,7 +851,7 @@ void queue_kicked_iocb(struct kiocb *ioc
run = __queue_kicked_iocb(iocb);
spin_unlock_irqrestore(&ctx->ctx_lock, flags);
if (run) {
- queue_work(aio_wq, &ctx->wq);
+ queue_delayed_work(aio_wq, &ctx->wq, HZ/10);
aio_wakeups++;
}
}
@@ -1079,13 +1079,14 @@ static int read_events(struct kioctx *ct
struct io_event ent;
struct timeout to;
int event_loop = 0; /* testing only */
+ int retry = 0;
/* needed to zero any padding within an entry (there shouldn't be
* any, but C is fun!
*/
memset(&ent, 0, sizeof(ent));
+retry:
ret = 0;
-
while (likely(i < nr)) {
ret = aio_read_evt(ctx, &ent);
if (unlikely(ret <= 0))
@@ -1114,6 +1115,13 @@ static int read_events(struct kioctx *ct
/* End fast path */
+ /* racey check, but it gets redone */
+ if (!retry && unlikely(!list_empty(&ctx->run_list))) {
+ retry = 1;
+ aio_run_iocbs(ctx);
+ goto retry;
+ }
+
init_timeout(&to);
if (timeout) {
struct timespec ts;
@@ -1505,11 +1513,9 @@ int fastcall io_submit_one(struct kioctx
goto out_put_req;
spin_lock_irq(&ctx->ctx_lock);
- ret = aio_run_iocb(req);
+ list_add_tail(&req->ki_run_list, &ctx->run_list);
+ __aio_run_iocbs(ctx);
spin_unlock_irq(&ctx->ctx_lock);
-
- if (-EIOCBRETRY == ret)
- queue_work(aio_wq, &ctx->wq);
aio_put_req(req); /* drop extra ref to req */
return 0;
next prev parent reply other threads:[~2004-07-02 13:17 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-07-02 13:00 [PATCH 0/22] fsaio, pipe aio and aio poll upgraded to 2.6.7 Suparna Bhattacharya
2004-07-02 13:07 ` [PATCH 1/22] High-level AIO retry infrastructure and fixes Suparna Bhattacharya
2004-07-02 13:11 ` [PATCH 2/22] use_mm fix (helps AIO hangs on 4:4 split) Suparna Bhattacharya
2004-07-02 13:14 ` [PATCH 3/22] Refcounting fixes Suparna Bhattacharya
2004-07-02 13:15 ` [PATCH 4/22] Splice ioctx runlist for fairness Suparna Bhattacharya
2004-07-02 13:16 ` [PATCH 5/22] AIO wait on page support Suparna Bhattacharya
2004-07-02 13:18 ` [PATCH 6/22] FS AIO read Suparna Bhattacharya
2004-07-02 13:19 ` [PATCH 7/22] Upfront readahead to help streaming AIO reads Suparna Bhattacharya
2004-07-02 13:20 ` [PATCH 8/22] AIO cancellation fix Suparna Bhattacharya
2004-07-02 13:23 ` [PATCH 9/22] AIO immediate read (needed for AIO pipes & sockets) Suparna Bhattacharya
2004-07-02 13:23 ` [PATCH 10/22] AIO pipe support Suparna Bhattacharya
2004-07-02 13:26 ` Suparna Bhattacharya [this message]
2004-07-02 16:05 ` [PATCH 12/22] Writeback page range hint Suparna Bhattacharya
2004-07-02 16:18 ` [PATCH 13/22] Fix writeback page range to use exact limits Suparna Bhattacharya
2004-07-02 16:22 ` [PATCH 14/22] mpage writepages range limit fix Suparna Bhattacharya
2004-07-02 16:25 ` [PATCH 15/22] filemap_fdatawrite range interface Suparna Bhattacharya
2004-07-02 16:27 ` [PATCH 16/22] Concurrent O_SYNC write support Suparna Bhattacharya
2004-07-02 16:31 ` [PATCH 17/22] AIO wait on writeback Suparna Bhattacharya
2004-07-02 16:33 ` [PATCH 18/22] AIO O_SYNC write Suparna Bhattacharya
2004-07-02 16:34 ` [PATCH 19/22] Fix math error in AIO wait on writeback Suparna Bhattacharya
2004-07-02 16:39 ` [PATCH 20/22] AIO poll Suparna Bhattacharya
2004-07-29 15:19 ` Jeff Moyer
2004-07-29 16:02 ` Avi Kivity
2004-07-29 16:16 ` Arjan van de Ven
2004-07-29 16:37 ` Benjamin LaHaise
2004-07-29 17:23 ` William Lee Irwin III
2004-07-29 17:10 ` William Lee Irwin III
2004-07-29 17:24 ` Avi Kivity
2004-07-29 17:26 ` William Lee Irwin III
2004-07-29 17:30 ` Avi Kivity
2004-07-29 17:32 ` William Lee Irwin III
2004-07-02 16:42 ` [PATCH 21/22] fix: flush workqueue on put_ioctx Suparna Bhattacharya
2004-07-02 16:44 ` [PATCH 22/22] Fix stalls with the AIO context switch patch Suparna Bhattacharya
2004-07-05 9:24 ` [PATCH 0/22] fsaio, pipe aio and aio poll upgraded to 2.6.7 Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040702132618.GK4374@in.ibm.com \
--to=suparna@in.ibm.com \
--cc=linux-aio@kvack.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-osdl@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox