From: Chris Mason <mason@suse.com>
To: Nick Piggin <piggin@cyberone.com.au>
Cc: Andrea Arcangeli <andrea@suse.de>,
Marc-Christian Petersen <m.c.p@wolk-project.de>,
Jens Axboe <axboe@suse.de>,
Marcelo Tosatti <marcelo@conectiva.com.br>,
Georg Nikodym <georgn@somanetworks.com>,
lkml <linux-kernel@vger.kernel.org>,
Matthias Mueller <matthias.mueller@rz.uni-karlsruhe.de>
Subject: Re: [PATCH] io stalls
Date: 12 Jun 2003 12:06:43 -0400 [thread overview]
Message-ID: <1055434002.23697.431.camel@tiny.suse.com> (raw)
In-Reply-To: <3EE7F18C.3010502@cyberone.com.au>
[-- Attachment #1: Type: text/plain, Size: 1782 bytes --]
On Wed, 2003-06-11 at 23:20, Nick Piggin wrote:
>
> I think the cpu utilization gain of waking a number of tasks
> at once would be outweighed by advantage of waking 1 task
> and not putting it to sleep again for a number of requests.
> You obviously are not claiming concurrency improvements, as
> your method would also increase contention on the io lock
> (or the queue lock in 2.5).
I've been trying variations on this for a few days, none have been
thrilling but the end result is better dbench and iozone throughput
overall. For the 20 writer iozone test, rc7 got an average throughput
of 3MB/s, and yesterdays latency patch got 500k/s or so. Ouch.
This gets us up to 1.2MB/s. I'm keeping yesterday's
get_request_wait_wake, which wakes up a waiter instead of unplugging.
The basic idea here is that after a process is woken up and grabs a
request, he becomes the batch owner. Batch owners get to ignore the
q->full flag for either 1/5 second or 32 requests, whichever comes
first. The timer part is an attempt at preventing memory pressure
writers (who go 1 req at a time) from holding onto batch ownership for
too long. Latency stats after dbench 50:
device 08:01: num_req 120077, total jiffies waited 663231
65538 forced to wait
1 min wait, 175 max wait
10 average wait
65296 < 100, 242 < 200, 0 < 300, 0 < 400, 0 < 500
0 waits longer than 500 jiffies
Good latency system wide comes from fair waiting, but it also comes from
how fast we can run write_some_buffers(), since that is the unit of
throttling. Hopefully this patch decreases the time it takes for
write_some_buffers over the past latency patches, or gives someone else
a better idea ;-)
Attached is an incremental over yesterday's io-stalls-5.diff.
-chris
[-- Attachment #2: io-stalls-6-inc.diff --]
[-- Type: text/plain, Size: 3421 bytes --]
diff -u edited/drivers/block/ll_rw_blk.c edited/drivers/block/ll_rw_blk.c
--- edited/drivers/block/ll_rw_blk.c Wed Jun 11 13:36:10 2003
+++ edited/drivers/block/ll_rw_blk.c Thu Jun 12 11:53:03 2003
@@ -437,6 +437,12 @@
nr_requests = 128;
if (megs < 32)
nr_requests /= 2;
+ q->batch_owner[0] = NULL;
+ q->batch_owner[1] = NULL;
+ q->batch_remaining[0] = 0;
+ q->batch_remaining[1] = 0;
+ q->batch_jiffies[0] = 0;
+ q->batch_jiffies[1] = 0;
blk_grow_request_list(q, nr_requests);
init_waitqueue_head(&q->wait_for_requests[0]);
@@ -558,6 +564,31 @@
blk_queue_bounce_limit(q, BLK_BOUNCE_HIGH);
}
+#define BATCH_JIFFIES (HZ/5)
+static void check_batch_owner(request_queue_t *q, int rw)
+{
+ if (q->batch_owner[rw] != current)
+ return;
+ if (--q->batch_remaining[rw] > 0 &&
+ jiffies - q->batch_jiffies[rw] < BATCH_JIFFIES) {
+ return;
+ }
+ q->batch_owner[rw] = NULL;
+}
+
+static void set_batch_owner(request_queue_t *q, int rw)
+{
+ struct task_struct *tsk = current;
+ if (q->batch_owner[rw] == tsk)
+ return;
+ if (q->batch_owner[rw] &&
+ jiffies - q->batch_jiffies[rw] < BATCH_JIFFIES)
+ return;
+ q->batch_jiffies[rw] = jiffies;
+ q->batch_owner[rw] = current;
+ q->batch_remaining[rw] = q->batch_requests;
+}
+
#define blkdev_free_rq(list) list_entry((list)->next, struct request, queue);
/*
* Get a free request. io_request_lock must be held and interrupts
@@ -587,9 +618,13 @@
*/
static inline struct request *get_request(request_queue_t *q, int rw)
{
- if (queue_full(q, rw))
+ struct request *rq;
+ if (queue_full(q, rw) && q->batch_owner[rw] != current)
return NULL;
- return __get_request(q, rw);
+ rq = __get_request(q, rw);
+ if (rq)
+ check_batch_owner(q, rw);
+ return rq;
}
/*
@@ -657,9 +692,9 @@
add_wait_queue_exclusive(&q->wait_for_requests[rw], &wait);
+ spin_lock_irq(&io_request_lock);
do {
set_current_state(TASK_UNINTERRUPTIBLE);
- spin_lock_irq(&io_request_lock);
if (queue_full(q, rw) || q->rq[rw].count == 0) {
if (q->rq[rw].count == 0)
__generic_unplug_device(q);
@@ -668,8 +703,9 @@
spin_lock_irq(&io_request_lock);
}
rq = __get_request(q, rw);
- spin_unlock_irq(&io_request_lock);
} while (rq == NULL);
+ set_batch_owner(q, rw);
+ spin_unlock_irq(&io_request_lock);
remove_wait_queue(&q->wait_for_requests[rw], &wait);
current->state = TASK_RUNNING;
@@ -1010,6 +1046,7 @@
struct list_head *head, *insert_here;
int latency;
elevator_t *elevator = &q->elevator;
+ int need_unplug = 0;
count = bh->b_size >> 9;
sector = bh->b_rsector;
@@ -1145,8 +1182,8 @@
spin_unlock_irq(&io_request_lock);
freereq = __get_request_wait(q, rw);
head = &q->queue_head;
+ need_unplug = 1;
spin_lock_irq(&io_request_lock);
- get_request_wait_wakeup(q, rw);
goto again;
}
}
@@ -1174,6 +1211,8 @@
out:
if (freereq)
blkdev_release_request(freereq);
+ if (need_unplug)
+ get_request_wait_wakeup(q, rw);
spin_unlock_irq(&io_request_lock);
return 0;
end_io:
diff -u edited/include/linux/blkdev.h edited/include/linux/blkdev.h
--- edited/include/linux/blkdev.h Wed Jun 11 09:56:55 2003
+++ edited/include/linux/blkdev.h Thu Jun 12 09:44:26 2003
@@ -92,6 +92,10 @@
*/
int batch_requests;
+ struct task_struct *batch_owner[2];
+ int batch_remaining[2];
+ unsigned long batch_jiffies[2];
+
/*
* Together with queue_head for cacheline sharing
*/
next prev parent reply other threads:[~2003-06-12 15:53 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-05-29 0:55 Linux 2.4.21-rc6 Marcelo Tosatti
2003-05-29 1:22 ` Con Kolivas
2003-05-29 5:24 ` Marc Wilson
2003-05-29 5:34 ` Riley Williams
2003-05-29 5:57 ` Marc Wilson
2003-05-29 7:15 ` Riley Williams
2003-05-29 8:38 ` Willy Tarreau
2003-05-29 8:40 ` Willy Tarreau
2003-06-03 16:02 ` Marcelo Tosatti
2003-06-03 16:13 ` Marc-Christian Petersen
2003-06-04 21:54 ` Pavel Machek
2003-06-05 2:10 ` Michael Frank
2003-06-03 16:30 ` Michael Frank
2003-06-03 16:53 ` Matthias Mueller
2003-06-03 16:59 ` Marc-Christian Petersen
2003-06-03 17:03 ` Marc-Christian Petersen
2003-06-03 18:02 ` Anders Karlsson
2003-06-03 21:12 ` J.A. Magallon
2003-06-03 21:18 ` Marc-Christian Petersen
2003-06-03 17:23 ` Michael Frank
2003-06-04 14:56 ` Jakob Oestergaard
2003-06-04 4:04 ` Marc Wilson
2003-05-29 10:02 ` Con Kolivas
2003-05-29 18:00 ` Georg Nikodym
2003-05-29 19:11 ` -rc7 " Marcelo Tosatti
2003-05-29 19:56 ` Krzysiek Taraszka
2003-05-29 20:18 ` Krzysiek Taraszka
2003-06-04 18:17 ` Marcelo Tosatti
2003-06-04 21:41 ` Krzysiek Taraszka
2003-06-04 22:37 ` Alan Cox
2003-06-04 10:22 ` Andrea Arcangeli
2003-06-04 10:35 ` Marc-Christian Petersen
2003-06-04 10:42 ` Jens Axboe
2003-06-04 10:46 ` Marc-Christian Petersen
2003-06-04 10:48 ` Andrea Arcangeli
2003-06-04 11:57 ` Nick Piggin
2003-06-04 12:00 ` Jens Axboe
2003-06-04 12:09 ` Andrea Arcangeli
2003-06-04 12:20 ` Jens Axboe
2003-06-04 20:50 ` Rob Landley
2003-06-04 12:11 ` Nick Piggin
2003-06-04 12:35 ` Miquel van Smoorenburg
2003-06-09 21:39 ` [PATCH] io stalls (was: -rc7 Re: Linux 2.4.21-rc6) Chris Mason
2003-06-09 22:19 ` Andrea Arcangeli
2003-06-10 0:27 ` Chris Mason
2003-06-10 23:13 ` Chris Mason
2003-06-11 0:16 ` Andrea Arcangeli
2003-06-11 0:44 ` Chris Mason
2003-06-09 23:51 ` [PATCH] io stalls Nick Piggin
2003-06-10 0:32 ` Chris Mason
2003-06-10 0:47 ` Nick Piggin
2003-06-10 1:48 ` Robert White
2003-06-10 2:13 ` Chris Mason
2003-06-10 23:04 ` Robert White
2003-06-11 0:58 ` Chris Mason
2003-06-10 3:22 ` Nick Piggin
2003-06-10 21:17 ` Robert White
2003-06-11 0:40 ` Nick Piggin
2003-06-11 0:33 ` [PATCH] io stalls (was: -rc7 Re: Linux 2.4.21-rc6) Andrea Arcangeli
2003-06-11 0:48 ` [PATCH] io stalls Nick Piggin
2003-06-11 1:07 ` Andrea Arcangeli
2003-06-11 0:54 ` [PATCH] io stalls (was: -rc7 Re: Linux 2.4.21-rc6) Chris Mason
2003-06-11 1:06 ` Andrea Arcangeli
2003-06-11 1:57 ` Chris Mason
2003-06-11 2:10 ` Andrea Arcangeli
2003-06-11 12:24 ` Chris Mason
2003-06-11 17:42 ` Chris Mason
2003-06-11 18:12 ` Andrea Arcangeli
2003-06-11 18:27 ` Chris Mason
2003-06-11 18:35 ` Andrea Arcangeli
2003-06-12 1:04 ` [PATCH] io stalls Nick Piggin
2003-06-12 1:12 ` Chris Mason
2003-06-12 1:29 ` Andrea Arcangeli
2003-06-12 1:37 ` Andrea Arcangeli
2003-06-12 2:22 ` Chris Mason
2003-06-12 2:41 ` Nick Piggin
2003-06-12 2:46 ` Andrea Arcangeli
2003-06-12 2:49 ` Nick Piggin
2003-06-12 2:51 ` Nick Piggin
2003-06-12 2:52 ` Nick Piggin
2003-06-12 3:04 ` Andrea Arcangeli
2003-06-12 2:58 ` Andrea Arcangeli
2003-06-12 3:04 ` Nick Piggin
2003-06-12 3:12 ` Andrea Arcangeli
2003-06-12 3:20 ` Nick Piggin
2003-06-12 3:33 ` Andrea Arcangeli
2003-06-12 3:48 ` Nick Piggin
2003-06-12 4:17 ` Andrea Arcangeli
2003-06-12 4:41 ` Nick Piggin
2003-06-12 16:06 ` Chris Mason [this message]
2003-06-12 16:16 ` Nick Piggin
2003-06-25 19:03 ` Chris Mason
2003-06-25 19:25 ` Andrea Arcangeli
2003-06-25 20:18 ` Chris Mason
2003-06-27 8:41 ` write-caches, I/O stalls: MUST-FIX (was: [PATCH] io stalls) Matthias Andree
2003-06-26 5:48 ` [PATCH] io stalls Nick Piggin
2003-06-26 11:48 ` Chris Mason
2003-06-26 13:04 ` Nick Piggin
2003-06-26 13:18 ` Nick Piggin
2003-06-26 15:55 ` Chris Mason
2003-06-27 1:21 ` Nick Piggin
2003-06-27 1:39 ` Chris Mason
2003-06-27 9:45 ` Nick Piggin
2003-06-27 12:41 ` Chris Mason
2003-06-12 11:57 ` Chris Mason
2003-06-04 10:43 ` -rc7 Re: Linux 2.4.21-rc6 Andrea Arcangeli
2003-06-04 11:01 ` Marc-Christian Petersen
2003-06-03 19:45 ` Config issue (CONFIG_X86_TSC) " Paul
2003-06-03 20:18 ` Jan-Benedict Glaw
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1055434002.23697.431.camel@tiny.suse.com \
--to=mason@suse.com \
--cc=andrea@suse.de \
--cc=axboe@suse.de \
--cc=georgn@somanetworks.com \
--cc=linux-kernel@vger.kernel.org \
--cc=m.c.p@wolk-project.de \
--cc=marcelo@conectiva.com.br \
--cc=matthias.mueller@rz.uni-karlsruhe.de \
--cc=piggin@cyberone.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.