From: Andrew Morton <akpm@digeo.com>
To: Andrea Arcangeli <andrea@suse.de>
Cc: Linus Torvalds <torvalds@transmeta.com>,
Marcelo Tosatti <marcelo@conectiva.com.br>,
linux-kernel@vger.kernel.org
Subject: Re: 2.[45] fixes for design locking bug in wait_on_page/wait_on_buffer/get_request_wait
Date: Tue, 12 Nov 2002 00:50:46 -0800 [thread overview]
Message-ID: <3DD0C0E6.4A8035A4@digeo.com> (raw)
In-Reply-To: 20021112035723.GA17642@dualathlon.random
Andrea Arcangeli wrote:
>
> the race looks like this:
>
> CPU0 CPU1
> ----------------- ------------------------
> reiserfs_writepage
> lock_buffer()
> fsync_buffers_list() under lock_super()
> wait_on_buffer()
> run_task_queue(&tq_disk) -> noop
> schedule() <- hang with lock_super acquired
> submit_bh()
> /* don't unplug here */
>
Or, more simply:
lock_buffer()
while (buffer_locked()) {
blk_run_queues(); /* Nothing happens */
if (buffer_locked(bh))
schedule();
submit_bh(); /* No unplug */
The fix seems reasonable to me. It would perhaps be better to just do:
+ if (waitqueue_active(wqh))
+ blk_run_queues();
in submit_bh(). To save the context switch.
Moving the blk_run_queues() inside the TASK_UNINTERRUPTIBLE region
is something which always worried me, because if something down
there sets TASK_RUNNING, we end up in a busy wait. But that's OK
for 2.5 and may be OK for 2.4's run_task_queue() - I haven't checked...
The multipage stuff in 2.5 does its own blk_run_queues() and looks to be
OK, which I assume is why you didn't touch that.
The little single-page reads like do_generic_mapping_read() look to be
OK because the process whcih waits is the one which submitted the IO.
wrt the get_request_wait changes: I never bothered about the barrier
because we know that there are tons of requests outstanding, and if
we don't do a wakeup the next guy will. Plus *this* request has to
be put back sometime too, which will deliver a wakeup. But whatever;
it's not exactly a fastpath.
However the function is still not watertight:
static struct request *get_request_wait(request_queue_t *q, int rw)
{
DEFINE_WAIT(wait);
struct request_list *rl = &q->rq[rw];
struct request *rq;
spin_lock_prefetch(q->queue_lock);
generic_unplug_device(q);
do {
prepare_to_wait_exclusive(&rl->wait, &wait,
TASK_UNINTERRUPTIBLE);
if (!rl->count)
io_schedule();
finish_wait(&rl->wait, &wait);
spin_lock_irq(q->queue_lock);
rq = get_request(q, rw);
spin_unlock_irq(q->queue_lock);
} while (rq == NULL);
return rq;
}
If someone has taken *all* the requests and hasn't submitted any of them
yet, there is nothing to unplug. We go to sleep, all the requests are
submitted (behind a plug) and it's game over. Could happen if the device
has a teeny queue...
I dunno. I bet there are still more holes, and I for one am heartily sick
of unplug bugs. Why not make the damn queue unplug itself after ten
milliseconds or 16 requests? I bet that would actually increase throughput,
especially in the presence of kernel preemption and scheduling points.
next prev parent reply other threads:[~2002-11-12 8:44 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-11-12 3:57 2.[45] fixes for design locking bug in wait_on_page/wait_on_buffer/get_request_wait Andrea Arcangeli
2002-11-12 8:50 ` Andrew Morton [this message]
2002-11-12 18:15 ` Andrea Arcangeli
2002-11-12 19:46 ` Andrew Morton
2002-11-13 0:33 ` Andrea Arcangeli
2002-11-13 1:01 ` Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2002-11-16 16:24 Marc-Christian Petersen
2002-11-16 16:59 ` Andrea Arcangeli
2002-11-16 17:23 ` Marc-Christian Petersen
2002-11-16 17:32 ` Andrea Arcangeli
2002-11-16 17:43 ` Marc-Christian Petersen
2002-11-16 18:46 ` Andrea Arcangeli
2002-11-16 18:58 Marc-Christian Petersen
2002-11-16 21:55 ` Marc-Christian Petersen
2002-11-17 17:37 ` Marc-Christian Petersen
2002-11-17 19:19 ` Folkert van Heusden
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3DD0C0E6.4A8035A4@digeo.com \
--to=akpm@digeo.com \
--cc=andrea@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=marcelo@conectiva.com.br \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox