From: Lou Langholtz <ldl@aros.net>
To: Jens Axboe <axboe@suse.de>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
Andrea Arcangeli <andrea@suse.de>,
NeilBrown <neilb@cse.unsw.edu.au>,
Steven Whitehouse <steve@chygwyn.com>
Subject: Re: blk_stop_queue/blk_start_queue confusion, problem, or bug???
Date: Mon, 28 Jul 2003 01:51:05 -0600 [thread overview]
Message-ID: <3F24D5E9.3090901@aros.net> (raw)
In-Reply-To: <20030728070150.GA25356@suse.de>
Jens Axboe wrote:
>On Sun, Jul 27 2003, Lou Langholtz wrote:
>
>
>>I've been trying to use the blk_start_queue and blk_stop_queue functions
>>in the network block device driver branch I'm working on. The stop works
>>as expected, but the start doesn't. Processes that have tried to read or
>>write to the device (after the queue was stopped) stay blocked in
>>io_schedule instead of getting woken up (after blk_start_queue was
>>called). Do I need to follow the call to blk_start_queue() with a call
>>to wake_up() on the correct wait queues? Why not have that functionality
>>be part of blk_start_queue()? Or was this an oversight/bug?
>>
>>
>
>blk_start_queue() should be enough. What kind of behaviour are you
>seeing? Is the request_fn() never called again?
>
Sorry. I've been so burried in this problem, I forgot others probably
can't read my mind ;-) The behavior I was seeing was that processes
blocked on I/O and in io_schedule, don't get woken up. After tracking
the problem down, I realized that once the queue was stopped (using
blk_stop_queue) any I/O requests against an empty request queue would
plug the device. After the short timeout, generic_unplug would get
called and would first try removing the plug then if it succeeded check
QUEUE_FLAG_STOPPED. In my case QUEUE_FLAG_STOPPED hadn't gotten cleared
by the time generic_unplug had gotten invoked. So the queue was left in
a state where it wasn't plugged any more but the request_fn wasn't
running either and things hung that way (locked in io_schedule).
Hopefully the patch I just sent out will make sense if my explanation
doesn't again this time. ;-)
>>The reason I'm using blk_stop_queue and blk_start_queue is to stop the
>>request handling function (installed from blk_init_queue), from being
>>re-invoked and to return when the network block device server goes down.
>>That way, the driver doesn't need to block indefinately within the
>>request handling function - which seems like it'd likely block other
>>block drivers if it did this - and doesn't need to be handled by
>>
>>
>
>It will, you should never block in your request function/
>
>
With the network block device driver, the only way to ensure the request
function *never* blocks is to have a seperate dedicated kernel thread
handling the actual network I/O. At best otherwise, I can use
MSG_DONTWAIT coupled with the blk_start_queue and blk_stop_queue
functions however the code must still drop the spin lock to make the
socket calls (since they still may sleep). At least when I try to call
sock_sendmsg/sendpage with the spin lock still held (and I'm using
CONFIG_DEBUG_SPINLOCK_SLEEP) I get "sleeping function called from
illegal context" messages. Is there another way? What's the way you
would suggest?
>>. . . BTW: LKML has had a related thread on this some years ago in discussing
>>how the block layer system handles request functions that must drop the
>>spinlock and may block indefinately. That never seemed to get resolved
>>though and makes me believe that's why Steven Whitehouse opted to use a
>>multi-threaded approach to the NBD driver at one point.
>>
>>
>
>That has never really been allowed, in that it is a Bad Thing to do
>something like that.
>
>
Want to make sure I don't misunderstand... you mean that dropping the
queue spin lock is a Bad Thing correct? Is it bad enough to warrant
using a seperate kernel thread for handling network sends to avoid this
then? This would have to be a seperate thread per network block device
then to ensure the devices don't impede each other.
Thanks!!!!!
next prev parent reply other threads:[~2003-07-28 7:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-07-27 18:24 blk_stop_queue/blk_start_queue confusion, problem, or bug??? Lou Langholtz
2003-07-28 6:43 ` Lou Langholtz
2003-07-28 7:00 ` [PATCH 2.6.0-test2] fix broken blk_start_queue behavior Lou Langholtz
2003-07-28 7:12 ` Jens Axboe
2003-07-28 7:01 ` blk_stop_queue/blk_start_queue confusion, problem, or bug??? Jens Axboe
2003-07-28 7:51 ` Lou Langholtz [this message]
2003-08-07 10:51 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3F24D5E9.3090901@aros.net \
--to=ldl@aros.net \
--cc=andrea@suse.de \
--cc=axboe@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=neilb@cse.unsw.edu.au \
--cc=steve@chygwyn.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox