Re: blk_stop_queue/blk_start_queue confusion, problem, or bug???

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Lou Langholtz <ldl@aros.net>
To: Jens Axboe <axboe@suse.de>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Andrea Arcangeli <andrea@suse.de>,
	NeilBrown <neilb@cse.unsw.edu.au>,
	Steven Whitehouse <steve@chygwyn.com>
Subject: Re: blk_stop_queue/blk_start_queue confusion, problem, or bug???
Date: Mon, 28 Jul 2003 01:51:05 -0600	[thread overview]
Message-ID: <3F24D5E9.3090901@aros.net> (raw)
In-Reply-To: <20030728070150.GA25356@suse.de>

Jens Axboe wrote:

>On Sun, Jul 27 2003, Lou Langholtz wrote:
>  
>
>>I've been trying to use the blk_start_queue and blk_stop_queue functions 
>>in the network block device driver branch I'm working on. The stop works 
>>as expected, but the start doesn't. Processes that have tried to read or 
>>write to the device (after the queue was stopped) stay blocked in 
>>io_schedule instead of getting woken up (after blk_start_queue was 
>>called). Do I need to follow the call to blk_start_queue() with a call 
>>to wake_up() on the correct wait queues? Why not have that functionality 
>>be part of blk_start_queue()? Or was this an oversight/bug?
>>    
>>
>
>blk_start_queue() should be enough. What kind of behaviour are you
>seeing? Is the request_fn() never called again?
>
Sorry. I've been so burried in this problem, I forgot others probably 
can't read my mind ;-) The behavior I was seeing was that processes 
blocked on I/O and in io_schedule, don't get woken up. After tracking 
the problem down, I realized that once the queue was stopped (using 
blk_stop_queue) any I/O requests against an empty request queue would 
plug the device. After the short timeout, generic_unplug would get 
called and would first try removing the plug then if it succeeded check 
QUEUE_FLAG_STOPPED. In my case QUEUE_FLAG_STOPPED hadn't gotten cleared 
by the time generic_unplug had gotten invoked. So the queue was left in 
a state where it wasn't plugged any more but the request_fn wasn't 
running either and things hung that way (locked in io_schedule). 
Hopefully the patch I just sent out will make sense if my explanation 
doesn't again this time. ;-)

>>The reason I'm using blk_stop_queue and blk_start_queue is to stop the 
>>request handling function (installed from blk_init_queue), from being 
>>re-invoked and to return when the network block device server goes down. 
>>That way, the driver doesn't need to block indefinately within the 
>>request handling function - which seems like it'd likely block other 
>>block drivers if it did this - and doesn't need to be handled by 
>>    
>>
>
>It will, you should never block in your request function/
>  
>
With the network block device driver, the only way to ensure the request 
function *never* blocks is to have a seperate dedicated kernel thread 
handling the actual network I/O. At best otherwise, I can use 
MSG_DONTWAIT coupled with the blk_start_queue and blk_stop_queue 
functions however the code must still drop the spin lock to make the 
socket calls (since they still may sleep). At least when I try to call 
sock_sendmsg/sendpage with the spin lock still held (and I'm using 
CONFIG_DEBUG_SPINLOCK_SLEEP) I get "sleeping function called from 
illegal context" messages. Is there another way? What's the way you 
would suggest?

>>. . . BTW: LKML has had a related thread on this some years ago in discussing 
>>how the block layer system handles request functions that must drop the 
>>spinlock and may block indefinately. That never seemed to get resolved 
>>though and makes me believe that's why Steven Whitehouse opted to use a 
>>multi-threaded approach to the NBD driver at one point.
>>    
>>
>
>That has never really been allowed, in that it is a Bad Thing to do
>something like that.
>  
>
Want to make sure I don't misunderstand... you mean that dropping the 
queue spin lock is a Bad Thing correct? Is it bad enough to warrant 
using a seperate kernel thread for handling network sends to avoid this 
then? This would have to be a seperate thread per network block device 
then to ensure the devices don't impede each other.

Thanks!!!!!

next prev parent reply	other threads:[~2003-07-28  7:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-27 18:24 blk_stop_queue/blk_start_queue confusion, problem, or bug??? Lou Langholtz
2003-07-28  6:43 ` Lou Langholtz
2003-07-28  7:00 ` [PATCH 2.6.0-test2] fix broken blk_start_queue behavior Lou Langholtz
2003-07-28  7:12   ` Jens Axboe
2003-07-28  7:01 ` blk_stop_queue/blk_start_queue confusion, problem, or bug??? Jens Axboe
2003-07-28  7:51   ` Lou Langholtz [this message]
2003-08-07 10:51     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F24D5E9.3090901@aros.net \
    --to=ldl@aros.net \
    --cc=andrea@suse.de \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@cse.unsw.edu.au \
    --cc=steve@chygwyn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.