public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Lou Langholtz <ldl@aros.net>
To: Jens Axboe <axboe@suse.de>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Andrea Arcangeli <andrea@suse.de>,
	NeilBrown <neilb@cse.unsw.edu.au>,
	Steven Whitehouse <steve@chygwyn.com>
Subject: Re: blk_stop_queue/blk_start_queue confusion, problem, or bug???
Date: Mon, 28 Jul 2003 01:51:05 -0600	[thread overview]
Message-ID: <3F24D5E9.3090901@aros.net> (raw)
In-Reply-To: <20030728070150.GA25356@suse.de>

Jens Axboe wrote:

>On Sun, Jul 27 2003, Lou Langholtz wrote:
>  
>
>>I've been trying to use the blk_start_queue and blk_stop_queue functions 
>>in the network block device driver branch I'm working on. The stop works 
>>as expected, but the start doesn't. Processes that have tried to read or 
>>write to the device (after the queue was stopped) stay blocked in 
>>io_schedule instead of getting woken up (after blk_start_queue was 
>>called). Do I need to follow the call to blk_start_queue() with a call 
>>to wake_up() on the correct wait queues? Why not have that functionality 
>>be part of blk_start_queue()? Or was this an oversight/bug?
>>    
>>
>
>blk_start_queue() should be enough. What kind of behaviour are you
>seeing? Is the request_fn() never called again?
>
Sorry. I've been so burried in this problem, I forgot others probably 
can't read my mind ;-) The behavior I was seeing was that processes 
blocked on I/O and in io_schedule, don't get woken up. After tracking 
the problem down, I realized that once the queue was stopped (using 
blk_stop_queue) any I/O requests against an empty request queue would 
plug the device. After the short timeout, generic_unplug would get 
called and would first try removing the plug then if it succeeded check 
QUEUE_FLAG_STOPPED. In my case QUEUE_FLAG_STOPPED hadn't gotten cleared 
by the time generic_unplug had gotten invoked. So the queue was left in 
a state where it wasn't plugged any more but the request_fn wasn't 
running either and things hung that way (locked in io_schedule). 
Hopefully the patch I just sent out will make sense if my explanation 
doesn't again this time. ;-)

>>The reason I'm using blk_stop_queue and blk_start_queue is to stop the 
>>request handling function (installed from blk_init_queue), from being 
>>re-invoked and to return when the network block device server goes down. 
>>That way, the driver doesn't need to block indefinately within the 
>>request handling function - which seems like it'd likely block other 
>>block drivers if it did this - and doesn't need to be handled by 
>>    
>>
>
>It will, you should never block in your request function/
>  
>
With the network block device driver, the only way to ensure the request 
function *never* blocks is to have a seperate dedicated kernel thread 
handling the actual network I/O. At best otherwise, I can use 
MSG_DONTWAIT coupled with the blk_start_queue and blk_stop_queue 
functions however the code must still drop the spin lock to make the 
socket calls (since they still may sleep). At least when I try to call 
sock_sendmsg/sendpage with the spin lock still held (and I'm using 
CONFIG_DEBUG_SPINLOCK_SLEEP) I get "sleeping function called from 
illegal context" messages. Is there another way? What's the way you 
would suggest?

>>. . . BTW: LKML has had a related thread on this some years ago in discussing 
>>how the block layer system handles request functions that must drop the 
>>spinlock and may block indefinately. That never seemed to get resolved 
>>though and makes me believe that's why Steven Whitehouse opted to use a 
>>multi-threaded approach to the NBD driver at one point.
>>    
>>
>
>That has never really been allowed, in that it is a Bad Thing to do
>something like that.
>  
>
Want to make sure I don't misunderstand... you mean that dropping the 
queue spin lock is a Bad Thing correct? Is it bad enough to warrant 
using a seperate kernel thread for handling network sends to avoid this 
then? This would have to be a seperate thread per network block device 
then to ensure the devices don't impede each other.

Thanks!!!!!


  reply	other threads:[~2003-07-28  7:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-27 18:24 blk_stop_queue/blk_start_queue confusion, problem, or bug??? Lou Langholtz
2003-07-28  6:43 ` Lou Langholtz
2003-07-28  7:00 ` [PATCH 2.6.0-test2] fix broken blk_start_queue behavior Lou Langholtz
2003-07-28  7:12   ` Jens Axboe
2003-07-28  7:01 ` blk_stop_queue/blk_start_queue confusion, problem, or bug??? Jens Axboe
2003-07-28  7:51   ` Lou Langholtz [this message]
2003-08-07 10:51     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F24D5E9.3090901@aros.net \
    --to=ldl@aros.net \
    --cc=andrea@suse.de \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@cse.unsw.edu.au \
    --cc=steve@chygwyn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox