* [PATCH] block: Make blk_drain_queue() work for stopped queues @ 2012-03-18 13:18 Bart Van Assche 2012-03-18 15:57 ` Tejun Heo 0 siblings, 1 reply; 15+ messages in thread From: Bart Van Assche @ 2012-03-18 13:18 UTC (permalink / raw) To: Jens Axboe, Tejun Heo, Stanislaw Gruszka, linux-scsi [-- Attachment #1: Type: text/plain, Size: 920 bytes --] All queued requests must be processed eventually. Hence make sure that blk_drain_queue() drains the queue even if the queue is in the stopped state. This patch makes it safe to invoke blk_cleanup_queue() on a stopped queue. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Tejun Heo <tj@kernel.org> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: stable@vger.kernel.org --- block/blk-core.c | 6 ++---- 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 3a78b00..bdcec86 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue); */ void __blk_run_queue(struct request_queue *q) { - if (unlikely(blk_queue_stopped(q))) - return; - - q->request_fn(q); + if (!blk_queue_stopped(q) || blk_queue_dead(q)) + q->request_fn(q); } EXPORT_SYMBOL(__blk_run_queue); -- 1.7.7 [-- Attachment #2: Attached Message Part --] [-- Type: text/plain, Size: 0 bytes --] ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-18 13:18 [PATCH] block: Make blk_drain_queue() work for stopped queues Bart Van Assche @ 2012-03-18 15:57 ` Tejun Heo 2012-03-18 19:47 ` Bart Van Assche 0 siblings, 1 reply; 15+ messages in thread From: Tejun Heo @ 2012-03-18 15:57 UTC (permalink / raw) To: Bart Van Assche; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi On Sun, Mar 18, 2012 at 01:18:21PM +0000, Bart Van Assche wrote: > All queued requests must be processed eventually. Hence make sure > that blk_drain_queue() drains the queue even if the queue is in the > stopped state. This patch makes it safe to invoke blk_cleanup_queue() > on a stopped queue. ... > diff --git a/block/blk-core.c b/block/blk-core.c > index 3a78b00..bdcec86 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue); > */ > void __blk_run_queue(struct request_queue *q) > { > - if (unlikely(blk_queue_stopped(q))) > - return; > - > - q->request_fn(q); > + if (!blk_queue_stopped(q) || blk_queue_dead(q)) > + q->request_fn(q); So, this allows calling request_fn for dead && stopped queue. Have you seen something which requires this? Thanks. -- tejun ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-18 15:57 ` Tejun Heo @ 2012-03-18 19:47 ` Bart Van Assche 2012-03-19 7:26 ` Stanislaw Gruszka 2012-03-19 17:04 ` Tejun Heo 0 siblings, 2 replies; 15+ messages in thread From: Bart Van Assche @ 2012-03-18 19:47 UTC (permalink / raw) To: Tejun Heo; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi On 03/18/12 15:57, Tejun Heo wrote: > On Sun, Mar 18, 2012 at 01:18:21PM +0000, Bart Van Assche wrote: >> All queued requests must be processed eventually. Hence make sure >> that blk_drain_queue() drains the queue even if the queue is in the >> stopped state. This patch makes it safe to invoke blk_cleanup_queue() >> on a stopped queue. > ... >> diff --git a/block/blk-core.c b/block/blk-core.c >> index 3a78b00..bdcec86 100644 >> --- a/block/blk-core.c >> +++ b/block/blk-core.c >> @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue); >> */ >> void __blk_run_queue(struct request_queue *q) >> { >> - if (unlikely(blk_queue_stopped(q))) >> - return; >> - >> - q->request_fn(q); >> + if (!blk_queue_stopped(q) || blk_queue_dead(q)) >> + q->request_fn(q); > So, this allows calling request_fn for dead && stopped queue. Have > you seen something which requires this? Not servicing queued SCSI requests can e.g. cause user space processes to hang. See also http://lkml.org/lkml/2011/8/27/6 for an example. Hence commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 which causes pending SCSI commands to be killed just before blk_cleanup_queue() is invoked. However, there is still a tiny race window left by that patch - new requests can get queued after the SCSI request function has been invoked by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence the proposal to change the block layer to make sure that all queued requests get processed eventually. Bart. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-18 19:47 ` Bart Van Assche @ 2012-03-19 7:26 ` Stanislaw Gruszka 2012-03-19 17:03 ` Bart Van Assche 2012-03-19 17:04 ` Tejun Heo 1 sibling, 1 reply; 15+ messages in thread From: Stanislaw Gruszka @ 2012-03-19 7:26 UTC (permalink / raw) To: Bart Van Assche; +Cc: Tejun Heo, Jens Axboe, linux-scsi Hi Bart On Sun, Mar 18, 2012 at 07:47:47PM +0000, Bart Van Assche wrote: > On 03/18/12 15:57, Tejun Heo wrote: > > On Sun, Mar 18, 2012 at 01:18:21PM +0000, Bart Van Assche wrote: > >> All queued requests must be processed eventually. Hence make sure > >> that blk_drain_queue() drains the queue even if the queue is in the > >> stopped state. This patch makes it safe to invoke blk_cleanup_queue() > >> on a stopped queue. > > ... > >> diff --git a/block/blk-core.c b/block/blk-core.c > >> index 3a78b00..bdcec86 100644 > >> --- a/block/blk-core.c > >> +++ b/block/blk-core.c > >> @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue); > >> */ > >> void __blk_run_queue(struct request_queue *q) > >> { > >> - if (unlikely(blk_queue_stopped(q))) > >> - return; > >> - > >> - q->request_fn(q); > >> + if (!blk_queue_stopped(q) || blk_queue_dead(q)) > >> + q->request_fn(q); I'm not sure if that behaviour is correct, i.e. we can call q->request_fn(q) if someone stoped queue, but if it is why not just call q->request_fn(q) from blk_drain_queue() instead? > > So, this allows calling request_fn for dead && stopped queue. Have > > you seen something which requires this? > > Not servicing queued SCSI requests can e.g. cause user space processes > to hang. See also http://lkml.org/lkml/2011/8/27/6 for an example. Hence > commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 which causes pending > SCSI commands to be killed just before blk_cleanup_queue() is invoked. > However, there is still a tiny race window left by that patch - new > requests can get queued after the SCSI request function has been invoked > by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence > the proposal to change the block layer to make sure that all queued > requests get processed eventually. That behaviour I can confirm using this script [1] running with usb dongle. I applied this patch and second one: http://marc.info/?l=linux-scsi&m=133207725114386&w=2 (BTW: second one patch is mangled). My impression is, that the script run much longer before it finally hung at infinite loop in blk_drain_queue(). Thanks Stanislaw [1] !#/bin/bash DEV=/dev/sdb ENABLE=/sys/bus/usb/devices/1-2/bConfigurationValue function stop_me() { for i in `jobs -p` ; do kill $i 2> /dev/null ; done exit } trap stop_me SIGHUP SIGINT SIGTERM for ((i = 0; i < 10; i++)) ; do while true; do fdisk -l $DEV 2>&1 > /dev/null ; done & done while true ; do echo 1 > $ENABLE sleep 1 echo 0 > $ENABLE done ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-19 7:26 ` Stanislaw Gruszka @ 2012-03-19 17:03 ` Bart Van Assche [not found] ` <4F6766F0.1070805-HInyCGIudOg@public.gmane.org> 0 siblings, 1 reply; 15+ messages in thread From: Bart Van Assche @ 2012-03-19 17:03 UTC (permalink / raw) To: Stanislaw Gruszka; +Cc: Tejun Heo, Jens Axboe, linux-scsi On 03/19/12 07:26, Stanislaw Gruszka wrote: > On Sun, Mar 18, 2012 at 07:47:47PM +0000, Bart Van Assche wrote: >> On 03/18/12 15:57, Tejun Heo wrote: >>> On Sun, Mar 18, 2012 at 01:18:21PM +0000, Bart Van Assche wrote: >>>> All queued requests must be processed eventually. Hence make sure >>>> that blk_drain_queue() drains the queue even if the queue is in the >>>> stopped state. This patch makes it safe to invoke blk_cleanup_queue() >>>> on a stopped queue. >>> ... >>>> diff --git a/block/blk-core.c b/block/blk-core.c >>>> index 3a78b00..bdcec86 100644 >>>> --- a/block/blk-core.c >>>> +++ b/block/blk-core.c >>>> @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue); >>>> */ >>>> void __blk_run_queue(struct request_queue *q) >>>> { >>>> - if (unlikely(blk_queue_stopped(q))) >>>> - return; >>>> - >>>> - q->request_fn(q); >>>> + if (!blk_queue_stopped(q) || blk_queue_dead(q)) >>>> + q->request_fn(q); > I'm not sure if that behaviour is correct, i.e. we can call q->request_fn(q) > if someone stoped queue, but if it is why not just call q->request_fn(q) > from blk_drain_queue() instead? As far as I can see invoking q->request_fn(q) directly from blk_drain_queue() would be a valid alternative. > >>> So, this allows calling request_fn for dead && stopped queue. Have >>> you seen something which requires this? >> Not servicing queued SCSI requests can e.g. cause user space processes >> to hang. See also http://lkml.org/lkml/2011/8/27/6 for an example. Hence >> commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 which causes pending >> SCSI commands to be killed just before blk_cleanup_queue() is invoked. >> However, there is still a tiny race window left by that patch - new >> requests can get queued after the SCSI request function has been invoked >> by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence >> the proposal to change the block layer to make sure that all queued >> requests get processed eventually. > That behaviour I can confirm using this script [1] running with usb > dongle. I applied this patch and second one: > http://marc.info/?l=linux-scsi&m=133207725114386&w=2 > (BTW: second one patch is mangled). My impression is, that the script run > much longer before it finally hung at infinite loop in blk_drain_queue(). I'm not an USB expert but I've had a quick look at usb_stor_release_resources() in drivers/usb/storage/usb.c. As far as I can see that function will only stop the usb_stor_control_thread() if that thread has been scheduled after the last complete() call by the USB queuecommand() function and before the complete() call in usb_stor_release_resources() is executed. That looks like a race condition to me. Bart. ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <4F6766F0.1070805-HInyCGIudOg@public.gmane.org>]
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues [not found] ` <4F6766F0.1070805-HInyCGIudOg@public.gmane.org> @ 2012-03-20 14:21 ` Stanislaw Gruszka 2012-03-20 14:31 ` Alan Stern 0 siblings, 1 reply; 15+ messages in thread From: Stanislaw Gruszka @ 2012-03-20 14:21 UTC (permalink / raw) To: Bart Van Assche Cc: Tejun Heo, Jens Axboe, linux-scsi, linux-usb-u79uwXL29TY76Z2rM5mHXA On Mon, Mar 19, 2012 at 05:03:44PM +0000, Bart Van Assche wrote: > On 03/19/12 07:26, Stanislaw Gruszka wrote: > > On Sun, Mar 18, 2012 at 07:47:47PM +0000, Bart Van Assche wrote: > >> On 03/18/12 15:57, Tejun Heo wrote: > >>> On Sun, Mar 18, 2012 at 01:18:21PM +0000, Bart Van Assche wrote: > >>>> All queued requests must be processed eventually. Hence make sure > >>>> that blk_drain_queue() drains the queue even if the queue is in the > >>>> stopped state. This patch makes it safe to invoke blk_cleanup_queue() > >>>> on a stopped queue. > >>> ... > >>>> diff --git a/block/blk-core.c b/block/blk-core.c > >>>> index 3a78b00..bdcec86 100644 > >>>> --- a/block/blk-core.c > >>>> +++ b/block/blk-core.c > >>>> @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue); > >>>> */ > >>>> void __blk_run_queue(struct request_queue *q) > >>>> { > >>>> - if (unlikely(blk_queue_stopped(q))) > >>>> - return; > >>>> - > >>>> - q->request_fn(q); > >>>> + if (!blk_queue_stopped(q) || blk_queue_dead(q)) > >>>> + q->request_fn(q); > > I'm not sure if that behaviour is correct, i.e. we can call q->request_fn(q) > > if someone stoped queue, but if it is why not just call q->request_fn(q) > > from blk_drain_queue() instead? > > As far as I can see invoking q->request_fn(q) directly from > blk_drain_queue() would be a valid alternative. > > > > >>> So, this allows calling request_fn for dead && stopped queue. Have > >>> you seen something which requires this? > >> Not servicing queued SCSI requests can e.g. cause user space processes > >> to hang. See also http://lkml.org/lkml/2011/8/27/6 for an example. Hence > >> commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 which causes pending > >> SCSI commands to be killed just before blk_cleanup_queue() is invoked. > >> However, there is still a tiny race window left by that patch - new > >> requests can get queued after the SCSI request function has been invoked > >> by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence > >> the proposal to change the block layer to make sure that all queued > >> requests get processed eventually. > > That behaviour I can confirm using this script [1] running with usb > > dongle. I applied this patch and second one: > > http://marc.info/?l=linux-scsi&m=133207725114386&w=2 > > (BTW: second one patch is mangled). My impression is, that the script run > > much longer before it finally hung at infinite loop in blk_drain_queue(). > > I'm not an USB expert but I've had a quick look at > usb_stor_release_resources() in drivers/usb/storage/usb.c. As far as I > can see that function will only stop the usb_stor_control_thread() if > that thread has been scheduled after the last complete() call by the USB > queuecommand() function and before the complete() call in > usb_stor_release_resources() is executed. That looks like a race > condition to me. CCing linux-usb, perhaps usb developers would like to take a look. Stanislaw -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-20 14:21 ` Stanislaw Gruszka @ 2012-03-20 14:31 ` Alan Stern 0 siblings, 0 replies; 15+ messages in thread From: Alan Stern @ 2012-03-20 14:31 UTC (permalink / raw) To: Stanislaw Gruszka Cc: Bart Van Assche, Tejun Heo, Jens Axboe, linux-scsi, linux-usb On Tue, 20 Mar 2012, Stanislaw Gruszka wrote: > On Mon, Mar 19, 2012 at 05:03:44PM +0000, Bart Van Assche wrote: ... > > I'm not an USB expert but I've had a quick look at > > usb_stor_release_resources() in drivers/usb/storage/usb.c. As far as I > > can see that function will only stop the usb_stor_control_thread() if > > that thread has been scheduled after the last complete() call by the USB > > queuecommand() function and before the complete() call in > > usb_stor_release_resources() is executed. That looks like a race > > condition to me. > > CCing linux-usb, perhaps usb developers would like to take a look. I don't understand what Bart wrote. Why is there a race condition and why is it a problem? Alan Stern ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-18 19:47 ` Bart Van Assche 2012-03-19 7:26 ` Stanislaw Gruszka @ 2012-03-19 17:04 ` Tejun Heo 2012-03-19 17:22 ` Bart Van Assche ` (2 more replies) 1 sibling, 3 replies; 15+ messages in thread From: Tejun Heo @ 2012-03-19 17:04 UTC (permalink / raw) To: Bart Van Assche; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi Hello, On Sun, Mar 18, 2012 at 07:47:47PM +0000, Bart Van Assche wrote: > Not servicing queued SCSI requests can e.g. cause user space processes > to hang. See also http://lkml.org/lkml/2011/8/27/6 for an example. Hence > commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 which causes pending > SCSI commands to be killed just before blk_cleanup_queue() is invoked. > However, Thanks for the pointer. It would be great if you can describe / link the actual case the patch is trying to solve. > there is still a tiny race window left by that patch - new > requests can get queued after the SCSI request function has been invoked > by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence > the proposal to change the block layer to make sure that all queued > requests get processed eventually. I don't think it's a good idea to push requests out to stopped queue. Wouldn't aborting all pending requests be better? Thanks. -- tejun ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-19 17:04 ` Tejun Heo @ 2012-03-19 17:22 ` Bart Van Assche 2012-03-20 20:04 ` Bart Van Assche 2012-03-20 20:06 ` Bart Van Assche 2 siblings, 0 replies; 15+ messages in thread From: Bart Van Assche @ 2012-03-19 17:22 UTC (permalink / raw) To: Tejun Heo; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi On 03/19/12 17:04, Tejun Heo wrote: > Thanks for the pointer. It would be great if you can describe / link > the actual case the patch is trying to solve. Another example is the sd scanning code (sd_probe_async()) which can queue SCSI commands concurrently with the blk_cleanup_queue() call called (indirectly) from scsi_remove_host(). > >> there is still a tiny race window left by that patch - new >> requests can get queued after the SCSI request function has been invoked >> by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence >> the proposal to change the block layer to make sure that all queued >> requests get processed eventually. > I don't think it's a good idea to push requests out to stopped queue. > Wouldn't aborting all pending requests be better? Sure, but is that possible from inside the block layer ? With patch http://marc.info/?l=linux-scsi&m=133207725114386 applied, the following code is present at the start of scsi_request_fn(): if (unlikely(blk_queue_dead(q))) { while ((req = blk_peek_request(q)) != NULL) scsi_kill_request(req, q); Bart. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-19 17:04 ` Tejun Heo 2012-03-19 17:22 ` Bart Van Assche @ 2012-03-20 20:04 ` Bart Van Assche 2012-03-20 20:06 ` Bart Van Assche 2 siblings, 0 replies; 15+ messages in thread From: Bart Van Assche @ 2012-03-20 20:04 UTC (permalink / raw) To: Tejun Heo; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi On 03/19/12 17:04, Tejun Heo wrote: > I don't think it's a good idea to push requests out to stopped queue. > Wouldn't aborting all pending requests be better? Thanks. How about the (lightly tested) patch below ? It combines three separate changes: - Making blk_drain_queue() ignore the stopped state of a queue. - Add struct request_queue.kill_all_requests_fn. - Fix a null pointer dereference triggered by sd during device removal. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-19 17:04 ` Tejun Heo 2012-03-19 17:22 ` Bart Van Assche 2012-03-20 20:04 ` Bart Van Assche @ 2012-03-20 20:06 ` Bart Van Assche 2012-03-20 21:01 ` Dan Williams 2 siblings, 1 reply; 15+ messages in thread From: Bart Van Assche @ 2012-03-20 20:06 UTC (permalink / raw) To: Tejun Heo; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi On 03/19/12 17:04, Tejun Heo wrote: > I don't think it's a good idea to push requests out to stopped queue. > Wouldn't aborting all pending requests be better? Thanks. How about the (lightly tested) patch below ? It combines three separate changes: - Making blk_drain_queue() ignore the stopped state of a queue. - Add request_queue.kill_all_requests_fn. - Fix a null pointer dereference triggered by sd during device removal. Thanks, Bart. diff --git a/block/blk-core.c b/block/blk-core.c index 3a78b00..9429f1b 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -375,8 +375,12 @@ void blk_drain_queue(struct request_queue *q, bool drain_all) * (e.g. fd) get unhappy in such cases. Kick queue iff * dispatch queue has something on it. */ - if (!list_empty(&q->queue_head)) - __blk_run_queue(q); + if (!list_empty(&q->queue_head)) { + if (blk_queue_dead(q) && q->kill_all_requests_fn) + q->kill_all_requests_fn(q); + else + q->request_fn(q); + } drain |= q->rq.elvpriv; diff --git a/block/blk-settings.c b/block/blk-settings.c index d3234fc..9ce3df6 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -191,6 +191,13 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn) } EXPORT_SYMBOL(blk_queue_make_request); +void blk_queue_kill_all_requests(struct request_queue *q, + kill_all_requests_fn *kfn) +{ + q->kill_all_requests_fn = kfn; +} +EXPORT_SYMBOL(blk_queue_kill_all_requests); + /** * blk_queue_bounce_limit - set bounce buffer limit for queue * @q: the request queue for the device diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 351dc0b..5cf3a92 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -296,6 +296,12 @@ static void scsi_host_dev_release(struct device *dev) destroy_workqueue(shost->work_q); q = shost->uspace_req_q; if (q) { + /* + * Note: freeing queuedata before invoking scsi_free_queue() + * is safe here because no request function is associated with + * uspace_req_q. See also the __scsi_alloc_queue() call in + * drivers/scsi/scsi_tgt_lib.c. + */ kfree(q->queuedata); q->queuedata = NULL; scsi_free_queue(q); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index b2c95db..21ede38 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1468,6 +1468,14 @@ static void scsi_softirq_done(struct request *rq) } } +static void scsi_kill_all_requests(struct request_queue *q) +{ + struct request *req; + + while ((req = blk_peek_request(q)) != NULL) + scsi_kill_request(req, q); +} + /* * Function: scsi_request_fn() * @@ -1486,11 +1494,7 @@ static void scsi_request_fn(struct request_queue *q) struct scsi_cmnd *cmd; struct request *req; - if (!sdev) { - while ((req = blk_peek_request(q)) != NULL) - scsi_kill_request(req, q); - return; - } + BUG_ON(!sdev); if(!get_device(&sdev->sdev_gendev)) /* We must be tearing the block queue down already */ @@ -1686,6 +1690,7 @@ struct request_queue *scsi_alloc_queue(struct scsi_device *sdev) if (!q) return NULL; + blk_queue_kill_all_requests(q, scsi_kill_all_requests); blk_queue_prep_rq(q, scsi_prep_fn); blk_queue_softirq_done(q, scsi_softirq_done); blk_queue_rq_timed_out(q, scsi_times_out); @@ -1695,15 +1700,6 @@ struct request_queue *scsi_alloc_queue(struct scsi_device *sdev) void scsi_free_queue(struct request_queue *q) { - unsigned long flags; - - WARN_ON(q->queuedata); - - /* cause scsi_request_fn() to kill all non-finished requests */ - spin_lock_irqsave(q->queue_lock, flags); - q->request_fn(q); - spin_unlock_irqrestore(q->queue_lock, flags); - blk_cleanup_queue(q); } diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 04c2a27..65801e9 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -971,9 +971,6 @@ void __scsi_remove_device(struct scsi_device *sdev) sdev->host->hostt->slave_destroy(sdev); transport_destroy_device(dev); - /* cause the request function to reject all I/O requests */ - sdev->request_queue->queuedata = NULL; - /* Freeing the queue signals to block that we're done */ scsi_free_queue(sdev->request_queue); put_device(dev); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 606cf33..5a31e4c 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -200,6 +200,7 @@ struct request_pm_state typedef void (request_fn_proc) (struct request_queue *q); typedef void (make_request_fn) (struct request_queue *q, struct bio *bio); +typedef void (kill_all_requests_fn) (struct request_queue *q); typedef int (prep_rq_fn) (struct request_queue *, struct request *); typedef void (unprep_rq_fn) (struct request_queue *, struct request *); @@ -282,6 +283,7 @@ struct request_queue { request_fn_proc *request_fn; make_request_fn *make_request_fn; + kill_all_requests_fn *kill_all_requests_fn; prep_rq_fn *prep_rq_fn; unprep_rq_fn *unprep_rq_fn; merge_bvec_fn *merge_bvec_fn; @@ -825,6 +827,8 @@ extern struct request_queue *blk_init_allocated_queue(struct request_queue *, request_fn_proc *, spinlock_t *); extern void blk_cleanup_queue(struct request_queue *); extern void blk_queue_make_request(struct request_queue *, make_request_fn *); +extern void blk_queue_kill_all_requests(struct request_queue *, + kill_all_requests_fn *); extern void blk_queue_bounce_limit(struct request_queue *, u64); extern void blk_limits_max_hw_sectors(struct queue_limits *, unsigned int); extern void blk_queue_max_hw_sectors(struct request_queue *, unsigned int); ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-20 20:06 ` Bart Van Assche @ 2012-03-20 21:01 ` Dan Williams 2012-03-21 3:37 ` Dan Williams 0 siblings, 1 reply; 15+ messages in thread From: Dan Williams @ 2012-03-20 21:01 UTC (permalink / raw) To: Bart Van Assche; +Cc: Tejun Heo, Jens Axboe, Stanislaw Gruszka, linux-scsi On Tue, Mar 20, 2012 at 1:06 PM, Bart Van Assche <bvanassche@acm.org> wrote: [..] > - Fix a null pointer dereference triggered by sd during device removal. Hi Bart, Do you have a log of the backtrace in this case? I'm going to put this patch into our libsas/isci test environment. Thanks, Dan ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-20 21:01 ` Dan Williams @ 2012-03-21 3:37 ` Dan Williams 2012-03-21 18:35 ` Dan Williams 2012-03-24 18:49 ` Bart Van Assche 0 siblings, 2 replies; 15+ messages in thread From: Dan Williams @ 2012-03-21 3:37 UTC (permalink / raw) To: Bart Van Assche Cc: Tejun Heo, Jens Axboe, Stanislaw Gruszka, linux-scsi, Bartek Nowakowski, Jacek Danecki On Tue, Mar 20, 2012 at 2:01 PM, Dan Williams <dan.j.williams@intel.com> wrote: > On Tue, Mar 20, 2012 at 1:06 PM, Bart Van Assche <bvanassche@acm.org> wrote: > [..] >> - Fix a null pointer dereference triggered by sd during device removal. > > Hi Bart, > > Do you have a log of the backtrace in this case? I'm going to put > this patch into our libsas/isci test environment. We beat on this patch pretty severely in our environment and appeared to only trigger a hung_task timeout when our driver / libata took too long to recovery for a 15 device unplug. So... Acked-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Jacek Danecki <jacek.danecki@intel.com> Tested-by: Bartek Nowakowski <bartek.nowakowski@intel.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-21 3:37 ` Dan Williams @ 2012-03-21 18:35 ` Dan Williams 2012-03-24 18:49 ` Bart Van Assche 1 sibling, 0 replies; 15+ messages in thread From: Dan Williams @ 2012-03-21 18:35 UTC (permalink / raw) To: Bart Van Assche Cc: Tejun Heo, Jens Axboe, Stanislaw Gruszka, linux-scsi, Bartek Nowakowski, Jacek Danecki On Tue, Mar 20, 2012 at 8:37 PM, Dan Williams <dan.j.williams@intel.com> wrote: > On Tue, Mar 20, 2012 at 2:01 PM, Dan Williams <dan.j.williams@intel.com> wrote: >> On Tue, Mar 20, 2012 at 1:06 PM, Bart Van Assche <bvanassche@acm.org> wrote: >> [..] >>> - Fix a null pointer dereference triggered by sd during device removal. >> >> Hi Bart, >> >> Do you have a log of the backtrace in this case? I'm going to put >> this patch into our libsas/isci test environment. > > We beat on this patch pretty severely in our environment and appeared > to only trigger a hung_task timeout when our driver / libata took too > long to recovery for a 15 device unplug. So... > > Acked-by: Dan Williams <dan.j.williams@intel.com> > Tested-by: Jacek Danecki <jacek.danecki@intel.com> > Tested-by: Bartek Nowakowski <bartek.nowakowski@intel.com> So, we're still able to trigger what appears to be improper handling of kblockd teardown even with this patch applied: [ 1530.709739] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 1530.719307] IP: [<ffffffff8104455a>] __queue_work+0x2e5/0x332 [ 1530.726151] PGD 0 [ 1530.728807] Oops: 0000 [#1] SMP [ 1530.732860] CPU 5 [ 1530.734924] Modules linked in: isci libsas scsi_transport_sas ipv6 kvm uinput wmi sg iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core igb e1000e ioatdma dca sd_mod ahci libahci libata [last unloaded: scsi_wait_scan] [ 1530.758269] [ 1530.760337] Pid: 0, comm: swapper/5 Not tainted 3.3.0-rc7-isci-183_test+ #12 Intel Corporation S2600CP/S2600CP [ 1530.772353] RIP: 0010:[<ffffffff8104455a>] [<ffffffff8104455a>] __queue_work+0x2e5/0x332 [ 1530.782294] RSP: 0018:ffff88043fd43bf0 EFLAGS: 00010046 [ 1530.788630] RAX: ffff8803fc7d2c30 RBX: ffff8803fbb6e040 RCX: 0000000000000000 [ 1530.797008] RDX: ffff8803fc7d2c38 RSI: ffff88043fd4d378 RDI: 0000000000000000 [ 1530.805384] RBP: ffff88043fd43c30 R08: 6b6b6b6b6b6b6b6b R09: 0000000000000001 [ 1530.813757] R10: 0000000000000000 R11: ffff88043fd4d358 R12: 0000000000000005 [ 1530.822132] R13: ffff88043fd56c00 R14: ffff88043fd4d340 R15: 0000000000000086 [ 1530.830503] FS: 0000000000000000(0000) GS:ffff88043fd40000(0000) knlGS:0000000000000000 [ 1530.840322] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1530.847136] CR2: 0000000000000008 CR3: 0000000001a05000 CR4: 00000000000406e0 [ 1530.855509] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1530.863882] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1530.872261] Process swapper/5 (pid: 0, threadinfo ffff880428cbe000, task ffff880428cc4160) [ 1530.882281] Stack: [ 1530.884913] ffff880400000000 0000000000000000 ffff88040000024b ffff8803fbb6dcd0 [ 1530.894018] ffff8803fb402bc0 ffff8803fb402bc0 0000000000000000 0000000000000001 [ 1530.903144] ffff88043fd43c40 ffffffff810445f0 ffff88043fd43c50 ffffffff81044745 [ 1530.912276] Call Trace: [ 1530.915406] <IRQ> [ 1530.918157] [<ffffffff810445f0>] queue_work_on+0x1b/0x22 [ 1530.924583] [<ffffffff81044745>] queue_work+0x1f/0x21 [ 1530.930745] [<ffffffff81247df1>] kblockd_schedule_work+0x15/0x17 [ 1530.937972] [<ffffffff812596dd>] cfq_schedule_dispatch+0x41/0x45 [ 1530.945177] [<ffffffff8125aacf>] cfq_completed_request+0x539/0x55d [ 1530.952594] [<ffffffff812436c2>] elv_completed_request+0x4a/0x4c [ 1530.959808] [<ffffffff812487dd>] __blk_put_request+0x38/0xba [ 1530.966624] [<ffffffff81248a72>] blk_finish_request+0x213/0x21f [ 1530.973740] [<ffffffff812496ec>] blk_end_bidi_request+0x42/0x5d [ 1530.980851] [<ffffffff81249743>] blk_end_request+0x10/0x12 [ 1530.987473] [<ffffffff81249781>] blk_end_request_err+0x3c/0x41 [ 1530.994507] [<ffffffff8132b244>] scsi_io_completion+0x46d/0x4cf [ 1531.001661] [<ffffffff81323e72>] scsi_finish_command+0xec/0xf5 [ 1531.008672] [<ffffffff8132b3ba>] scsi_softirq_done+0xff/0x108 [ 1531.015600] [<ffffffff8124ea08>] blk_done_softirq+0x84/0x98 [ 1531.022344] [<ffffffff81032786>] __do_softirq+0xe8/0x1e5 [ 1531.028772] [<ffffffff8148c014>] ? _raw_spin_unlock+0x2b/0x2f [ 1531.035687] [<ffffffff814941ec>] call_softirq+0x1c/0x26 [ 1531.042023] [<ffffffff81003bdb>] do_softirq+0x4b/0xa3 [ 1531.048175] [<ffffffff810324a8>] irq_exit+0x53/0xc8 [ 1531.054127] [<ffffffff81494295>] do_IRQ+0x9d/0xb4 [ 1531.059882] [<ffffffff8148c1f3>] common_interrupt+0x73/0x73 [ 1531.066620] <EOI> [ 1531.069407] [<ffffffff812d0ef2>] ? acpi_idle_enter_simple+0xee/0x12a [ 1531.077010] [<ffffffff812d0eee>] ? acpi_idle_enter_simple+0xea/0x12a [ 1531.084609] [<ffffffff8148f695>] ? notifier_call_chain+0x63/0x63 [ 1531.091835] [<ffffffff81397058>] cpuidle_idle_call+0x10b/0x1c7 [ 1531.098854] [<ffffffff81001e73>] cpu_idle+0xc0/0x10d [ 1531.104911] [<ffffffff8148585b>] start_secondary+0x277/0x279 [ 1531.111728] Code: 54 49 8b 45 08 49 8d 76 38 f6 00 10 75 05 48 89 f2 eb 40 49 8b 46 38 eb 1f 4c 8b 00 31 ff 41 f6 c0 04 74 07 4c 89 c7 40 80 e7 00 <48> 8b 7f 08 f6 07 10 74 11 48 8b 40 08 48 83 e8 08 48 8d 50 08 [ 1531.135084] RIP [<ffffffff8104455a>] __queue_work+0x2e5/0x332 [ 1531.142015] RSP <ffff88043fd43bf0> [ 1531.146301] CR2: 0000000000000008 [ 1531.150968] ---[ end trace 0711d1e925c5e6b1 ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues 2012-03-21 3:37 ` Dan Williams 2012-03-21 18:35 ` Dan Williams @ 2012-03-24 18:49 ` Bart Van Assche 1 sibling, 0 replies; 15+ messages in thread From: Bart Van Assche @ 2012-03-24 18:49 UTC (permalink / raw) To: Dan Williams Cc: Tejun Heo, Jens Axboe, Stanislaw Gruszka, linux-scsi, Bartek Nowakowski, Jacek Danecki On 03/21/12 03:37, Dan Williams wrote: > On Tue, Mar 20, 2012 at 2:01 PM, Dan Williams <dan.j.williams@intel.com> wrote: >> On Tue, Mar 20, 2012 at 1:06 PM, Bart Van Assche <bvanassche@acm.org> wrote: >> [..] >>> - Fix a null pointer dereference triggered by sd during device removal. >> Hi Bart, >> >> Do you have a log of the backtrace in this case? I'm going to put >> this patch into our libsas/isci test environment. > We beat on this patch pretty severely in our environment and appeared > to only trigger a hung_task timeout when our driver / libata took too > long to recovery for a 15 device unplug. Thanks for testing - that's appreciated. The null pointer dereference triggered during device removal was originally reported by Jun'ichi Nomura. A call stack can be found here: http://www.spinics.net/lists/linux-scsi/msg56254.html. Regarding invoking blk_cleanup_queue() on a stopped queue: some code I was testing could trigger this. But as far as I can see both the fc and iSCSI transport layer code take care to unblock a queue before destroying it, so these transports are not affected. There are other (non-SCSI) block drivers though that can stop and restart the queue. I haven't analyzed all of them. So I'm not sure there is currently any upstream code that invokes blk_cleanup_queue() on a stopped queue. Bart. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-03-24 18:49 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-18 13:18 [PATCH] block: Make blk_drain_queue() work for stopped queues Bart Van Assche
2012-03-18 15:57 ` Tejun Heo
2012-03-18 19:47 ` Bart Van Assche
2012-03-19 7:26 ` Stanislaw Gruszka
2012-03-19 17:03 ` Bart Van Assche
[not found] ` <4F6766F0.1070805-HInyCGIudOg@public.gmane.org>
2012-03-20 14:21 ` Stanislaw Gruszka
2012-03-20 14:31 ` Alan Stern
2012-03-19 17:04 ` Tejun Heo
2012-03-19 17:22 ` Bart Van Assche
2012-03-20 20:04 ` Bart Van Assche
2012-03-20 20:06 ` Bart Van Assche
2012-03-20 21:01 ` Dan Williams
2012-03-21 3:37 ` Dan Williams
2012-03-21 18:35 ` Dan Williams
2012-03-24 18:49 ` Bart Van Assche
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).