* [PATCH] block: Make blk_drain_queue() work for stopped queues
@ 2012-03-18 13:18 Bart Van Assche
2012-03-18 15:57 ` Tejun Heo
0 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2012-03-18 13:18 UTC (permalink / raw)
To: Jens Axboe, Tejun Heo, Stanislaw Gruszka, linux-scsi
[-- Attachment #1: Type: text/plain, Size: 920 bytes --]
All queued requests must be processed eventually. Hence make sure
that blk_drain_queue() drains the queue even if the queue is in the
stopped state. This patch makes it safe to invoke blk_cleanup_queue()
on a stopped queue.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: stable@vger.kernel.org
---
block/blk-core.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 3a78b00..bdcec86 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue);
*/
void __blk_run_queue(struct request_queue *q)
{
- if (unlikely(blk_queue_stopped(q)))
- return;
-
- q->request_fn(q);
+ if (!blk_queue_stopped(q) || blk_queue_dead(q))
+ q->request_fn(q);
}
EXPORT_SYMBOL(__blk_run_queue);
-- 1.7.7
[-- Attachment #2: Attached Message Part --]
[-- Type: text/plain, Size: 0 bytes --]
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-18 13:18 [PATCH] block: Make blk_drain_queue() work for stopped queues Bart Van Assche
@ 2012-03-18 15:57 ` Tejun Heo
2012-03-18 19:47 ` Bart Van Assche
0 siblings, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2012-03-18 15:57 UTC (permalink / raw)
To: Bart Van Assche; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi
On Sun, Mar 18, 2012 at 01:18:21PM +0000, Bart Van Assche wrote:
> All queued requests must be processed eventually. Hence make sure
> that blk_drain_queue() drains the queue even if the queue is in the
> stopped state. This patch makes it safe to invoke blk_cleanup_queue()
> on a stopped queue.
...
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 3a78b00..bdcec86 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue);
> */
> void __blk_run_queue(struct request_queue *q)
> {
> - if (unlikely(blk_queue_stopped(q)))
> - return;
> -
> - q->request_fn(q);
> + if (!blk_queue_stopped(q) || blk_queue_dead(q))
> + q->request_fn(q);
So, this allows calling request_fn for dead && stopped queue. Have
you seen something which requires this?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-18 15:57 ` Tejun Heo
@ 2012-03-18 19:47 ` Bart Van Assche
2012-03-19 7:26 ` Stanislaw Gruszka
2012-03-19 17:04 ` Tejun Heo
0 siblings, 2 replies; 15+ messages in thread
From: Bart Van Assche @ 2012-03-18 19:47 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi
On 03/18/12 15:57, Tejun Heo wrote:
> On Sun, Mar 18, 2012 at 01:18:21PM +0000, Bart Van Assche wrote:
>> All queued requests must be processed eventually. Hence make sure
>> that blk_drain_queue() drains the queue even if the queue is in the
>> stopped state. This patch makes it safe to invoke blk_cleanup_queue()
>> on a stopped queue.
> ...
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 3a78b00..bdcec86 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue);
>> */
>> void __blk_run_queue(struct request_queue *q)
>> {
>> - if (unlikely(blk_queue_stopped(q)))
>> - return;
>> -
>> - q->request_fn(q);
>> + if (!blk_queue_stopped(q) || blk_queue_dead(q))
>> + q->request_fn(q);
> So, this allows calling request_fn for dead && stopped queue. Have
> you seen something which requires this?
Not servicing queued SCSI requests can e.g. cause user space processes
to hang. See also http://lkml.org/lkml/2011/8/27/6 for an example. Hence
commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 which causes pending
SCSI commands to be killed just before blk_cleanup_queue() is invoked.
However, there is still a tiny race window left by that patch - new
requests can get queued after the SCSI request function has been invoked
by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence
the proposal to change the block layer to make sure that all queued
requests get processed eventually.
Bart.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-18 19:47 ` Bart Van Assche
@ 2012-03-19 7:26 ` Stanislaw Gruszka
2012-03-19 17:03 ` Bart Van Assche
2012-03-19 17:04 ` Tejun Heo
1 sibling, 1 reply; 15+ messages in thread
From: Stanislaw Gruszka @ 2012-03-19 7:26 UTC (permalink / raw)
To: Bart Van Assche; +Cc: Tejun Heo, Jens Axboe, linux-scsi
Hi Bart
On Sun, Mar 18, 2012 at 07:47:47PM +0000, Bart Van Assche wrote:
> On 03/18/12 15:57, Tejun Heo wrote:
> > On Sun, Mar 18, 2012 at 01:18:21PM +0000, Bart Van Assche wrote:
> >> All queued requests must be processed eventually. Hence make sure
> >> that blk_drain_queue() drains the queue even if the queue is in the
> >> stopped state. This patch makes it safe to invoke blk_cleanup_queue()
> >> on a stopped queue.
> > ...
> >> diff --git a/block/blk-core.c b/block/blk-core.c
> >> index 3a78b00..bdcec86 100644
> >> --- a/block/blk-core.c
> >> +++ b/block/blk-core.c
> >> @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue);
> >> */
> >> void __blk_run_queue(struct request_queue *q)
> >> {
> >> - if (unlikely(blk_queue_stopped(q)))
> >> - return;
> >> -
> >> - q->request_fn(q);
> >> + if (!blk_queue_stopped(q) || blk_queue_dead(q))
> >> + q->request_fn(q);
I'm not sure if that behaviour is correct, i.e. we can call q->request_fn(q)
if someone stoped queue, but if it is why not just call q->request_fn(q)
from blk_drain_queue() instead?
> > So, this allows calling request_fn for dead && stopped queue. Have
> > you seen something which requires this?
>
> Not servicing queued SCSI requests can e.g. cause user space processes
> to hang. See also http://lkml.org/lkml/2011/8/27/6 for an example. Hence
> commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 which causes pending
> SCSI commands to be killed just before blk_cleanup_queue() is invoked.
> However, there is still a tiny race window left by that patch - new
> requests can get queued after the SCSI request function has been invoked
> by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence
> the proposal to change the block layer to make sure that all queued
> requests get processed eventually.
That behaviour I can confirm using this script [1] running with usb
dongle. I applied this patch and second one:
http://marc.info/?l=linux-scsi&m=133207725114386&w=2
(BTW: second one patch is mangled). My impression is, that the script run
much longer before it finally hung at infinite loop in blk_drain_queue().
Thanks
Stanislaw
[1]
!#/bin/bash
DEV=/dev/sdb
ENABLE=/sys/bus/usb/devices/1-2/bConfigurationValue
function stop_me()
{
for i in `jobs -p` ; do kill $i 2> /dev/null ; done
exit
}
trap stop_me SIGHUP SIGINT SIGTERM
for ((i = 0; i < 10; i++)) ; do
while true; do fdisk -l $DEV 2>&1 > /dev/null ; done &
done
while true ; do
echo 1 > $ENABLE
sleep 1
echo 0 > $ENABLE
done
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-19 7:26 ` Stanislaw Gruszka
@ 2012-03-19 17:03 ` Bart Van Assche
[not found] ` <4F6766F0.1070805-HInyCGIudOg@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2012-03-19 17:03 UTC (permalink / raw)
To: Stanislaw Gruszka; +Cc: Tejun Heo, Jens Axboe, linux-scsi
On 03/19/12 07:26, Stanislaw Gruszka wrote:
> On Sun, Mar 18, 2012 at 07:47:47PM +0000, Bart Van Assche wrote:
>> On 03/18/12 15:57, Tejun Heo wrote:
>>> On Sun, Mar 18, 2012 at 01:18:21PM +0000, Bart Van Assche wrote:
>>>> All queued requests must be processed eventually. Hence make sure
>>>> that blk_drain_queue() drains the queue even if the queue is in the
>>>> stopped state. This patch makes it safe to invoke blk_cleanup_queue()
>>>> on a stopped queue.
>>> ...
>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>> index 3a78b00..bdcec86 100644
>>>> --- a/block/blk-core.c
>>>> +++ b/block/blk-core.c
>>>> @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue);
>>>> */
>>>> void __blk_run_queue(struct request_queue *q)
>>>> {
>>>> - if (unlikely(blk_queue_stopped(q)))
>>>> - return;
>>>> -
>>>> - q->request_fn(q);
>>>> + if (!blk_queue_stopped(q) || blk_queue_dead(q))
>>>> + q->request_fn(q);
> I'm not sure if that behaviour is correct, i.e. we can call q->request_fn(q)
> if someone stoped queue, but if it is why not just call q->request_fn(q)
> from blk_drain_queue() instead?
As far as I can see invoking q->request_fn(q) directly from
blk_drain_queue() would be a valid alternative.
>
>>> So, this allows calling request_fn for dead && stopped queue. Have
>>> you seen something which requires this?
>> Not servicing queued SCSI requests can e.g. cause user space processes
>> to hang. See also http://lkml.org/lkml/2011/8/27/6 for an example. Hence
>> commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 which causes pending
>> SCSI commands to be killed just before blk_cleanup_queue() is invoked.
>> However, there is still a tiny race window left by that patch - new
>> requests can get queued after the SCSI request function has been invoked
>> by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence
>> the proposal to change the block layer to make sure that all queued
>> requests get processed eventually.
> That behaviour I can confirm using this script [1] running with usb
> dongle. I applied this patch and second one:
> http://marc.info/?l=linux-scsi&m=133207725114386&w=2
> (BTW: second one patch is mangled). My impression is, that the script run
> much longer before it finally hung at infinite loop in blk_drain_queue().
I'm not an USB expert but I've had a quick look at
usb_stor_release_resources() in drivers/usb/storage/usb.c. As far as I
can see that function will only stop the usb_stor_control_thread() if
that thread has been scheduled after the last complete() call by the USB
queuecommand() function and before the complete() call in
usb_stor_release_resources() is executed. That looks like a race
condition to me.
Bart.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-18 19:47 ` Bart Van Assche
2012-03-19 7:26 ` Stanislaw Gruszka
@ 2012-03-19 17:04 ` Tejun Heo
2012-03-19 17:22 ` Bart Van Assche
` (2 more replies)
1 sibling, 3 replies; 15+ messages in thread
From: Tejun Heo @ 2012-03-19 17:04 UTC (permalink / raw)
To: Bart Van Assche; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi
Hello,
On Sun, Mar 18, 2012 at 07:47:47PM +0000, Bart Van Assche wrote:
> Not servicing queued SCSI requests can e.g. cause user space processes
> to hang. See also http://lkml.org/lkml/2011/8/27/6 for an example. Hence
> commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 which causes pending
> SCSI commands to be killed just before blk_cleanup_queue() is invoked.
> However,
Thanks for the pointer. It would be great if you can describe / link
the actual case the patch is trying to solve.
> there is still a tiny race window left by that patch - new
> requests can get queued after the SCSI request function has been invoked
> by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence
> the proposal to change the block layer to make sure that all queued
> requests get processed eventually.
I don't think it's a good idea to push requests out to stopped queue.
Wouldn't aborting all pending requests be better?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-19 17:04 ` Tejun Heo
@ 2012-03-19 17:22 ` Bart Van Assche
2012-03-20 20:04 ` Bart Van Assche
2012-03-20 20:06 ` Bart Van Assche
2 siblings, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2012-03-19 17:22 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi
On 03/19/12 17:04, Tejun Heo wrote:
> Thanks for the pointer. It would be great if you can describe / link
> the actual case the patch is trying to solve.
Another example is the sd scanning code (sd_probe_async()) which can
queue SCSI commands concurrently with the blk_cleanup_queue() call
called (indirectly) from scsi_remove_host().
>
>> there is still a tiny race window left by that patch - new
>> requests can get queued after the SCSI request function has been invoked
>> by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence
>> the proposal to change the block layer to make sure that all queued
>> requests get processed eventually.
> I don't think it's a good idea to push requests out to stopped queue.
> Wouldn't aborting all pending requests be better?
Sure, but is that possible from inside the block layer ? With patch
http://marc.info/?l=linux-scsi&m=133207725114386 applied, the following
code is present at the start of scsi_request_fn():
if (unlikely(blk_queue_dead(q))) {
while ((req = blk_peek_request(q)) != NULL)
scsi_kill_request(req, q);
Bart.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
[not found] ` <4F6766F0.1070805-HInyCGIudOg@public.gmane.org>
@ 2012-03-20 14:21 ` Stanislaw Gruszka
2012-03-20 14:31 ` Alan Stern
0 siblings, 1 reply; 15+ messages in thread
From: Stanislaw Gruszka @ 2012-03-20 14:21 UTC (permalink / raw)
To: Bart Van Assche
Cc: Tejun Heo, Jens Axboe, linux-scsi,
linux-usb-u79uwXL29TY76Z2rM5mHXA
On Mon, Mar 19, 2012 at 05:03:44PM +0000, Bart Van Assche wrote:
> On 03/19/12 07:26, Stanislaw Gruszka wrote:
> > On Sun, Mar 18, 2012 at 07:47:47PM +0000, Bart Van Assche wrote:
> >> On 03/18/12 15:57, Tejun Heo wrote:
> >>> On Sun, Mar 18, 2012 at 01:18:21PM +0000, Bart Van Assche wrote:
> >>>> All queued requests must be processed eventually. Hence make sure
> >>>> that blk_drain_queue() drains the queue even if the queue is in the
> >>>> stopped state. This patch makes it safe to invoke blk_cleanup_queue()
> >>>> on a stopped queue.
> >>> ...
> >>>> diff --git a/block/blk-core.c b/block/blk-core.c
> >>>> index 3a78b00..bdcec86 100644
> >>>> --- a/block/blk-core.c
> >>>> +++ b/block/blk-core.c
> >>>> @@ -300,10 +300,8 @@ EXPORT_SYMBOL(blk_sync_queue);
> >>>> */
> >>>> void __blk_run_queue(struct request_queue *q)
> >>>> {
> >>>> - if (unlikely(blk_queue_stopped(q)))
> >>>> - return;
> >>>> -
> >>>> - q->request_fn(q);
> >>>> + if (!blk_queue_stopped(q) || blk_queue_dead(q))
> >>>> + q->request_fn(q);
> > I'm not sure if that behaviour is correct, i.e. we can call q->request_fn(q)
> > if someone stoped queue, but if it is why not just call q->request_fn(q)
> > from blk_drain_queue() instead?
>
> As far as I can see invoking q->request_fn(q) directly from
> blk_drain_queue() would be a valid alternative.
>
> >
> >>> So, this allows calling request_fn for dead && stopped queue. Have
> >>> you seen something which requires this?
> >> Not servicing queued SCSI requests can e.g. cause user space processes
> >> to hang. See also http://lkml.org/lkml/2011/8/27/6 for an example. Hence
> >> commit 3308511c93e6ad0d3c58984ecd6e5e57f96b12c8 which causes pending
> >> SCSI commands to be killed just before blk_cleanup_queue() is invoked.
> >> However, there is still a tiny race window left by that patch - new
> >> requests can get queued after the SCSI request function has been invoked
> >> by scsi_free_queue() and before blk_cleanup_queue() gets invoked. Hence
> >> the proposal to change the block layer to make sure that all queued
> >> requests get processed eventually.
> > That behaviour I can confirm using this script [1] running with usb
> > dongle. I applied this patch and second one:
> > http://marc.info/?l=linux-scsi&m=133207725114386&w=2
> > (BTW: second one patch is mangled). My impression is, that the script run
> > much longer before it finally hung at infinite loop in blk_drain_queue().
>
> I'm not an USB expert but I've had a quick look at
> usb_stor_release_resources() in drivers/usb/storage/usb.c. As far as I
> can see that function will only stop the usb_stor_control_thread() if
> that thread has been scheduled after the last complete() call by the USB
> queuecommand() function and before the complete() call in
> usb_stor_release_resources() is executed. That looks like a race
> condition to me.
CCing linux-usb, perhaps usb developers would like to take a look.
Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-20 14:21 ` Stanislaw Gruszka
@ 2012-03-20 14:31 ` Alan Stern
0 siblings, 0 replies; 15+ messages in thread
From: Alan Stern @ 2012-03-20 14:31 UTC (permalink / raw)
To: Stanislaw Gruszka
Cc: Bart Van Assche, Tejun Heo, Jens Axboe, linux-scsi, linux-usb
On Tue, 20 Mar 2012, Stanislaw Gruszka wrote:
> On Mon, Mar 19, 2012 at 05:03:44PM +0000, Bart Van Assche wrote:
...
> > I'm not an USB expert but I've had a quick look at
> > usb_stor_release_resources() in drivers/usb/storage/usb.c. As far as I
> > can see that function will only stop the usb_stor_control_thread() if
> > that thread has been scheduled after the last complete() call by the USB
> > queuecommand() function and before the complete() call in
> > usb_stor_release_resources() is executed. That looks like a race
> > condition to me.
>
> CCing linux-usb, perhaps usb developers would like to take a look.
I don't understand what Bart wrote. Why is there a race condition and
why is it a problem?
Alan Stern
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-19 17:04 ` Tejun Heo
2012-03-19 17:22 ` Bart Van Assche
@ 2012-03-20 20:04 ` Bart Van Assche
2012-03-20 20:06 ` Bart Van Assche
2 siblings, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2012-03-20 20:04 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi
On 03/19/12 17:04, Tejun Heo wrote:
> I don't think it's a good idea to push requests out to stopped queue.
> Wouldn't aborting all pending requests be better? Thanks.
How about the (lightly tested) patch below ? It combines three separate
changes:
- Making blk_drain_queue() ignore the stopped state of a queue.
- Add struct request_queue.kill_all_requests_fn.
- Fix a null pointer dereference triggered by sd during device removal.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-19 17:04 ` Tejun Heo
2012-03-19 17:22 ` Bart Van Assche
2012-03-20 20:04 ` Bart Van Assche
@ 2012-03-20 20:06 ` Bart Van Assche
2012-03-20 21:01 ` Dan Williams
2 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2012-03-20 20:06 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jens Axboe, Stanislaw Gruszka, linux-scsi
On 03/19/12 17:04, Tejun Heo wrote:
> I don't think it's a good idea to push requests out to stopped queue.
> Wouldn't aborting all pending requests be better? Thanks.
How about the (lightly tested) patch below ? It combines three separate
changes:
- Making blk_drain_queue() ignore the stopped state of a queue.
- Add request_queue.kill_all_requests_fn.
- Fix a null pointer dereference triggered by sd during device removal.
Thanks,
Bart.
diff --git a/block/blk-core.c b/block/blk-core.c
index 3a78b00..9429f1b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -375,8 +375,12 @@ void blk_drain_queue(struct request_queue *q, bool drain_all)
* (e.g. fd) get unhappy in such cases. Kick queue iff
* dispatch queue has something on it.
*/
- if (!list_empty(&q->queue_head))
- __blk_run_queue(q);
+ if (!list_empty(&q->queue_head)) {
+ if (blk_queue_dead(q) && q->kill_all_requests_fn)
+ q->kill_all_requests_fn(q);
+ else
+ q->request_fn(q);
+ }
drain |= q->rq.elvpriv;
diff --git a/block/blk-settings.c b/block/blk-settings.c
index d3234fc..9ce3df6 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -191,6 +191,13 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
}
EXPORT_SYMBOL(blk_queue_make_request);
+void blk_queue_kill_all_requests(struct request_queue *q,
+ kill_all_requests_fn *kfn)
+{
+ q->kill_all_requests_fn = kfn;
+}
+EXPORT_SYMBOL(blk_queue_kill_all_requests);
+
/**
* blk_queue_bounce_limit - set bounce buffer limit for queue
* @q: the request queue for the device
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 351dc0b..5cf3a92 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -296,6 +296,12 @@ static void scsi_host_dev_release(struct device *dev)
destroy_workqueue(shost->work_q);
q = shost->uspace_req_q;
if (q) {
+ /*
+ * Note: freeing queuedata before invoking scsi_free_queue()
+ * is safe here because no request function is associated with
+ * uspace_req_q. See also the __scsi_alloc_queue() call in
+ * drivers/scsi/scsi_tgt_lib.c.
+ */
kfree(q->queuedata);
q->queuedata = NULL;
scsi_free_queue(q);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b2c95db..21ede38 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1468,6 +1468,14 @@ static void scsi_softirq_done(struct request *rq)
}
}
+static void scsi_kill_all_requests(struct request_queue *q)
+{
+ struct request *req;
+
+ while ((req = blk_peek_request(q)) != NULL)
+ scsi_kill_request(req, q);
+}
+
/*
* Function: scsi_request_fn()
*
@@ -1486,11 +1494,7 @@ static void scsi_request_fn(struct request_queue *q)
struct scsi_cmnd *cmd;
struct request *req;
- if (!sdev) {
- while ((req = blk_peek_request(q)) != NULL)
- scsi_kill_request(req, q);
- return;
- }
+ BUG_ON(!sdev);
if(!get_device(&sdev->sdev_gendev))
/* We must be tearing the block queue down already */
@@ -1686,6 +1690,7 @@ struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
if (!q)
return NULL;
+ blk_queue_kill_all_requests(q, scsi_kill_all_requests);
blk_queue_prep_rq(q, scsi_prep_fn);
blk_queue_softirq_done(q, scsi_softirq_done);
blk_queue_rq_timed_out(q, scsi_times_out);
@@ -1695,15 +1700,6 @@ struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
void scsi_free_queue(struct request_queue *q)
{
- unsigned long flags;
-
- WARN_ON(q->queuedata);
-
- /* cause scsi_request_fn() to kill all non-finished requests */
- spin_lock_irqsave(q->queue_lock, flags);
- q->request_fn(q);
- spin_unlock_irqrestore(q->queue_lock, flags);
-
blk_cleanup_queue(q);
}
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 04c2a27..65801e9 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -971,9 +971,6 @@ void __scsi_remove_device(struct scsi_device *sdev)
sdev->host->hostt->slave_destroy(sdev);
transport_destroy_device(dev);
- /* cause the request function to reject all I/O requests */
- sdev->request_queue->queuedata = NULL;
-
/* Freeing the queue signals to block that we're done */
scsi_free_queue(sdev->request_queue);
put_device(dev);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 606cf33..5a31e4c 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -200,6 +200,7 @@ struct request_pm_state
typedef void (request_fn_proc) (struct request_queue *q);
typedef void (make_request_fn) (struct request_queue *q, struct bio *bio);
+typedef void (kill_all_requests_fn) (struct request_queue *q);
typedef int (prep_rq_fn) (struct request_queue *, struct request *);
typedef void (unprep_rq_fn) (struct request_queue *, struct request *);
@@ -282,6 +283,7 @@ struct request_queue {
request_fn_proc *request_fn;
make_request_fn *make_request_fn;
+ kill_all_requests_fn *kill_all_requests_fn;
prep_rq_fn *prep_rq_fn;
unprep_rq_fn *unprep_rq_fn;
merge_bvec_fn *merge_bvec_fn;
@@ -825,6 +827,8 @@ extern struct request_queue *blk_init_allocated_queue(struct request_queue *,
request_fn_proc *, spinlock_t *);
extern void blk_cleanup_queue(struct request_queue *);
extern void blk_queue_make_request(struct request_queue *, make_request_fn *);
+extern void blk_queue_kill_all_requests(struct request_queue *,
+ kill_all_requests_fn *);
extern void blk_queue_bounce_limit(struct request_queue *, u64);
extern void blk_limits_max_hw_sectors(struct queue_limits *, unsigned int);
extern void blk_queue_max_hw_sectors(struct request_queue *, unsigned int);
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-20 20:06 ` Bart Van Assche
@ 2012-03-20 21:01 ` Dan Williams
2012-03-21 3:37 ` Dan Williams
0 siblings, 1 reply; 15+ messages in thread
From: Dan Williams @ 2012-03-20 21:01 UTC (permalink / raw)
To: Bart Van Assche; +Cc: Tejun Heo, Jens Axboe, Stanislaw Gruszka, linux-scsi
On Tue, Mar 20, 2012 at 1:06 PM, Bart Van Assche <bvanassche@acm.org> wrote:
[..]
> - Fix a null pointer dereference triggered by sd during device removal.
Hi Bart,
Do you have a log of the backtrace in this case? I'm going to put
this patch into our libsas/isci test environment.
Thanks,
Dan
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-20 21:01 ` Dan Williams
@ 2012-03-21 3:37 ` Dan Williams
2012-03-21 18:35 ` Dan Williams
2012-03-24 18:49 ` Bart Van Assche
0 siblings, 2 replies; 15+ messages in thread
From: Dan Williams @ 2012-03-21 3:37 UTC (permalink / raw)
To: Bart Van Assche
Cc: Tejun Heo, Jens Axboe, Stanislaw Gruszka, linux-scsi,
Bartek Nowakowski, Jacek Danecki
On Tue, Mar 20, 2012 at 2:01 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Tue, Mar 20, 2012 at 1:06 PM, Bart Van Assche <bvanassche@acm.org> wrote:
> [..]
>> - Fix a null pointer dereference triggered by sd during device removal.
>
> Hi Bart,
>
> Do you have a log of the backtrace in this case? I'm going to put
> this patch into our libsas/isci test environment.
We beat on this patch pretty severely in our environment and appeared
to only trigger a hung_task timeout when our driver / libata took too
long to recovery for a 15 device unplug. So...
Acked-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Jacek Danecki <jacek.danecki@intel.com>
Tested-by: Bartek Nowakowski <bartek.nowakowski@intel.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-21 3:37 ` Dan Williams
@ 2012-03-21 18:35 ` Dan Williams
2012-03-24 18:49 ` Bart Van Assche
1 sibling, 0 replies; 15+ messages in thread
From: Dan Williams @ 2012-03-21 18:35 UTC (permalink / raw)
To: Bart Van Assche
Cc: Tejun Heo, Jens Axboe, Stanislaw Gruszka, linux-scsi,
Bartek Nowakowski, Jacek Danecki
On Tue, Mar 20, 2012 at 8:37 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Tue, Mar 20, 2012 at 2:01 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>> On Tue, Mar 20, 2012 at 1:06 PM, Bart Van Assche <bvanassche@acm.org> wrote:
>> [..]
>>> - Fix a null pointer dereference triggered by sd during device removal.
>>
>> Hi Bart,
>>
>> Do you have a log of the backtrace in this case? I'm going to put
>> this patch into our libsas/isci test environment.
>
> We beat on this patch pretty severely in our environment and appeared
> to only trigger a hung_task timeout when our driver / libata took too
> long to recovery for a 15 device unplug. So...
>
> Acked-by: Dan Williams <dan.j.williams@intel.com>
> Tested-by: Jacek Danecki <jacek.danecki@intel.com>
> Tested-by: Bartek Nowakowski <bartek.nowakowski@intel.com>
So, we're still able to trigger what appears to be improper handling
of kblockd teardown even with this patch applied:
[ 1530.709739] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000008
[ 1530.719307] IP: [<ffffffff8104455a>] __queue_work+0x2e5/0x332
[ 1530.726151] PGD 0
[ 1530.728807] Oops: 0000 [#1] SMP
[ 1530.732860] CPU 5
[ 1530.734924] Modules linked in: isci libsas scsi_transport_sas ipv6
kvm uinput wmi sg iTCO_wdt iTCO_vendor_support i2c_i801 i2c_core igb
e1000e ioatdma dca sd_mod ahci libahci libata [last unloaded:
scsi_wait_scan]
[ 1530.758269]
[ 1530.760337] Pid: 0, comm: swapper/5 Not tainted
3.3.0-rc7-isci-183_test+ #12 Intel Corporation S2600CP/S2600CP
[ 1530.772353] RIP: 0010:[<ffffffff8104455a>] [<ffffffff8104455a>]
__queue_work+0x2e5/0x332
[ 1530.782294] RSP: 0018:ffff88043fd43bf0 EFLAGS: 00010046
[ 1530.788630] RAX: ffff8803fc7d2c30 RBX: ffff8803fbb6e040 RCX: 0000000000000000
[ 1530.797008] RDX: ffff8803fc7d2c38 RSI: ffff88043fd4d378 RDI: 0000000000000000
[ 1530.805384] RBP: ffff88043fd43c30 R08: 6b6b6b6b6b6b6b6b R09: 0000000000000001
[ 1530.813757] R10: 0000000000000000 R11: ffff88043fd4d358 R12: 0000000000000005
[ 1530.822132] R13: ffff88043fd56c00 R14: ffff88043fd4d340 R15: 0000000000000086
[ 1530.830503] FS: 0000000000000000(0000) GS:ffff88043fd40000(0000)
knlGS:0000000000000000
[ 1530.840322] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1530.847136] CR2: 0000000000000008 CR3: 0000000001a05000 CR4: 00000000000406e0
[ 1530.855509] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1530.863882] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1530.872261] Process swapper/5 (pid: 0, threadinfo ffff880428cbe000,
task ffff880428cc4160)
[ 1530.882281] Stack:
[ 1530.884913] ffff880400000000 0000000000000000 ffff88040000024b
ffff8803fbb6dcd0
[ 1530.894018] ffff8803fb402bc0 ffff8803fb402bc0 0000000000000000
0000000000000001
[ 1530.903144] ffff88043fd43c40 ffffffff810445f0 ffff88043fd43c50
ffffffff81044745
[ 1530.912276] Call Trace:
[ 1530.915406] <IRQ>
[ 1530.918157] [<ffffffff810445f0>] queue_work_on+0x1b/0x22
[ 1530.924583] [<ffffffff81044745>] queue_work+0x1f/0x21
[ 1530.930745] [<ffffffff81247df1>] kblockd_schedule_work+0x15/0x17
[ 1530.937972] [<ffffffff812596dd>] cfq_schedule_dispatch+0x41/0x45
[ 1530.945177] [<ffffffff8125aacf>] cfq_completed_request+0x539/0x55d
[ 1530.952594] [<ffffffff812436c2>] elv_completed_request+0x4a/0x4c
[ 1530.959808] [<ffffffff812487dd>] __blk_put_request+0x38/0xba
[ 1530.966624] [<ffffffff81248a72>] blk_finish_request+0x213/0x21f
[ 1530.973740] [<ffffffff812496ec>] blk_end_bidi_request+0x42/0x5d
[ 1530.980851] [<ffffffff81249743>] blk_end_request+0x10/0x12
[ 1530.987473] [<ffffffff81249781>] blk_end_request_err+0x3c/0x41
[ 1530.994507] [<ffffffff8132b244>] scsi_io_completion+0x46d/0x4cf
[ 1531.001661] [<ffffffff81323e72>] scsi_finish_command+0xec/0xf5
[ 1531.008672] [<ffffffff8132b3ba>] scsi_softirq_done+0xff/0x108
[ 1531.015600] [<ffffffff8124ea08>] blk_done_softirq+0x84/0x98
[ 1531.022344] [<ffffffff81032786>] __do_softirq+0xe8/0x1e5
[ 1531.028772] [<ffffffff8148c014>] ? _raw_spin_unlock+0x2b/0x2f
[ 1531.035687] [<ffffffff814941ec>] call_softirq+0x1c/0x26
[ 1531.042023] [<ffffffff81003bdb>] do_softirq+0x4b/0xa3
[ 1531.048175] [<ffffffff810324a8>] irq_exit+0x53/0xc8
[ 1531.054127] [<ffffffff81494295>] do_IRQ+0x9d/0xb4
[ 1531.059882] [<ffffffff8148c1f3>] common_interrupt+0x73/0x73
[ 1531.066620] <EOI>
[ 1531.069407] [<ffffffff812d0ef2>] ? acpi_idle_enter_simple+0xee/0x12a
[ 1531.077010] [<ffffffff812d0eee>] ? acpi_idle_enter_simple+0xea/0x12a
[ 1531.084609] [<ffffffff8148f695>] ? notifier_call_chain+0x63/0x63
[ 1531.091835] [<ffffffff81397058>] cpuidle_idle_call+0x10b/0x1c7
[ 1531.098854] [<ffffffff81001e73>] cpu_idle+0xc0/0x10d
[ 1531.104911] [<ffffffff8148585b>] start_secondary+0x277/0x279
[ 1531.111728] Code: 54 49 8b 45 08 49 8d 76 38 f6 00 10 75 05 48 89
f2 eb 40 49 8b 46 38 eb 1f 4c 8b 00 31 ff 41 f6 c0 04 74 07 4c 89 c7
40 80 e7 00 <48> 8b 7f 08 f6 07 10 74 11 48 8b 40 08 48 83 e8 08 48 8d
50 08
[ 1531.135084] RIP [<ffffffff8104455a>] __queue_work+0x2e5/0x332
[ 1531.142015] RSP <ffff88043fd43bf0>
[ 1531.146301] CR2: 0000000000000008
[ 1531.150968] ---[ end trace 0711d1e925c5e6b1 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] block: Make blk_drain_queue() work for stopped queues
2012-03-21 3:37 ` Dan Williams
2012-03-21 18:35 ` Dan Williams
@ 2012-03-24 18:49 ` Bart Van Assche
1 sibling, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2012-03-24 18:49 UTC (permalink / raw)
To: Dan Williams
Cc: Tejun Heo, Jens Axboe, Stanislaw Gruszka, linux-scsi,
Bartek Nowakowski, Jacek Danecki
On 03/21/12 03:37, Dan Williams wrote:
> On Tue, Mar 20, 2012 at 2:01 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>> On Tue, Mar 20, 2012 at 1:06 PM, Bart Van Assche <bvanassche@acm.org> wrote:
>> [..]
>>> - Fix a null pointer dereference triggered by sd during device removal.
>> Hi Bart,
>>
>> Do you have a log of the backtrace in this case? I'm going to put
>> this patch into our libsas/isci test environment.
> We beat on this patch pretty severely in our environment and appeared
> to only trigger a hung_task timeout when our driver / libata took too
> long to recovery for a 15 device unplug.
Thanks for testing - that's appreciated.
The null pointer dereference triggered during device removal was
originally reported by Jun'ichi Nomura. A call stack can be found here:
http://www.spinics.net/lists/linux-scsi/msg56254.html.
Regarding invoking blk_cleanup_queue() on a stopped queue: some code I
was testing could trigger this. But as far as I can see both the fc and
iSCSI transport layer code take care to unblock a queue before
destroying it, so these transports are not affected. There are other
(non-SCSI) block drivers though that can stop and restart the queue. I
haven't analyzed all of them. So I'm not sure there is currently any
upstream code that invokes blk_cleanup_queue() on a stopped queue.
Bart.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-03-24 18:49 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-18 13:18 [PATCH] block: Make blk_drain_queue() work for stopped queues Bart Van Assche
2012-03-18 15:57 ` Tejun Heo
2012-03-18 19:47 ` Bart Van Assche
2012-03-19 7:26 ` Stanislaw Gruszka
2012-03-19 17:03 ` Bart Van Assche
[not found] ` <4F6766F0.1070805-HInyCGIudOg@public.gmane.org>
2012-03-20 14:21 ` Stanislaw Gruszka
2012-03-20 14:31 ` Alan Stern
2012-03-19 17:04 ` Tejun Heo
2012-03-19 17:22 ` Bart Van Assche
2012-03-20 20:04 ` Bart Van Assche
2012-03-20 20:06 ` Bart Van Assche
2012-03-20 21:01 ` Dan Williams
2012-03-21 3:37 ` Dan Williams
2012-03-21 18:35 ` Dan Williams
2012-03-24 18:49 ` Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).