WARNING in block layer triggered in 3.17-rc3

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* WARNING in block layer triggered in 3.17-rc3
@ 2014-09-05 17:45 Alan Stern
  2014-09-07 10:43 ` Ming Lei
  2014-09-07 22:59 ` Shirish Pargaonkar
  0 siblings, 2 replies; 14+ messages in thread
From: Alan Stern @ 2014-09-05 17:45 UTC (permalink / raw)
  To: James Bottomley, Jens Axboe
  Cc: SCSI development list, Kernel development list

James and Jens:

I got a WARNING when unbinding the sd driver from a USB flash drive and
then binding it back again.  Here's where the flash drive gets probed 
initially:

[  143.300886] usb-storage 4-8:1.0: usb_probe_interface
[  143.300911] usb-storage 4-8:1.0: usb_probe_interface - got id
[  143.300930] usb-storage 4-8:1.0: USB Mass Storage device detected
[  143.318239] scsi host0: usb-storage 4-8:1.0
[  143.359979] scsi 0:0:0:0: Direct-Access     Ut165    USB2FlashStorage 0.00 PQ: 0 ANSI: 2
[  143.376366] usbcore: registered new interface driver usb-storage
[  143.468464] sd 0:0:0:0: [sda] 7892040 512-byte logical blocks: (4.04 GB/3.76 GiB)
[  143.481725] sd 0:0:0:0: [sda] Write Protect is off
[  143.485712] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[  143.487064] sd 0:0:0:0: [sda] Asking for cache data failed
[  143.498428] sd 0:0:0:0: [sda] Assuming drive cache: write through
[  143.656797]  sda: sda1
[  143.676922] sd 0:0:0:0: [sda] Attached SCSI removable disk

Then I did

	echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/unbind

followed by

	echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/bind

which resulted in:

[  165.079557] sd 0:0:0:0: [sda] 7892040 512-byte logical blocks: (4.04 GB/3.76 GiB)
[  165.093510] sd 0:0:0:0: [sda] Write Protect is off
[  165.104388] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[  165.105632] sd 0:0:0:0: [sda] Asking for cache data failed
[  165.115136] sd 0:0:0:0: [sda] Assuming drive cache: write through
[  165.142950]  sda: sda1
[  165.156480] ------------[ cut here ]------------
[  165.159912] WARNING: CPU: 0 PID: 29 at block/blk-core.c:473 blk_queue_bypass_end+0x4d/0x62()
[  165.160030] Modules linked in: sd_mod usb_storage scsi_mod hid_generic usbhid hid pcspkr evdev i915 cfbfillrect cfbimgblt i2c_algo_bit cfbcopyarea video fbcon backlight bitblit softcursor font ehci_pci drm_kms_helper uhci_hcd ehci_hcd ohci_pci ohci_hcd drm i2ccore usbcore e100 mii usb_common fb fbdev fan processor button thermal_sys
[  165.160030] CPU: 0 PID: 29 Comm: kworker/u4:1 Not tainted 3.17.0-rc3AS-dirty #12
[  165.160030] Hardware name: Hewlett-Packard HP dx2000 MT (EE004AA)/08FCh, BIOS 1.17    11/24/2005
[  165.160030] Workqueue: events_unbound async_run_entry_fn
[  165.160030]  c105d2ef 00000000 ea569e10 c12771c0 00000000 ea569e28 c102c5fb c114907d
[  165.160030]  ecc18000 ea619c20 ecc18000 ea569e38 c102c676 00000009 00000000 ea569e44
[  165.160030]  c114907d ea619c20 ea569e5c c114b97d ea569e5c ea619c20 ed74e400 ea619c2c
[  165.160030] Call Trace:
[  165.160030]  [<c105d2ef>] ? console_unlock+0x37e/0x3ab
[  165.160030]  [<c12771c0>] dump_stack+0x49/0x73
[  165.160030]  [<c102c5fb>] warn_slowpath_common+0x5c/0x73
[  165.160030]  [<c114907d>] ? blk_queue_bypass_end+0x4d/0x62
[  165.160030]  [<c102c676>] warn_slowpath_null+0xf/0x13
[  165.160030]  [<c114907d>] blk_queue_bypass_end+0x4d/0x62
[  165.160030]  [<c114b97d>] blk_register_queue+0x8f/0xc4
[  165.160030]  [<c11548a6>] add_disk+0x2bc/0x3a8
[  165.160030]  [<efff40a5>] sd_probe_async+0xf5/0x17b [sd_mod]
[  165.160030]  [<c103fc08>] async_run_entry_fn+0x59/0xf9
[  165.160030]  [<c103ab22>] process_one_work+0x187/0x2ac
[  165.160030]  [<c103aac4>] ? process_one_work+0x129/0x2ac
[  165.160030]  [<c103ae19>] worker_thread+0x1b1/0x26b
[  165.160030]  [<c103ac68>] ? process_scheduled_works+0x21/0x21
[  165.160030]  [<c103e4ee>] kthread+0x82/0x87
[  165.160030]  [<c127b074>] ? _raw_spin_unlock_irq+0x22/0x3f
[  165.160030]  [<c1160000>] ? radix_tree_tag_set+0x3f/0xa5
[  165.160030]  [<c127b781>] ret_from_kernel_thread+0x21/0x30
[  165.160030]  [<c103e46c>] ? __kthread_parkme+0x50/0x50
[  165.160030] ---[ end trace 31df765b6ea80892 ]---
[  165.308986] sd 0:0:0:0: [sda] Attached SCSI removable disk

I don't know what's going on here, but it looks like something doesn't 
get cleaned up properly during the unbind operation.

This was in vanilla 3.17-rc3.

Alan Stern


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: WARNING in block layer triggered in 3.17-rc3
  2014-09-05 17:45 WARNING in block layer triggered in 3.17-rc3 Alan Stern
@ 2014-09-07 10:43 ` Ming Lei
  2014-09-07 22:59 ` Shirish Pargaonkar
  1 sibling, 0 replies; 14+ messages in thread
From: Ming Lei @ 2014-09-07 10:43 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Jens Axboe, SCSI development list,
	Kernel development list

On Sat, Sep 6, 2014 at 1:45 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> James and Jens:
>
> I got a WARNING when unbinding the sd driver from a USB flash drive and
> then binding it back again.  Here's where the flash drive gets probed
> initially:
>
> [  143.300886] usb-storage 4-8:1.0: usb_probe_interface
> [  143.300911] usb-storage 4-8:1.0: usb_probe_interface - got id
> [  143.300930] usb-storage 4-8:1.0: USB Mass Storage device detected
> [  143.318239] scsi host0: usb-storage 4-8:1.0
> [  143.359979] scsi 0:0:0:0: Direct-Access     Ut165    USB2FlashStorage 0.00 PQ: 0 ANSI: 2
> [  143.376366] usbcore: registered new interface driver usb-storage
> [  143.468464] sd 0:0:0:0: [sda] 7892040 512-byte logical blocks: (4.04 GB/3.76 GiB)
> [  143.481725] sd 0:0:0:0: [sda] Write Protect is off
> [  143.485712] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
> [  143.487064] sd 0:0:0:0: [sda] Asking for cache data failed
> [  143.498428] sd 0:0:0:0: [sda] Assuming drive cache: write through
> [  143.656797]  sda: sda1
> [  143.676922] sd 0:0:0:0: [sda] Attached SCSI removable disk
>
> Then I did
>
>         echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/unbind
>
> followed by
>
>         echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/bind

I can reproduce it on virtio-blk too with scsi-mq, and can't duplicate
it on non-scsi devices, so looks there are refcounts leak in scsi subsystem?


Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: WARNING in block layer triggered in 3.17-rc3
  2014-09-05 17:45 WARNING in block layer triggered in 3.17-rc3 Alan Stern
  2014-09-07 10:43 ` Ming Lei
@ 2014-09-07 22:59 ` Shirish Pargaonkar
  2014-09-08 14:51   ` Alan Stern
  1 sibling, 1 reply; 14+ messages in thread
From: Shirish Pargaonkar @ 2014-09-07 22:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Jens Axboe, SCSI development list,
	Kernel development list

I think the problem is, when a gendisk is detached, its request queue
is not put in bypass mode
cause when it is re-attached, code tries to put it out of bypass mode,
hence the warning.

So either of these should work, I have not tested it, just coded it up.

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 4db5abf..f94c7ba 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -600,6 +600,8 @@ void blk_unregister_queue(struct gendisk *disk)
  if (q->request_fn)
  elv_unregister_queue(q);

+ blk_queue_bypss_start(q);
+
  kobject_uevent(&q->kobj, KOBJ_REMOVE);
  kobject_del(&q->kobj);
  blk_trace_remove_sysfs(disk_to_dev(disk));





diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2c2041c..1c7ef55f 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3054,15 +3054,26 @@ static int sd_probe(struct device *dev)
 static int sd_remove(struct device *dev)
 {
  struct scsi_disk *sdkp;
+ struct request_queue *q;
+ struct scsi_device *sdp;
  dev_t devt;

  sdkp = dev_get_drvdata(dev);
+ q = sdkp->disk->queue;
  devt = disk_devt(sdkp->disk);
  scsi_autopm_get_device(sdkp->device);

  async_synchronize_full_domain(&scsi_sd_pm_domain);
  async_synchronize_full_domain(&scsi_sd_probe_domain);
  device_del(&sdkp->dev);
+
+ if (q) {
+ spin_lock_irq(q->queue_lock);
+ q->bypass_depth++;
+ queue_flag_set(QUEUE_FLAG_BYPASS, q);
+ spin_unlock_irq(q->queue_lock);
+ }
+
  del_gendisk(sdkp->disk);
  sd_shutdown(dev);



On Fri, Sep 5, 2014 at 12:45 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> James and Jens:
>
> I got a WARNING when unbinding the sd driver from a USB flash drive and
> then binding it back again.  Here's where the flash drive gets probed
> initially:
>
> [  143.300886] usb-storage 4-8:1.0: usb_probe_interface
> [  143.300911] usb-storage 4-8:1.0: usb_probe_interface - got id
> [  143.300930] usb-storage 4-8:1.0: USB Mass Storage device detected
> [  143.318239] scsi host0: usb-storage 4-8:1.0
> [  143.359979] scsi 0:0:0:0: Direct-Access     Ut165    USB2FlashStorage 0.00 PQ: 0 ANSI: 2
> [  143.376366] usbcore: registered new interface driver usb-storage
> [  143.468464] sd 0:0:0:0: [sda] 7892040 512-byte logical blocks: (4.04 GB/3.76 GiB)
> [  143.481725] sd 0:0:0:0: [sda] Write Protect is off
> [  143.485712] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
> [  143.487064] sd 0:0:0:0: [sda] Asking for cache data failed
> [  143.498428] sd 0:0:0:0: [sda] Assuming drive cache: write through
> [  143.656797]  sda: sda1
> [  143.676922] sd 0:0:0:0: [sda] Attached SCSI removable disk
>
> Then I did
>
>         echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/unbind
>
> followed by
>
>         echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/bind
>
> which resulted in:
>
> [  165.079557] sd 0:0:0:0: [sda] 7892040 512-byte logical blocks: (4.04 GB/3.76 GiB)
> [  165.093510] sd 0:0:0:0: [sda] Write Protect is off
> [  165.104388] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
> [  165.105632] sd 0:0:0:0: [sda] Asking for cache data failed
> [  165.115136] sd 0:0:0:0: [sda] Assuming drive cache: write through
> [  165.142950]  sda: sda1
> [  165.156480] ------------[ cut here ]------------
> [  165.159912] WARNING: CPU: 0 PID: 29 at block/blk-core.c:473 blk_queue_bypass_end+0x4d/0x62()
> [  165.160030] Modules linked in: sd_mod usb_storage scsi_mod hid_generic usbhid hid pcspkr evdev i915 cfbfillrect cfbimgblt i2c_algo_bit cfbcopyarea video fbcon backlight bitblit softcursor font ehci_pci drm_kms_helper uhci_hcd ehci_hcd ohci_pci ohci_hcd drm i2ccore usbcore e100 mii usb_common fb fbdev fan processor button thermal_sys
> [  165.160030] CPU: 0 PID: 29 Comm: kworker/u4:1 Not tainted 3.17.0-rc3AS-dirty #12
> [  165.160030] Hardware name: Hewlett-Packard HP dx2000 MT (EE004AA)/08FCh, BIOS 1.17    11/24/2005
> [  165.160030] Workqueue: events_unbound async_run_entry_fn
> [  165.160030]  c105d2ef 00000000 ea569e10 c12771c0 00000000 ea569e28 c102c5fb c114907d
> [  165.160030]  ecc18000 ea619c20 ecc18000 ea569e38 c102c676 00000009 00000000 ea569e44
> [  165.160030]  c114907d ea619c20 ea569e5c c114b97d ea569e5c ea619c20 ed74e400 ea619c2c
> [  165.160030] Call Trace:
> [  165.160030]  [<c105d2ef>] ? console_unlock+0x37e/0x3ab
> [  165.160030]  [<c12771c0>] dump_stack+0x49/0x73
> [  165.160030]  [<c102c5fb>] warn_slowpath_common+0x5c/0x73
> [  165.160030]  [<c114907d>] ? blk_queue_bypass_end+0x4d/0x62
> [  165.160030]  [<c102c676>] warn_slowpath_null+0xf/0x13
> [  165.160030]  [<c114907d>] blk_queue_bypass_end+0x4d/0x62
> [  165.160030]  [<c114b97d>] blk_register_queue+0x8f/0xc4
> [  165.160030]  [<c11548a6>] add_disk+0x2bc/0x3a8
> [  165.160030]  [<efff40a5>] sd_probe_async+0xf5/0x17b [sd_mod]
> [  165.160030]  [<c103fc08>] async_run_entry_fn+0x59/0xf9
> [  165.160030]  [<c103ab22>] process_one_work+0x187/0x2ac
> [  165.160030]  [<c103aac4>] ? process_one_work+0x129/0x2ac
> [  165.160030]  [<c103ae19>] worker_thread+0x1b1/0x26b
> [  165.160030]  [<c103ac68>] ? process_scheduled_works+0x21/0x21
> [  165.160030]  [<c103e4ee>] kthread+0x82/0x87
> [  165.160030]  [<c127b074>] ? _raw_spin_unlock_irq+0x22/0x3f
> [  165.160030]  [<c1160000>] ? radix_tree_tag_set+0x3f/0xa5
> [  165.160030]  [<c127b781>] ret_from_kernel_thread+0x21/0x30
> [  165.160030]  [<c103e46c>] ? __kthread_parkme+0x50/0x50
> [  165.160030] ---[ end trace 31df765b6ea80892 ]---
> [  165.308986] sd 0:0:0:0: [sda] Attached SCSI removable disk
>
> I don't know what's going on here, but it looks like something doesn't
> get cleaned up properly during the unbind operation.
>
> This was in vanilla 3.17-rc3.
>
> Alan Stern
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: WARNING in block layer triggered in 3.17-rc3
  2014-09-07 22:59 ` Shirish Pargaonkar
@ 2014-09-08 14:51   ` Alan Stern
  2014-09-08 15:05     ` Shirish Pargaonkar
  2014-09-08 15:51     ` James Bottomley
  0 siblings, 2 replies; 14+ messages in thread
From: Alan Stern @ 2014-09-08 14:51 UTC (permalink / raw)
  To: Shirish Pargaonkar
  Cc: James Bottomley, Jens Axboe, SCSI development list,
	Kernel development list

On Sun, 7 Sep 2014, Shirish Pargaonkar wrote:

> I think the problem is, when a gendisk is detached, its request queue
> is not put in bypass mode
> cause when it is re-attached, code tries to put it out of bypass mode,
> hence the warning.
> 
> So either of these should work, I have not tested it, just coded it up.

I'm pretty sure that both of your solutions are wrong.

Jens and James, it appears the problem is in blk_register_queue().  The 
code does this:

	/*
	 * Initialization must be complete by now.  Finish the initial
	 * bypass from queue allocation.
	 */
	queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
	blk_queue_bypass_end(q);

This doesn't work well if the queue is unregistered later and then
registered again -- which is what happens when the sd driver is unbound
from a device and then bound again.  It looks like the code should be:

	if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
		blk_queue_bypass_end(q);

Do you agree?  If so, I'll send in patch.

Alan Stern


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: WARNING in block layer triggered in 3.17-rc3
  2014-09-08 14:51   ` Alan Stern
@ 2014-09-08 15:05     ` Shirish Pargaonkar
  2014-09-08 15:51     ` James Bottomley
  1 sibling, 0 replies; 14+ messages in thread
From: Shirish Pargaonkar @ 2014-09-08 15:05 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Jens Axboe, SCSI development list,
	Kernel development list

So should a request queue be in bypass mode when the device is being detached
and queue is being unregistereed because requests can get queued up?


On Mon, Sep 8, 2014 at 9:51 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Sun, 7 Sep 2014, Shirish Pargaonkar wrote:
>
>> I think the problem is, when a gendisk is detached, its request queue
>> is not put in bypass mode
>> cause when it is re-attached, code tries to put it out of bypass mode,
>> hence the warning.
>>
>> So either of these should work, I have not tested it, just coded it up.
>
> I'm pretty sure that both of your solutions are wrong.
>
> Jens and James, it appears the problem is in blk_register_queue().  The
> code does this:
>
>         /*
>          * Initialization must be complete by now.  Finish the initial
>          * bypass from queue allocation.
>          */
>         queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
>         blk_queue_bypass_end(q);
>
> This doesn't work well if the queue is unregistered later and then
> registered again -- which is what happens when the sd driver is unbound
> from a device and then bound again.  It looks like the code should be:
>
>         if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
>                 blk_queue_bypass_end(q);
>
> Do you agree?  If so, I'll send in patch.
>
> Alan Stern
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: WARNING in block layer triggered in 3.17-rc3
  2014-09-08 14:51   ` Alan Stern
  2014-09-08 15:05     ` Shirish Pargaonkar
@ 2014-09-08 15:51     ` James Bottomley
  2014-09-08 16:11       ` Shirish Pargaonkar
  2014-09-08 17:42       ` Alan Stern
  1 sibling, 2 replies; 14+ messages in thread
From: James Bottomley @ 2014-09-08 15:51 UTC (permalink / raw)
  To: Alan Stern, Tejun Heo
  Cc: Shirish Pargaonkar, Jens Axboe, SCSI development list,
	Kernel development list

On Mon, 2014-09-08 at 10:51 -0400, Alan Stern wrote:
> On Sun, 7 Sep 2014, Shirish Pargaonkar wrote:
> 
> > I think the problem is, when a gendisk is detached, its request queue
> > is not put in bypass mode
> > cause when it is re-attached, code tries to put it out of bypass mode,
> > hence the warning.
> > 
> > So either of these should work, I have not tested it, just coded it up.
> 
> I'm pretty sure that both of your solutions are wrong.
> 
> Jens and James, it appears the problem is in blk_register_queue().  The 
> code does this:
> 
> 	/*
> 	 * Initialization must be complete by now.  Finish the initial
> 	 * bypass from queue allocation.
> 	 */
> 	queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
> 	blk_queue_bypass_end(q);
> 
> This doesn't work well if the queue is unregistered later and then
> registered again -- which is what happens when the sd driver is unbound
> from a device and then bound again.  It looks like the code should be:
> 
> 	if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
> 		blk_queue_bypass_end(q);
> 
> Do you agree?  If so, I'll send in patch.

This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
be unset on blk_unregister_queue() to match the teardown; it's only
accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
lot of queue stuff down.  However, the problem looks to be the mismatch
in assumptions.  The way SCSI binding works, the queue belongs to the
underlying device so we always assumed we could add and remove upper
drivers ... there's even a case for this if you don't want a disk but
want to attach sg instead.  However, it's not the common use case.

The block model now seems to tie a lot of queue set up and teardown to
add and remove of the gendisk which is counter to these assumptions.  As
long as we can go from del->add without calling the ->release function
on the queue, everything works.  Most of the operations seem
symmetrical, so perhaps this is only the bypass doing too much.

The ideal is that disk teardown only does as much as disk setup, so the
mid layer can still use the underlying queue on the device.

This bypass code is not very well documented.  However, your problem
seems to be caused by this change:

commit 776687bce42bb22cce48b5da950e48ebbb9a948f
Author: Tejun Heo <tj@kernel.org>
Date:   Tue Jul 1 10:29:17 2014 -0600

    block, blk-mq: draining can't be skipped even if bypass_depth was
non-zero

Your hack seems to indicate that this doesn't work on the add->del->add
transtion of a gendisk.

James

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: WARNING in block layer triggered in 3.17-rc3
  2014-09-08 15:51     ` James Bottomley
@ 2014-09-08 16:11       ` Shirish Pargaonkar
  2014-09-08 17:42       ` Alan Stern
  1 sibling, 0 replies; 14+ messages in thread
From: Shirish Pargaonkar @ 2014-09-08 16:11 UTC (permalink / raw)
  To: James Bottomley
  Cc: Alan Stern, Tejun Heo, Jens Axboe, SCSI development list,
	Kernel development list

see this issue/warning in 3.12 also.

On Mon, Sep 8, 2014 at 10:51 AM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> On Mon, 2014-09-08 at 10:51 -0400, Alan Stern wrote:
>> On Sun, 7 Sep 2014, Shirish Pargaonkar wrote:
>>
>> > I think the problem is, when a gendisk is detached, its request queue
>> > is not put in bypass mode
>> > cause when it is re-attached, code tries to put it out of bypass mode,
>> > hence the warning.
>> >
>> > So either of these should work, I have not tested it, just coded it up.
>>
>> I'm pretty sure that both of your solutions are wrong.
>>
>> Jens and James, it appears the problem is in blk_register_queue().  The
>> code does this:
>>
>>       /*
>>        * Initialization must be complete by now.  Finish the initial
>>        * bypass from queue allocation.
>>        */
>>       queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
>>       blk_queue_bypass_end(q);
>>
>> This doesn't work well if the queue is unregistered later and then
>> registered again -- which is what happens when the sd driver is unbound
>> from a device and then bound again.  It looks like the code should be:
>>
>>       if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
>>               blk_queue_bypass_end(q);
>>
>> Do you agree?  If so, I'll send in patch.
>
> This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
> be unset on blk_unregister_queue() to match the teardown; it's only
> accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
> lot of queue stuff down.  However, the problem looks to be the mismatch
> in assumptions.  The way SCSI binding works, the queue belongs to the
> underlying device so we always assumed we could add and remove upper
> drivers ... there's even a case for this if you don't want a disk but
> want to attach sg instead.  However, it's not the common use case.
>
> The block model now seems to tie a lot of queue set up and teardown to
> add and remove of the gendisk which is counter to these assumptions.  As
> long as we can go from del->add without calling the ->release function
> on the queue, everything works.  Most of the operations seem
> symmetrical, so perhaps this is only the bypass doing too much.
>
> The ideal is that disk teardown only does as much as disk setup, so the
> mid layer can still use the underlying queue on the device.
>
> This bypass code is not very well documented.  However, your problem
> seems to be caused by this change:
>
> commit 776687bce42bb22cce48b5da950e48ebbb9a948f
> Author: Tejun Heo <tj@kernel.org>
> Date:   Tue Jul 1 10:29:17 2014 -0600
>
>     block, blk-mq: draining can't be skipped even if bypass_depth was
> non-zero
>
> Your hack seems to indicate that this doesn't work on the add->del->add
> transtion of a gendisk.
>
> James
>
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: WARNING in block layer triggered in 3.17-rc3
  2014-09-08 15:51     ` James Bottomley
  2014-09-08 16:11       ` Shirish Pargaonkar
@ 2014-09-08 17:42       ` Alan Stern
  2014-09-09  1:19         ` Tejun Heo
  1 sibling, 1 reply; 14+ messages in thread
From: Alan Stern @ 2014-09-08 17:42 UTC (permalink / raw)
  To: James Bottomley
  Cc: Tejun Heo, Shirish Pargaonkar, Jens Axboe, SCSI development list,
	Kernel development list

On Mon, 8 Sep 2014, James Bottomley wrote:

> > Jens and James, it appears the problem is in blk_register_queue().  The 
> > code does this:
> > 
> > 	/*
> > 	 * Initialization must be complete by now.  Finish the initial
> > 	 * bypass from queue allocation.
> > 	 */
> > 	queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
> > 	blk_queue_bypass_end(q);
> > 
> > This doesn't work well if the queue is unregistered later and then
> > registered again -- which is what happens when the sd driver is unbound
> > from a device and then bound again.  It looks like the code should be:
> > 
> > 	if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
> > 		blk_queue_bypass_end(q);
> > 
> > Do you agree?  If so, I'll send in patch.
> 
> This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
> be unset on blk_unregister_queue() to match the teardown; it's only
> accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
> lot of queue stuff down.

It's not clear what the operative assumptions are.  The comment in
blk_register_queue() implies that bypass is active only because it was
set up that way when the queue was created.  The fact that
blk_unregister_queue() doesn't call blk_queue_bypass_start() seems to
support this view -- although it could also be a simple oversight.

Hopefully Tejun can clear this iup.

>  However, the problem looks to be the mismatch
> in assumptions.  The way SCSI binding works, the queue belongs to the
> underlying device so we always assumed we could add and remove upper
> drivers ... there's even a case for this if you don't want a disk but
> want to attach sg instead.  However, it's not the common use case.
> 
> The block model now seems to tie a lot of queue set up and teardown to
> add and remove of the gendisk which is counter to these assumptions.  As
> long as we can go from del->add without calling the ->release function
> on the queue, everything works.  Most of the operations seem
> symmetrical, so perhaps this is only the bypass doing too much.
> 
> The ideal is that disk teardown only does as much as disk setup, so the
> mid layer can still use the underlying queue on the device.
> 
> This bypass code is not very well documented.  However, your problem
> seems to be caused by this change:
> 
> commit 776687bce42bb22cce48b5da950e48ebbb9a948f
> Author: Tejun Heo <tj@kernel.org>
> Date:   Tue Jul 1 10:29:17 2014 -0600
> 
>     block, blk-mq: draining can't be skipped even if bypass_depth was
> non-zero
> 
> Your hack seems to indicate that this doesn't work on the add->del->add
> transtion of a gendisk.

Indeed, it does not work.

Tejun, for more details about the failure see the initial message in 
this thread:

	http://marc.info/?l=linux-kernel&m=140993911413862&w=2

Alan Stern


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: WARNING in block layer triggered in 3.17-rc3
  2014-09-08 17:42       ` Alan Stern
@ 2014-09-09  1:19         ` Tejun Heo
  2014-09-09 15:08           ` Alan Stern
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2014-09-09  1:19 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Shirish Pargaonkar, Jens Axboe,
	SCSI development list, Kernel development list

Hello,

On Mon, Sep 08, 2014 at 01:42:44PM -0400, Alan Stern wrote:
> > This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
> > be unset on blk_unregister_queue() to match the teardown; it's only
> > accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
> > lot of queue stuff down.
> 
> It's not clear what the operative assumptions are.  The comment in
> blk_register_queue() implies that bypass is active only because it was
> set up that way when the queue was created.  The fact that
> blk_unregister_queue() doesn't call blk_queue_bypass_start() seems to
> support this view -- although it could also be a simple oversight.
> 
> Hopefully Tejun can clear this iup.

Maintaining the initial bypass till queue registration is an
optimization because shutting down a fully functional queue is a
costly operation and there are drivers which set and destroy queues
repeatedly while probing, so, yeah, it's really a special case for
when the queue is being registered for the first time.

> > Your hack seems to indicate that this doesn't work on the add->del->add
> > transtion of a gendisk.
> 
> Indeed, it does not work.

As such, the change you suggested makes perfect sense to me.  Why
wouldn't it work?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: WARNING in block layer triggered in 3.17-rc3
  2014-09-09  1:19         ` Tejun Heo
@ 2014-09-09 15:08           ` Alan Stern
  2014-09-09 15:17             ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Stern @ 2014-09-09 15:08 UTC (permalink / raw)
  To: Tejun Heo
  Cc: James Bottomley, Shirish Pargaonkar, Jens Axboe,
	SCSI development list, Kernel development list

On Tue, 9 Sep 2014, Tejun Heo wrote:

> Hello,
> 
> On Mon, Sep 08, 2014 at 01:42:44PM -0400, Alan Stern wrote:
> > > This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
> > > be unset on blk_unregister_queue() to match the teardown; it's only
> > > accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
> > > lot of queue stuff down.
> > 
> > It's not clear what the operative assumptions are.  The comment in
> > blk_register_queue() implies that bypass is active only because it was
> > set up that way when the queue was created.  The fact that
> > blk_unregister_queue() doesn't call blk_queue_bypass_start() seems to
> > support this view -- although it could also be a simple oversight.
> > 
> > Hopefully Tejun can clear this iup.
> 
> Maintaining the initial bypass till queue registration is an
> optimization because shutting down a fully functional queue is a
> costly operation and there are drivers which set and destroy queues
> repeatedly while probing, so, yeah, it's really a special case for
> when the queue is being registered for the first time.
> 
> > > Your hack seems to indicate that this doesn't work on the add->del->add
> > > transtion of a gendisk.
> > 
> > Indeed, it does not work.
> 
> As such, the change you suggested makes perfect sense to me.  Why
> wouldn't it work?

Sorry, the meaning wasn't clear.  I meant that the existing code 
doesn't work.  The patch seems to work okay (except that I can't use 
queue_flag_test_and_set because the queue isn't locked at that point).  
I'll submit the patch shortly.

Alan Stern


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: WARNING in block layer triggered in 3.17-rc3
  2014-09-09 15:08           ` Alan Stern
@ 2014-09-09 15:17             ` Tejun Heo
  2014-09-09 15:50               ` [PATCH] Block: fix unbalanced bypass-disable in blk_register_queue Alan Stern
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2014-09-09 15:17 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Shirish Pargaonkar, Jens Axboe,
	SCSI development list, Kernel development list

Hello, Alan.

On Tue, Sep 09, 2014 at 11:08:22AM -0400, Alan Stern wrote:
> Sorry, the meaning wasn't clear.  I meant that the existing code 
> doesn't work.  The patch seems to work okay (except that I can't use 
> queue_flag_test_and_set because the queue isn't locked at that point).  
> I'll submit the patch shortly.

Given it's fully synchronized, the following should work fine and
probably is less misleading than using atomic test_and_set.

	if (!blk_queue_init_done) {
		queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
		blk_queue_bypass_end(q);
	}

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] Block: fix unbalanced bypass-disable in blk_register_queue
  2014-09-09 15:17             ` Tejun Heo
@ 2014-09-09 15:50               ` Alan Stern
  2014-09-09 16:43                 ` Tejun Heo
  2014-09-09 18:32                 ` Shirish Pargaonkar
  0 siblings, 2 replies; 14+ messages in thread
From: Alan Stern @ 2014-09-09 15:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: James Bottomley, Shirish Pargaonkar, Jens Axboe,
	SCSI development list, Kernel development list

When a queue is registered, the block layer turns off the bypass
setting (because bypass is enabled when the queue is created).  This
doesn't work well for queues that are unregistered and then registered
again; we get a WARNING because of the unbalanced calls to
blk_queue_bypass_end().

This patch fixes the problem by making blk_register_queue() call
blk_queue_bypass_end() only the first time the queue is registered.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Tejun Heo <tj@kernel.org>
CC: James Bottomley <James.Bottomley@HansenPartnership.com>
CC: Jens Axboe <axboe@kernel.dk>

---

[as1765]


 block/blk-sysfs.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Index: usb-3.17/block/blk-sysfs.c
===================================================================
--- usb-3.17.orig/block/blk-sysfs.c
+++ usb-3.17/block/blk-sysfs.c
@@ -554,8 +554,10 @@ int blk_register_queue(struct gendisk *d
 	 * Initialization must be complete by now.  Finish the initial
 	 * bypass from queue allocation.
 	 */
-	queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
-	blk_queue_bypass_end(q);
+	if (!blk_queue_init_done(q)) {
+		queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
+		blk_queue_bypass_end(q);
+	}
 
 	ret = blk_trace_init_sysfs(dev);
 	if (ret)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Block: fix unbalanced bypass-disable in blk_register_queue
  2014-09-09 15:50               ` [PATCH] Block: fix unbalanced bypass-disable in blk_register_queue Alan Stern
@ 2014-09-09 16:43                 ` Tejun Heo
  2014-09-09 18:32                 ` Shirish Pargaonkar
  1 sibling, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2014-09-09 16:43 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Shirish Pargaonkar, Jens Axboe,
	SCSI development list, Kernel development list

On Tue, Sep 09, 2014 at 11:50:58AM -0400, Alan Stern wrote:
> When a queue is registered, the block layer turns off the bypass
> setting (because bypass is enabled when the queue is created).  This
> doesn't work well for queues that are unregistered and then registered
> again; we get a WARNING because of the unbalanced calls to
> blk_queue_bypass_end().
> 
> This patch fixes the problem by making blk_register_queue() call
> blk_queue_bypass_end() only the first time the queue is registered.
> 
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> CC: Tejun Heo <tj@kernel.org>
> CC: James Bottomley <James.Bottomley@HansenPartnership.com>
> CC: Jens Axboe <axboe@kernel.dk>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Block: fix unbalanced bypass-disable in blk_register_queue
  2014-09-09 15:50               ` [PATCH] Block: fix unbalanced bypass-disable in blk_register_queue Alan Stern
  2014-09-09 16:43                 ` Tejun Heo
@ 2014-09-09 18:32                 ` Shirish Pargaonkar
  1 sibling, 0 replies; 14+ messages in thread
From: Shirish Pargaonkar @ 2014-09-09 18:32 UTC (permalink / raw)
  To: Alan Stern
  Cc: Tejun Heo, James Bottomley, Jens Axboe, SCSI development list,
	Kernel development list

Tested-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>

On Tue, Sep 9, 2014 at 10:50 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> When a queue is registered, the block layer turns off the bypass
> setting (because bypass is enabled when the queue is created).  This
> doesn't work well for queues that are unregistered and then registered
> again; we get a WARNING because of the unbalanced calls to
> blk_queue_bypass_end().
>
> This patch fixes the problem by making blk_register_queue() call
> blk_queue_bypass_end() only the first time the queue is registered.
>
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> CC: Tejun Heo <tj@kernel.org>
> CC: James Bottomley <James.Bottomley@HansenPartnership.com>
> CC: Jens Axboe <axboe@kernel.dk>
>
> ---
>
> [as1765]
>
>
>  block/blk-sysfs.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> Index: usb-3.17/block/blk-sysfs.c
> ===================================================================
> --- usb-3.17.orig/block/blk-sysfs.c
> +++ usb-3.17/block/blk-sysfs.c
> @@ -554,8 +554,10 @@ int blk_register_queue(struct gendisk *d
>          * Initialization must be complete by now.  Finish the initial
>          * bypass from queue allocation.
>          */
> -       queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
> -       blk_queue_bypass_end(q);
> +       if (!blk_queue_init_done(q)) {
> +               queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
> +               blk_queue_bypass_end(q);
> +       }
>
>         ret = blk_trace_init_sysfs(dev);
>         if (ret)
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-09-09 18:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-05 17:45 WARNING in block layer triggered in 3.17-rc3 Alan Stern
2014-09-07 10:43 ` Ming Lei
2014-09-07 22:59 ` Shirish Pargaonkar
2014-09-08 14:51   ` Alan Stern
2014-09-08 15:05     ` Shirish Pargaonkar
2014-09-08 15:51     ` James Bottomley
2014-09-08 16:11       ` Shirish Pargaonkar
2014-09-08 17:42       ` Alan Stern
2014-09-09  1:19         ` Tejun Heo
2014-09-09 15:08           ` Alan Stern
2014-09-09 15:17             ` Tejun Heo
2014-09-09 15:50               ` [PATCH] Block: fix unbalanced bypass-disable in blk_register_queue Alan Stern
2014-09-09 16:43                 ` Tejun Heo
2014-09-09 18:32                 ` Shirish Pargaonkar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox