public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Bart Van Assche <bvanassche@acm.org>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>,
	Jaegeuk Kim <jaegeuk@kernel.org>,
	linux-scsi@vger.kernel.org,
	Adrian Hunter <adrian.hunter@intel.com>,
	Ming Lei <ming.lei@redhat.com>, Christoph Hellwig <hch@lst.de>,
	Mike Christie <michael.christie@oracle.com>,
	Hannes Reinecke <hare@suse.de>,
	John Garry <john.garry@huawei.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>
Subject: Re: [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets
Date: Mon, 5 Sep 2022 10:40:47 -0700	[thread overview]
Message-ID: <20220905173905.GA3405134@roeck-us.net> (raw)
In-Reply-To: <20220728221851.1822295-3-bvanassche@acm.org>

On Thu, Jul 28, 2022 at 03:18:49PM -0700, Bart Van Assche wrote:
> From: Ming Lei <ming.lei@redhat.com>
> 
> Fix the race conditions between SCSI LLD kernel module unloading and SCSI
> device and target removal by making sure that SCSI hosts are destroyed after
> all associated target and device objects have been freed.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Mike Christie <michael.christie@oracle.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: John Garry <john.garry@huawei.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> [ bvanassche: Reworked Ming's patch and split it ]

I know this has been reported before, but it is still seen in the
upstream kernel, so:

This patch results in a deadlock if a USB storage device is removed.

[   29.291148] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   29.300064] ci_hdrc ci_hdrc.1: remove, state 4
[   29.300317] usb usb2: USB disconnect, device number 1
[   29.305090] ci_hdrc ci_hdrc.1: USB bus 2 deregistered
[   29.307052] ci_hdrc ci_hdrc.0: remove, state 1
[   29.307214] usb usb1: USB disconnect, device number 1
[   29.307321] usb 1-1: USB disconnect, device number 2
[   29.344575] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   29.345323] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[   63.358569] INFO: task init:347 blocked for more than 30 seconds.
[   63.358928]       Tainted: G        W        N 6.0.0-rc4-00017-gcec18aa4b63a #1
[   63.359200] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   63.359600] task:init            state:D stack:    0 pid:  347 ppid:     1 flags:0x00000000
[   63.360104]  __schedule from schedule+0x60/0xbc
[   63.360368]  schedule from scsi_remove_host+0x154/0x1c0
[   63.360602]  scsi_remove_host from usb_stor_disconnect+0x4c/0xac
[   63.360852]  usb_stor_disconnect from usb_unbind_interface+0x74/0x268
[   63.361100]  usb_unbind_interface from device_release_driver_internal+0x1a0/0x22c
[   63.361383]  device_release_driver_internal from bus_remove_device+0xcc/0xfc
[   63.361651]  bus_remove_device from device_del+0x16c/0x3f8
[   63.361877]  device_del from usb_disable_device+0xcc/0x178
[   63.362097]  usb_disable_device from usb_disconnect+0xd0/0x230
[   63.362325]  usb_disconnect from usb_disconnect+0x9c/0x230
[   63.362536]  usb_disconnect from usb_remove_hcd+0xd0/0x16c
[   63.362741]  usb_remove_hcd from host_stop+0x38/0xa8
[   63.362946]  host_stop from ci_hdrc_remove+0x44/0x120
[   63.363148]  ci_hdrc_remove from platform_remove+0x20/0x4c
[   63.363367]  platform_remove from device_release_driver_internal+0x1a0/0x22c
[   63.363635]  device_release_driver_internal from bus_remove_device+0xcc/0xfc
[   63.363897]  bus_remove_device from device_del+0x16c/0x3f8
[   63.364117]  device_del from platform_device_del.part.0+0x10/0x74
[   63.364353]  platform_device_del.part.0 from platform_device_unregister+0x18/0x24
[   63.364623]  platform_device_unregister from ci_hdrc_remove_device+0xc/0x20
[   63.364886]  ci_hdrc_remove_device from ci_hdrc_imx_remove+0x28/0x110
[   63.365131]  ci_hdrc_imx_remove from device_shutdown+0x174/0x250
[   63.365372]  device_shutdown from __do_sys_reboot+0x124/0x270
[   63.365616]  __do_sys_reboot from ret_fast_syscall+0x0/0x1c
[   63.365849] Exception stack(0xd1859fa8 to 0xd1859ff0)
[   63.366054] 9fa0:                   01234567 000c623f fee1dead 28121969 01234567 00000000
[   63.366343] 9fc0: 01234567 000c623f 00000001 00000058 000d85c0 00000000 00000000 00000000
[   63.366620] 9fe0: 000d8298 bef49de4 000918bc b6e8cedc
[   63.366881] INFO: lockdep is turned off.
[   63.367069] Kernel panic - not syncing: hung_task: blocked tasks

I understand that it looks like the problem is caused by the shutdown
function in the imx driver calling remove_device, but that is not really
the problem.

As can be seen in the backtrace, usb_stor_disconnect() calls
scsi_remove_host(). Thanks to this patch, scsi_remove_host() now
waits for the scsi release function to be called. However,
usb_stor_disconnect() only calls release_everything() and with it
scsi_host_put() _after_ scsi_remove_host() has returned. Since
scsi_remove_host() now waits for the resource which is released
by calling scsi_host_put(), this causes a deadlock.

If my analysis is correct, any USB storage device removal should
result in the deadlock. My analysis may of course be wrong. If so,
please let me know what I missed.

Thanks,
Guenter

  reply	other threads:[~2022-09-05 17:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-28 22:18 [PATCH v5 0/4] Call blk_mq_free_tag_set() earlier Bart Van Assche
2022-07-28 22:18 ` [PATCH v5 1/4] scsi: core: Make sure that targets outlive devices Bart Van Assche
2022-07-28 22:18 ` [PATCH v5 2/4] scsi: core: Make sure that hosts outlive targets Bart Van Assche
2022-09-05 17:40   ` Guenter Roeck [this message]
2022-09-06 14:16     ` Bart Van Assche
2022-09-06 14:23       ` Guenter Roeck
2022-07-28 22:18 ` [PATCH v5 3/4] scsi: core: Simplify LLD module reference counting Bart Van Assche
2022-07-28 22:18 ` [PATCH v5 4/4] scsi: core: Call blk_mq_free_tag_set() earlier Bart Van Assche
2022-07-29 15:59 ` [PATCH v5 0/4] " Mike Christie
2022-08-01 23:45 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220905173905.GA3405134@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=adrian.hunter@intel.com \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jaegeuk@kernel.org \
    --cc=jejb@linux.ibm.com \
    --cc=john.garry@huawei.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=michael.christie@oracle.com \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox