From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: YangYang <yang.yang@vivo.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>
Subject: Re: block: del_gendisk() vs blk_queue_enter() race condition
Date: Tue, 8 Oct 2024 14:19:48 +0900 [thread overview]
Message-ID: <20241008051948.GB10794@google.com> (raw)
In-Reply-To: <b3690d1b-3c4f-4ec0-9d74-e09addc322ff@vivo.com>
On (24/10/08 12:02), YangYang wrote:
> On 2024/10/3 16:56, Sergey Senozhatsky wrote:
> > Hello,
> >
> > I'm looking at a report from the fleet (don't have a reproducer)
> > and wondering what you and block folks might think / suggest.
> >
> > The problem is basically as follows
> >
> > CPU0
> >
> > do_syscall
> > sys_close
> > __fput
> > blkdev_release
> > blkdev_put grabs ->open_mutex
> > sr_block_release
> > scsi_set_medium_removal
> > ioctl_internal_command
> > scsi_execute_cmd
> > scsi_alloc_request
> > blk_mq_alloc_request
> > blk_queue_enter
> > schedule
> >
> > at the same time:
> >
> > CPU1
> >
> > usb_disconnect
> > usb_disable_device
> > device_del
> > usb_unbind_interface
> > usb_stor_disconnect
> > scsi_remove_host
> > scsi_forget_host
> > __scsi_remove_device
> > device_del
> > bus_remove_device
> > device_release_driver_internal
> > sr_remove
> > del_gendisk
> > mutex_lock attempts to grab ->open_mutex
> > schedule
> >
>
> I'm a little confused here. How is the queue getting frozen in this
> scenario?
I don't know. Could it be that it's PM not frozen queue that falsifies
wait_event() condition? (if that's what you are pointing at).
I have several reports (various devices, various use-cases) and the ones
that I looked at so far have the same pattern:
usb_disconnect() vs blk_queue_enter()
E.g. one of the reports:
...
sd 1:0:0:0: [sdb] Attached SCSI removable disk
usb 3-4: USB disconnect, device number 29
sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=15s
sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 07 47 af fd 00 00 01 00
I/O error, dev sdb, sector 122138621 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
device offline error, dev sdb, sector 122138616 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev sdb, logical block 15267327, async page read
...
schedule+0x4f4/0x1540
del_gendisk+0x136/0x370
sd_remove+0x30/0x60
device_release_driver_internal+0x1a2/0x2a0
bus_remove_device+0x154/0x180
device_del+0x207/0x370
__scsi_remove_device+0xc0/0x170
scsi_forget_host+0x45/0x60
scsi_remove_host+0x87/0x170
usb_stor_disconnect+0x63/0xb0
usb_unbind_interface+0xbe/0x250
device_release_driver_internal+0x1a2/0x2a0
bus_remove_device+0x154/0x180
device_del+0x207/0x370
? kobject_release+0x56/0xb0
usb_disable_device+0x72/0x170
usb_disconnect+0xeb/0x280
schedule+0x4f4/0x1540
blk_queue_enter+0x172/0x250
blk_mq_alloc_request+0x167/0x210
scsi_execute_cmd+0x65/0x240
ioctl_internal_command+0x6c/0x150
scsi_set_medium_removal+0x63/0xc0
sd_release+0x42/0x50
blkdev_put+0x13b/0x1f0
blkdev_release+0x2b/0x40
__fput_sync+0x9b/0x2c0
__se_sys_close+0x69/0xc0
do_syscall_64+0x60/0x90
Or another report:
sr 1:0:0:0: Power-on or device reset occurred
sr 1:0:0:0: [sr0] scsi3-mmc drive: 8x/24x writer dvd-ram cd/rw xa/form2 cdda tray
usb 1-1.3.1: USB disconnect, device number 27
schedule+0x554/0x1218
schedule_preempt_disabled+0x30/0x50
mutex_lock+0x3c/0x70
del_gendisk+0xe8/0x370
sr_remove+0x30/0x58 [sr_mod (HASH:d5f2 4)]
device_release_driver_internal+0x1a0/0x278
device_release_driver+0x24/0x38
bus_remove_device+0x150/0x170
device_del+0x1d0/0x348
__scsi_remove_device+0xb4/0x198
scsi_forget_host+0x5c/0x80
scsi_remove_host+0x98/0x1c8
usb_stor_disconnect+0x74/0x110
usb_unbind_interface+0xcc/0x250
device_release_driver_internal+0x1a0/0x278
device_release_driver+0x24/0x38
bus_remove_device+0x150/0x170
device_del+0x1d0/0x348
usb_disable_device+0x88/0x190
usb_disconnect+0xf8/0x318
schedule+0x554/0x1218
blk_queue_enter+0xd0/0x170
blk_mq_alloc_request+0x138/0x1e8
scsi_execute_cmd+0x88/0x258
scsi_test_unit_ready+0x88/0x118
sr_drive_status+0x5c/0x160 [sr_mod (HASH:d5f2 4)]
cdrom_ioctl+0x7d4/0x2730 [cdrom (HASH:37c3 5)]
sr_block_ioctl+0xa8/0x110 [sr_mod (HASH:d5f2 4)]
blkdev_ioctl+0x468/0xbf0
__arm64_sys_ioctl+0x254/0x6d0
next prev parent reply other threads:[~2024-10-08 5:19 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-03 8:56 block: del_gendisk() vs blk_queue_enter() race condition Sergey Senozhatsky
2024-10-03 13:36 ` Christoph Hellwig
2024-10-03 13:43 ` Christoph Hellwig
2024-10-03 14:00 ` Sergey Senozhatsky
2024-10-03 14:17 ` Sergey Senozhatsky
2024-10-04 4:21 ` Sergey Senozhatsky
2024-10-04 6:45 ` Christoph Hellwig
2024-10-04 7:48 ` Sergey Senozhatsky
2024-10-04 7:49 ` Sergey Senozhatsky
2024-10-04 12:20 ` Christoph Hellwig
2024-10-04 14:32 ` Sergey Senozhatsky
2024-10-07 6:10 ` Christoph Hellwig
2024-10-07 9:45 ` Sergey Senozhatsky
2024-10-08 5:31 ` Sergey Senozhatsky
2024-10-04 14:41 ` Sergey Senozhatsky
2024-10-03 13:55 ` Sergey Senozhatsky
2024-10-08 4:02 ` YangYang
2024-10-08 5:19 ` Sergey Senozhatsky [this message]
2024-10-08 5:26 ` Sergey Senozhatsky
2024-10-08 5:56 ` Christoph Hellwig
2024-10-08 6:04 ` Christoph Hellwig
2024-10-08 6:10 ` Sergey Senozhatsky
2024-10-08 8:13 ` Christoph Hellwig
2024-10-08 8:20 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241008051948.GB10794@google.com \
--to=senozhatsky@chromium.org \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=yang.yang@vivo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.