* [PATCH v2 0/3] Call blk_mq_free_tag_set() earlier
@ 2022-06-30 21:37 Bart Van Assche
2022-06-30 21:37 ` [PATCH v2 1/3] scsi: Simplify scsi_forget_host() Bart Van Assche
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Bart Van Assche @ 2022-06-30 21:37 UTC (permalink / raw)
To: Martin K . Petersen; +Cc: Jaegeuk Kim, linux-scsi, Bart Van Assche
Hi Martin,
This patch series fixes a recently reported use-after-free in the SRP driver.
Please consider this patch series for kernel v5.20.
Changes compared to v1:
- Expanded this series from one to three patches.
- Fixed the description of patch 3/3.
Bart Van Assche (3):
scsi: Simplify scsi_forget_host()
scsi: Make scsi_forget_host() wait for request queue removal
scsi: core: Call blk_mq_free_tag_set() earlier
drivers/scsi/hosts.c | 10 +++++-----
drivers/scsi/scsi_lib.c | 3 +++
drivers/scsi/scsi_scan.c | 34 +++++++++++++++++++++++++++++-----
3 files changed, 37 insertions(+), 10 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH v2 1/3] scsi: Simplify scsi_forget_host() 2022-06-30 21:37 [PATCH v2 0/3] Call blk_mq_free_tag_set() earlier Bart Van Assche @ 2022-06-30 21:37 ` Bart Van Assche 2022-07-05 2:40 ` Ming Lei 2022-06-30 21:37 ` [PATCH v2 2/3] scsi: Make scsi_forget_host() wait for request queue removal Bart Van Assche 2022-06-30 21:37 ` [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier Bart Van Assche 2 siblings, 1 reply; 14+ messages in thread From: Bart Van Assche @ 2022-06-30 21:37 UTC (permalink / raw) To: Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Bart Van Assche, Christoph Hellwig, Ming Lei, Hannes Reinecke, John Garry scsi_forget_host() has only one caller, namely scsi_remove_host(). That function may sleep. Additionally, scsi_forget_host() calls a function that may sleep (__scsi_remove_device()). Simplify scsi_forget_host() by removing support for saving and restoring the interrupt state. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/scsi/scsi_scan.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 91ac901a6682..5c3bb4ceeac3 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1964,17 +1964,18 @@ EXPORT_SYMBOL(scsi_scan_host); void scsi_forget_host(struct Scsi_Host *shost) { struct scsi_device *sdev; - unsigned long flags; + + might_sleep(); restart: - spin_lock_irqsave(shost->host_lock, flags); + spin_lock_irq(shost->host_lock); list_for_each_entry(sdev, &shost->__devices, siblings) { if (sdev->sdev_state == SDEV_DEL) continue; - spin_unlock_irqrestore(shost->host_lock, flags); + spin_unlock_irq(shost->host_lock); __scsi_remove_device(sdev); goto restart; } - spin_unlock_irqrestore(shost->host_lock, flags); + spin_unlock_irq(shost->host_lock); } ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/3] scsi: Simplify scsi_forget_host() 2022-06-30 21:37 ` [PATCH v2 1/3] scsi: Simplify scsi_forget_host() Bart Van Assche @ 2022-07-05 2:40 ` Ming Lei 0 siblings, 0 replies; 14+ messages in thread From: Ming Lei @ 2022-07-05 2:40 UTC (permalink / raw) To: Bart Van Assche Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Christoph Hellwig, Hannes Reinecke, John Garry On Thu, Jun 30, 2022 at 02:37:31PM -0700, Bart Van Assche wrote: > scsi_forget_host() has only one caller, namely scsi_remove_host(). That > function may sleep. Additionally, scsi_forget_host() calls a function > that may sleep (__scsi_remove_device()). Simplify scsi_forget_host() by > removing support for saving and restoring the interrupt state. > > Cc: Christoph Hellwig <hch@lst.de> > Cc: Ming Lei <ming.lei@redhat.com> > Cc: Hannes Reinecke <hare@suse.de> > Cc: John Garry <john.garry@huawei.com> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- Reviewed-by: Ming Lei <ming.lei@redhat.com> Thanks, Ming ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 2/3] scsi: Make scsi_forget_host() wait for request queue removal 2022-06-30 21:37 [PATCH v2 0/3] Call blk_mq_free_tag_set() earlier Bart Van Assche 2022-06-30 21:37 ` [PATCH v2 1/3] scsi: Simplify scsi_forget_host() Bart Van Assche @ 2022-06-30 21:37 ` Bart Van Assche 2022-07-01 16:25 ` Mike Christie 2022-07-05 3:38 ` Ming Lei 2022-06-30 21:37 ` [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier Bart Van Assche 2 siblings, 2 replies; 14+ messages in thread From: Bart Van Assche @ 2022-06-30 21:37 UTC (permalink / raw) To: Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Bart Van Assche, Christoph Hellwig, Ming Lei, Hannes Reinecke, John Garry Prepare for freeing the host tag set earlier by making scsi_forget_host() wait until all activity on the host tag set has stopped. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/scsi/scsi_scan.c | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 5c3bb4ceeac3..c8331ccdde95 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1961,6 +1961,16 @@ void scsi_scan_host(struct Scsi_Host *shost) } EXPORT_SYMBOL(scsi_scan_host); +/** + * scsi_forget_host() - Remove all SCSI devices from a host. + * @shost: SCSI host to remove devices from. + * + * Removes all SCSI devices that have not yet been removed. For the SCSI devices + * for which removal started before scsi_forget_host(), wait until the + * associated request queue has reached the "dead" state. In that state it is + * guaranteed that no new requests will be allocated and also that no requests + * are in progress anymore. + */ void scsi_forget_host(struct Scsi_Host *shost) { struct scsi_device *sdev; @@ -1970,8 +1980,21 @@ void scsi_forget_host(struct Scsi_Host *shost) restart: spin_lock_irq(shost->host_lock); list_for_each_entry(sdev, &shost->__devices, siblings) { - if (sdev->sdev_state == SDEV_DEL) + if (sdev->sdev_state == SDEV_DEL && + blk_queue_dead(sdev->request_queue)) { continue; + } + if (sdev->sdev_state == SDEV_DEL) { + get_device(&sdev->sdev_gendev); + spin_unlock_irq(shost->host_lock); + + while (!blk_queue_dead(sdev->request_queue)) + msleep(10); + + spin_lock_irq(shost->host_lock); + put_device(&sdev->sdev_gendev); + goto restart; + } spin_unlock_irq(shost->host_lock); __scsi_remove_device(sdev); goto restart; ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 2/3] scsi: Make scsi_forget_host() wait for request queue removal 2022-06-30 21:37 ` [PATCH v2 2/3] scsi: Make scsi_forget_host() wait for request queue removal Bart Van Assche @ 2022-07-01 16:25 ` Mike Christie 2022-07-01 23:44 ` Bart Van Assche 2022-07-05 3:38 ` Ming Lei 1 sibling, 1 reply; 14+ messages in thread From: Mike Christie @ 2022-07-01 16:25 UTC (permalink / raw) To: Bart Van Assche, Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Christoph Hellwig, Ming Lei, Hannes Reinecke, John Garry On 6/30/22 4:37 PM, Bart Van Assche wrote: > struct scsi_device *sdev; > @@ -1970,8 +1980,21 @@ void scsi_forget_host(struct Scsi_Host *shost) > restart: > spin_lock_irq(shost->host_lock); > list_for_each_entry(sdev, &shost->__devices, siblings) { > - if (sdev->sdev_state == SDEV_DEL) > + if (sdev->sdev_state == SDEV_DEL && > + blk_queue_dead(sdev->request_queue)) { > continue; > + } > + if (sdev->sdev_state == SDEV_DEL) { > + get_device(&sdev->sdev_gendev); > + spin_unlock_irq(shost->host_lock); > + > + while (!blk_queue_dead(sdev->request_queue)) > + msleep(10); > + > + spin_lock_irq(shost->host_lock); > + put_device(&sdev->sdev_gendev); > + goto restart; > + } > spin_unlock_irq(shost->host_lock); > __scsi_remove_device(sdev); > goto restart; Are there 2 ways to hit your issue? 1. Normal case. srp_remove_one frees srp_device. Then all refs to host are dropped and we call srp_exit_cmd_priv which accesses the freed srp_device? You don't need the above patch for this case right. 2. Are you hitting a race? Something did a scsi_remove_device. It set the device state to SDEV_DEL. It's stuck in blk_cleanup_queue blk_freeze_queue. Now we do the above code. Without your patch we just move on by that device. Commands could still be accessed. With your patch we wait for for that other thread to complete the device destruction. Could you also solve both issues by adding a scsi_host_template scsi_host release function that's called from scsi_host_dev_release. A new srp_host_release would free structs like the srp device from there. Or would we still have an issue for #2 where we just don't know how far blk_freeze_queue has got so commands could still be hitting our queue_rq function when we are doing scsi_host_dev_release? I like how your patch makes it so we know after scsi_remove_host has returned that the device is really gone. Could we have a similar race as in #2 with someone doing a scsi_remove_device and transport doing scsi_remove_target? We would not hit the same use after free from the tag set exit_cmd_priv being called. But, for example, if we wanted to free some transport level resources that running commands reference after scsi_target_remove could we hit a use after free? If so maybe we want to move this wait/check to __scsi_remove_device? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 2/3] scsi: Make scsi_forget_host() wait for request queue removal 2022-07-01 16:25 ` Mike Christie @ 2022-07-01 23:44 ` Bart Van Assche 0 siblings, 0 replies; 14+ messages in thread From: Bart Van Assche @ 2022-07-01 23:44 UTC (permalink / raw) To: Mike Christie, Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Christoph Hellwig, Ming Lei, Hannes Reinecke, John Garry On 7/1/22 09:25, Mike Christie wrote: > On 6/30/22 4:37 PM, Bart Van Assche wrote: >> struct scsi_device *sdev; >> @@ -1970,8 +1980,21 @@ void scsi_forget_host(struct Scsi_Host *shost) >> restart: >> spin_lock_irq(shost->host_lock); >> list_for_each_entry(sdev, &shost->__devices, siblings) { >> - if (sdev->sdev_state == SDEV_DEL) >> + if (sdev->sdev_state == SDEV_DEL && >> + blk_queue_dead(sdev->request_queue)) { >> continue; >> + } >> + if (sdev->sdev_state == SDEV_DEL) { >> + get_device(&sdev->sdev_gendev); >> + spin_unlock_irq(shost->host_lock); >> + >> + while (!blk_queue_dead(sdev->request_queue)) >> + msleep(10); >> + >> + spin_lock_irq(shost->host_lock); >> + put_device(&sdev->sdev_gendev); >> + goto restart; >> + } >> spin_unlock_irq(shost->host_lock); >> __scsi_remove_device(sdev); >> goto restart; > > Are there 2 ways to hit your issue? > > 1. Normal case. srp_remove_one frees srp_device. Then all refs > to host are dropped and we call srp_exit_cmd_priv which accesses > the freed srp_device? > > You don't need the above patch for this case right. > > 2. Are you hitting a race? Something did a scsi_remove_device. It > set the device state to SDEV_DEL. It's stuck in blk_cleanup_queue > blk_freeze_queue. Now we do the above code. Without your patch > we just move on by that device. Commands could still be accessed. > With your patch we wait for for that other thread to complete the > device destruction. > > > Could you also solve both issues by adding a scsi_host_template > scsi_host release function that's called from scsi_host_dev_release. A > new srp_host_release would free structs like the srp device from there. > Or would we still have an issue for #2 where we just don't know how > far blk_freeze_queue has got so commands could still be hitting our > queue_rq function when we are doing scsi_host_dev_release? > > I like how your patch makes it so we know after scsi_remove_host > has returned that the device is really gone. Could we have a similar > race as in #2 with someone doing a scsi_remove_device and transport > doing scsi_remove_target? > > We would not hit the same use after free from the tag set exit_cmd_priv > being called. But, for example, if we wanted to free some transport level > resources that running commands reference after scsi_target_remove could > we hit a use after free? If so maybe we want to move this wait/check to > __scsi_remove_device? Hi Mike, Although I'm not sure that I completely understood the above: my goal with this patch is to make sure that all activity on the host tag set has stopped by the time scsi_forget_host() returns. I do not only want to cover the case where scsi_remove_host() is called by the ib_srp driver but also the case where a SCSI device that was created by the ib_srp driver is removed before scsi_remove_host() is called. Users can trigger this scenario by writing into /sys/class/scsi_device/*/*/delete. Adding a new callback into scsi_host_dev_release() would not help because the scsi_host_dev_release() call may happen long after scsi_forget_host() returned. Does this answer your question? Bart. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 2/3] scsi: Make scsi_forget_host() wait for request queue removal 2022-06-30 21:37 ` [PATCH v2 2/3] scsi: Make scsi_forget_host() wait for request queue removal Bart Van Assche 2022-07-01 16:25 ` Mike Christie @ 2022-07-05 3:38 ` Ming Lei 1 sibling, 0 replies; 14+ messages in thread From: Ming Lei @ 2022-07-05 3:38 UTC (permalink / raw) To: Bart Van Assche Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Christoph Hellwig, Hannes Reinecke, John Garry, ming.lei On Thu, Jun 30, 2022 at 02:37:32PM -0700, Bart Van Assche wrote: > Prepare for freeing the host tag set earlier by making scsi_forget_host() > wait until all activity on the host tag set has stopped. I think it is correct to free the host tag during removing host. > > Cc: Christoph Hellwig <hch@lst.de> > Cc: Ming Lei <ming.lei@redhat.com> > Cc: Hannes Reinecke <hare@suse.de> > Cc: John Garry <john.garry@huawei.com> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/scsi/scsi_scan.c | 25 ++++++++++++++++++++++++- > 1 file changed, 24 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > index 5c3bb4ceeac3..c8331ccdde95 100644 > --- a/drivers/scsi/scsi_scan.c > +++ b/drivers/scsi/scsi_scan.c > @@ -1961,6 +1961,16 @@ void scsi_scan_host(struct Scsi_Host *shost) > } > EXPORT_SYMBOL(scsi_scan_host); > > +/** > + * scsi_forget_host() - Remove all SCSI devices from a host. > + * @shost: SCSI host to remove devices from. > + * > + * Removes all SCSI devices that have not yet been removed. For the SCSI devices > + * for which removal started before scsi_forget_host(), wait until the > + * associated request queue has reached the "dead" state. In that state it is > + * guaranteed that no new requests will be allocated and also that no requests > + * are in progress anymore. > + */ > void scsi_forget_host(struct Scsi_Host *shost) > { > struct scsi_device *sdev; > @@ -1970,8 +1980,21 @@ void scsi_forget_host(struct Scsi_Host *shost) > restart: > spin_lock_irq(shost->host_lock); > list_for_each_entry(sdev, &shost->__devices, siblings) { > - if (sdev->sdev_state == SDEV_DEL) > + if (sdev->sdev_state == SDEV_DEL && > + blk_queue_dead(sdev->request_queue)) { > continue; > + } > + if (sdev->sdev_state == SDEV_DEL) { > + get_device(&sdev->sdev_gendev); > + spin_unlock_irq(shost->host_lock); > + > + while (!blk_queue_dead(sdev->request_queue)) > + msleep(10); Thinking of further, this report on UAF on SRP resource and Changhui's report on UAF of host->hostt should belong to same kind of issue: 1) after scsi_remove_host() returns, either the HBA driver module can be unloaded and hostt can't be used, or any HBA resources can be freed, both may cause UAF in scsi_mq_exit_request, so moving freeing host tagset to scsi_remove_host() is correct. static void scsi_mq_exit_request() { ... if (shost->hostt->exit_cmd_priv) shost->hostt->exit_cmd_priv(shost, cmd); } 2) checking request queue dead here isn't good, which not only relies on block layer implementation detail, but also can't avoid UAF on ->hostt, I'd suggest to fix them all. The attached patch may drain all sdevs, which can replace the 2nd patch if you don't mind, but the 3rd patch is still needed: diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index e1cb187402fd..6445718c2b18 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -189,6 +189,14 @@ void scsi_remove_host(struct Scsi_Host *shost) transport_unregister_device(&shost->shost_gendev); device_unregister(&shost->shost_dev); device_del(&shost->shost_gendev); + + /* + * Once returning, the scsi LLD module can be unloaded, so we have + * to wait until our descendants are released, otherwise our host + * device reference can be grabbed by them, then use-after-free + * on shost->hostt will be triggered + */ + wait_for_completion(&shost->targets_completion); } EXPORT_SYMBOL(scsi_remove_host); @@ -393,6 +401,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) INIT_LIST_HEAD(&shost->starved_list); init_waitqueue_head(&shost->host_wait); mutex_init(&shost->scan_mutex); + init_completion(&shost->targets_completion); index = ida_simple_get(&host_index_ida, 0, 0, GFP_KERNEL); if (index < 0) { diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 10e5bffc34aa..b6e6354da396 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -545,10 +545,8 @@ EXPORT_SYMBOL(scsi_device_get); */ void scsi_device_put(struct scsi_device *sdev) { - struct module *mod = sdev->host->hostt->module; - + module_put(sdev->host->hostt->module); put_device(&sdev->sdev_gendev); - module_put(mod); } EXPORT_SYMBOL(scsi_device_put); diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 2ef78083f1ef..931333a48372 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -406,9 +406,14 @@ static void scsi_target_destroy(struct scsi_target *starget) static void scsi_target_dev_release(struct device *dev) { struct device *parent = dev->parent; + struct Scsi_Host *shost = dev_to_shost(parent); struct scsi_target *starget = to_scsi_target(dev); kfree(starget); + + if (!atomic_dec_return(&shost->nr_targets)) + complete_all(&shost->targets_completion); + put_device(parent); } @@ -521,6 +526,7 @@ static struct scsi_target *scsi_alloc_target(struct device *parent, starget->state = STARGET_CREATED; starget->scsi_level = SCSI_2; starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED; + atomic_inc(&shost->nr_targets); retry: spin_lock_irqsave(shost->host_lock, flags); diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 94091d5281ba..28c51ef0ea54 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -449,12 +449,9 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work) struct scsi_vpd *vpd_pg80 = NULL, *vpd_pg83 = NULL; struct scsi_vpd *vpd_pg0 = NULL, *vpd_pg89 = NULL; unsigned long flags; - struct module *mod; sdev = container_of(work, struct scsi_device, ew.work); - mod = sdev->host->hostt->module; - scsi_dh_release_device(sdev); parent = sdev->sdev_gendev.parent; @@ -505,17 +502,11 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work) if (parent) put_device(parent); - module_put(mod); } static void scsi_device_dev_release(struct device *dev) { struct scsi_device *sdp = to_scsi_device(dev); - - /* Set module pointer as NULL in case of module unloading */ - if (!try_module_get(sdp->host->hostt->module)) - sdp->host->hostt->module = NULL; - execute_in_process_context(scsi_device_dev_release_usercontext, &sdp->ew); } diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index 37718373defc..d0baffd62287 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -710,6 +710,9 @@ struct Scsi_Host { /* ldm bits */ struct device shost_gendev, shost_dev; + atomic_t nr_targets; + struct completion targets_completion; + /* * Points to the transport data (if any) which is allocated * separately Thanks, Ming ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier 2022-06-30 21:37 [PATCH v2 0/3] Call blk_mq_free_tag_set() earlier Bart Van Assche 2022-06-30 21:37 ` [PATCH v2 1/3] scsi: Simplify scsi_forget_host() Bart Van Assche 2022-06-30 21:37 ` [PATCH v2 2/3] scsi: Make scsi_forget_host() wait for request queue removal Bart Van Assche @ 2022-06-30 21:37 ` Bart Van Assche 2022-07-01 3:44 ` Ming Lei 2 siblings, 1 reply; 14+ messages in thread From: Bart Van Assche @ 2022-06-30 21:37 UTC (permalink / raw) To: Martin K . Petersen Cc: Jaegeuk Kim, linux-scsi, Bart Van Assche, Christoph Hellwig, Ming Lei, Hannes Reinecke, John Garry, Li Zhijian There are two .exit_cmd_priv implementations. Both implementations use resources associated with the SCSI host. Make sure that these resources are still available when .exit_cmd_priv is called by moving the .exit_cmd_priv calls from scsi_host_dev_release() to scsi_forget_host(). Moving blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_forget_host() is safe because scsi_forget_host() drains all the request queues that use the host tag set. This guarantees that no requests are in flight and also that no new requests will be allocated from the host tag set. This patch fixes the following use-after-free: ================================================================== BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] Read of size 8 at addr ffff888100337000 by task multipathd/16727 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x5e/0x5db kasan_report+0xab/0x120 srp_exit_cmd_priv+0x27/0xd0 [ib_srp] scsi_mq_exit_request+0x4d/0x70 blk_mq_free_rqs+0x143/0x410 __blk_mq_free_map_and_rqs+0x6e/0x100 blk_mq_free_tag_set+0x2b/0x160 scsi_host_dev_release+0xf3/0x1a0 device_release+0x54/0xe0 kobject_put+0xa5/0x120 device_release+0x54/0xe0 kobject_put+0xa5/0x120 scsi_device_dev_release_usercontext+0x4c1/0x4e0 execute_in_process_context+0x23/0x90 device_release+0x54/0xe0 kobject_put+0xa5/0x120 scsi_disk_release+0x3f/0x50 device_release+0x54/0xe0 kobject_put+0xa5/0x120 disk_release+0x17f/0x1b0 device_release+0x54/0xe0 kobject_put+0xa5/0x120 dm_put_table_device+0xa3/0x160 [dm_mod] dm_put_device+0xd0/0x140 [dm_mod] free_priority_group+0xd8/0x110 [dm_multipath] free_multipath+0x94/0xe0 [dm_multipath] dm_table_destroy+0xa2/0x1e0 [dm_mod] __dm_destroy+0x196/0x350 [dm_mod] dev_remove+0x10c/0x160 [dm_mod] ctl_ioctl+0x2c2/0x590 [dm_mod] dm_ctl_ioctl+0x5/0x10 [dm_mod] __x64_sys_ioctl+0xb4/0xf0 dm_ctl_ioctl+0x5/0x10 [dm_mod] __x64_sys_ioctl+0xb4/0xf0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Cc: Li Zhijian <lizhijian@fujitsu.com> Reported-by: Li Zhijian <lizhijian@fujitsu.com> Tested-by: Li Zhijian <lizhijian@fujitsu.com> Fixes: 65ca846a5314 ("scsi: core: Introduce {init,exit}_cmd_priv()") Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/scsi/hosts.c | 10 +++++----- drivers/scsi/scsi_lib.c | 3 +++ 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index ef6c0e37acce..74bfa187fe19 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -182,6 +182,8 @@ void scsi_remove_host(struct Scsi_Host *shost) mutex_unlock(&shost->scan_mutex); scsi_proc_host_rm(shost); + scsi_mq_destroy_tags(shost); + spin_lock_irqsave(shost->host_lock, flags); if (scsi_host_set_state(shost, SHOST_DEL)) BUG_ON(scsi_host_set_state(shost, SHOST_DEL_RECOVERY)); @@ -295,8 +297,8 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, return error; /* - * Any host allocation in this function will be freed in - * scsi_host_dev_release(). + * Any resources associated with the SCSI host in this function except + * the tag set will be freed by scsi_host_dev_release(). */ out_del_dev: device_del(&shost->shost_dev); @@ -312,6 +314,7 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, pm_runtime_disable(&shost->shost_gendev); pm_runtime_set_suspended(&shost->shost_gendev); pm_runtime_put_noidle(&shost->shost_gendev); + scsi_mq_destroy_tags(shost); fail: return error; } @@ -345,9 +348,6 @@ static void scsi_host_dev_release(struct device *dev) kfree(dev_name(&shost->shost_dev)); } - if (shost->tag_set.tags) - scsi_mq_destroy_tags(shost); - kfree(shost->shost_data); ida_free(&host_index_ida, shost->host_no); diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 6ffc9e4258a8..1aa1a279f8f3 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1990,7 +1990,10 @@ int scsi_mq_setup_tags(struct Scsi_Host *shost) void scsi_mq_destroy_tags(struct Scsi_Host *shost) { + if (!shost->tag_set.tags) + return; blk_mq_free_tag_set(&shost->tag_set); + WARN_ON_ONCE(shost->tag_set.tags); } /** ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier 2022-06-30 21:37 ` [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier Bart Van Assche @ 2022-07-01 3:44 ` Ming Lei 2022-07-01 7:25 ` lizhijian ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Ming Lei @ 2022-07-01 3:44 UTC (permalink / raw) To: Bart Van Assche Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Christoph Hellwig, Hannes Reinecke, John Garry, Li Zhijian On Thu, Jun 30, 2022 at 02:37:33PM -0700, Bart Van Assche wrote: > There are two .exit_cmd_priv implementations. Both implementations use > resources associated with the SCSI host. Make sure that these resources are Please document what the exact resources associated with this SCSI host is. We need the root cause. I understand it might be related with module unloading, since ib_srp may be gone already, but not sure if it is the exact one in this report. > still available when .exit_cmd_priv is called by moving the .exit_cmd_priv > calls from scsi_host_dev_release() to scsi_forget_host(). Moving > blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_forget_host() is > safe because scsi_forget_host() drains all the request queues that use the > host tag set. This guarantees that no requests are in flight and also that > no new requests will be allocated from the host tag set. > > This patch fixes the following use-after-free: > > ================================================================== > BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] > Read of size 8 at addr ffff888100337000 by task multipathd/16727 What is the 8bytes buffer which triggers UAF? what does srp_exit_cmd_priv+0x27 point to? Thanks, Ming ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier 2022-07-01 3:44 ` Ming Lei @ 2022-07-01 7:25 ` lizhijian 2022-07-01 7:45 ` Li, Zhijian 2022-07-01 14:07 ` Bart Van Assche 2 siblings, 0 replies; 14+ messages in thread From: lizhijian @ 2022-07-01 7:25 UTC (permalink / raw) To: Ming Lei, Bart Van Assche Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi@vger.kernel.org, Christoph Hellwig, Hannes Reinecke, John Garry On 01/07/2022 11:44, Ming Lei wrote: > On Thu, Jun 30, 2022 at 02:37:33PM -0700, Bart Van Assche wrote: >> There are two .exit_cmd_priv implementations. Both implementations use >> resources associated with the SCSI host. Make sure that these resources are > Please document what the exact resources associated with this SCSI host is. > > We need the root cause. > > I understand it might be related with module unloading, since ib_srp may > be gone already, but not sure if it is the exact one in this report. > >> still available when .exit_cmd_priv is called by moving the .exit_cmd_priv >> calls from scsi_host_dev_release() to scsi_forget_host(). Moving >> blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_forget_host() is >> safe because scsi_forget_host() drains all the request queues that use the >> host tag set. This guarantees that no requests are in flight and also that >> no new requests will be allocated from the host tag set. >> >> This patch fixes the following use-after-free: >> >> ================================================================== >> BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] >> Read of size 8 at addr ffff888100337000 by task multipathd/16727 > What is the 8bytes buffer which triggers UAF? what does srp_exit_cmd_priv+0x27 > point to? This bug was reported by me, let's input some debug information. *Attention*: below debug info come from a modified source, so the offset it is a bit different from above one. [ 120.400572] ib_srp: lizhijian: srp_exit_cmd_priv:975: target_host ffff88810b8d6000, ffff88810b8d67e0 [ 120.400578] ib_srp: lizhijian: srp_exit_cmd_priv:977: target_host ffff88810b8d6000, ffff88810b8d67e0 [ 120.400590] ================================================================== [ 120.400592] BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x6c/0x109 [ib_srp] [ 120.400616] Read of size 8 at addr ffff8881076b1800 by task multipathd/1417 [ 120.400619] [ 120.400621] CPU: 0 PID: 1417 Comm: multipathd Not tainted 5.19.0-rc1-roce-flush+ #85 [ 120.400626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-27-g64f37cc530f1-prebuilt.qemu.org 04/01/2014 crash> struct srp_target_port.srp_host ffff88810b8d67e0 srp_host = 0xffff8881076b1800, crash> struct srp_target_port.srp_host struct srp_target_port { [104] struct srp_host *srp_host; } crash> struct srp_host.srp_dev 0xffff8881076b1800 srp_dev = 0xffff88800bcd1400, crash> struct srp_device 0xffff88800bcd1400 struct srp_device { dev_list = { next = 0xffff888106fafd00, prev = 0xb680010900000749 }, dev = 0x0, pd = 0x0, global_rkey = 0, mr_page_mask = 3, mr_page_size = 181960704, mr_max_size = -30592, max_pages_per_mr = 117112064, has_fr = 129, use_fast_reg = 136 } crash> dis -s srp_exit_cmd_priv+0x6c FILE: ../drivers/infiniband/ulp/srp/ib_srp.c LINE: 978 973 struct srp_request *req; 974 975 pr_info("lizhijian: %s:%d: target_host %px, %px\n", __func__, __LINE__, shost, shost->hostdata); 976 target = host_to_target(shost); 977 pr_info("lizhijian: %s:%d: target_host %px, %px\n", __func__, __LINE__, shost, shost->hostdata); * 978 dev = target->srp_host->srp_dev; 979 ibdev = dev->dev; 980 req = scsi_cmd_priv(cmd); 981 982 kfree(req->fr_list); 983 if (req->indirect_dma_addr) { 984 ib_dma_unmap_single(ibdev, req->indirect_dma_addr, 985 target->indirect_size, 986 DMA_TO_DEVICE); Thanks Zhijian > > Thanks, > Ming > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier 2022-07-01 3:44 ` Ming Lei 2022-07-01 7:25 ` lizhijian @ 2022-07-01 7:45 ` Li, Zhijian 2022-07-01 14:07 ` Bart Van Assche 2 siblings, 0 replies; 14+ messages in thread From: Li, Zhijian @ 2022-07-01 7:45 UTC (permalink / raw) To: Ming Lei, Bart Van Assche Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Christoph Hellwig, Hannes Reinecke, John Garry Send again, the format of previous one is wrong. on 7/1/2022 11:44 AM, Ming Lei wrote: > On Thu, Jun 30, 2022 at 02:37:33PM -0700, Bart Van Assche wrote: >> There are two .exit_cmd_priv implementations. Both implementations use >> resources associated with the SCSI host. Make sure that these resources are > Please document what the exact resources associated with this SCSI host is. > > We need the root cause. > > I understand it might be related with module unloading, since ib_srp may > be gone already, but not sure if it is the exact one in this report. > >> still available when .exit_cmd_priv is called by moving the .exit_cmd_priv >> calls from scsi_host_dev_release() to scsi_forget_host(). Moving >> blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_forget_host() is >> safe because scsi_forget_host() drains all the request queues that use the >> host tag set. This guarantees that no requests are in flight and also that >> no new requests will be allocated from the host tag set. >> >> This patch fixes the following use-after-free: >> >> ================================================================== >> BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] >> Read of size 8 at addr ffff888100337000 by task multipathd/16727 > What is the 8bytes buffer which triggers UAF? what does srp_exit_cmd_priv+0x27 > point to? This bug was reported by me, let's input some debug information. *Attention*: below debug info came from a modified source, so the offset it is a bit different from above one. [ 120.400572] ib_srp: lizhijian: srp_exit_cmd_priv:975: target_host ffff88810b8d6000, ffff88810b8d67e0 [ 120.400578] ib_srp: lizhijian: srp_exit_cmd_priv:977: target_host ffff88810b8d6000, ffff88810b8d67e0 [ 120.400590] ================================================================== [ 120.400592] BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x6c/0x109 [ib_srp] [ 120.400616] Read of size 8 at addr ffff8881076b1800 by task multipathd/1417 [ 120.400619] [ 120.400621] CPU: 0 PID: 1417 Comm: multipathd Not tainted 5.19.0-rc1-roce-flush+ #85 [ 120.400626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-27-g64f37cc530f1-prebuilt.qemu.org 04/01/2014 crash> struct srp_target_port.srp_host ffff88810b8d67e0 srp_host = 0xffff8881076b1800, crash> struct srp_target_port.srp_host struct srp_target_port { [104] struct srp_host *srp_host; } crash> struct srp_host.srp_dev 0xffff8881076b1800 srp_dev = 0xffff88800bcd1400, crash> struct srp_device 0xffff88800bcd1400 struct srp_device { dev_list = { next = 0xffff888106fafd00, prev = 0xb680010900000749 }, dev = 0x0, pd = 0x0, global_rkey = 0, mr_page_mask = 3, mr_page_size = 181960704, mr_max_size = -30592, max_pages_per_mr = 117112064, has_fr = 129, use_fast_reg = 136 } crash> dis -s srp_exit_cmd_priv+0x6c FILE: ../drivers/infiniband/ulp/srp/ib_srp.c LINE: 978 973 struct srp_request *req; 974 975 pr_info("lizhijian: %s:%d: target_host %px, %px\n", __func__, __LINE__, shost, shost->hostdata); 976 target = host_to_target(shost); 977 pr_info("lizhijian: %s:%d: target_host %px, %px\n", __func__, __LINE__, shost, shost->hostdata); * 978 dev = target->srp_host->srp_dev; 979 ibdev = dev->dev; 980 req = scsi_cmd_priv(cmd); 981 982 kfree(req->fr_list); 983 if (req->indirect_dma_addr) { 984 ib_dma_unmap_single(ibdev, req->indirect_dma_addr, 985 target->indirect_size, 986 DMA_TO_DEVICE); 987 } 988 kfree(req->indirect_desc); 989 990 return 0; 991 } Thanks Zhijian > > > Thanks, > Ming > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier 2022-07-01 3:44 ` Ming Lei 2022-07-01 7:25 ` lizhijian 2022-07-01 7:45 ` Li, Zhijian @ 2022-07-01 14:07 ` Bart Van Assche 2022-07-01 14:37 ` Ming Lei 2 siblings, 1 reply; 14+ messages in thread From: Bart Van Assche @ 2022-07-01 14:07 UTC (permalink / raw) To: Ming Lei Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Christoph Hellwig, Hannes Reinecke, John Garry, Li Zhijian On 6/30/22 20:44, Ming Lei wrote: > On Thu, Jun 30, 2022 at 02:37:33PM -0700, Bart Van Assche wrote: >> There are two .exit_cmd_priv implementations. Both implementations use >> resources associated with the SCSI host. Make sure that these resources are > > Please document what the exact resources associated with this SCSI host is. > > We need the root cause. > > I understand it might be related with module unloading, since ib_srp may > be gone already, but not sure if it is the exact one in this report. It is not necessary to unload ib_srp to trigger this scenario. Hot-unplugging an RDMA adapter used by the ib_srp driver is sufficient. Hot-unplugging triggers a call of srp_remove_one(). srp_remove_one() itself and also its caller free resources used by srp_exit_cmd_priv(), e.g. struct ib_device. >> still available when .exit_cmd_priv is called by moving the .exit_cmd_priv >> calls from scsi_host_dev_release() to scsi_forget_host(). Moving >> blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_forget_host() is >> safe because scsi_forget_host() drains all the request queues that use the >> host tag set. This guarantees that no requests are in flight and also that >> no new requests will be allocated from the host tag set. >> >> This patch fixes the following use-after-free: >> >> ================================================================== >> BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] >> Read of size 8 at addr ffff888100337000 by task multipathd/16727 > > What is the 8bytes buffer which triggers UAF? what does srp_exit_cmd_priv+0x27 > point to? I think that Li already answered this question. Thanks, Bart. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier 2022-07-01 14:07 ` Bart Van Assche @ 2022-07-01 14:37 ` Ming Lei 2022-07-01 23:58 ` Bart Van Assche 0 siblings, 1 reply; 14+ messages in thread From: Ming Lei @ 2022-07-01 14:37 UTC (permalink / raw) To: Bart Van Assche Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Christoph Hellwig, Hannes Reinecke, John Garry, Li Zhijian On Fri, Jul 01, 2022 at 07:07:13AM -0700, Bart Van Assche wrote: > On 6/30/22 20:44, Ming Lei wrote: > > On Thu, Jun 30, 2022 at 02:37:33PM -0700, Bart Van Assche wrote: > > > There are two .exit_cmd_priv implementations. Both implementations use > > > resources associated with the SCSI host. Make sure that these resources are > > > > Please document what the exact resources associated with this SCSI host is. > > > > We need the root cause. > > > > I understand it might be related with module unloading, since ib_srp may > > be gone already, but not sure if it is the exact one in this report. > > It is not necessary to unload ib_srp to trigger this scenario. > Hot-unplugging an RDMA adapter used by the ib_srp driver is sufficient. > Hot-unplugging triggers a call of srp_remove_one(). srp_remove_one() itself > and also its caller free resources used by srp_exit_cmd_priv(), e.g. struct > ib_device. OK, looks it isn't same with Changhui's report. > > > > still available when .exit_cmd_priv is called by moving the .exit_cmd_priv > > > calls from scsi_host_dev_release() to scsi_forget_host(). Moving > > > blk_mq_free_tag_set() from scsi_host_dev_release() to scsi_forget_host() is > > > safe because scsi_forget_host() drains all the request queues that use the > > > host tag set. This guarantees that no requests are in flight and also that > > > no new requests will be allocated from the host tag set. > > > > > > This patch fixes the following use-after-free: > > > > > > ================================================================== > > > BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] > > > Read of size 8 at addr ffff888100337000 by task multipathd/16727 > > > > What is the 8bytes buffer which triggers UAF? what does srp_exit_cmd_priv+0x27 > > point to? > > I think that Li already answered this question. OK, from Li's input, the UAF is on the following code: struct srp_device *dev = target->srp_host->srp_dev; So looks you meant target->srp_host is freed by srp_remove_one() before calling srp_exit_cmd_priv? Then when is srp_remove_one() triggered? And why is it called before scsi_remove_host()? Sorry for the stupid question since I am not familiar with srp. Thanks, Ming ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier 2022-07-01 14:37 ` Ming Lei @ 2022-07-01 23:58 ` Bart Van Assche 0 siblings, 0 replies; 14+ messages in thread From: Bart Van Assche @ 2022-07-01 23:58 UTC (permalink / raw) To: Ming Lei Cc: Martin K . Petersen, Jaegeuk Kim, linux-scsi, Christoph Hellwig, Hannes Reinecke, John Garry, Li Zhijian, Mike Christie On 7/1/22 07:37, Ming Lei wrote: > On Fri, Jul 01, 2022 at 07:07:13AM -0700, Bart Van Assche wrote: >> On 6/30/22 20:44, Ming Lei wrote: >>> On Thu, Jun 30, 2022 at 02:37:33PM -0700, Bart Van Assche wrote: >>>> BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] >>>> Read of size 8 at addr ffff888100337000 by task multipathd/16727 >>> >>> What is the 8bytes buffer which triggers UAF? what does srp_exit_cmd_priv+0x27 >>> point to? >> >> I think that Li already answered this question. > > OK, from Li's input, the UAF is on the following code: > > struct srp_device *dev = target->srp_host->srp_dev; > > So looks you meant target->srp_host is freed by srp_remove_one() before calling > srp_exit_cmd_priv? > > Then when is srp_remove_one() triggered? And why is it called before > scsi_remove_host()? Sorry for the stupid question since I am not familiar with srp. Hi Ming, I think that can happen as the result of the following sequence (will look into converting this into a blktests test): * The Soft-RoCE (or soft-iWARP) driver is bound to a network interface. This results in the instantation of an RDMA interface that supports RDMA loopback. * ib_srp and ib_srpt are told to connect to each other over that RDMA loopback interface. This results in the creation of a SCSI host and one or more SCSI devices. * The Soft-RoCE (or soft-iWARP) driver is dissociated from all network interfaces. This causes the RDMA core to report a hot-unplug event. That results in a call of srp_remove_one(). I think the call chain is as follows: rxe_notify() ib_unregister_device_queued() ib_unregister_work() __ib_unregister_device() disable_device() remove_client_context() srp_remove_one() Bart. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-07-05 3:38 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-06-30 21:37 [PATCH v2 0/3] Call blk_mq_free_tag_set() earlier Bart Van Assche 2022-06-30 21:37 ` [PATCH v2 1/3] scsi: Simplify scsi_forget_host() Bart Van Assche 2022-07-05 2:40 ` Ming Lei 2022-06-30 21:37 ` [PATCH v2 2/3] scsi: Make scsi_forget_host() wait for request queue removal Bart Van Assche 2022-07-01 16:25 ` Mike Christie 2022-07-01 23:44 ` Bart Van Assche 2022-07-05 3:38 ` Ming Lei 2022-06-30 21:37 ` [PATCH v2 3/3] scsi: core: Call blk_mq_free_tag_set() earlier Bart Van Assche 2022-07-01 3:44 ` Ming Lei 2022-07-01 7:25 ` lizhijian 2022-07-01 7:45 ` Li, Zhijian 2022-07-01 14:07 ` Bart Van Assche 2022-07-01 14:37 ` Ming Lei 2022-07-01 23:58 ` Bart Van Assche
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox