From: James Bottomley <jejb@linux.vnet.ibm.com>
To: Jason Yan <yanaijie@huawei.com>, martin.petersen@oracle.com
Cc: linux-scsi@vger.kernel.org, Hannes Reinecke <hare@suse.de>,
Christoph Hellwig <hch@lst.de>,
Johannes Thumshirn <jthumshirn@suse.de>,
Zhaohongjiang <zhaohongjiang@huawei.com>,
Miao Xie <miaoxie@huawei.com>
Subject: Re: [PATCH] scsi: fix race condition when removing target
Date: Wed, 29 Nov 2017 08:31:48 -0800 [thread overview]
Message-ID: <1511973108.3222.10.camel@linux.vnet.ibm.com> (raw)
In-Reply-To: <20171129030556.47833-1-yanaijie@huawei.com>
On Wed, 2017-11-29 at 11:05 +0800, Jason Yan wrote:
> In commit fbce4d97fd43 ("scsi: fixup kernel warning during rmmod()"),
> we
> removed scsi_device_get() and directly called get_device() to
> increase
> the refcount of the device. But actullay scsi_device_get() will fail
> in
> three cases:
> 1. the scsi device is in SDEV_DEL or SDEV_CANCEL state
> 2. get_device() fail
> 3. the module is not alive
>
> The intended purpose was to remove the check of the module alive.
> Unfortunately the check of the device state was droped too. And this
> introduced a race condition like this:
>
> CPU0 CPU1
> __scsi_remove_target()
> ->iterate shost->__devices
> ->scsi_remove_device()
> ->put_device()
> someone still hold a refcount
> sd_release()
> -
> >scsi_disk_put()
> ->put_device()
> last put and trigger the device release
>
> ->goto restart
> ->iterate shost->__devices and got the same device
> ->get_device() while refcount is 0
This analysis fails here: get_device() on something with refcount 0
returns NULL. That triggers the if clause to ignore this device.
We may have a more complex way of triggering a dual put race as the
trace implies, but I don't think this is it.
[...]
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 50e7d7e..d398894 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -1398,6 +1398,15 @@ void scsi_remove_device(struct scsi_device
> *sdev)
> }
> EXPORT_SYMBOL(scsi_remove_device);
>
> +static int scsi_device_get_not_deleted(struct scsi_device *sdev)
> +{
> + if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state ==
> SDEV_CANCEL)
> + return -ENXIO;
> + if (!get_device(&sdev->sdev_gendev))
> + return -ENXIO;
> + return 0;
> +}
This is pretty much scsi_device_get() without the try_module get, so
they should probably be combined.
James
> static void __scsi_remove_target(struct scsi_target *starget)
> {
> struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
> @@ -1415,7 +1424,7 @@ static void __scsi_remove_target(struct
> scsi_target *starget)
> */
> if (sdev->channel != starget->channel ||
> sdev->id != starget->id ||
> - !get_device(&sdev->sdev_gendev))
> + scsi_device_get_not_deleted(sdev))
> continue;
> spin_unlock_irqrestore(shost->host_lock, flags);
> scsi_remove_device(sdev);
next prev parent reply other threads:[~2017-11-29 16:31 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-29 3:05 [PATCH] scsi: fix race condition when removing target Jason Yan
2017-11-29 7:41 ` Hannes Reinecke
2017-11-29 16:18 ` Bart Van Assche
2017-11-29 16:20 ` hch
2017-11-29 17:39 ` Bart Van Assche
2017-11-30 1:18 ` Jason Yan
2017-11-30 16:08 ` Bart Van Assche
2017-11-30 16:40 ` gregkh
2017-11-30 23:56 ` James Bottomley
2017-12-01 1:12 ` Finn Thain
2017-12-01 8:40 ` Jason Yan
2017-12-01 14:41 ` Ewan D. Milne
2017-12-01 15:35 ` James Bottomley
2017-12-05 12:37 ` Jason Yan
2017-12-05 15:37 ` James Bottomley
2017-12-06 0:41 ` Jason Yan
2017-12-06 2:07 ` James Bottomley
2017-12-06 2:43 ` Jason Yan
2017-11-29 17:39 ` gregkh
2017-11-29 18:49 ` Ewan D. Milne
2017-11-29 19:11 ` Bart Van Assche
2017-11-29 19:20 ` Ewan D. Milne
2017-11-29 19:50 ` Bart Van Assche
2017-11-29 17:39 ` gregkh
2017-11-29 17:47 ` Bart Van Assche
2017-11-29 16:31 ` James Bottomley [this message]
2017-11-29 16:34 ` Christoph Hellwig
2017-11-29 16:47 ` James Bottomley
2017-11-29 19:05 ` Ewan D. Milne
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1511973108.3222.10.camel@linux.vnet.ibm.com \
--to=jejb@linux.vnet.ibm.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jthumshirn@suse.de \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=miaoxie@huawei.com \
--cc=yanaijie@huawei.com \
--cc=zhaohongjiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.