* [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning
@ 2015-01-07 13:03 Bart Van Assche
2015-01-08 13:15 ` Christoph Hellwig
2015-01-15 15:23 ` Don Brace
0 siblings, 2 replies; 12+ messages in thread
From: Bart Van Assche @ 2015-01-07 13:03 UTC (permalink / raw)
To: James Bottomley, Christoph Hellwig
Cc: Hannes Reinecke, linux-scsi@vger.kernel.org
Since kernel v3.19-rc1 module_refcount() returns 1 instead of 0
when called from inside module_exit(). This breaks the
module_refcount() test in scsi_device_put() and hence causes the
following kernel warning to be reported when unloading the ib_srp
kernel module:
WARNING: CPU: 5 PID: 228 at kernel/module.c:954 module_put+0x207/0x220()
Call Trace:
[<ffffffff814d1fcf>] dump_stack+0x4c/0x65
[<ffffffff81053ada>] warn_slowpath_common+0x8a/0xc0
[<ffffffff81053bca>] warn_slowpath_null+0x1a/0x20
[<ffffffff810d0507>] module_put+0x207/0x220
[<ffffffffa000bea8>] scsi_device_put+0x48/0x50 [scsi_mod]
[<ffffffffa03676d2>] scsi_disk_put+0x32/0x50 [sd_mod]
[<ffffffffa0368d4c>] sd_shutdown+0x8c/0x150 [sd_mod]
[<ffffffffa0368e79>] sd_remove+0x69/0xc0 [sd_mod]
[<ffffffff813457ef>] __device_release_driver+0x7f/0xf0
[<ffffffff81345885>] device_release_driver+0x25/0x40
[<ffffffff81345134>] bus_remove_device+0x124/0x1b0
[<ffffffff8134189e>] device_del+0x13e/0x250
[<ffffffffa001cdcd>] __scsi_remove_device+0xcd/0xe0 [scsi_mod]
[<ffffffffa001b39f>] scsi_forget_host+0x6f/0x80 [scsi_mod]
[<ffffffffa000d5f6>] scsi_remove_host+0x86/0x140 [scsi_mod]
[<ffffffffa07d5c0b>] srp_remove_work+0x9b/0x210 [ib_srp]
[<ffffffff8106fd28>] process_one_work+0x1d8/0x780
[<ffffffff810703eb>] worker_thread+0x11b/0x4a0
[<ffffffff81075a6f>] kthread+0xef/0x110
[<ffffffff814dad6c>] ret_from_fork+0x7c/0xb0
See also patch "module: Remove stop_machine from module unloading"
(Masami Hiramatsu; commit e513cc1c07e2; kernel v3.19-rc1).
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
---
drivers/scsi/scsi.c | 63 ++++++++++++++++++++++++++++++++--------------
drivers/scsi/sd.c | 44 +++++++++++++++++---------------
include/scsi/scsi_device.h | 2 ++
3 files changed, 70 insertions(+), 39 deletions(-)
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index e028854..2cae46b 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -973,30 +973,63 @@ int scsi_report_opcode(struct scsi_device *sdev, unsigned char *buffer,
EXPORT_SYMBOL(scsi_report_opcode);
/**
- * scsi_device_get - get an additional reference to a scsi_device
+ * scsi_dev_get - get an additional reference to a scsi_device
* @sdev: device to get a reference to
+ * @get_lld: whether or not to increase the LLD kernel module refcount
*
- * Description: Gets a reference to the scsi_device and increments the use count
- * of the underlying LLDD module. You must hold host_lock of the
- * parent Scsi_Host or already have a reference when calling this.
+ * Description: Gets a reference to the scsi_device and optionally increments
+ * the use count of the associated LLDD module. You must hold host_lock of
+ * the parent Scsi_Host or already have a reference when calling this.
*/
-int scsi_device_get(struct scsi_device *sdev)
+int scsi_dev_get(struct scsi_device *sdev, bool get_lld)
{
if (sdev->sdev_state == SDEV_DEL)
return -ENXIO;
if (!get_device(&sdev->sdev_gendev))
return -ENXIO;
- /* We can fail this if we're doing SCSI operations
- * from module exit (like cache flush) */
- try_module_get(sdev->host->hostt->module);
+ /* Can fail if invoked during module exit (like cache flush) */
+ if (get_lld && !try_module_get(sdev->host->hostt->module)) {
+ put_device(&sdev->sdev_gendev);
+ return -ENXIO;
+ }
return 0;
}
+EXPORT_SYMBOL(scsi_dev_get);
+
+/**
+ * scsi_dev_put - release a reference to a scsi_device
+ * @sdev: device to release a reference on
+ * @put_lld: whether or not to decrease the LLD kernel module refcount
+ *
+ * Description: Release a reference to the scsi_device. The device is freed
+ * once the last user vanishes.
+ */
+void scsi_dev_put(struct scsi_device *sdev, bool put_lld)
+{
+ if (put_lld)
+ module_put(sdev->host->hostt->module);
+ put_device(&sdev->sdev_gendev);
+}
+EXPORT_SYMBOL(scsi_dev_put);
+
+/**
+ * scsi_device_get - get an additional reference to a scsi_device
+ * @sdev: device to get a reference to
+ *
+ * Description: Gets a reference to the scsi_device and increments the use count
+ * of the underlying LLDD module. You must hold host_lock of the
+ * parent Scsi_Host or already have a reference when calling this.
+ */
+int scsi_device_get(struct scsi_device *sdev)
+{
+ return scsi_dev_get(sdev, true);
+}
EXPORT_SYMBOL(scsi_device_get);
/**
- * scsi_device_put - release a reference to a scsi_device
- * @sdev: device to release a reference on.
+ * scsi_device_put - release a reference to a scsi_device
+ * @sdev: device to release a reference on
*
* Description: Release a reference to the scsi_device and decrements the use
* count of the underlying LLDD module. The device is freed once the last
@@ -1004,15 +1037,7 @@ EXPORT_SYMBOL(scsi_device_get);
*/
void scsi_device_put(struct scsi_device *sdev)
{
-#ifdef CONFIG_MODULE_UNLOAD
- struct module *module = sdev->host->hostt->module;
-
- /* The module refcount will be zero if scsi_device_get()
- * was called from a module removal routine */
- if (module && module_refcount(module) != 0)
- module_put(module);
-#endif
- put_device(&sdev->sdev_gendev);
+ scsi_dev_put(sdev, true);
}
EXPORT_SYMBOL(scsi_device_put);
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index fedab3c..05ab2ec 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -564,13 +564,13 @@ static int sd_major(int major_idx)
}
}
-static struct scsi_disk *__scsi_disk_get(struct gendisk *disk)
+static struct scsi_disk *__scsi_disk_get(struct gendisk *disk, bool get_lld)
{
struct scsi_disk *sdkp = NULL;
if (disk->private_data) {
sdkp = scsi_disk(disk);
- if (scsi_device_get(sdkp->device) == 0)
+ if (scsi_dev_get(sdkp->device, get_lld) == 0)
get_device(&sdkp->dev);
else
sdkp = NULL;
@@ -578,35 +578,36 @@ static struct scsi_disk *__scsi_disk_get(struct gendisk *disk)
return sdkp;
}
-static struct scsi_disk *scsi_disk_get(struct gendisk *disk)
+static struct scsi_disk *scsi_disk_get(struct gendisk *disk, bool get_lld)
{
struct scsi_disk *sdkp;
mutex_lock(&sd_ref_mutex);
- sdkp = __scsi_disk_get(disk);
+ sdkp = __scsi_disk_get(disk, get_lld);
mutex_unlock(&sd_ref_mutex);
return sdkp;
}
-static struct scsi_disk *scsi_disk_get_from_dev(struct device *dev)
+static struct scsi_disk *scsi_disk_get_from_dev(struct device *dev,
+ bool get_lld)
{
struct scsi_disk *sdkp;
mutex_lock(&sd_ref_mutex);
sdkp = dev_get_drvdata(dev);
if (sdkp)
- sdkp = __scsi_disk_get(sdkp->disk);
+ sdkp = __scsi_disk_get(sdkp->disk, get_lld);
mutex_unlock(&sd_ref_mutex);
return sdkp;
}
-static void scsi_disk_put(struct scsi_disk *sdkp)
+static void scsi_disk_put(struct scsi_disk *sdkp, bool put_lld)
{
struct scsi_device *sdev = sdkp->device;
mutex_lock(&sd_ref_mutex);
put_device(&sdkp->dev);
- scsi_device_put(sdev);
+ scsi_dev_put(sdev, put_lld);
mutex_unlock(&sd_ref_mutex);
}
@@ -1184,7 +1185,7 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
**/
static int sd_open(struct block_device *bdev, fmode_t mode)
{
- struct scsi_disk *sdkp = scsi_disk_get(bdev->bd_disk);
+ struct scsi_disk *sdkp = scsi_disk_get(bdev->bd_disk, true);
struct scsi_device *sdev;
int retval;
@@ -1239,7 +1240,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
return 0;
error_out:
- scsi_disk_put(sdkp);
+ scsi_disk_put(sdkp, true);
return retval;
}
@@ -1273,7 +1274,7 @@ static void sd_release(struct gendisk *disk, fmode_t mode)
* XXX is followed by a "rmmod sd_mod"?
*/
- scsi_disk_put(sdkp);
+ scsi_disk_put(sdkp, true);
}
static int sd_getgeo(struct block_device *bdev, struct hd_geometry *geo)
@@ -1525,11 +1526,11 @@ static int sd_sync_cache(struct scsi_disk *sdkp)
static void sd_rescan(struct device *dev)
{
- struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+ struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev, true);
if (sdkp) {
revalidate_disk(sdkp->disk);
- scsi_disk_put(sdkp);
+ scsi_disk_put(sdkp, true);
}
}
@@ -3142,11 +3143,14 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start)
/*
* Send a SYNCHRONIZE CACHE instruction down to the device through
* the normal SCSI command structure. Wait for the command to
- * complete.
+ * complete. Since this function can be called during SCSI LLD kernel
+ * module unload and since try_module_get() fails after kernel module
+ * unload has started this function must not try to increase the SCSI
+ * LLD kernel module refcount.
*/
static void sd_shutdown(struct device *dev)
{
- struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+ struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev, false);
if (!sdkp)
return; /* this can happen */
@@ -3165,12 +3169,12 @@ static void sd_shutdown(struct device *dev)
}
exit:
- scsi_disk_put(sdkp);
+ scsi_disk_put(sdkp, false);
}
static int sd_suspend_common(struct device *dev, bool ignore_stop_errors)
{
- struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+ struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev, true);
int ret = 0;
if (!sdkp)
@@ -3196,7 +3200,7 @@ static int sd_suspend_common(struct device *dev, bool ignore_stop_errors)
}
done:
- scsi_disk_put(sdkp);
+ scsi_disk_put(sdkp, true);
return ret;
}
@@ -3212,7 +3216,7 @@ static int sd_suspend_runtime(struct device *dev)
static int sd_resume(struct device *dev)
{
- struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+ struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev, true);
int ret = 0;
if (!sdkp->device->manage_start_stop)
@@ -3222,7 +3226,7 @@ static int sd_resume(struct device *dev)
ret = sd_start_stop_device(sdkp, 1);
done:
- scsi_disk_put(sdkp);
+ scsi_disk_put(sdkp, true);
return ret;
}
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 3a4edd1..a4cb852 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -330,6 +330,8 @@ extern void scsi_remove_device(struct scsi_device *);
extern int scsi_unregister_device_handler(struct scsi_device_handler *scsi_dh);
void scsi_attach_vpd(struct scsi_device *sdev);
+extern int scsi_dev_get(struct scsi_device *, bool get_lld);
+extern void scsi_dev_put(struct scsi_device *, bool put_lld);
extern int scsi_device_get(struct scsi_device *);
extern void scsi_device_put(struct scsi_device *);
extern struct scsi_device *scsi_device_lookup(struct Scsi_Host *,
--
2.1.2
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning
2015-01-07 13:03 [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning Bart Van Assche
@ 2015-01-08 13:15 ` Christoph Hellwig
2015-01-12 16:29 ` Alan Stern
2015-01-15 15:23 ` Don Brace
1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2015-01-08 13:15 UTC (permalink / raw)
To: Bart Van Assche
Cc: James Bottomley, Christoph Hellwig, Hannes Reinecke, Alan Stern,
linux-scsi@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]
On Wed, Jan 07, 2015 at 02:03:22PM +0100, Bart Van Assche wrote:
> Since kernel v3.19-rc1 module_refcount() returns 1 instead of 0
> when called from inside module_exit(). This breaks the
> module_refcount() test in scsi_device_put() and hence causes the
> following kernel warning to be reported when unloading the ib_srp
> kernel module:
This is getting better, but I still think we need to sort out the root
cause.
The problemt started with commit 39b7f1e25 ("[SCSI] sd: Fix refcounting"),
which added the calls to scsi_device_get in various struct
device_driver/scsi_driver methods. From the BZ it seems like the
rationale was to protect against races between ->rescan and ->remove,
but instead of doing that using refcounting we better ensure that
in the SCSI midlayer by taking scan_mutex around calls to ->rescan.
The first attached patch does that, which allows us to functionally
revert 39b7f1e25, which then also allows to revert 85b6c720
("[SCSI] sd: fix cache flushing on module removal (and individual
device removal)").
See the attached series to do that. Warnings: so far it only got
minimal testing.
[-- Attachment #2: 0001-scsi-serialize-rescan-against-remove.patch --]
[-- Type: text/plain, Size: 1183 bytes --]
>From f92b9687a9663d4bf304303658acfe03167e0667 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Thu, 8 Jan 2015 11:59:56 +0100
Subject: scsi: serialize ->rescan against ->remove
Take the scan_mutex when rescanning a device to protect against a
concurrent device removal.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/scsi/scsi_scan.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 0deb385..7d926cd 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1569,16 +1569,18 @@ EXPORT_SYMBOL(scsi_add_device);
void scsi_rescan_device(struct device *dev)
{
- if (!dev->driver)
- return;
+ struct scsi_device *sdev = to_scsi_device(dev);
+ struct Scsi_Host *shost = sdev->host;
- if (try_module_get(dev->driver->owner)) {
+ mutex_lock(&shost->scan_mutex);
+ if (dev->driver && try_module_get(dev->driver->owner)) {
struct scsi_driver *drv = to_scsi_driver(dev->driver);
if (drv->rescan)
drv->rescan(dev);
module_put(dev->driver->owner);
}
+ mutex_unlock(&shost->scan_mutex);
}
EXPORT_SYMBOL(scsi_rescan_device);
--
1.9.1
[-- Attachment #3: 0002-sd-don-t-grab-reference-device-references-from-drive.patch --]
[-- Type: text/plain, Size: 3836 bytes --]
>From c056e4f177617d095eca495f94753f16043eb38b Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Thu, 8 Jan 2015 12:07:05 +0100
Subject: sd: don't grab reference device references from driver methods
The device model already takes care of races from ->remove and ->shutdown
vs its other methods, and we now take care for the scsi-specific methods
as well.
This is a partial, manual revert of commit 39b7f1e25
("[SCSI] sd: Fix refcounting").
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/scsi/sd.c | 55 +++++++++++--------------------------------------------
1 file changed, 11 insertions(+), 44 deletions(-)
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index ebf35cb6..be76130 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -564,10 +564,12 @@ static int sd_major(int major_idx)
}
}
-static struct scsi_disk *__scsi_disk_get(struct gendisk *disk)
+static struct scsi_disk *scsi_disk_get(struct gendisk *disk)
{
struct scsi_disk *sdkp = NULL;
+ mutex_lock(&sd_ref_mutex);
+
if (disk->private_data) {
sdkp = scsi_disk(disk);
if (scsi_device_get(sdkp->device) == 0)
@@ -575,27 +577,6 @@ static struct scsi_disk *__scsi_disk_get(struct gendisk *disk)
else
sdkp = NULL;
}
- return sdkp;
-}
-
-static struct scsi_disk *scsi_disk_get(struct gendisk *disk)
-{
- struct scsi_disk *sdkp;
-
- mutex_lock(&sd_ref_mutex);
- sdkp = __scsi_disk_get(disk);
- mutex_unlock(&sd_ref_mutex);
- return sdkp;
-}
-
-static struct scsi_disk *scsi_disk_get_from_dev(struct device *dev)
-{
- struct scsi_disk *sdkp;
-
- mutex_lock(&sd_ref_mutex);
- sdkp = dev_get_drvdata(dev);
- if (sdkp)
- sdkp = __scsi_disk_get(sdkp->disk);
mutex_unlock(&sd_ref_mutex);
return sdkp;
}
@@ -610,8 +591,6 @@ static void scsi_disk_put(struct scsi_disk *sdkp)
mutex_unlock(&sd_ref_mutex);
}
-
-
static unsigned char sd_setup_protect_cmnd(struct scsi_cmnd *scmd,
unsigned int dix, unsigned int dif)
{
@@ -1525,12 +1504,9 @@ static int sd_sync_cache(struct scsi_disk *sdkp)
static void sd_rescan(struct device *dev)
{
- struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+ struct scsi_disk *sdkp = dev_get_drvdata(dev);
- if (sdkp) {
- revalidate_disk(sdkp->disk);
- scsi_disk_put(sdkp);
- }
+ revalidate_disk(sdkp->disk);
}
@@ -3147,13 +3123,13 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start)
*/
static void sd_shutdown(struct device *dev)
{
- struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+ struct scsi_disk *sdkp = dev_get_drvdata(dev);
if (!sdkp)
return; /* this can happen */
if (pm_runtime_suspended(dev))
- goto exit;
+ return;
if (sdkp->WCE && sdkp->media_present) {
sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
@@ -3164,14 +3140,11 @@ static void sd_shutdown(struct device *dev)
sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n");
sd_start_stop_device(sdkp, 0);
}
-
-exit:
- scsi_disk_put(sdkp);
}
static int sd_suspend_common(struct device *dev, bool ignore_stop_errors)
{
- struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+ struct scsi_disk *sdkp = dev_get_drvdata(dev);
int ret = 0;
if (!sdkp)
@@ -3197,7 +3170,6 @@ static int sd_suspend_common(struct device *dev, bool ignore_stop_errors)
}
done:
- scsi_disk_put(sdkp);
return ret;
}
@@ -3213,18 +3185,13 @@ static int sd_suspend_runtime(struct device *dev)
static int sd_resume(struct device *dev)
{
- struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
- int ret = 0;
+ struct scsi_disk *sdkp = dev_get_drvdata(dev);
if (!sdkp->device->manage_start_stop)
- goto done;
+ return 0;
sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
- ret = sd_start_stop_device(sdkp, 1);
-
-done:
- scsi_disk_put(sdkp);
- return ret;
+ return sd_start_stop_device(sdkp, 1);
}
/**
--
1.9.1
[-- Attachment #4: 0003-Revert-SCSI-sd-fix-cache-flushing-on-module-removal-.patch --]
[-- Type: text/plain, Size: 1944 bytes --]
>From 53378be96577c3fdb31dc1bdc0a73ec49171c763 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Thu, 8 Jan 2015 12:10:29 +0100
Subject: Revert "[SCSI] sd: fix cache flushing on module removal (and
individual device removal)"
This reverts commit 85b6c720b0931101c8bcc3a5abdc2b8514b0fb4b.
We now never call scsi_device_get from the shutdown path, and this
workaround turned out to create more problems than it solves. Move
back to properly checking the device state and carefully handle
module refcounting including checking the return value.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/scsi/scsi.c | 19 ++++++-------------
1 file changed, 6 insertions(+), 13 deletions(-)
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 08c90a7..ff8f888 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -975,14 +975,14 @@ EXPORT_SYMBOL(scsi_report_opcode);
*/
int scsi_device_get(struct scsi_device *sdev)
{
- if (sdev->sdev_state == SDEV_DEL)
+ if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)
return -ENXIO;
if (!get_device(&sdev->sdev_gendev))
return -ENXIO;
- /* We can fail this if we're doing SCSI operations
- * from module exit (like cache flush) */
- try_module_get(sdev->host->hostt->module);
-
+ if (!try_module_get(sdev->host->hostt->module)) {
+ put_device(&sdev->sdev_gendev);
+ return -ENXIO;
+ }
return 0;
}
EXPORT_SYMBOL(scsi_device_get);
@@ -997,14 +997,7 @@ EXPORT_SYMBOL(scsi_device_get);
*/
void scsi_device_put(struct scsi_device *sdev)
{
-#ifdef CONFIG_MODULE_UNLOAD
- struct module *module = sdev->host->hostt->module;
-
- /* The module refcount will be zero if scsi_device_get()
- * was called from a module removal routine */
- if (module && module_refcount(module) != 0)
- module_put(module);
-#endif
+ module_put(sdev->host->hostt->module);
put_device(&sdev->sdev_gendev);
}
EXPORT_SYMBOL(scsi_device_put);
--
1.9.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning
2015-01-08 13:15 ` Christoph Hellwig
@ 2015-01-12 16:29 ` Alan Stern
2015-01-14 9:33 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: Alan Stern @ 2015-01-12 16:29 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Bart Van Assche, James Bottomley, Hannes Reinecke,
linux-scsi@vger.kernel.org
On Thu, 8 Jan 2015, Christoph Hellwig wrote:
> On Wed, Jan 07, 2015 at 02:03:22PM +0100, Bart Van Assche wrote:
> > Since kernel v3.19-rc1 module_refcount() returns 1 instead of 0
> > when called from inside module_exit(). This breaks the
> > module_refcount() test in scsi_device_put() and hence causes the
> > following kernel warning to be reported when unloading the ib_srp
> > kernel module:
>
> This is getting better, but I still think we need to sort out the root
> cause.
>
> The problemt started with commit 39b7f1e25 ("[SCSI] sd: Fix refcounting"),
> which added the calls to scsi_device_get in various struct
> device_driver/scsi_driver methods. From the BZ it seems like the
> rationale was to protect against races between ->rescan and ->remove,
> but instead of doing that using refcounting we better ensure that
> in the SCSI midlayer by taking scan_mutex around calls to ->rescan.
> The first attached patch does that, which allows us to functionally
> revert 39b7f1e25, which then also allows to revert 85b6c720
> ("[SCSI] sd: fix cache flushing on module removal (and individual
> device removal)").
>
> See the attached series to do that. Warnings: so far it only got
> minimal testing.
This seems like a good idea and the obvious (once it has been pointed
out!) approach.
Perhaps not directly related to the issue at hand is this question: In
scsi_rescan_device() we will now have:
mutex_lock(&shost->scan_mutex);
if (dev->driver && try_module_get(dev->driver->owner)) {
struct scsi_driver *drv = to_scsi_driver(dev->driver);
if (drv->rescan)
drv->rescan(dev);
module_put(dev->driver->owner);
}
mutex_unlock(&shost->scan_mutex);
What prevents the device from being unbound from its driver while the
rescan runs? Evaluating the argument to the module_put() would then
dereference a NULL pointer.
Unbind events that happen through the normal scsi_remove_host()
mechanism are fine, because scsi_remove_host() locks the scan_mutex.
But what about writes to the driver's sysfs "unbind" attribute?
Alan Stern
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning
2015-01-12 16:29 ` Alan Stern
@ 2015-01-14 9:33 ` Christoph Hellwig
2015-01-14 15:07 ` Alan Stern
2015-01-20 15:11 ` [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning Alan Stern
0 siblings, 2 replies; 12+ messages in thread
From: Christoph Hellwig @ 2015-01-14 9:33 UTC (permalink / raw)
To: Alan Stern
Cc: Christoph Hellwig, Bart Van Assche, James Bottomley,
Hannes Reinecke, linux-scsi@vger.kernel.org, Greg Kroah-Hartman,
linux-kernel
On Mon, Jan 12, 2015 at 11:29:15AM -0500, Alan Stern wrote:
> This seems like a good idea and the obvious (once it has been pointed
> out!) approach.
>
> Perhaps not directly related to the issue at hand is this question: In
> scsi_rescan_device() we will now have:
>
> mutex_lock(&shost->scan_mutex);
> if (dev->driver && try_module_get(dev->driver->owner)) {
> struct scsi_driver *drv = to_scsi_driver(dev->driver);
>
> if (drv->rescan)
> drv->rescan(dev);
> module_put(dev->driver->owner);
> }
> mutex_unlock(&shost->scan_mutex);
>
> What prevents the device from being unbound from its driver while the
> rescan runs? Evaluating the argument to the module_put() would then
> dereference a NULL pointer.
>
> Unbind events that happen through the normal scsi_remove_host()
> mechanism are fine, because scsi_remove_host() locks the scan_mutex.
> But what about writes to the driver's sysfs "unbind" attribute?
Looks like we should still get an unconditional reference to
the device using get_device in scsi_rescan_device at least.
But this seems like a more generic problem, and at least a quick glance at
the pci_driver methods seems like others don't have a good
synchroniation of ->remove against random driver methods.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning
2015-01-14 9:33 ` Christoph Hellwig
@ 2015-01-14 15:07 ` Alan Stern
2015-01-15 16:06 ` Christoph Hellwig
2015-01-20 15:11 ` [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning Alan Stern
1 sibling, 1 reply; 12+ messages in thread
From: Alan Stern @ 2015-01-14 15:07 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Bart Van Assche, James Bottomley, Hannes Reinecke,
linux-scsi@vger.kernel.org, Greg Kroah-Hartman, linux-kernel
On Wed, 14 Jan 2015, Christoph Hellwig wrote:
> On Mon, Jan 12, 2015 at 11:29:15AM -0500, Alan Stern wrote:
> > This seems like a good idea and the obvious (once it has been pointed
> > out!) approach.
> >
> > Perhaps not directly related to the issue at hand is this question: In
> > scsi_rescan_device() we will now have:
> >
> > mutex_lock(&shost->scan_mutex);
> > if (dev->driver && try_module_get(dev->driver->owner)) {
> > struct scsi_driver *drv = to_scsi_driver(dev->driver);
> >
> > if (drv->rescan)
> > drv->rescan(dev);
> > module_put(dev->driver->owner);
> > }
> > mutex_unlock(&shost->scan_mutex);
> >
> > What prevents the device from being unbound from its driver while the
> > rescan runs? Evaluating the argument to the module_put() would then
> > dereference a NULL pointer.
> >
> > Unbind events that happen through the normal scsi_remove_host()
> > mechanism are fine, because scsi_remove_host() locks the scan_mutex.
> > But what about writes to the driver's sysfs "unbind" attribute?
>
> Looks like we should still get an unconditional reference to
> the device using get_device in scsi_rescan_device at least.
I'm not sure that's necessary. scsi_rescan_device is exposed by sysfs,
and the kernfs core insures that the underlying device won't be
deallocated while a sysfs method runs.
(scsi_rescan_device is also EXPORTed, so in theory it could be called
under less safe circumstances. Perhaps then the burden should fall on
the caller to guarantee that the device won't be deallocated.)
> But this seems like a more generic problem, and at least a quick glance at
> the pci_driver methods seems like others don't have a good
> synchroniation of ->remove against random driver methods.
Can you give one or two examples?
Alan Stern
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning
2015-01-07 13:03 [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning Bart Van Assche
2015-01-08 13:15 ` Christoph Hellwig
@ 2015-01-15 15:23 ` Don Brace
1 sibling, 0 replies; 12+ messages in thread
From: Don Brace @ 2015-01-15 15:23 UTC (permalink / raw)
To: Bart Van Assche, James Bottomley, Christoph Hellwig
Cc: Hannes Reinecke, linux-scsi@vger.kernel.org
On 01/07/2015 07:03 AM, Bart Van Assche wrote:
> Since kernel v3.19-rc1 module_refcount() returns 1 instead of 0
> when called from inside module_exit(). This breaks the
> module_refcount() test in scsi_device_put() and hence causes the
> following kernel warning to be reported when unloading the ib_srp
> kernel module:
>
Tested-by: Don Brace <don.brace@pmcs.com>
This patch fixed module removal crashes I was seeing with the hpsa
driver running on 3.19-rc4 kernel. Ran ~200 insmod/rmmod tests with no
issues.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning
2015-01-14 15:07 ` Alan Stern
@ 2015-01-15 16:06 ` Christoph Hellwig
2015-01-15 18:22 ` sysfs methods can race with ->remove Alan Stern
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2015-01-15 16:06 UTC (permalink / raw)
To: Alan Stern
Cc: Bart Van Assche, James Bottomley, Hannes Reinecke,
linux-scsi@vger.kernel.org, Greg Kroah-Hartman, linux-kernel
On Wed, Jan 14, 2015 at 10:07:00AM -0500, Alan Stern wrote:
> and the kernfs core insures that the underlying device won't be
> deallocated while a sysfs method runs.
It has a reference to keep it from beeing freed, but so far I can't find
anything that prevents ->remove from beeing called while we are in or
just before a method call.
> > But this seems like a more generic problem, and at least a quick glance at
> > the pci_driver methods seems like others don't have a good
> > synchroniation of ->remove against random driver methods.
>
> Can you give one or two examples?
I look at the sriov_configure PCI method, or the various sub-methods
under pci_driver.err_handler.
^ permalink raw reply [flat|nested] 12+ messages in thread
* sysfs methods can race with ->remove
2015-01-15 16:06 ` Christoph Hellwig
@ 2015-01-15 18:22 ` Alan Stern
2015-01-15 19:40 ` Tejun Heo
0 siblings, 1 reply; 12+ messages in thread
From: Alan Stern @ 2015-01-15 18:22 UTC (permalink / raw)
To: Christoph Hellwig, Tejun Heo
Cc: Bart Van Assche, James Bottomley, Hannes Reinecke,
linux-scsi@vger.kernel.org, Greg Kroah-Hartman,
Kernel development list
Tejun:
The context is that we have been talking about
drivers/scsi/scsi_scan.c:scsi_rescan_device(), which is called by the
store_rescan_field() sysfs method in scsi_sysfs.c. The problem is
this: What happens in scsi_rescan_device if the device is unbound from
its driver before the module_put call? The dev->driver->owner
calculation would dereference a NULL pointer.
On Thu, 15 Jan 2015, Christoph Hellwig wrote:
> On Wed, Jan 14, 2015 at 10:07:00AM -0500, Alan Stern wrote:
> > and the kernfs core insures that the underlying device won't be
> > deallocated while a sysfs method runs.
>
> It has a reference to keep it from beeing freed, but so far I can't find
> anything that prevents ->remove from beeing called while we are in or
> just before a method call.
There are two types of methods to think about: Those registered by the
subsystem and those registered by the driver.
If a method is registered by the driver, then the driver will
unregister it when the ->remove routine runs. I don't know for
certain, but I would expect that the sysfs/kernfs core will make sure
that any existing method calls complete before unregister returns.
This would prevent races.
If a method is registered by the subsystem, and if the method runs
entirely within the subsystem's code, then ->remove doesn't matter.
The driver could be unbound while the method is running and it would be
okay.
The only time we have a problem is when the method is registered by the
subsystem and the method calls into the driver. (Note that this is
exactly what happens with scsi_rescan_device.)
> > > But this seems like a more generic problem, and at least a quick glance at
> > > the pci_driver methods seems like others don't have a good
> > > synchroniation of ->remove against random driver methods.
> >
> > Can you give one or two examples?
>
> I look at the sriov_configure PCI method, or the various sub-methods
> under pci_driver.err_handler.
The sriov_numvfs_store method does have the same problem, and so does
the reset_store method (by way of pci_reset_function ->
pci_dev_save_and_disable -> pci_reset_notify).
Tejun, is my analysis correct? How should we fix these races?
Alan Stern
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sysfs methods can race with ->remove
2015-01-15 18:22 ` sysfs methods can race with ->remove Alan Stern
@ 2015-01-15 19:40 ` Tejun Heo
2015-01-26 17:19 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2015-01-15 19:40 UTC (permalink / raw)
To: Alan Stern
Cc: Christoph Hellwig, Bart Van Assche, James Bottomley,
Hannes Reinecke, linux-scsi@vger.kernel.org, Greg Kroah-Hartman,
Kernel development list
Hello, Alan.
On Thu, Jan 15, 2015 at 01:22:03PM -0500, Alan Stern wrote:
> > It has a reference to keep it from beeing freed, but so far I can't find
> > anything that prevents ->remove from beeing called while we are in or
> > just before a method call.
>
> There are two types of methods to think about: Those registered by the
> subsystem and those registered by the driver.
>
> If a method is registered by the driver, then the driver will
> unregister it when the ->remove routine runs. I don't know for
> certain, but I would expect that the sysfs/kernfs core will make sure
> that any existing method calls complete before unregister returns.
> This would prevent races.
Yes, attribute deletions are blocked till the on-going sysfs
read/write operations are finished and further rw accesses are failed.
> If a method is registered by the subsystem, and if the method runs
> entirely within the subsystem's code, then ->remove doesn't matter.
> The driver could be unbound while the method is running and it would be
> okay.
>
> The only time we have a problem is when the method is registered by the
> subsystem and the method calls into the driver. (Note that this is
> exactly what happens with scsi_rescan_device.)
>
> > > > But this seems like a more generic problem, and at least a quick glance at
> > > > the pci_driver methods seems like others don't have a good
> > > > synchroniation of ->remove against random driver methods.
> > >
> > > Can you give one or two examples?
> >
> > I look at the sriov_configure PCI method, or the various sub-methods
> > under pci_driver.err_handler.
>
> The sriov_numvfs_store method does have the same problem, and so does
> the reset_store method (by way of pci_reset_function ->
> pci_dev_save_and_disable -> pci_reset_notify).
>
> Tejun, is my analysis correct? How should we fix these races?
I'm not really following what the actual problem case is, so SCSI
subsystem store methods are derefing dev->driver without synchronizing
against detach events? If that's the case, the solution would be
synchronizing against attach/detach events? Sorry if I'm being
totally idiotic. I'm having a bit of hard time jumping right in. :)
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning
2015-01-14 9:33 ` Christoph Hellwig
2015-01-14 15:07 ` Alan Stern
@ 2015-01-20 15:11 ` Alan Stern
1 sibling, 0 replies; 12+ messages in thread
From: Alan Stern @ 2015-01-20 15:11 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Bart Van Assche, James Bottomley, Hannes Reinecke,
linux-scsi@vger.kernel.org, Greg Kroah-Hartman,
Kernel development list
On Wed, 14 Jan 2015, Christoph Hellwig wrote:
> On Mon, Jan 12, 2015 at 11:29:15AM -0500, Alan Stern wrote:
> > This seems like a good idea and the obvious (once it has been pointed
> > out!) approach.
> >
> > Perhaps not directly related to the issue at hand is this question: In
> > scsi_rescan_device() we will now have:
> >
> > mutex_lock(&shost->scan_mutex);
> > if (dev->driver && try_module_get(dev->driver->owner)) {
> > struct scsi_driver *drv = to_scsi_driver(dev->driver);
> >
> > if (drv->rescan)
> > drv->rescan(dev);
> > module_put(dev->driver->owner);
> > }
> > mutex_unlock(&shost->scan_mutex);
> >
> > What prevents the device from being unbound from its driver while the
> > rescan runs? Evaluating the argument to the module_put() would then
> > dereference a NULL pointer.
> >
> > Unbind events that happen through the normal scsi_remove_host()
> > mechanism are fine, because scsi_remove_host() locks the scan_mutex.
> > But what about writes to the driver's sysfs "unbind" attribute?
>
> Looks like we should still get an unconditional reference to
> the device using get_device in scsi_rescan_device at least.
>
> But this seems like a more generic problem, and at least a quick glance at
> the pci_driver methods seems like others don't have a good
> synchroniation of ->remove against random driver methods.
This particular problem comes down to the fact that
scsi_rescan_device() accesses dev->driver without appropriate mutual
exclusion. SCSI's scan_mutex won't help because it doesn't protect
dev->driver. Rather, dev->driver is protected by dev->mutex, and so
scsi_rescan_device() needs to use device_lock/unlock.
This suggests that the scan_mutex may not be necessary at all.
Historically, it seems to be quite old, predating the device model.
Now that we have the device model, maybe scan_mutex simply isn't
needed.
Scanning for channels or targets beneath a host should be protected by
shost->gendev.mutex. Scanning for logical units beneath a target
should be protected by starget->dev.mutex. Scanning for partitions
beneath a SCSI drive should be protected by sdev->sdev_gendev.mutex.
James, here's a related question. Suppose userspace writes to the
rescan attribute file for a disk drive for sd_probe_async() has
started. What will happen? What _ought_ to happen? Do we need to
call
async_synchronize_full_domain(&scsi_sd_probe_domain);
somewhere in this pathway, or will it be okay?
Alan Stern
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sysfs methods can race with ->remove
2015-01-15 19:40 ` Tejun Heo
@ 2015-01-26 17:19 ` Christoph Hellwig
2015-01-26 18:38 ` Alan Stern
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2015-01-26 17:19 UTC (permalink / raw)
To: Tejun Heo
Cc: Alan Stern, Bart Van Assche, James Bottomley, Hannes Reinecke,
linux-scsi@vger.kernel.org, Greg Kroah-Hartman,
Kernel development list
On Thu, Jan 15, 2015 at 02:40:31PM -0500, Tejun Heo wrote:
> > If a method is registered by the driver, then the driver will
> > unregister it when the ->remove routine runs. I don't know for
> > certain, but I would expect that the sysfs/kernfs core will make sure
> > that any existing method calls complete before unregister returns.
> > This would prevent races.
>
> Yes, attribute deletions are blocked till the on-going sysfs
> read/write operations are finished and further rw accesses are failed.
Btw, where do we do that? I did a walk through the code starting
from device_del, but must have missed the obvious.
> > The sriov_numvfs_store method does have the same problem, and so does
> > the reset_store method (by way of pci_reset_function ->
> > pci_dev_save_and_disable -> pci_reset_notify).
> >
> > Tejun, is my analysis correct? How should we fix these races?
>
> I'm not really following what the actual problem case is, so SCSI
> subsystem store methods are derefing dev->driver without synchronizing
> against detach events? If that's the case, the solution would be
> synchronizing against attach/detach events? Sorry if I'm being
> totally idiotic. I'm having a bit of hard time jumping right in. :)
No problem. That's the basic situation we are talking about. I have
a serie fixing some long standing issues in the device model integration
in SCSI, and pointed out a possible issue in that area.
So what is the proper lock to take to prevent ->remove from beeing
called while in such a method? A mentioned about I tried to peel
through all the layers of the onion^H^H^H^H^Hdriver core, but so far
couldn't find anything obvious.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sysfs methods can race with ->remove
2015-01-26 17:19 ` Christoph Hellwig
@ 2015-01-26 18:38 ` Alan Stern
0 siblings, 0 replies; 12+ messages in thread
From: Alan Stern @ 2015-01-26 18:38 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Tejun Heo, Bart Van Assche, James Bottomley, Hannes Reinecke,
linux-scsi@vger.kernel.org, Greg Kroah-Hartman,
Kernel development list
On Mon, 26 Jan 2015, Christoph Hellwig wrote:
> On Thu, Jan 15, 2015 at 02:40:31PM -0500, Tejun Heo wrote:
> > > If a method is registered by the driver, then the driver will
> > > unregister it when the ->remove routine runs. I don't know for
> > > certain, but I would expect that the sysfs/kernfs core will make sure
> > > that any existing method calls complete before unregister returns.
> > > This would prevent races.
> >
> > Yes, attribute deletions are blocked till the on-going sysfs
> > read/write operations are finished and further rw accesses are failed.
>
> Btw, where do we do that? I did a walk through the code starting
> from device_del, but must have missed the obvious.
It happens in fs/kernfs/dir.c:kernfs_drain(). That routine is called
when a sysfs file is removed, and it waits until all ongoing read/write
operations are finished.
> > > The sriov_numvfs_store method does have the same problem, and so does
> > > the reset_store method (by way of pci_reset_function ->
> > > pci_dev_save_and_disable -> pci_reset_notify).
> > >
> > > Tejun, is my analysis correct? How should we fix these races?
> >
> > I'm not really following what the actual problem case is, so SCSI
> > subsystem store methods are derefing dev->driver without synchronizing
> > against detach events? If that's the case, the solution would be
> > synchronizing against attach/detach events? Sorry if I'm being
> > totally idiotic. I'm having a bit of hard time jumping right in. :)
>
> No problem. That's the basic situation we are talking about. I have
> a serie fixing some long standing issues in the device model integration
> in SCSI, and pointed out a possible issue in that area.
>
> So what is the proper lock to take to prevent ->remove from beeing
> called while in such a method? A mentioned about I tried to peel
> through all the layers of the onion^H^H^H^H^Hdriver core, but so far
> couldn't find anything obvious.
The proper lock is dev->mutex, as I mentioned in an earlier email
(http://marc.info/?l=linux-scsi&m=142176669519982&w=2). That lock is
held whenever a ->remove method is called.
Alan Stern
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-01-26 18:38 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-07 13:03 [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning Bart Van Assche
2015-01-08 13:15 ` Christoph Hellwig
2015-01-12 16:29 ` Alan Stern
2015-01-14 9:33 ` Christoph Hellwig
2015-01-14 15:07 ` Alan Stern
2015-01-15 16:06 ` Christoph Hellwig
2015-01-15 18:22 ` sysfs methods can race with ->remove Alan Stern
2015-01-15 19:40 ` Tejun Heo
2015-01-26 17:19 ` Christoph Hellwig
2015-01-26 18:38 ` Alan Stern
2015-01-20 15:11 ` [PATCH for v3.19, v2] Avoid that sd_shutdown() triggers a kernel warning Alan Stern
2015-01-15 15:23 ` Don Brace
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).