* [PATCH] sd: fix race between rescan and remove
@ 2005-11-03 15:54 Alan Stern
2005-11-04 2:08 ` James Bottomley
0 siblings, 1 reply; 4+ messages in thread
From: Alan Stern @ 2005-11-03 15:54 UTC (permalink / raw)
To: James Bottomley; +Cc: SCSI development list
James:
When the well-known open/remove race was fixed in sd.c, a closely-related
race between rescan and remove was ignored. Probably because rescan gets
called from an unusual pathway (via sysfs). It's easy for this race to
trigger an oops, especially if you unplug a USB storage device while a
user program continually writes to the "rescan" attribute file.
This patch (as597) fixes the race by making the rescan routine acquire the
appropriate references. It also cleans up the code scsi_disk_get
somewhat. This resolves Bugzilla entry #5237.
Alan Stern
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
---
Do you think this is appropriate for 2.6.14.stable?
Index: usb-2.6/drivers/scsi/sd.c
===================================================================
--- usb-2.6.orig/drivers/scsi/sd.c
+++ usb-2.6/drivers/scsi/sd.c
@@ -182,19 +182,29 @@ static struct scsi_disk *scsi_disk_get(s
struct scsi_disk *sdkp = NULL;
down(&sd_ref_sem);
- if (disk->private_data == NULL)
- goto out;
- sdkp = scsi_disk(disk);
- kref_get(&sdkp->kref);
- if (scsi_device_get(sdkp->device))
- goto out_put;
+ if (disk->private_data) {
+ sdkp = scsi_disk(disk);
+ if (scsi_device_get(sdkp->device) == 0)
+ kref_get(&sdkp->kref);
+ else
+ sdkp = NULL;
+ }
up(&sd_ref_sem);
return sdkp;
+}
- out_put:
- kref_put(&sdkp->kref, scsi_disk_release);
- sdkp = NULL;
- out:
+static struct scsi_disk *scsi_disk_get_from_dev(struct device *dev)
+{
+ struct scsi_disk *sdkp;
+
+ down(&sd_ref_sem);
+ sdkp = dev_get_drvdata(dev);
+ if (sdkp) {
+ if (scsi_device_get(sdkp->device) == 0)
+ kref_get(&sdkp->kref);
+ else
+ sdkp = NULL;
+ }
up(&sd_ref_sem);
return sdkp;
}
@@ -769,8 +779,13 @@ static int sd_prepare_flush(request_queu
static void sd_rescan(struct device *dev)
{
- struct scsi_disk *sdkp = dev_get_drvdata(dev);
- sd_revalidate_disk(sdkp->disk);
+ struct scsi_disk *sdkp;
+
+ sdkp = scsi_disk_get_from_dev(dev);
+ if (sdkp) {
+ sd_revalidate_disk(sdkp->disk);
+ scsi_disk_put(sdkp);
+ }
}
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] sd: fix race between rescan and remove 2005-11-03 15:54 [PATCH] sd: fix race between rescan and remove Alan Stern @ 2005-11-04 2:08 ` James Bottomley 2005-11-04 16:40 ` Alan Stern 0 siblings, 1 reply; 4+ messages in thread From: James Bottomley @ 2005-11-04 2:08 UTC (permalink / raw) To: Alan Stern; +Cc: SCSI development list On Thu, 2005-11-03 at 10:54 -0500, Alan Stern wrote: > When the well-known open/remove race was fixed in sd.c, a closely-related > race between rescan and remove was ignored. Probably because rescan gets > called from an unusual pathway (via sysfs). It's easy for this race to > trigger an oops, especially if you unplug a USB storage device while a > user program continually writes to the "rescan" attribute file. > > This patch (as597) fixes the race by making the rescan routine acquire the > appropriate references. It also cleans up the code scsi_disk_get > somewhat. This resolves Bugzilla entry #5237. Thanks for investigating this ... I really hate debugging these obscure object reference races. I'm afraid that your fix isn't quite right: The problem occurs because in most cases scsi_remove_device will call sd_remove (via the driver model call backs from device_del). The kref_put in sd_remove will (almost always) trigger scsi_disk_release which frees sdkp. So, I'm afraid you can't rely on sdkp->device being correct. The fix for this one is a bit nasty because it's a reversal in the logic. The usual logic is that the user does something, so we have a refcounted gendisk object that we tie to scsi_disk and then scsi_device. In this case, we actually get a scsi_device and we have to try to work out how to get a valid refcounted scsi_disk object out of it. Unfortunately, this problem doesn't just affect sd_rescan, it affects the flush functions too where they try to do the same thing. Could you take a look at the attached? I just threw it together on a 'plane, so it's completely untested. It tries to implement the secure get of struct scsi_disk by tying the nulling out of drvdata to the sd_ref_sem, so we can now either get a reference to the scsi_disk or return NULL. I also fixed all the other relevant driver data users to follow this model. Thanks, James diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -177,24 +177,38 @@ static inline struct scsi_disk *scsi_dis return container_of(disk->private_data, struct scsi_disk, driver); } -static struct scsi_disk *scsi_disk_get(struct gendisk *disk) +static struct scsi_disk *__scsi_disk_get(struct gendisk *disk) { struct scsi_disk *sdkp = NULL; - down(&sd_ref_sem); if (disk->private_data == NULL) goto out; sdkp = scsi_disk(disk); + if (scsi_device_get(sdkp->device)) { + sdkp = NULL; + goto out; + } kref_get(&sdkp->kref); - if (scsi_device_get(sdkp->device)) - goto out_put; + out: + return sdkp; +} + +static struct scsi_disk *scsi_disk_get(struct gendisk *disk) +{ + struct scsi_disk *sdkp; + down(&sd_ref_sem); + sdkp = __scsi_disk_get(disk); up(&sd_ref_sem); return sdkp; +} - out_put: - kref_put(&sdkp->kref, scsi_disk_release); - sdkp = NULL; - out: +static struct scsi_disk *scsi_disk_get_from_dev(struct device *dev) +{ + struct scsi_disk *sdkp; + down(&sd_ref_sem); + sdkp = dev_get_drvdata(dev); + if (sdkp) + sdkp = __scsi_disk_get(sdkp->disk); up(&sd_ref_sem); return sdkp; } @@ -716,16 +730,20 @@ static int sd_sync_cache(struct scsi_dev static int sd_issue_flush(struct device *dev, sector_t *error_sector) { + int ret = 0; struct scsi_device *sdp = to_scsi_device(dev); - struct scsi_disk *sdkp = dev_get_drvdata(dev); + struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); if (!sdkp) return -ENODEV; if (!sdkp->WCE) - return 0; + goto out; - return sd_sync_cache(sdp); + ret = sd_sync_cache(sdp); + out: + scsi_disk_put(sdkp); + return ret; } static void sd_end_flush(request_queue_t *q, struct request *flush_rq) @@ -754,13 +772,14 @@ static void sd_end_flush(request_queue_t static int sd_prepare_flush(request_queue_t *q, struct request *rq) { struct scsi_device *sdev = q->queuedata; - struct scsi_disk *sdkp = dev_get_drvdata(&sdev->sdev_gendev); + struct scsi_disk *sdkp = scsi_disk_get_from_dev(&sdev->sdev_gendev); - if (sdkp->WCE) { + if (sdkp && sdkp->WCE) { memset(rq->cmd, 0, sizeof(rq->cmd)); rq->flags |= REQ_BLOCK_PC | REQ_SOFTBARRIER; rq->timeout = SD_TIMEOUT; rq->cmd[0] = SYNCHRONIZE_CACHE; + scsi_disk_put(sdkp); return 1; } @@ -769,8 +788,11 @@ static int sd_prepare_flush(request_queu static void sd_rescan(struct device *dev) { - struct scsi_disk *sdkp = dev_get_drvdata(dev); - sd_revalidate_disk(sdkp->disk); + struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); + if (sdkp) { + sd_revalidate_disk(sdkp->disk); + scsi_disk_put(sdkp); + } } @@ -1641,6 +1663,7 @@ static int sd_remove(struct device *dev) del_gendisk(sdkp->disk); sd_shutdown(dev); down(&sd_ref_sem); + dev_set_drvdata(dev, NULL); kref_put(&sdkp->kref, scsi_disk_release); up(&sd_ref_sem); @@ -1680,18 +1703,20 @@ static void scsi_disk_release(struct kre static void sd_shutdown(struct device *dev) { struct scsi_device *sdp = to_scsi_device(dev); - struct scsi_disk *sdkp = dev_get_drvdata(dev); + struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); if (!sdkp) return; /* this can happen */ if (!sdkp->WCE) - return; + goto out; printk(KERN_NOTICE "Synchronizing SCSI cache for disk %s: \n", sdkp->disk->disk_name); sd_sync_cache(sdp); -} + out: + scsi_disk_put(sdkp); +} /** * init_sd - entry point for this driver (both when built in or when ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] sd: fix race between rescan and remove 2005-11-04 2:08 ` James Bottomley @ 2005-11-04 16:40 ` Alan Stern 2005-11-04 17:45 ` James Bottomley 0 siblings, 1 reply; 4+ messages in thread From: Alan Stern @ 2005-11-04 16:40 UTC (permalink / raw) To: James Bottomley; +Cc: SCSI development list On Thu, 3 Nov 2005, James Bottomley wrote: > Thanks for investigating this ... I really hate debugging these obscure > object reference races. Yeah, it was a real pain to find the cause. Took most of two days. And yet it should have been obvious where to look... > I'm afraid that your fix isn't quite right: The problem occurs because > in most cases scsi_remove_device will call sd_remove (via the driver > model call backs from device_del). The kref_put in sd_remove will > (almost always) trigger scsi_disk_release which frees sdkp. So, I'm > afraid you can't rely on sdkp->device being correct. Hmm. Surely that's a problem for the old scsi_disk_get routine rather than the new scsi_disk_get_from_dev routine? The problem I fixed was different: sdkp was getting freed while it was still in use by the rescan procedure. But I agree that after sd_remove returns, we can't depend on sdkp->device being correct. > The fix for this one is a bit nasty because it's a reversal in the > logic. The usual logic is that the user does something, so we have a > refcounted gendisk object that we tie to scsi_disk and then scsi_device. > In this case, we actually get a scsi_device and we have to try to work > out how to get a valid refcounted scsi_disk object out of it. > Unfortunately, this problem doesn't just affect sd_rescan, it affects > the flush functions too where they try to do the same thing. I wasn't sure about them. It wasn't clear if they could get called after sd_remove. However it certainly doesn't hurt to make them call the new routine. > Could you take a look at the attached? I just threw it together on a > 'plane, so it's completely untested. It tries to implement the secure > get of struct scsi_disk by tying the nulling out of drvdata to the > sd_ref_sem, so we can now either get a reference to the scsi_disk or > return NULL. I also fixed all the other relevant driver data users to > follow this model. I see three differences between your patch and mine: You refactored the common code in scsi_disk_get and scsi_disk_get_from_dev. Okay, that's fine, although it looks a little strange to be going all the way down to the disk just to go back up to the sdkp. Note that when scsi_disk_get_from_dev calls __scsi_disk_get, it's not possible for disk->private_data to be NULL. You added calls to scsi_disk_get_from_dev in the flush and shutdown routines. That's good; I would have done it originally had I known it was needed. You added dev_set_drvdata(dev, NULL) to sd_remove. Actually I'm astonished it wasn't there before and that I didn't realize it was missing. I guess nobody considered that the scsi_device might outlive the disk. This is all good. I've made a few stylistic changes (mostly to avoid goto statements). By the way, is there any reason that disk->private_data is set to &sdkp->driver instead of to sdkp? I don't see any, so I'll change it. However this still doesn't solve the problem you mentioned. Somehow sd_remove has to tell __scsi_disk_get that sdkp->device might be gone. The only way I can think of is to add an extra bitfield to the scsi_disk structure. This patch solves the original problem and works okay on my system. If it looks okay to you, I'll resend it as a proper patch submission. Alan Stern Index: usb-2.6/drivers/scsi/sd.c =================================================================== --- usb-2.6.orig/drivers/scsi/sd.c +++ usb-2.6/drivers/scsi/sd.c @@ -102,6 +102,7 @@ struct scsi_disk { u8 write_prot; unsigned WCE : 1; /* state of disk WCE bit */ unsigned RCD : 1; /* state of disk RCD bit, unused */ + unsigned gone : 1; /* the scsi_device has been removed */ }; static DEFINE_IDR(sd_index_idr); @@ -174,27 +175,38 @@ static int sd_major(int major_idx) static inline struct scsi_disk *scsi_disk(struct gendisk *disk) { - return container_of(disk->private_data, struct scsi_disk, driver); + return (struct scsi_disk *) (disk->private_data); +} + +static struct scsi_disk *__scsi_disk_get(struct gendisk *disk) +{ + struct scsi_disk *sdkp = scsi_disk(disk); + + if (sdkp && !sdkp->gone && scsi_device_get(sdkp->device) == 0) + kref_get(&sdkp->kref); + else + sdkp = NULL; + return sdkp; } static struct scsi_disk *scsi_disk_get(struct gendisk *disk) { - struct scsi_disk *sdkp = NULL; + struct scsi_disk *sdkp; down(&sd_ref_sem); - if (disk->private_data == NULL) - goto out; - sdkp = scsi_disk(disk); - kref_get(&sdkp->kref); - if (scsi_device_get(sdkp->device)) - goto out_put; + sdkp = __scsi_disk_get(disk); up(&sd_ref_sem); return sdkp; +} - out_put: - kref_put(&sdkp->kref, scsi_disk_release); - sdkp = NULL; - out: +static struct scsi_disk *scsi_disk_get_from_dev(struct device *dev) +{ + struct scsi_disk *sdkp; + + down(&sd_ref_sem); + sdkp = dev_get_drvdata(dev); + if (sdkp) + sdkp = __scsi_disk_get(sdkp->disk); up(&sd_ref_sem); return sdkp; } @@ -716,16 +728,18 @@ static int sd_sync_cache(struct scsi_dev static int sd_issue_flush(struct device *dev, sector_t *error_sector) { + int ret = 0; struct scsi_device *sdp = to_scsi_device(dev); - struct scsi_disk *sdkp = dev_get_drvdata(dev); + struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); if (!sdkp) return -ENODEV; - if (!sdkp->WCE) - return 0; + if (sdkp->WCE) + ret = sd_sync_cache(sdp); - return sd_sync_cache(sdp); + scsi_disk_put(sdkp); + return ret; } static void sd_end_flush(request_queue_t *q, struct request *flush_rq) @@ -754,13 +768,14 @@ static void sd_end_flush(request_queue_t static int sd_prepare_flush(request_queue_t *q, struct request *rq) { struct scsi_device *sdev = q->queuedata; - struct scsi_disk *sdkp = dev_get_drvdata(&sdev->sdev_gendev); + struct scsi_disk *sdkp = scsi_disk_get_from_dev(&sdev->sdev_gendev); - if (sdkp->WCE) { + if (sdkp && sdkp->WCE) { memset(rq->cmd, 0, sizeof(rq->cmd)); rq->flags |= REQ_BLOCK_PC | REQ_SOFTBARRIER; rq->timeout = SD_TIMEOUT; rq->cmd[0] = SYNCHRONIZE_CACHE; + scsi_disk_put(sdkp); return 1; } @@ -769,8 +784,12 @@ static int sd_prepare_flush(request_queu static void sd_rescan(struct device *dev) { - struct scsi_disk *sdkp = dev_get_drvdata(dev); - sd_revalidate_disk(sdkp->disk); + struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); + + if (sdkp) { + sd_revalidate_disk(sdkp->disk); + scsi_disk_put(sdkp); + } } @@ -1593,7 +1612,7 @@ static int sd_probe(struct device *dev) 'a' + m1, 'a' + m2, 'a' + m3); } - gd->private_data = &sdkp->driver; + gd->private_data = sdkp; sd_revalidate_disk(gd); @@ -1638,7 +1657,10 @@ static int sd_remove(struct device *dev) del_gendisk(sdkp->disk); sd_shutdown(dev); + down(&sd_ref_sem); + dev_set_drvdata(dev, NULL); + sdkp->gone = 1; kref_put(&sdkp->kref, scsi_disk_release); up(&sd_ref_sem); @@ -1664,7 +1686,6 @@ static void scsi_disk_release(struct kre spin_unlock(&sd_index_lock); disk->private_data = NULL; - put_disk(disk); kfree(sdkp); @@ -1678,18 +1699,18 @@ static void scsi_disk_release(struct kre static void sd_shutdown(struct device *dev) { struct scsi_device *sdp = to_scsi_device(dev); - struct scsi_disk *sdkp = dev_get_drvdata(dev); + struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); if (!sdkp) return; /* this can happen */ - if (!sdkp->WCE) - return; - - printk(KERN_NOTICE "Synchronizing SCSI cache for disk %s: \n", - sdkp->disk->disk_name); - sd_sync_cache(sdp); -} + if (sdkp->WCE) { + printk(KERN_NOTICE "Synchronizing SCSI cache for disk %s: \n", + sdkp->disk->disk_name); + sd_sync_cache(sdp); + } + scsi_disk_put(sdkp); +} /** * init_sd - entry point for this driver (both when built in or when ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] sd: fix race between rescan and remove 2005-11-04 16:40 ` Alan Stern @ 2005-11-04 17:45 ` James Bottomley 0 siblings, 0 replies; 4+ messages in thread From: James Bottomley @ 2005-11-04 17:45 UTC (permalink / raw) To: Alan Stern; +Cc: SCSI development list On Fri, 2005-11-04 at 11:40 -0500, Alan Stern wrote: > On Thu, 3 Nov 2005, James Bottomley wrote: > > > Thanks for investigating this ... I really hate debugging these obscure > > object reference races. > > Yeah, it was a real pain to find the cause. Took most of two days. And > yet it should have been obvious where to look... That's what I usually end up saying after days of staring at the code ... > > I'm afraid that your fix isn't quite right: The problem occurs because > > in most cases scsi_remove_device will call sd_remove (via the driver > > model call backs from device_del). The kref_put in sd_remove will > > (almost always) trigger scsi_disk_release which frees sdkp. So, I'm > > afraid you can't rely on sdkp->device being correct. > > Hmm. Surely that's a problem for the old scsi_disk_get routine rather > than the new scsi_disk_get_from_dev routine? No .. we rely on support from the way block operation works for the scsi_disk_get() routine to work. The problem I was trying to prevent is the race where scsi_remove_disk is freeing sdkp as sd_rescan is being called. > The problem I fixed was different: sdkp was getting freed while it was > still in use by the rescan procedure. But I agree that after sd_remove > returns, we can't depend on sdkp->device being correct. Yes, but I also wanted to fix the race between acquiring a reference to sdkp and having it freed. > > The fix for this one is a bit nasty because it's a reversal in the > > logic. The usual logic is that the user does something, so we have a > > refcounted gendisk object that we tie to scsi_disk and then scsi_device. > > In this case, we actually get a scsi_device and we have to try to work > > out how to get a valid refcounted scsi_disk object out of it. > > Unfortunately, this problem doesn't just affect sd_rescan, it affects > > the flush functions too where they try to do the same thing. > > I wasn't sure about them. It wasn't clear if they could get called after > sd_remove. However it certainly doesn't hurt to make them call the new > routine. > > > Could you take a look at the attached? I just threw it together on a > > 'plane, so it's completely untested. It tries to implement the secure > > get of struct scsi_disk by tying the nulling out of drvdata to the > > sd_ref_sem, so we can now either get a reference to the scsi_disk or > > return NULL. I also fixed all the other relevant driver data users to > > follow this model. > > I see three differences between your patch and mine: > > You refactored the common code in scsi_disk_get and > scsi_disk_get_from_dev. Okay, that's fine, although > it looks a little strange to be going all the way > down to the disk just to go back up to the sdkp. > Note that when scsi_disk_get_from_dev calls __scsi_disk_get, > it's not possible for disk->private_data to be NULL. Yes, I just wanted the tying in a single routine rather than being open coded in two different routines. > You added calls to scsi_disk_get_from_dev in the flush > and shutdown routines. That's good; I would have done > it originally had I known it was needed. > > You added dev_set_drvdata(dev, NULL) to sd_remove. > Actually I'm astonished it wasn't there before and that > I didn't realize it was missing. I guess nobody > considered that the scsi_device might outlive the disk. Yes, that's the key to mediating the final race. > This is all good. I've made a few stylistic changes (mostly to avoid goto > statements). By the way, is there any reason that disk->private_data is > set to &sdkp->driver instead of to sdkp? I don't see any, so I'll change > it. Actually, it is necessary, it's a way of overloading private_data so we can get the driver even if we don't know the ULD type (we use this in scsi_lib.c), but also so that the ULD can get its specific structure as well. > However this still doesn't solve the problem you mentioned. Somehow > sd_remove has to tell __scsi_disk_get that sdkp->device might be gone. > The only way I can think of is to add an extra bitfield to the scsi_disk > structure. Hmm, OK, but you don't need a gone field to mediate that. just do a get_device(&sdp->sdev_gendev) before setting the sdkp->device pointer and do a put_device() in the release method. That won't interfere with the scsi_remove_device logic, but it will ensure that the pointer is always valid until we release it. > This patch solves the original problem and works okay on my system. If it > looks okay to you, I'll resend it as a proper patch submission. Great, thanks. James ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-11-04 17:45 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-11-03 15:54 [PATCH] sd: fix race between rescan and remove Alan Stern 2005-11-04 2:08 ` James Bottomley 2005-11-04 16:40 ` Alan Stern 2005-11-04 17:45 ` James Bottomley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox