public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drivers/scsi/sg.c: fix problems when sg_remove is called before sg_release
@ 2005-04-07 20:20 Dailey, Nate
  2005-04-12 11:03 ` Douglas Gilbert
  0 siblings, 1 reply; 4+ messages in thread
From: Dailey, Nate @ 2005-04-07 20:20 UTC (permalink / raw)
  To: 'dgilbert@interlog.com'; +Cc: 'linux-scsi@vger.kernel.org'

This is my first attempt at submitting a patch, so I hope I'm not making any
mistakes...

This patch fixes two problems I came across in sg, both of which occur when
sg_remove is called on a disk which hasn't yet been sg_release'd:

1. I got the following Oops in sg_remove:

--
Unable to handle kernel paging request at virtual address e0a58020
 printing eip:
e0a7ceba
*pde = 1c245c8b
Oops: 0000 [#1]
SMP
CPU:    3
EIP:    0060:[<e0a7ceba>]    Tainted: G  U
EFLAGS: 00210246   (2.6.5-7.139-bigsmp )
EIP is at sg_remove+0xba/0x240 [sg]
eax: dfee0500   ebx: 00000000   ecx: dfeefa50   edx: dfee05a8
esi: 00000000   edi: c7beb000   ebp: e0a58000   esp: daf4bec0
ds: 007b   es: 007b   ss: 0068
Process bash (pid: 27436, threadinfo=daf4a000 task=dbcd3350)
Stack: c10c1840 000000d0 00000001 01500002 00000000 00200246 df844000
e0a84d08
       df844240 e0857ec0 e0857f24 c0263dff e0857f0c df844240 df9d3028
df9d3000
       fffffffa c0263e78 df844000 e08449be df9d3000 df844000 00000000
fffffffa
Call Trace:
 [<c0263dff>] class_device_del+0x5f/0xd0
 [<c0263e78>] class_device_unregister+0x8/0x10
 [<e08449be>] scsi_remove_device+0x3e/0xc0 [scsi_mod]
 [<e0845b59>] proc_scsi_write+0x269/0x2a0 [scsi_mod]
 [<c0176136>] vfs_write+0xc6/0x160
 [<c011057d>] do_mmap2+0xdd/0x170
 [<c01763e1>] sys_write+0x91/0xf0
 [<c0109199>] sysenter_past_esp+0x52/0x71

Code: 8b 45 20 e8 ce 2a 70 df c7 45 20 00 00 00 00 8b 45 1c e8 ef
--

It looks like sg_remove needs to hold the sg_dev_arr_lock a little longer.
After sg_remove dropped the lock, sg_release/sg_remove_sfp came along and
freed the sdp... and then sg_remove tried to do cdev_del(sdp->cdev). I made
a change to get what we need from the sdp while holding the lock, and it
fixed the problem for me.

2. Scsi_device references are never released: after sg_remove sets the
detached flag, sg_release won't call scsi_device_put. But someone needs to
release the reference obtained in sg_open. A change also needs to be made in
sg_remove_sfp to call scsi_device_put if freeing sdp--since sg_release
wouldn't be able to do it in that case. I verified (by sticking a printk in
scsi_device_dev_release) that the scsi_device wasn't freed in this case,
without my change in place.


To provoke these, I used a few 'dd if=/dev/sg2 of=/dev/null' commands to
hold the device open. I then did 'echo "scsi remove-single-device 1 0 0 0" >
/proc/scsi/scsi' to trigger the sg_remove.

This patch is against 2.6.12-rc2, though I actually found/fixed the problems
in 2.6.5-7.139 (SLES9 SP1).

Do these look okay?

Nate Dailey
Stratus Technologies


Signed-off-by: Nate Dailey <nate.dailey@stratus.com>

--- linux-2.6.12-rc2/drivers/scsi/sg.c.orig	2005-04-06
14:17:51.000000000 -0400
+++ linux-2.6.12-rc2/drivers/scsi/sg.c	2005-04-06 15:25:08.000000000 -0400
@@ -318,9 +318,7 @@ sg_release(struct inode *inode, struct f
 	SCSI_LOG_TIMEOUT(3, printk("sg_release: %s\n",
sdp->disk->disk_name));
 	sg_fasync(-1, filp, 0);	/* remove filp from async notification list
*/
 	if (0 == sg_remove_sfp(sdp, sfp)) {	/* Returns 1 when sdp gone
*/
-		if (!sdp->detached) {
-			scsi_device_put(sdp->device);
-		}
+		scsi_device_put(sdp->device);
 		sdp->exclude = 0;
 		wake_up_interruptible(&sdp->o_excl_wait);
 	}
@@ -1574,19 +1572,28 @@ sg_remove(struct class_device *cl_dev)
 		sg_nr_dev--;
 		break;
 	}
-	write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
 
 	if (sdp) {
+		struct cdev *tmp_cdev;
+		struct gendisk *tmp_disk;
+		int free_sdp = 0;
+
+		tmp_cdev = sdp->cdev;
+		sdp->cdev = NULL;
+		tmp_disk = sdp->disk;
+		sdp->disk = NULL;
+		if (NULL == sdp->headfp) free_sdp = 1;
+		write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
+
 		sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
 		class_simple_device_remove(MKDEV(SCSI_GENERIC_MAJOR, k));
-		cdev_del(sdp->cdev);
-		sdp->cdev = NULL;
+		cdev_del(tmp_cdev);
 		devfs_remove("%s/generic", scsidp->devfs_name);
-		put_disk(sdp->disk);
-		sdp->disk = NULL;
-		if (NULL == sdp->headfp)
+		put_disk(tmp_disk);
+		if (free_sdp)
 			kfree((char *) sdp);
 	}
+	else write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
 
 	if (delay)
 		msleep(10);	/* dirty detach so delay device destruction
*/
@@ -2550,6 +2557,7 @@ sg_remove_sfp(Sg_device * sdp, Sg_fd * s
 			}
 			if (k < maxd)
 				sg_dev_arr[k] = NULL;
+			scsi_device_put(sdp->device);
 			kfree((char *) sdp);
 			res = 1;
 		}

^ permalink raw reply	[flat|nested] 4+ messages in thread
* RE: [PATCH] drivers/scsi/sg.c: fix problems when sg_remove is called before sg_release
@ 2005-07-11 13:59 Dailey, Nate
  0 siblings, 0 replies; 4+ messages in thread
From: Dailey, Nate @ 2005-07-11 13:59 UTC (permalink / raw)
  To: dougg; +Cc: linux-scsi, James.Bottomley

[-- Attachment #1: Type: text/plain, Size: 3840 bytes --]

Just following up on this patch... I haven't heard anything negative,
but also haven't noticed it in the latest kernel. Attached is a version
of my patch that applies against kernel 2.6.13rc2.

Nate Dailey


> -----Original Message-----
> From: Douglas Gilbert [mailto:dougg@torque.net] 
> Sent: Tuesday, April 12, 2005 7:04 AM
> To: Dailey, Nate
> Cc: 'linux-scsi@vger.kernel.org'
> Subject: Re: [PATCH] drivers/scsi/sg.c: fix problems when 
> sg_remove is called before sg_release
> 
> 
> Dailey, Nate wrote:
> > This is my first attempt at submitting a patch, so I hope 
> I'm not making any
> > mistakes...
> > 
> > This patch fixes two problems I came across in sg, both of 
> which occur when
> > sg_remove is called on a disk which hasn't yet been sg_release'd:
> > 
> > 1. I got the following Oops in sg_remove:
> > 
> > --
> > Unable to handle kernel paging request at virtual address e0a58020
> >  printing eip:
> > e0a7ceba
> > *pde = 1c245c8b
> > Oops: 0000 [#1]
> > SMP
> > CPU:    3
> > EIP:    0060:[<e0a7ceba>]    Tainted: G  U
> > EFLAGS: 00210246   (2.6.5-7.139-bigsmp )
> > EIP is at sg_remove+0xba/0x240 [sg]
> > eax: dfee0500   ebx: 00000000   ecx: dfeefa50   edx: dfee05a8
> > esi: 00000000   edi: c7beb000   ebp: e0a58000   esp: daf4bec0
> > ds: 007b   es: 007b   ss: 0068
> > Process bash (pid: 27436, threadinfo=daf4a000 task=dbcd3350)
> > Stack: c10c1840 000000d0 00000001 01500002 00000000 
> 00200246 df844000
> > e0a84d08
> >        df844240 e0857ec0 e0857f24 c0263dff e0857f0c 
> df844240 df9d3028
> > df9d3000
> >        fffffffa c0263e78 df844000 e08449be df9d3000 
> df844000 00000000
> > fffffffa
> > Call Trace:
> >  [<c0263dff>] class_device_del+0x5f/0xd0
> >  [<c0263e78>] class_device_unregister+0x8/0x10
> >  [<e08449be>] scsi_remove_device+0x3e/0xc0 [scsi_mod]
> >  [<e0845b59>] proc_scsi_write+0x269/0x2a0 [scsi_mod]
> >  [<c0176136>] vfs_write+0xc6/0x160
> >  [<c011057d>] do_mmap2+0xdd/0x170
> >  [<c01763e1>] sys_write+0x91/0xf0
> >  [<c0109199>] sysenter_past_esp+0x52/0x71
> > 
> > Code: 8b 45 20 e8 ce 2a 70 df c7 45 20 00 00 00 00 8b 45 1c e8 ef
> > --
> > 
> > It looks like sg_remove needs to hold the sg_dev_arr_lock a 
> little longer.
> > After sg_remove dropped the lock, sg_release/sg_remove_sfp 
> came along and
> > freed the sdp... and then sg_remove tried to do 
> cdev_del(sdp->cdev). I made
> > a change to get what we need from the sdp while holding the 
> lock, and it
> > fixed the problem for me.
> > 
> > 2. Scsi_device references are never released: after 
> sg_remove sets the
> > detached flag, sg_release won't call scsi_device_put. But 
> someone needs to
> > release the reference obtained in sg_open. A change also 
> needs to be made in
> > sg_remove_sfp to call scsi_device_put if freeing sdp--since 
> sg_release
> > wouldn't be able to do it in that case. I verified (by 
> sticking a printk in
> > scsi_device_dev_release) that the scsi_device wasn't freed 
> in this case,
> > without my change in place.
> > 
> > 
> > To provoke these, I used a few 'dd if=/dev/sg2 
> of=/dev/null' commands to
> > hold the device open. I then did 'echo "scsi 
> remove-single-device 1 0 0 0" >
> > /proc/scsi/scsi' to trigger the sg_remove.
> > 
> > This patch is against 2.6.12-rc2, though I actually 
> found/fixed the problems
> > in 2.6.5-7.139 (SLES9 SP1).
> > 
> > Do these look okay?
> 
> Nate,
> 
> Thanks for the analysis and the patch.
> 
> I'm starting to look at this patch and will do more
> testing shortly. The changes look sound. The patch
> had some line wrap problems so attached is my version
> of it (against lk 2.6.12-rc2). It also applies over
> my patch sent to this list on 28/3/2005 with a little
> harmless "fuzz".
> 
> Doug Gilbert
> 

[-- Attachment #2: sg2613rc2nd.diff --]
[-- Type: application/octet-stream, Size: 1815 bytes --]

--- sg.c.orig	2005-07-08 11:49:55.000000000 -0400
+++ sg.c	2005-07-08 14:53:50.000000000 -0400
@@ -319,9 +319,7 @@ sg_release(struct inode *inode, struct f
 	SCSI_LOG_TIMEOUT(3, printk("sg_release: %s\n", sdp->disk->disk_name));
 	sg_fasync(-1, filp, 0);	/* remove filp from async notification list */
 	if (0 == sg_remove_sfp(sdp, sfp)) {	/* Returns 1 when sdp gone */
-		if (!sdp->detached) {
-			scsi_device_put(sdp->device);
-		}
+		scsi_device_put(sdp->device);
 		sdp->exclude = 0;
 		wake_up_interruptible(&sdp->o_excl_wait);
 	}
@@ -1632,19 +1630,30 @@ sg_remove(struct class_device *cl_dev)
 		sg_nr_dev--;
 		break;
 	}
-	write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
 
 	if (sdp) {
-		sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
-		class_device_destroy(sg_sysfs_class, MKDEV(SCSI_GENERIC_MAJOR, k));
-		cdev_del(sdp->cdev);
+		struct cdev *tmp_cdev;
+		struct gendisk *tmp_disk;
+		int free_sdp = 0;
+
+		tmp_cdev = sdp->cdev;
 		sdp->cdev = NULL;
-		devfs_remove("%s/generic", scsidp->devfs_name);
-		put_disk(sdp->disk);
+		tmp_disk = sdp->disk;
 		sdp->disk = NULL;
 		if (NULL == sdp->headfp)
+			free_sdp = 1;
+		write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
+
+		sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
+		class_device_destroy(sg_sysfs_class,
+				     MKDEV(SCSI_GENERIC_MAJOR, k));
+		cdev_del(tmp_cdev);
+		devfs_remove("%s/generic", scsidp->devfs_name);
+		put_disk(tmp_disk);
+		if (free_sdp)
 			kfree((char *) sdp);
-	}
+	} else
+		write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
 
 	if (delay)
 		msleep(10);	/* dirty detach so delay device destruction */
@@ -2610,6 +2619,7 @@ sg_remove_sfp(Sg_device * sdp, Sg_fd * s
 			}
 			if (k < maxd)
 				sg_dev_arr[k] = NULL;
+			scsi_device_put(sdp->device);
 			kfree((char *) sdp);
 			res = 1;
 		}

^ permalink raw reply	[flat|nested] 4+ messages in thread
* [PATCH] drivers/scsi/sg.c: fix problems when sg_remove is called  before sg_release
@ 2005-09-23 18:03 Dailey, Nate
  0 siblings, 0 replies; 4+ messages in thread
From: Dailey, Nate @ 2005-09-23 18:03 UTC (permalink / raw)
  To: dgilbert; +Cc: linux-scsi

[-- Attachment #1: Type: text/plain, Size: 953 bytes --]

This is a re-submission of a patch I sent to the linux-scsi mailing list
several months back
(http://marc.theaimsgroup.com/?l=linux-scsi&m=111290555013699&w=2).


In summary, the patch fixes:

1. an oops in sg_remove, if it's called when the device has been opened,
which occurs because the sg_dev_arr_lock is dropped too soon and
sg_remove_sfp frees the sdp before sg_remove is done with it.

2. a case in sg_remove_sfp where a reference isn't dropped, but should
be.


I found a problem with my original patch... I was calling
scsi_device_put (which may eventually sleep, at least in the 2.6.9
kernel I'm using) from sg_remove_sfp while holding a spinlock. Hence the
revised patch.

The attached patch is against kernel 2.6.14rc2. I retested in a recent
kernel to verify that these problems still exist and are fixed by the
patch.


Nate Dailey
Stratus Technologies


Signed-off-by: Nate Dailey <nate.dailey@stratus.com>

[-- Attachment #2: sg.c.patch2614rc2nd --]
[-- Type: application/octet-stream, Size: 2126 bytes --]

--- sg.c.orig	2005-09-22 10:31:51.000000000 -0400
+++ sg.c	2005-09-22 10:41:28.000000000 -0400
@@ -319,9 +319,7 @@ sg_release(struct inode *inode, struct f
 	SCSI_LOG_TIMEOUT(3, printk("sg_release: %s\n", sdp->disk->disk_name));
 	sg_fasync(-1, filp, 0);	/* remove filp from async notification list */
 	if (0 == sg_remove_sfp(sdp, sfp)) {	/* Returns 1 when sdp gone */
-		if (!sdp->detached) {
-			scsi_device_put(sdp->device);
-		}
+		scsi_device_put(sdp->device);
 		sdp->exclude = 0;
 		wake_up_interruptible(&sdp->o_excl_wait);
 	}
@@ -1631,19 +1629,29 @@ sg_remove(struct class_device *cl_dev)
 		sg_nr_dev--;
 		break;
 	}
-	write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
 
 	if (sdp) {
-		sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
-		class_device_destroy(sg_sysfs_class, MKDEV(SCSI_GENERIC_MAJOR, k));
-		cdev_del(sdp->cdev);
+		struct cdev *tmp_cdev;
+		struct gendisk *tmp_disk;
+		int free_sdp = 0;
+
+		tmp_cdev = sdp->cdev;
 		sdp->cdev = NULL;
-		devfs_remove("%s/generic", scsidp->devfs_name);
-		put_disk(sdp->disk);
+		tmp_disk = sdp->disk;
 		sdp->disk = NULL;
 		if (NULL == sdp->headfp)
+			free_sdp = 1;
+		write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
+
+		sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
+		class_device_destroy(sg_sysfs_class, MKDEV(SCSI_GENERIC_MAJOR, k));
+		cdev_del(tmp_cdev);
+		devfs_remove("%s/generic", scsidp->devfs_name);
+		put_disk(tmp_disk);
+		if (free_sdp)
 			kfree((char *) sdp);
-	}
+	} else
+		write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
 
 	if (delay)
 		msleep(10);	/* dirty detach so delay device destruction */
@@ -2609,10 +2617,12 @@ sg_remove_sfp(Sg_device * sdp, Sg_fd * s
 			}
 			if (k < maxd)
 				sg_dev_arr[k] = NULL;
+			write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
+			scsi_device_put(sdp->device);
 			kfree((char *) sdp);
 			res = 1;
-		}
-		write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
+		} else
+			write_unlock_irqrestore(&sg_dev_arr_lock, iflags);
 	} else {
 		/* MOD_INC's to inhibit unloading sg and associated adapter driver */
 		/* only bump the access_count if we actually succeeded in

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-09-23 18:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-07 20:20 [PATCH] drivers/scsi/sg.c: fix problems when sg_remove is called before sg_release Dailey, Nate
2005-04-12 11:03 ` Douglas Gilbert
  -- strict thread matches above, loose matches on Subject: below --
2005-07-11 13:59 Dailey, Nate
2005-09-23 18:03 Dailey, Nate

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox