public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] sg: fix races during device removal (v2)
@ 2009-01-05 19:07 Tony Battersby
  2009-01-08 23:21 ` Douglas Gilbert
  2009-01-10 17:26 ` FUJITA Tomonori
  0 siblings, 2 replies; 42+ messages in thread
From: Tony Battersby @ 2009-01-05 19:07 UTC (permalink / raw)
  To: Douglas Gilbert, FUJITA Tomonori, James.Bottomley
  Cc: Christoph Hellwig, linux-scsi

This is version 2 of my patch, this time using a kref instead of int.
There are also a lot of other changes since the first version since I
found even more bugs both through testing and source code analysis.
I have split the patch up into two parts: the first patch fixes
races between open, close, device removal, and command completion.
The second patch fixes some races I spotted in ioctl(SG_IO).

Below are a list of test cases fixed by the patch.

----------

test #1

open /dev/sgX
send a command that takes a long time (e.g. any tape drive seek
command)
before command completes, echo 1 >
/sys/class/scsi_generic/sgX/device/delete

without patch:
oops

with patch:
test program gets ENODEV immediately
keventd sleeps until the cmd is complete
this is suboptimal since it starves other users of keventd while
waiting for the command to complete, but it is better than an oops

----------

test #2

open /dev/sgX
send a command that takes a long time (e.g. any tape drive seek
command) without waiting for it to complete
close fd
before command completes, echo 1 >
/sys/class/scsi_generic/sgX/device/delete

without patch:
oops when the command does complete (sg_rq_end_io() bad pointer deref)

with patch:
keventd sleeps until the cmd is complete
this is suboptimal since it starves other users of keventd while
waiting for the command to complete, but it is better than an oops

----------

test #3

open /dev/sgX
send a command that takes a long time (e.g. any tape drive seek
command) without waiting for it to complete
close fd
rmmod sg

without patch:
rmmod succeeds without waiting for the command to complete
oops when the command does complete (sg_rq_end_io() callback no
longer exists)

with patch:
sg module usage count does not drop to 0 until the command completes,
so cannot rmmod

----------

test #4

open /dev/sgX
loop: send commands, check results
unplug SAS cable
mptsas automatically removes the device and fails active commands
the test program detects the failed commands and closes its fds at
the same time
this results in the following sequence:
sg_remove() enter
sg_release() enter
sg_release() exit
sg_remove() exit

without patch:
oops

with patch:
ok

----------

test #5

open /dev/sgX
loop: send commands, check results
unplug SAS cable
mptsas automatically removes the device and fails active commands
the test program detects the failed commands, but sleeps for a second
before closing its fds
this results in the following sequence:
sg_remove() enter
sg_remove() exit
sg_release() enter
sg_release() exit

without patch:
oops

with patch:
ok



^ permalink raw reply	[flat|nested] 42+ messages in thread
* [PATCH 2/2] sg: fix races with ioctl(SG_IO)
@ 2009-01-05 19:11 Tony Battersby
  0 siblings, 0 replies; 42+ messages in thread
From: Tony Battersby @ 2009-01-05 19:11 UTC (permalink / raw)
  To: Douglas Gilbert, FUJITA Tomonori, James.Bottomley
  Cc: Christoph Hellwig, linux-scsi

sg_io_owned needs to be set before the command is sent to the midlevel;
otherwise, a quickly-completing command may cause a different CPU
to see "srp->done == 1 && !srp->sg_io_owned", which would lead to
incorrect behavior.

Check srp->done and set srp->orphan while holding rq_list_lock to
prevent races with sg_rq_end_io().

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
---
 sg.c |   21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)
--- linux-2.6.28/drivers/scsi/sg.c.orig	2009-01-05 11:08:09.000000000 -0500
+++ linux-2.6.28/drivers/scsi/sg.c	2009-01-05 11:10:46.000000000 -0500
@@ -197,7 +197,7 @@ static ssize_t sg_new_read(Sg_fd * sfp, 
 			   Sg_request * srp);
 static ssize_t sg_new_write(Sg_fd *sfp, struct file *file,
 			const char __user *buf, size_t count, int blocking,
-			int read_only, Sg_request **o_srp);
+			int read_only, int sg_io_owned, Sg_request **o_srp);
 static int sg_common_write(Sg_fd * sfp, Sg_request * srp,
 			   unsigned char *cmnd, int timeout, int blocking);
 static int sg_read_oxfer(Sg_request * srp, char __user *outp, int num_read_xfer);
@@ -607,7 +607,8 @@ sg_write(struct file *filp, const char _
 		return -EFAULT;
 	blocking = !(filp->f_flags & O_NONBLOCK);
 	if (old_hdr.reply_len < 0)
-		return sg_new_write(sfp, filp, buf, count, blocking, 0, NULL);
+		return sg_new_write(sfp, filp, buf, count,
+				    blocking, 0, 0, NULL);
 	if (count < (SZ_SG_HEADER + 6))
 		return -EIO;	/* The minimum scsi command length is 6 bytes. */
 
@@ -688,7 +689,7 @@ sg_write(struct file *filp, const char _
 
 static ssize_t
 sg_new_write(Sg_fd *sfp, struct file *file, const char __user *buf,
-		 size_t count, int blocking, int read_only,
+		 size_t count, int blocking, int read_only, int sg_io_owned,
 		 Sg_request **o_srp)
 {
 	int k;
@@ -708,6 +709,7 @@ sg_new_write(Sg_fd *sfp, struct file *fi
 		SCSI_LOG_TIMEOUT(1, printk("sg_new_write: queue full\n"));
 		return -EDOM;
 	}
+	srp->sg_io_owned = sg_io_owned;
 	hp = &srp->header;
 	if (__copy_from_user(hp, buf, SZ_SG_IO_HDR)) {
 		sg_remove_request(sfp, srp);
@@ -854,10 +856,9 @@ sg_ioctl(struct inode *inode, struct fil
 				return -EFAULT;
 			result =
 			    sg_new_write(sfp, filp, p, SZ_SG_IO_HDR,
-					 blocking, read_only, &srp);
+					 blocking, read_only, 1, &srp);
 			if (result < 0)
 				return result;
-			srp->sg_io_owned = 1;
 			while (1) {
 				result = 0;	/* following macro to beat race condition */
 				__wait_event_interruptible(sfp->read_wait,
@@ -867,14 +868,16 @@ sg_ioctl(struct inode *inode, struct fil
 					return -ENODEV;
 				if (sfp->closed)
 					return 0;	/* request packet dropped already */
-				if (0 == result)
+				write_lock_irq(&sfp->rq_list_lock);
+				if (srp->done) {
+					srp->done = 2;
+					write_unlock_irq(&sfp->rq_list_lock);
 					break;
+				}
 				srp->orphan = 1;
+				write_unlock_irq(&sfp->rq_list_lock);
 				return result;	/* -ERESTARTSYS because signal hit process */
 			}
-			write_lock_irqsave(&sfp->rq_list_lock, iflags);
-			srp->done = 2;
-			write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
 			result = sg_new_read(sfp, p, SZ_SG_IO_HDR, srp);
 			return (result < 0) ? result : 0;
 		}



^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2009-01-28 15:06 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-05 19:07 [PATCH 0/2] sg: fix races during device removal (v2) Tony Battersby
2009-01-08 23:21 ` Douglas Gilbert
2009-01-10 17:26 ` FUJITA Tomonori
2009-01-12 21:09   ` Tony Battersby
2009-01-13 16:24     ` FUJITA Tomonori
2009-01-14 20:31   ` Tony Battersby
2009-01-14 21:39     ` Greg KH
2009-01-14 21:59       ` Tony Battersby
2009-01-14 22:33       ` Stefan Richter
2009-01-14 22:53         ` Tony Battersby
2009-01-14 23:47           ` Stefan Richter
2009-01-15 14:47             ` Tony Battersby
2009-01-15 16:22               ` Stefan Richter
2009-01-15 16:44                 ` Stefan Richter
2009-01-15 18:17                 ` Tony Battersby
2009-01-15 18:47                   ` Stefan Richter
2009-01-15 19:14                     ` Stefan Richter
2009-01-15 19:20                     ` Tony Battersby
2009-01-15 20:43                       ` Stefan Richter
2009-01-15 21:43                         ` Tony Battersby
2009-01-15 21:58                           ` Stefan Richter
2009-01-15 22:23                             ` Tony Battersby
2009-01-15 23:24                               ` Stefan Richter
2009-01-16 14:16                                 ` Tony Battersby
2009-01-16  0:53                           ` Stefan Richter
2009-01-16  8:09                 ` Stefan Richter
2009-01-19  6:57     ` FUJITA Tomonori
2009-01-19 15:02       ` Tony Battersby
2009-01-19 23:03       ` [PATCH 1/2] sg: fix races during device removal (v4) Tony Battersby
2009-01-20  1:06         ` FUJITA Tomonori
2009-01-20 21:58           ` [PATCH 1/2] sg: fix races during device removal (v5) Tony Battersby
2009-01-21 18:25             ` Stefan Richter
2009-01-21 19:23               ` Tony Battersby
2009-01-21 19:45             ` [PATCH 1/2] sg: fix races during device removal (v6) Tony Battersby
2009-01-25 12:46               ` FUJITA Tomonori
2009-01-26 13:57               ` Douglas Gilbert
2009-01-28  1:51                 ` FUJITA Tomonori
2009-01-28 15:06                   ` James Bottomley
2009-01-20 22:00           ` [PATCH 2/2] sg: fix races with ioctl(SG_IO) (v2) Tony Battersby
2009-01-25 12:46             ` FUJITA Tomonori
2009-01-19 23:06       ` [PATCH 2/2] sg: fix races with ioctl(SG_IO) Tony Battersby
  -- strict thread matches above, loose matches on Subject: below --
2009-01-05 19:11 Tony Battersby

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox