public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scsi_allocate_request() reference
@ 2005-03-21 13:26 Jens Axboe
  2005-03-21 15:09 ` James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2005-03-21 13:26 UTC (permalink / raw)
  To: linux-scsi; +Cc: James Bottomley

Hi,

scsi_allocate_request() doesn't hold a reference to the device that it
points to, that is not good. This patch fixes that up.

Signed-off-by: Jens Axboe <axboe@suse.de>

===== drivers/scsi/scsi.c 1.157 vs edited =====
--- 1.157/drivers/scsi/scsi.c	2005-03-03 09:22:17 +01:00
+++ edited/drivers/scsi/scsi.c	2005-03-21 14:24:27 +01:00
@@ -132,7 +132,11 @@
 	const int offset = ALIGN(sizeof(struct scsi_request), 4);
 	const int size = offset + sizeof(struct request);
 	struct scsi_request *sreq;
-  
+
+
+	if (!get_device(&sdev->sdev_gendev))
+		return NULL;
+
 	sreq = kmalloc(size, gfp_mask);
 	if (likely(sreq != NULL)) {
 		memset(sreq, 0, size);
@@ -141,7 +145,8 @@
 		sreq->sr_host = sdev->host;
 		sreq->sr_magic = SCSI_REQ_MAGIC;
 		sreq->sr_data_direction = DMA_BIDIRECTIONAL;
-	}
+	} else
+		put_device(&sdev->sdev_gendev);
 
 	return sreq;
 }
@@ -184,6 +189,10 @@
 void scsi_release_request(struct scsi_request *sreq)
 {
 	__scsi_release_request(sreq);
+
+	if (sreq->sr_device)
+		put_device(&sreq->sr_device->sdev_gendev);
+
 	kfree(sreq);
 }
 EXPORT_SYMBOL(scsi_release_request);

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi_allocate_request() reference
  2005-03-21 13:26 [PATCH] scsi_allocate_request() reference Jens Axboe
@ 2005-03-21 15:09 ` James Bottomley
  2005-03-21 16:57   ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2005-03-21 15:09 UTC (permalink / raw)
  To: Jens Axboe; +Cc: SCSI Mailing List

On Mon, 2005-03-21 at 14:26 +0100, Jens Axboe wrote:
> scsi_allocate_request() doesn't hold a reference to the device that it
> points to, that is not good. This patch fixes that up.

Actually, I don't think this is correct.  The reference is taken when
the command is attached to a request in the scsi_request_fn function
after first checking that the entity is in a condition to have this
happen.

The problem is that the device could have been torn down (surprise
ejection etc) when some of the routines that call scsi_allocate_request
run.  What's the actual problem this is trying to solve?

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi_allocate_request() reference
  2005-03-21 15:09 ` James Bottomley
@ 2005-03-21 16:57   ` Jens Axboe
  2005-03-21 23:51     ` Tejun Heo
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2005-03-21 16:57 UTC (permalink / raw)
  To: James Bottomley; +Cc: SCSI Mailing List

On Mon, Mar 21 2005, James Bottomley wrote:
> On Mon, 2005-03-21 at 14:26 +0100, Jens Axboe wrote:
> > scsi_allocate_request() doesn't hold a reference to the device that it
> > points to, that is not good. This patch fixes that up.
> 
> Actually, I don't think this is correct.  The reference is taken when
> the command is attached to a request in the scsi_request_fn function
> after first checking that the entity is in a condition to have this
> happen.

Where does this happen, specifically? I can verify that this patch
solves at least one of the oopses I am seeing with hotplug. IMHO, the
reference should be taken when you reference the device (ie ->sr_device
is assigned in scsi_allocate_request()).

> The problem is that the device could have been torn down (surprise
> ejection etc) when some of the routines that call scsi_allocate_request
> run.  What's the actual problem this is trying to solve?

Same problem, SCSI blowing up with a hotplug device. Try with your
pendrive, you should have no problem oopsing the kernel. As mentioned
earlier in this thread, there are unsolvable reference counting problems
between sdev and the request_queue right now.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi_allocate_request() reference
  2005-03-21 16:57   ` Jens Axboe
@ 2005-03-21 23:51     ` Tejun Heo
  2005-03-22 11:17       ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2005-03-21 23:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: James Bottomley, SCSI Mailing List

 Hello, Jens.
 Hello, James.

On Mon, Mar 21, 2005 at 05:57:52PM +0100, Jens Axboe wrote:
> On Mon, Mar 21 2005, James Bottomley wrote:
> > On Mon, 2005-03-21 at 14:26 +0100, Jens Axboe wrote:
> > > scsi_allocate_request() doesn't hold a reference to the device that it
> > > points to, that is not good. This patch fixes that up.
> > 
> > Actually, I don't think this is correct.  The reference is taken when
> > the command is attached to a request in the scsi_request_fn function
> > after first checking that the entity is in a condition to have this
> > happen.
> 
> Where does this happen, specifically? I can verify that this patch
> solves at least one of the oopses I am seeing with hotplug. IMHO, the
> reference should be taken when you reference the device (ie ->sr_device
> is assigned in scsi_allocate_request()).

 I've been working on scsi hot plug/unplugging for a few days now but
I haven't seen any oops caused by reference count problems.  The only
oops I've seen looks like the following.  (using usb hdd)

Unable to handle kernel paging request at virtual address 20020070
 printing eip:
c02b925f
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP 
Modules linked in: nfs lockd sunrpc sd_mod sata_sil libata e1000 eepro100 e100 mii tsdev floppy
CPU:    0
EIP:    0060:[<c02b925f>]    Not tainted VLI
EFLAGS: 00010246   (2.6.11-scsi-work-i386) 
EIP is at bus_reset+0x5f/0xe0
eax: 2002006c   ebx: f72b8e00   ecx: 00000001   edx: 00000001
esi: fffffffb   edi: 00000000   ebp: f76737c0   esp: f29d5ee0
ds: 007b   es: 007b   ss: 0068
Process scsi_eh_13 (pid: 3775, threadinfo=f29d4000 task=f7479040)
Stack: c0340d04 00000066 f76737c0 00000286 00000000 f29d5f90 c027a308 f76737c0 
       00000286 f76737c0 00000000 00000000 c027a512 f76737c0 f763a400 00000000 
       f29d4000 f76737c0 f763a400 f29d5f88 f29d5f90 f763a400 c027aae4 f763a400 
Call Trace:
 [<c027a308>] scsi_try_bus_reset+0x58/0xf0
 [<c027a512>] scsi_eh_bus_reset+0x82/0x170
 [<c027aae4>] scsi_eh_ready_devs+0x64/0xa0
 [<c027ace1>] scsi_unjam_host+0xd1/0x200
 [<c0115e40>] default_wake_function+0x0/0x20
 [<c027af2f>] scsi_error_handler+0x11f/0x1d0
 [<c027ae10>] scsi_error_handler+0x0/0x1d0
 [<c0100f11>] kernel_thread_helper+0x5/0x14
Code: 24 04 0d 34 c0 e8 22 e9 e5 ff f0 ff 0b 0f 88 54 04 00 00 8b 43 28 be fb ff ff ff a9 00 00 20 00 75 14 8b 43 1c 8b 80 40 01 00 00 <80> 78 04 01 74 39 be f0 ff ff ff f0 ff 03 0f 8e 34 04 00 00 8b 

 This error is due to faulty cleanup/detach sequence in scsi layer.
More specifially, scsi_device_cancel() never gets called on hot-unplug
path.  If you have seen any oops with different reason, please share.

> 
> > The problem is that the device could have been torn down (surprise
> > ejection etc) when some of the routines that call scsi_allocate_request
> > run.  What's the actual problem this is trying to solve?
> 
> Same problem, SCSI blowing up with a hotplug device. Try with your
> pendrive, you should have no problem oopsing the kernel. As mentioned
> earlier in this thread, there are unsolvable reference counting problems
> between sdev and the request_queue right now.

 Currently, there are many issues even not including the reference
counting problem you're encountering.

 * scsi_device state is manipulated simultaneously without
   synchronization.
 * request cleanup doesn't synchronize with request processing
   and/or scsi_eh
 * when manual removal is requested via proc or sysfs no request
   cleanup is done.
 * things like domain validation need exclusive access, but
   currently, other special requests can enter during dv.
 * FS/PC requests become special requests after being requeued.

 I'm working on a new state machine with clean flushing.  I think it
will take a few more days.  The state machine will look like the
following.

 http://home-tj.org/scsi_state_machine.png

 Also, as drawing with dia was fun, I draw another diagram regarding
scsi midlayer.  This diagram helps thinking about how things are
referenced and where the roles are devided (scsi device/driver).

 http://home-tj.org/scsi_block_model.png

 Ignore todo notes below the diagram.

 So, please let me know about the reference counting problem you're
encountering.  I will incorporate it in the new state machine /
flushing patches.

 Thanks.

P.S. James, do you like the idea?

-- 
tejun


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi_allocate_request() reference
  2005-03-21 23:51     ` Tejun Heo
@ 2005-03-22 11:17       ` Jens Axboe
  2005-03-22 15:34         ` James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2005-03-22 11:17 UTC (permalink / raw)
  To: Tejun Heo; +Cc: James Bottomley, SCSI Mailing List

On Tue, Mar 22 2005, Tejun Heo wrote:
>  Hello, Jens.
>  Hello, James.
> 
> On Mon, Mar 21, 2005 at 05:57:52PM +0100, Jens Axboe wrote:
> > On Mon, Mar 21 2005, James Bottomley wrote:
> > > On Mon, 2005-03-21 at 14:26 +0100, Jens Axboe wrote:
> > > > scsi_allocate_request() doesn't hold a reference to the device that it
> > > > points to, that is not good. This patch fixes that up.
> > > 
> > > Actually, I don't think this is correct.  The reference is taken when
> > > the command is attached to a request in the scsi_request_fn function
> > > after first checking that the entity is in a condition to have this
> > > happen.
> > 
> > Where does this happen, specifically? I can verify that this patch
> > solves at least one of the oopses I am seeing with hotplug. IMHO, the
> > reference should be taken when you reference the device (ie ->sr_device
> > is assigned in scsi_allocate_request()).
> 
>  I've been working on scsi hot plug/unplugging for a few days now but
> I haven't seen any oops caused by reference count problems.  The only

You need to have io in progress. The one ref problem with
scsi_allocate_request() is easy to trigger, if you just open/close the
device repeatedly while inserting and removing it.

>  So, please let me know about the reference counting problem you're
> encountering.  I will incorporate it in the new state machine /
> flushing patches.

See my two recent postings on this list!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi_allocate_request() reference
  2005-03-22 11:17       ` Jens Axboe
@ 2005-03-22 15:34         ` James Bottomley
  0 siblings, 0 replies; 6+ messages in thread
From: James Bottomley @ 2005-03-22 15:34 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Tejun Heo, SCSI Mailing List

On Tue, 2005-03-22 at 12:17 +0100, Jens Axboe wrote:
> You need to have io in progress. The one ref problem with
> scsi_allocate_request() is easy to trigger, if you just open/close the
> device repeatedly while inserting and removing it.

OK, this is the python program I've been using:

>>> while 1:
...     try:
...             fd=open("/dev/sda", "rw")
...     except IOError:
...             pass
...     fd.close();


The kernel trace from repeated insertion/removal of my pendrive is
below.  This is a vanilla 2.6.12-rc1 without your changes.  As you can
see, it works for me (and, as you can see, I had I/O active during
ejection).  (As I expect, since I did quite a bit of work around hot
ejection of USB CD's to have this work correctly).  The only known issue
is the panic in the usb-storage reset routines if we're in error
recovery after ejection.

Can you at least provide a trace on 2.6.12-rc1 where you can trigger
this?

Thanks,

James



usb 4-2: new full speed USB device using uhci_hcd and address 2
usb 4-2: device descriptor read/64, error -71
SCSI subsystem initialized
Initializing USB Mass Storage driver...
scsi0 : SCSI emulation for USB Mass Storage devices
usbcore: registered new driver usb-storage
USB Mass Storage support registered.
usb-storage: device found at 2
usb-storage: waiting for device to settle before scanning
  Vendor: SWISSBIT  Model: Victorinox        Rev: 1.89
  Type:   Direct-Access                      ANSI SCSI revision: 02
usb-storage: device scan complete
sda: Unit Not Ready, sense:
: Current: sense key: Unit Attention
    Additional sense: Not ready to ready change, medium may have changed
SCSI device sda: 126720 512-byte hdwr sectors (65 MB)
sda: Write Protect is off
sda: Mode Sense: 03 00 00 00
sda: assuming drive cache: write through
SCSI device sda: 126720 512-byte hdwr sectors (65 MB)
sda: Write Protect is off
sda: Mode Sense: 03 00 00 00
sda: assuming drive cache: write through
 sda: unknown partition table
Attached scsi removable disk sda at scsi0, channel 0, id 0, lun 0
FAT: utf8 is not a recommended IO charset for FAT filesystems,
filesystem will be case sensitive!
usb 4-2: USB disconnect, address 2
usb 4-2: new full speed USB device using uhci_hcd and address 3
usb 4-2: device descriptor read/64, error -71
scsi1 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 3
usb-storage: waiting for device to settle before scanning
  Vendor: SWISSBIT  Model: Victorinox        Rev: 1.89
  Type:   Direct-Access                      ANSI SCSI revision: 02
sda: Unit Not Ready, sense:
: Current: sense key: Unit Attention
    Additional sense: Not ready to ready change, medium may have changed
SCSI device sda: 126720 512-byte hdwr sectors (65 MB)
sda: Write Protect is off
sda: Mode Sense: 03 00 00 00
sda: assuming drive cache: write through
SCSI device sda: 126720 512-byte hdwr sectors (65 MB)
sda: Write Protect is off
sda: Mode Sense: 03 00 00 00
sda: assuming drive cache: write through
 sda: unknown partition table
Attached scsi removable disk sda at scsi1, channel 0, id 0, lun 0
usb-storage: device scan complete
FAT: utf8 is not a recommended IO charset for FAT filesystems,
filesystem will be case sensitive!
VFS: busy inodes on changed media.
sda : READ CAPACITY failed.
sda : status=0, message=00, host=7, driver=00 
sda : sense not available. 
sda: Write Protect is off
sda: Mode Sense: 00 00 00 00
sda: assuming drive cache: write through
VFS: busy inodes on changed media.
sda : READ CAPACITY failed.
sda : status=0, message=00, host=7, driver=00 
sda : sense not available. 
sda: Write Protect is off
sda: Mode Sense: 00 00 00 00
sda: assuming drive cache: write through
usb 4-2: USB disconnect, address 3
VFS: busy inodes on changed media.
scsi1 (0:0): rejecting I/O to dead device
usb 4-2: new full speed USB device using uhci_hcd and address 4
usb 4-2: device descriptor read/64, error -71
scsi2 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 4
usb-storage: waiting for device to settle before scanning
usb 4-2: USB disconnect, address 4
usb 4-2: new full speed USB device using uhci_hcd and address 5
usb 4-2: device descriptor read/64, error -71
scsi3 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 5
usb-storage: waiting for device to settle before scanning
usb 4-2: USB disconnect, address 5
usb 4-2: new full speed USB device using uhci_hcd and address 6
usb 4-2: device descriptor read/64, error -71
scsi4 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 6
usb-storage: waiting for device to settle before scanning
usb 4-2: USB disconnect, address 6
usb 4-2: new full speed USB device using uhci_hcd and address 7
usb 4-2: device descriptor read/64, error -71
scsi5 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 7
usb-storage: waiting for device to settle before scanning
  Vendor: SWISSBIT  Model: Victorinox        Rev: 1.89
  Type:   Direct-Access                      ANSI SCSI revision: 02
sdb: Unit Not Ready, sense:
: Current: sense key: Unit Attention
    Additional sense: Not ready to ready change, medium may have changed
SCSI device sdb: 126720 512-byte hdwr sectors (65 MB)
sdb: Write Protect is off
sdb: Mode Sense: 03 00 00 00
sdb: assuming drive cache: write through
SCSI device sdb: 126720 512-byte hdwr sectors (65 MB)
sdb: Write Protect is off
sdb: Mode Sense: 03 00 00 00
sdb: assuming drive cache: write through
 sdb: unknown partition table
Attached scsi removable disk sdb at scsi5, channel 0, id 0, lun 0
usb-storage: device scan complete
usb 4-2: USB disconnect, address 7
usb 4-2: new full speed USB device using uhci_hcd and address 8
usb 4-2: device descriptor read/64, error -71
scsi6 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 8
usb-storage: waiting for device to settle before scanning
usb 4-2: USB disconnect, address 8
usb 4-1: new full speed USB device using uhci_hcd and address 9
usb 4-1: device descriptor read/64, error -71
scsi7 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 9
usb-storage: waiting for device to settle before scanning
  Vendor: SWISSBIT  Model: Victorinox        Rev: 1.89
  Type:   Direct-Access                      ANSI SCSI revision: 02
sdb: Unit Not Ready, sense:
: Current: sense key: Unit Attention
    Additional sense: Not ready to ready change, medium may have changed
SCSI device sdb: 126720 512-byte hdwr sectors (65 MB)
sdb: Write Protect is off
sdb: Mode Sense: 03 00 00 00
sdb: assuming drive cache: write through
SCSI device sdb: 126720 512-byte hdwr sectors (65 MB)
sdb: Write Protect is off
sdb: Mode Sense: 03 00 00 00
sdb: assuming drive cache: write through
 sdb: unknown partition table
Attached scsi removable disk sdb at scsi7, channel 0, id 0, lun 0
usb-storage: device scan complete
usb 4-1: USB disconnect, address 9



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-03-22 15:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-21 13:26 [PATCH] scsi_allocate_request() reference Jens Axboe
2005-03-21 15:09 ` James Bottomley
2005-03-21 16:57   ` Jens Axboe
2005-03-21 23:51     ` Tejun Heo
2005-03-22 11:17       ` Jens Axboe
2005-03-22 15:34         ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox