* Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 [not found] <20030516190414.GD11884@iucha.net> @ 2003-05-16 21:23 ` Alan Stern 2003-05-16 21:44 ` Florin Iucha 2003-05-16 22:29 ` Mike Anderson 0 siblings, 2 replies; 35+ messages in thread From: Alan Stern @ 2003-05-16 21:23 UTC (permalink / raw) To: Florin Iucha; +Cc: David Brownell, USB development list, linux-scsi On Fri, 16 May 2003, Florin Iucha wrote: > On Fri, May 16, 2003 at 02:54:00PM -0400, Alan Stern wrote: > > > > The logs you posted did not include the usb-storage debugging output. > > Either you didn't configure it in your kernel, or you had the console > > logging level set too low to capture debug-priority messages. > > > > Maybe David Brownell will be able to learn something from the ehci log > > messages. > > D'oh! I have attached the respective kern.log. Good. This clearly shows the problem. It's an error in the SCSI driver. It has shown up before and been discussed at length a few weeks ago. The problem is that a MODE-SENSE(6) command is requesting a transfer of 3 bytes, which is smaller than the minimum of 4 bytes for that command. The device doesn't like it and crashes. The SCSI people are supposed to be working on this. First, we should be using MODE-SENSE(10) by default rather than MODE-SENSE(6). Second, we should never request a transfer smaller than the minimum allowed size. When these changes are ready you shouldn't have any more trouble. The OOPS you got originally was caused by the lack of proper hot-unplugging cooperation between usb-storage and the SCSI driver. Alan Stern ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 2003-05-16 21:23 ` [linux-usb-devel] USB Storage oops in 2.5.69-bk8 Alan Stern @ 2003-05-16 21:44 ` Florin Iucha 2003-05-16 22:29 ` Mike Anderson 1 sibling, 0 replies; 35+ messages in thread From: Florin Iucha @ 2003-05-16 21:44 UTC (permalink / raw) To: Alan Stern; +Cc: David Brownell, USB development list, linux-scsi [-- Attachment #1: Type: text/plain, Size: 1088 bytes --] On Fri, May 16, 2003 at 05:23:59PM -0400, Alan Stern wrote: > Good. This clearly shows the problem. It's an error in the SCSI driver. > It has shown up before and been discussed at length a few weeks ago. The > problem is that a MODE-SENSE(6) command is requesting a transfer of 3 > bytes, which is smaller than the minimum of 4 bytes for that command. > The device doesn't like it and crashes. > > The SCSI people are supposed to be working on this. First, we should be > using MODE-SENSE(10) by default rather than MODE-SENSE(6). Second, we > should never request a transfer smaller than the minimum allowed size. > When these changes are ready you shouldn't have any more trouble. > > The OOPS you got originally was caused by the lack of proper > hot-unplugging cooperation between usb-storage and the SCSI driver. At GregKH's suggestion I have opened a bug in 'zilla: http://bugme.osdl.org/show_bug.cgi?id=724 You might want to update it or close it. Thank you, florin -- "NT is to UNIX what a doughnut is to a particle accelerator." [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 2003-05-16 21:23 ` [linux-usb-devel] USB Storage oops in 2.5.69-bk8 Alan Stern 2003-05-16 21:44 ` Florin Iucha @ 2003-05-16 22:29 ` Mike Anderson 2003-05-17 15:29 ` Alan Stern 2003-05-20 14:11 ` Patch: change the serial_number for error-handler commands Alan Stern 1 sibling, 2 replies; 35+ messages in thread From: Mike Anderson @ 2003-05-16 22:29 UTC (permalink / raw) To: Alan Stern; +Cc: Florin Iucha, David Brownell, USB development list, linux-scsi Alan Stern [stern@rowland.harvard.edu] wrote: > Good. This clearly shows the problem. It's an error in the SCSI driver. > It has shown up before and been discussed at length a few weeks ago. The > problem is that a MODE-SENSE(6) command is requesting a transfer of 3 > bytes, which is smaller than the minimum of 4 bytes for that command. > The device doesn't like it and crashes. > > The SCSI people are supposed to be working on this. First, we should be > using MODE-SENSE(10) by default rather than MODE-SENSE(6). Second, we > should never request a transfer smaller than the minimum allowed size. > When these changes are ready you shouldn't have any more trouble. > > The OOPS you got originally was caused by the lack of proper > hot-unplugging cooperation between usb-storage and the SCSI driver. I remember the previous conversation on the MODE-SENSE, but I think there might be more here. I do not believe the oops is related to the lack of proper hot-plugging support (unless this is another way it will manifest itself). The issue is more the interaction of the error handler and the usb storage driver. The sequence: "drivers/usb/core/message.c: usb_control/bulk_msg: timeout Error handler scsi_eh_1 waking up scsi_eh_prt_fail_stats: 1:0:0:0 cmds failed: 0, cancel: 1" would lead one two believe that a timeout was detected both in the usb core and in the mid layer. This would not be a good thing to have two timeout recovery actions happening. I am not familiar with the usb storage code so this may be incorrect. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 2003-05-16 22:29 ` Mike Anderson @ 2003-05-17 15:29 ` Alan Stern 2003-05-17 16:33 ` David Brownell 2003-05-17 18:38 ` [linux-usb-devel] USB Storage oops in 2.5.69-bk8 Patrick Mansfield 2003-05-20 14:11 ` Patch: change the serial_number for error-handler commands Alan Stern 1 sibling, 2 replies; 35+ messages in thread From: Alan Stern @ 2003-05-17 15:29 UTC (permalink / raw) To: Mike Anderson Cc: Florin Iucha, David Brownell, USB development list, linux-scsi On Fri, 16 May 2003, Mike Anderson wrote: > I remember the previous conversation on the MODE-SENSE, but I think > there might be more here. > > I do not believe the oops is related to the lack of proper hot-plugging > support (unless this is another way it will manifest itself). The issue > is more the interaction of the error handler and the usb storage driver. > > The sequence: > "drivers/usb/core/message.c: usb_control/bulk_msg: timeout > Error handler scsi_eh_1 waking up > scsi_eh_prt_fail_stats: 1:0:0:0 cmds failed: 0, cancel: 1" > > would lead one two believe that a timeout was detected both in the usb > core and in the mid layer. This would not be a good thing to have two > timeout recovery actions happening. I am not familiar with the usb > storage code so this may be incorrect. I think this is a red herring, and the sequence of log messages above is not especially pertinent. I don't have my copy of the original system log with me here, so this is relying on memory and might be wrong -- end disclaimer. What happens is that the device crashes because of the 3-byte MODE-SENSE(6) request. The scsi error handler tries various levels of fault recovery, all the way up to a bus reset. Usb-storage implements the bus reset as a USB port reset, which requires usbcore to re-enumerate the device. The line drivers/usb/core/message.c: usb_control/bulk_msg: timeout refers to a failure during the re-enumeration. Apparently even the port reset was not enough to revive the device. I don't know the reason for the line Error handler scsi_eh_1 waking up since the error handler is already running -- maybe it's related to the hot-unplug sequence that usb-storage goes through. That sequence is pretty much guaranteed to cause an OOPS if the device is busy at the time, since it hasn't yet been updated properly. (I think Matt Dharm is waiting to hear from someone -- like you, Mike -- that the SCSI core is ready for hot-unplugging.) When the USB port reset fails as it did here, it automatically causes a hot-unplug event. Usb-storage currently handles the SCSI part of that as follows: // Lock the host for each device on the host: sdev->online = 0; // Unlock the host scsi_remove_host(host); scsi_unregister_host(host); My guess is that this sequence provokes that error handler message and causes the crash. At any rate, the two timeout messages are referring to two different things. One is the failure of the MODE-SENSE and the other is the failure of the re-enumeration. They do not in themselves indicate a problem. Alan Stern ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 2003-05-17 15:29 ` Alan Stern @ 2003-05-17 16:33 ` David Brownell 2003-05-18 18:31 ` Alan Stern 2003-05-17 18:38 ` [linux-usb-devel] USB Storage oops in 2.5.69-bk8 Patrick Mansfield 1 sibling, 1 reply; 35+ messages in thread From: David Brownell @ 2003-05-17 16:33 UTC (permalink / raw) To: Alan Stern; +Cc: Mike Anderson, Florin Iucha, USB development list, linux-scsi > What happens is that the device crashes because of the 3-byte > MODE-SENSE(6) request. The scsi error handler tries various levels of > fault recovery, all the way up to a bus reset. Usb-storage implements the > bus reset as a USB port reset, which requires usbcore to re-enumerate > the device. The line > > drivers/usb/core/message.c: usb_control/bulk_msg: timeout > > refers to a failure during the re-enumeration. Apparently even the port > reset was not enough to revive the device. That reset couldn't re-assign the address, so of course it failed. In hub.c, usb_physical_reset_device() is leaving dev->state as USB_STATE_CONFIGURED after resetting the port, and of course it's not legal to set the address of a device in that state. - Dave ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 2003-05-17 16:33 ` David Brownell @ 2003-05-18 18:31 ` Alan Stern 2003-05-18 23:46 ` David Brownell 0 siblings, 1 reply; 35+ messages in thread From: Alan Stern @ 2003-05-18 18:31 UTC (permalink / raw) To: David Brownell Cc: Mike Anderson, Florin Iucha, USB development list, linux-scsi On Sat, 17 May 2003, David Brownell wrote: > That reset couldn't re-assign the address, so of course it failed. > > In hub.c, usb_physical_reset_device() is leaving dev->state as > USB_STATE_CONFIGURED after resetting the port, and of course it's > not legal to set the address of a device in that state. Do you mean to say that this is a bug in usb_reset_device()? Isn't it always going to be true that after a port reset a device will need to be assigned an address? Alan Stern ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 2003-05-18 18:31 ` Alan Stern @ 2003-05-18 23:46 ` David Brownell 2003-05-21 15:19 ` Bug in hot-unplugging for SCSI CD-ROM Alan Stern 0 siblings, 1 reply; 35+ messages in thread From: David Brownell @ 2003-05-18 23:46 UTC (permalink / raw) To: Alan Stern; +Cc: Mike Anderson, Florin Iucha, USB development list, linux-scsi [-- Attachment #1: Type: text/plain, Size: 766 bytes --] Alan Stern wrote: > On Sat, 17 May 2003, David Brownell wrote: > > >>That reset couldn't re-assign the address, so of course it failed. >> >>In hub.c, usb_physical_reset_device() is leaving dev->state as >>USB_STATE_CONFIGURED after resetting the port, and of course it's >>not legal to set the address of a device in that state. > > > Do you mean to say that this is a bug in usb_reset_device()? Isn't it > always going to be true that after a port reset a device will need to be > assigned an address? No; yes. What I mean is that something like this patch would seem to be needed, so USB_STATE_DEFAULT ("needs an address") is set when the reset completes. Untested, but it compiles. The enumeration logic still relies on hidden side effects. - Dave [-- Attachment #2: Diff --] [-- Type: text/plain, Size: 691 bytes --] --- 1.99/drivers/usb/core/hub.c Mon Apr 21 10:47:47 2003 +++ edited/drivers/usb/core/hub.c Sun May 18 16:31:02 2003 @@ -737,6 +737,9 @@ if (status != -1) { usb_clear_port_feature(hub, port + 1, USB_PORT_FEAT_C_RESET); + dev->state = status + ? USB_STATE_NOTATTACHED + : USB_STATE_DEFAULT; return status; } --- 1.204/drivers/usb/core/usb.c Sun May 4 23:49:53 2003 +++ edited/drivers/usb/core/usb.c Sun May 18 16:32:19 2003 @@ -1018,7 +1018,6 @@ dev->dev.dma_mask = parent->dma_mask; /* it's not usable yet */ - dev->state = USB_STATE_DEFAULT; /* USB 2.0 section 5.5.3 talks about ep0 maxpacket ... * it's fixed size except for full speed devices. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Bug in hot-unplugging for SCSI CD-ROM 2003-05-18 23:46 ` David Brownell @ 2003-05-21 15:19 ` Alan Stern 0 siblings, 0 replies; 35+ messages in thread From: Alan Stern @ 2003-05-21 15:19 UTC (permalink / raw) To: Mike Anderson; +Cc: Matthew Dharm, linux-scsi Mike: I just modified the usb-storage driver to simulate a failed device and tried it with my USB CD-RW drive. The SCSI error handler kicked in, tried a device reset (which I had doctored to simulate a failure), tried a bus reset (which actually did fail because of an bug somewhere else in the USB stack), couldn't do a host reset (because usb-storage doesn't support it), and finally decided to take the device off-line. That much worked fine. But afterwards, when the process trying to read from the drive ended and the device file was closed, usb-storage received an ALLOW_MEDIUM_REMOVAL command. This should not have happened given that the device was off-line. Here are the relevant parts of the kernel log: May 21 10:09:29 localhost kernel: usb-storage: Command READ_10 (10 bytes) May 21 10:09:29 localhost kernel: usb-storage: 28 00 00 00 00 44 00 00 40 00 (delay to simulate failure) May 21 10:09:59 localhost kernel: usb-storage: usb_storage_command_abort called May 21 10:09:59 localhost kernel: usb-storage: usb_stor_stop_transport called May 21 10:09:59 localhost kernel: usb-storage: Bulk status result = 3 May 21 10:09:59 localhost kernel: usb-storage: -- command was aborted May 21 10:09:59 localhost kernel: usb-storage: Bulk reset requested (delay for automatically-generated device reset) May 21 10:10:05 localhost kernel: usb-storage: Soft reset done May 21 10:10:05 localhost kernel: usb-storage: scsi command aborted May 21 10:10:05 localhost kernel: usb-storage: *** thread sleeping. May 21 10:10:05 localhost kernel: usb-storage: queuecommand() called May 21 10:10:05 localhost kernel: usb-storage: *** thread awakened. May 21 10:10:05 localhost kernel: usb-storage: Command TEST_UNIT_READY (6 bytes) May 21 10:10:05 localhost kernel: usb-storage: 00 00 00 00 00 00 (delay to simulate failure) May 21 10:10:15 localhost kernel: usb-storage: usb_storage_command_abort called May 21 10:10:15 localhost kernel: usb-storage: usb_stor_stop_transport called May 21 10:10:15 localhost kernel: usb-storage: Bulk status result = 3 May 21 10:10:15 localhost kernel: usb-storage: -- command was aborted May 21 10:10:15 localhost kernel: usb-storage: Bulk reset requested (delay for automatically-generated device reset) May 21 10:10:21 localhost kernel: usb-storage: Soft reset done May 21 10:10:21 localhost kernel: usb-storage: scsi command aborted May 21 10:10:21 localhost kernel: usb-storage: *** thread sleeping. May 21 10:10:21 localhost kernel: usb-storage: usb_storage_device_reset called (simulate failure of device reset) May 21 10:10:21 localhost kernel: usb-storage: usb_storage_bus_reset called (next line is due to an error in the USB stack, causing failure) May 21 10:10:21 localhost kernel: drivers/usb/core/hub.c: USB device not accepting new address (error=-22) May 21 10:10:21 localhost kernel: usb-storage: usb_reset_device returns -22 May 21 10:10:21 localhost kernel: scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0 May 21 10:10:21 localhost kernel: SCSI cdrom error : <0 0 0 0> return code = 0x6050000 May 21 10:10:21 localhost kernel: end_request: I/O error, dev sr0, sector 272 May 21 10:10:21 localhost kernel: Buffer I/O error on device scsi/host0/bus0/target0/lun0/cd, logical block 68 May 21 10:10:21 localhost kernel: Buffer I/O error on device scsi/host0/bus0/target0/lun0/cd, logical block 69 (more I/O error messages, not shown) May 21 10:10:21 localhost kernel: usb-storage: queuecommand() called May 21 10:10:21 localhost kernel: usb-storage: *** thread awakened. May 21 10:10:21 localhost kernel: usb-storage: Command ALLOW_MEDIUM_REMOVAL (6 bytes) May 21 10:10:21 localhost kernel: usb-storage: 1e 00 00 00 00 00 I trust this won't be hard to fix. Alan Stern ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 2003-05-17 15:29 ` Alan Stern 2003-05-17 16:33 ` David Brownell @ 2003-05-17 18:38 ` Patrick Mansfield 2003-05-31 14:35 ` Florin Iucha 1 sibling, 1 reply; 35+ messages in thread From: Patrick Mansfield @ 2003-05-17 18:38 UTC (permalink / raw) To: Alan Stern Cc: Mike Anderson, Florin Iucha, David Brownell, USB development list, linux-scsi On Sat, May 17, 2003 at 11:29:30AM -0400, Alan Stern wrote: > On Fri, 16 May 2003, Mike Anderson wrote: > What happens is that the device crashes because of the 3-byte > MODE-SENSE(6) request. The scsi error handler tries various levels of > fault recovery, all the way up to a bus reset. Usb-storage implements the > bus reset as a USB port reset, which requires usbcore to re-enumerate > the device. The line > > drivers/usb/core/message.c: usb_control/bulk_msg: timeout > > refers to a failure during the re-enumeration. Apparently even the port > reset was not enough to revive the device. I don't know the reason for > the line > > Error handler scsi_eh_1 waking up The above is the first thing output when the eh wakes up, so the re-enumeration could not have started yet (at least it could not have been initiated via scsi eh calling the bus reset or other abort/reset related functions), and the USB timeout message could not be part of the scsi eh, though we did not see the scsi_times_out log message (since it's wrapped in SCSI_LOG_TIMEOUT, not SCSI_LOG_ERROR_RECOVERY). Turning on timeout (or even all) logging might give a bit more data: echo "scsi log timeout 4" > /proc/scsi/scsi Or (warning - logs the logging messages if logging to a scsi disk): echo "scsi log all" > /proc/scsi/scsi -- Patrick Mansfield ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 2003-05-17 18:38 ` [linux-usb-devel] USB Storage oops in 2.5.69-bk8 Patrick Mansfield @ 2003-05-31 14:35 ` Florin Iucha 0 siblings, 0 replies; 35+ messages in thread From: Florin Iucha @ 2003-05-31 14:35 UTC (permalink / raw) To: Patrick Mansfield Cc: Alan Stern, Mike Anderson, David Brownell, USB development list, linux-scsi [-- Attachment #1: Type: text/plain, Size: 222 bytes --] The problem is fixed in 2.5.70-bk5. I have closed the bug [1] in bugzilla. Thank you, florin 1. http://bugme.osdl.org/show_bug.cgi?id=724 -- "NT is to UNIX what a doughnut is to a particle accelerator." [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Patch: change the serial_number for error-handler commands 2003-05-16 22:29 ` Mike Anderson 2003-05-17 15:29 ` Alan Stern @ 2003-05-20 14:11 ` Alan Stern 2003-05-20 21:25 ` Luben Tuikov 1 sibling, 1 reply; 35+ messages in thread From: Alan Stern @ 2003-05-20 14:11 UTC (permalink / raw) To: Mike Anderson; +Cc: linux-scsi Mike: Do you, or does anybody else, object to the patch below? And if not, could you please ask Linus to apply it? It would improve error recovery for certain types of USB storage drives. Alan Stern ===== scsi_error.c 1.27 vs edited ===== --- 1.27/drivers/scsi/scsi_error.c Sun May 4 12:50:44 2003 +++ edited/drivers/scsi/scsi_error.c Mon May 19 14:17:04 2003 @@ -753,6 +753,8 @@ scmd->cmd_len = COMMAND_SIZE(scmd->cmnd[0]); scmd->underflow = 0; scmd->sc_data_direction = SCSI_DATA_NONE; + if ((scmd->serial_number -= 999999) == 0) + scmd->serial_number -= 999999; rtn = scsi_send_eh_cmnd(scmd, SENSE_TIMEOUT); ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-20 14:11 ` Patch: change the serial_number for error-handler commands Alan Stern @ 2003-05-20 21:25 ` Luben Tuikov 2003-05-21 1:19 ` Alan Stern 0 siblings, 1 reply; 35+ messages in thread From: Luben Tuikov @ 2003-05-20 21:25 UTC (permalink / raw) To: Alan Stern; +Cc: Mike Anderson, linux-scsi Alan Stern wrote: > ===== scsi_error.c 1.27 vs edited ===== > --- 1.27/drivers/scsi/scsi_error.c Sun May 4 12:50:44 2003 > +++ edited/drivers/scsi/scsi_error.c Mon May 19 14:17:04 2003 > @@ -753,6 +753,8 @@ > scmd->cmd_len = COMMAND_SIZE(scmd->cmnd[0]); > scmd->underflow = 0; > scmd->sc_data_direction = SCSI_DATA_NONE; > + if ((scmd->serial_number -= 999999) == 0) > + scmd->serial_number -= 999999; > > rtn = scsi_send_eh_cmnd(scmd, SENSE_TIMEOUT); Alan, could you please explain this patch? This looks like: if (sn == 999999) sn = -sn; for _that_ particular command, but scsi_core->sn stays unchanged... Q: Which value does USB Storage reserve for cmdsn, 0 or MAX? -- Luben ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-20 21:25 ` Luben Tuikov @ 2003-05-21 1:19 ` Alan Stern 2003-05-21 18:03 ` Mike Anderson 2003-05-21 19:24 ` Patch: change the serial_number for error-handler commands Luben Tuikov 0 siblings, 2 replies; 35+ messages in thread From: Alan Stern @ 2003-05-21 1:19 UTC (permalink / raw) To: Luben Tuikov; +Cc: Mike Anderson, linux-scsi On Tue, 20 May 2003, Luben Tuikov wrote: > Alan Stern wrote: > > ===== scsi_error.c 1.27 vs edited ===== > > --- 1.27/drivers/scsi/scsi_error.c Sun May 4 12:50:44 2003 > > +++ edited/drivers/scsi/scsi_error.c Mon May 19 14:17:04 2003 > > @@ -753,6 +753,8 @@ > > scmd->cmd_len = COMMAND_SIZE(scmd->cmnd[0]); > > scmd->underflow = 0; > > scmd->sc_data_direction = SCSI_DATA_NONE; > > + if ((scmd->serial_number -= 999999) == 0) > > + scmd->serial_number -= 999999; > > > > rtn = scsi_send_eh_cmnd(scmd, SENSE_TIMEOUT); > > Alan, could you please explain this patch? It's just to insure that when the error handler sends a TEST-UNIT-READY, the serial number is different from that of the immediately-prior failed command (or from that of any command issued recently or likely to be issued in the near future). > This looks like: > if (sn == 999999) > sn = -sn; > > for _that_ particular command, but scsi_core->sn stays unchanged... Well, the patch changes the serial number regardless of its initial value, but otherwise yes. It would be okay with me to change scsi_core->sn. But the global serial number is currently a static variable defined in scsi.c and my change was to scsi_error.c -- I didn't want to export the variable needlessly. > Q: Which value does USB Storage reserve for cmdsn, 0 or MAX? It doesn't reserve any values. (I avoided 0 only because I noticed that the scsi core uses 0 as a reserved value.) Usb-storage just requires that serial numbers not be repeated. Alan Stern ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 1:19 ` Alan Stern @ 2003-05-21 18:03 ` Mike Anderson 2003-05-21 18:50 ` Luben Tuikov ` (4 more replies) 2003-05-21 19:24 ` Patch: change the serial_number for error-handler commands Luben Tuikov 1 sibling, 5 replies; 35+ messages in thread From: Mike Anderson @ 2003-05-21 18:03 UTC (permalink / raw) To: Alan Stern; +Cc: Luben Tuikov, linux-scsi Sorry for the slow response I was under the weather and not doing much yesterday. Alan Stern [stern@rowland.harvard.edu] wrote: > Well, the patch changes the serial number regardless of its initial value, > but otherwise yes. It would be okay with me to change scsi_core->sn. > But the global serial number is currently a static variable defined in > scsi.c and my change was to scsi_error.c -- I didn't want to export the > variable needlessly. > > > Q: Which value does USB Storage reserve for cmdsn, 0 or MAX? > > It doesn't reserve any values. (I avoided 0 only because I noticed that > the scsi core uses 0 as a reserved value.) Usb-storage just requires that > serial numbers not be repeated. > I agree on the change in SCSI core instead of a change just for scsi error handling. If you are depending on command serial number to be unique as the comments in scsi_dispatch_cmd already indicate we have problems (in USB storage case since can_queue is 1 it would not run into a issue until scsi error handling). The patch below is a quick patch that moves the serial_number from a scsi mid level static to a per scsi host value. In looking at the usage of command serial_number it appears that moving to a per scsi host name space should be ok. I have compile and boot tested it on my system with the ips and aic7xxx drivers. The patch needs some more testing and some variable name updates, but should be ok to test USB and allow for discussion of this type of change. -andmike -- Michael Anderson andmike@us.ibm.com DESC This patch is against scsi-misc-2.5 but also applies against 2.5.69. Move scsi command serial number to per scsi host serial number. We also only increment the serial number under a lock now so the race on the value is removed. A new serial number is also acquired in the scsi_error handler on new commands. EDESC drivers/scsi/hosts.h | 7 +++++++ drivers/scsi/scsi.c | 7 ++----- drivers/scsi/scsi_error.c | 1 + 3 files changed, 10 insertions(+), 5 deletions(-) diff -puN drivers/scsi/hosts.h~scsi_serial_number drivers/scsi/hosts.h --- sysfs-scsi-misc-2.5/drivers/scsi/hosts.h~scsi_serial_number Wed May 21 08:41:49 2003 +++ sysfs-scsi-misc-2.5-andmike/drivers/scsi/hosts.h Wed May 21 08:41:49 2003 @@ -487,6 +487,8 @@ struct Scsi_Host struct device host_gendev; struct class_device class_dev; + unsigned long serial_number; + /* * We should ensure that this is aligned, both for better performance * and also because some compilers (m68k) don't automatically force @@ -532,6 +534,11 @@ static inline struct device *scsi_get_de return shost->host_gendev.parent; } +static inline unsigned long scsi_get_next_serial(struct Scsi_Host *shost) +{ + return (++shost->serial_number) ? shost->serial_number : 1; +} + struct Scsi_Device_Template { struct list_head list; diff -puN drivers/scsi/scsi.c~scsi_serial_number drivers/scsi/scsi.c --- sysfs-scsi-misc-2.5/drivers/scsi/scsi.c~scsi_serial_number Wed May 21 08:41:49 2003 +++ sysfs-scsi-misc-2.5-andmike/drivers/scsi/scsi.c Wed May 21 08:41:49 2003 @@ -83,7 +83,6 @@ */ unsigned long scsi_pid; struct scsi_cmnd *last_cmnd; -static unsigned long serial_number; /* * List of all highlevel drivers. @@ -398,11 +397,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd * unsigned long timeout; int rtn = 1; - /* Assign a unique nonzero serial_number. */ /* XXX(hch): this is racy */ - if (++serial_number == 0) - serial_number = 1; - cmd->serial_number = serial_number; cmd->pid = scsi_pid++; /* @@ -471,6 +466,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd * host->hostt->queuecommand)); spin_lock_irqsave(host->host_lock, flags); + cmd->serial_number = scsi_get_next_serial(host); rtn = host->hostt->queuecommand(cmd, scsi_done); spin_unlock_irqrestore(host->host_lock, flags); if (rtn) { @@ -485,6 +481,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd * host->hostt->command)); spin_lock_irqsave(host->host_lock, flags); + cmd->serial_number = scsi_get_next_serial(host); cmd->result = host->hostt->command(cmd); scsi_done(cmd); spin_unlock_irqrestore(host->host_lock, flags); diff -puN drivers/scsi/scsi_error.c~scsi_serial_number drivers/scsi/scsi_error.c --- sysfs-scsi-misc-2.5/drivers/scsi/scsi_error.c~scsi_serial_number Wed May 21 08:41:49 2003 +++ sysfs-scsi-misc-2.5-andmike/drivers/scsi/scsi_error.c Wed May 21 08:41:49 2003 @@ -456,6 +456,7 @@ static int scsi_send_eh_cmnd(struct scsi scmd->request->rq_status = RQ_SCSI_BUSY; spin_lock_irqsave(scmd->device->host->host_lock, flags); + scmd->serial_number = scsi_get_next_serial(host); host->hostt->queuecommand(scmd, scsi_eh_done); spin_unlock_irqrestore(scmd->device->host->host_lock, flags); _ ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 18:03 ` Mike Anderson @ 2003-05-21 18:50 ` Luben Tuikov 2003-05-21 19:18 ` Luben Tuikov ` (3 subsequent siblings) 4 siblings, 0 replies; 35+ messages in thread From: Luben Tuikov @ 2003-05-21 18:50 UTC (permalink / raw) To: Mike Anderson; +Cc: Alan Stern, linux-scsi Mike Anderson wrote: > Sorry for the slow response I was under the weather and not doing much > yesterday. > > Alan Stern [stern@rowland.harvard.edu] wrote: > >>Well, the patch changes the serial number regardless of its initial value, >>but otherwise yes. It would be okay with me to change scsi_core->sn. >>But the global serial number is currently a static variable defined in >>scsi.c and my change was to scsi_error.c -- I didn't want to export the >>variable needlessly. >> >> >>>Q: Which value does USB Storage reserve for cmdsn, 0 or MAX? >> >>It doesn't reserve any values. (I avoided 0 only because I noticed that >>the scsi core uses 0 as a reserved value.) Usb-storage just requires that >>serial numbers not be repeated. >> > > > I agree on the change in SCSI core instead of a change just for scsi > error handling. If you are depending on command serial number to be > unique as the comments in scsi_dispatch_cmd already indicate we have > problems (in USB storage case since can_queue is 1 it would not run into a > issue until scsi error handling). > > The patch below is a quick patch that moves the serial_number from a > scsi mid level static to a per scsi host value. In looking at the usage > of command serial_number it appears that moving to a per scsi host name > space should be ok. > > I have compile and boot tested it on my system with the ips and aic7xxx > drivers. > > The patch needs some more testing and some variable name updates, but > should be ok to test USB and allow for discussion of this type of > change. 1. cmdsn is per initiator, so I have no objections _in_principle_ to this patch. (haven't looked at the patch code, just read the idea) Actually cmdsn just insures delvery of the _command_ across the interconnect, after that it doesn't matter. Normally transports play with cmdsn, but it is no error for the application client to provide it. Transports may ignore it completely, when they don't support it or when they generate their own, and thus SCSI Core SHOULD NOT rely on this to identify the command. 2. SCSI Core->cmdsn should _always_ contain the _next_ cmdsn to be assigned. This provides that at boot time, it contains 0, and this is consistent with the fact that no commands have been issued yet. Assignment to a command should be atomic a la, cmd->cmdsn = scsi_core->cmdsn++; This also provides that for a reasonable value v [RFC1982], one can tell immediately the number of commands issued, a la, (scsi_core->cmdsn - v + MAX) % MAX is commands issued since v. This is also consistent with the well known practice that indexes, etc, always contain the _next_ index, so that that it serves as an index (to the next) and as a _size_ variable just by reading it. (#2 should probably be fixed first) 3. Zero is indeed a reserved value, as per SAM-3r07, 5.1. 4. The error handler is NO exception. If the error handler sends a SCSI Command (i.e. a _CDB_) then it is supposed to assign a cmdsn, as per the rules. This is for delivery. -- Luben ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 18:03 ` Mike Anderson 2003-05-21 18:50 ` Luben Tuikov @ 2003-05-21 19:18 ` Luben Tuikov 2003-05-21 20:28 ` Mike Anderson 2003-05-21 19:57 ` Alan Stern ` (2 subsequent siblings) 4 siblings, 1 reply; 35+ messages in thread From: Luben Tuikov @ 2003-05-21 19:18 UTC (permalink / raw) To: Mike Anderson; +Cc: Alan Stern, linux-scsi Mike Anderson wrote: > > DESC > This patch is against scsi-misc-2.5 but also applies against 2.5.69. > > Move scsi command serial number to per scsi host serial number. We also > only increment the serial number under a lock now so the race on the value > is removed. A new serial number is also acquired in the scsi_error handler > on new commands. > EDESC > > > drivers/scsi/hosts.h | 7 +++++++ > drivers/scsi/scsi.c | 7 ++----- > drivers/scsi/scsi_error.c | 1 + > 3 files changed, 10 insertions(+), 5 deletions(-) > > diff -puN drivers/scsi/hosts.h~scsi_serial_number drivers/scsi/hosts.h > --- sysfs-scsi-misc-2.5/drivers/scsi/hosts.h~scsi_serial_number Wed May 21 08:41:49 2003 > +++ sysfs-scsi-misc-2.5-andmike/drivers/scsi/hosts.h Wed May 21 08:41:49 2003 > @@ -487,6 +487,8 @@ struct Scsi_Host > struct device host_gendev; > struct class_device class_dev; > > + unsigned long serial_number; > + > /* > * We should ensure that this is aligned, both for better performance > * and also because some compilers (m68k) don't automatically force > @@ -532,6 +534,11 @@ static inline struct device *scsi_get_de > return shost->host_gendev.parent; > } > > +static inline unsigned long scsi_get_next_serial(struct Scsi_Host *shost) > +{ > + return (++shost->serial_number) ? shost->serial_number : 1; > +} > + scsi_get_next_serial()??? Do _you_ Mike like this name? scsi_get_cmdsn() will be a lot more appropriate. It doesn't reveal implementation (``next''), and tells that it's a serial number for a command, ``sn'' stands for serial number pretty universally in our society. How about something like this: static inline unsigned long scsi_get_cmdsn(struct Scsi_Host, *shost) { static const typeof(shost->serial_number) MAX_SN = ~((typeof(shost->serial_number) 0); return shost->serial_number++ == MAX_SN ? ++shost->serial_number : shost->serial_number; } BTW, are you relying on memset(..., 0, sizeof(struct Scsi_Host), to set the serial number to 0? -- Luben ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 19:18 ` Luben Tuikov @ 2003-05-21 20:28 ` Mike Anderson 2003-05-21 21:11 ` Luben Tuikov 0 siblings, 1 reply; 35+ messages in thread From: Mike Anderson @ 2003-05-21 20:28 UTC (permalink / raw) To: Luben Tuikov; +Cc: Alan Stern, linux-scsi Luben Tuikov [tluben@rogers.com] wrote: > >+static inline unsigned long scsi_get_next_serial(struct Scsi_Host *shost) > >+{ > >+ return (++shost->serial_number) ? shost->serial_number : 1; > >+} > >+ > There is a bug in the above line it should be return (++shost->serial_number) ? shost->serial_number : ++shost->serial_number; > scsi_get_next_serial()??? Do _you_ Mike like this name? > No. > scsi_get_cmdsn() will be a lot more appropriate. It doesn't > reveal implementation (``next''), and tells that it's a serial > number for a command, ``sn'' stands for serial number pretty > universally in our society. > This name is ok with me. > How about something like this: > > static inline unsigned long scsi_get_cmdsn(struct Scsi_Host, *shost) > { > static const typeof(shost->serial_number) MAX_SN = > ~((typeof(shost->serial_number) 0); > return shost->serial_number++ == MAX_SN ? > ++shost->serial_number : shost->serial_number; > } > Why compare with MAX_SN and not just let the value role over and check for false. > BTW, are you relying on memset(..., 0, sizeof(struct Scsi_Host), > to set the serial number to 0? Yes I was. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 20:28 ` Mike Anderson @ 2003-05-21 21:11 ` Luben Tuikov 2003-05-21 23:15 ` Patrick Mansfield 0 siblings, 1 reply; 35+ messages in thread From: Luben Tuikov @ 2003-05-21 21:11 UTC (permalink / raw) To: Mike Anderson; +Cc: Alan Stern, linux-scsi Mike Anderson wrote: > Luben Tuikov [tluben@rogers.com] wrote: > >> Mike Anderson wrote: ^^^^^^^^^^^^^^ >>>+static inline unsigned long scsi_get_next_serial(struct Scsi_Host *shost) >>>+{ >>>+ return (++shost->serial_number) ? shost->serial_number : 1; >>>+} >>>+ >> > > There is a bug in the above line it should be > return (++shost->serial_number) ? shost->serial_number : ++shost->serial_number; I did NOT write the above mentioned function -- this was part of _your_ patch -- next time _please_ remove ``Luben wrote:'' in the reply, so that there's no confusion! I'd appreciate that very much. >>How about something like this: >> >>static inline unsigned long scsi_get_cmdsn(struct Scsi_Host, *shost) >>{ >> static const typeof(shost->serial_number) MAX_SN = >> ~((typeof(shost->serial_number) 0); >> return shost->serial_number++ == MAX_SN ? >> ++shost->serial_number : shost->serial_number; >>} >> > > > Why compare with MAX_SN and not just let the value role over and check > for false. First, zero is reserved, and I was trying to keep the _next_ sn in the ``session'' cmdsn variable, but there's a little _bug_ up there too, so here's a corrected version: static inline unsigned long scsi_get_cmdsn(struct Scsi_Host *shost) { static const typeof(shost->serial_number) MAX_SN = ~((typeof(shost->serial_number) 0); return shost->serial_number == MAX_SN ? (shost->serial_number += 2)++ : shost->serial_number++; } -- Luben ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 21:11 ` Luben Tuikov @ 2003-05-21 23:15 ` Patrick Mansfield 2003-05-22 5:47 ` Luben Tuikov 0 siblings, 1 reply; 35+ messages in thread From: Patrick Mansfield @ 2003-05-21 23:15 UTC (permalink / raw) To: Luben Tuikov; +Cc: Mike Anderson, linux-scsi On Wed, May 21, 2003 at 05:11:41PM -0400, Luben Tuikov wrote: > Mike Anderson wrote: > > > > Why compare with MAX_SN and not just let the value role over and check > > for false. > > First, zero is reserved, and I was trying to keep the _next_ sn in > the ``session'' cmdsn variable, but there's a little _bug_ up there too, > so here's a corrected version: > > static inline unsigned long scsi_get_cmdsn(struct Scsi_Host *shost) > { > static const typeof(shost->serial_number) MAX_SN = > ~((typeof(shost->serial_number) 0); > return shost->serial_number == MAX_SN ? > (shost->serial_number += 2)++ : > shost->serial_number++; My less than $.02: As far as the code goes, it does not matter whether we store the current or next value in shost->serial_number, so we ought to use the simpler code (Mike's version). And (x += 2)++ won't compile. Per naming: "get" is already overloaded and implies a put. We have code like scsi_get_cmd that allocates and returns a pointer/value, and then various scsi and other kernel get/put functions that take a pointer and increment or decrement ref counters. cmdsn is a rather cryptic abbreviation that my brain can't easily parse. I suggest scsi_serial_number(host) or scsi_cmdsn(host). -- Patrick Mansfield ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 23:15 ` Patrick Mansfield @ 2003-05-22 5:47 ` Luben Tuikov 0 siblings, 0 replies; 35+ messages in thread From: Luben Tuikov @ 2003-05-22 5:47 UTC (permalink / raw) To: Patrick Mansfield; +Cc: Mike Anderson, linux-scsi Patrick Mansfield wrote: > > My less than $.02: > > As far as the code goes, it does not matter whether we store the current > or next value in shost->serial_number, I was _just_ mentioning what is normally done in implementations of some transports... (but actually they need to store the next, but it seems we'll not use this ability in SCSI Core, so yes, it doesn't really matter...) > so we ought to use the simpler code > (Mike's version). You mean Eric's version -- this was in SCSI Core _before_ you two started paid work on SCSI Core. > And (x += 2)++ won't compile. Aaaah, I've mentioned this before several times, which is my usual disclaimer: the only C code which I send to linux scsi is in a patch. Anything else, doesn't matter how C-like it is, if it is in text, is just that: C-like. I'm just trying to convey an idea. Actually I _did_ set up a running verion of this as a patch today but I never sent it -- too much politics, I got discouraged and saw absolutely _no_ point. > Per naming: "get" is already overloaded and implies a put. We have code > like scsi_get_cmd that allocates and returns a pointer/value, and then > various scsi and other kernel get/put functions that take a pointer and > increment or decrement ref counters. cmdsn is a rather cryptic > abbreviation that my brain can't easily parse. > > I suggest scsi_serial_number(host) or scsi_cmdsn(host). I personally am getting tired of those scsi_this_is_a_long_name_for_a_trivial_function() function names in scsi so I'd go for the second. -- Luben ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 18:03 ` Mike Anderson 2003-05-21 18:50 ` Luben Tuikov 2003-05-21 19:18 ` Luben Tuikov @ 2003-05-21 19:57 ` Alan Stern 2003-05-21 20:42 ` Luben Tuikov 2003-05-21 21:19 ` James Bottomley 2003-06-11 17:41 ` PATCH: (as33) Remove /proc/scsi directory in scsi_remove_host() Alan Stern 4 siblings, 1 reply; 35+ messages in thread From: Alan Stern @ 2003-05-21 19:57 UTC (permalink / raw) To: Mike Anderson; +Cc: Luben Tuikov, linux-scsi On Wed, 21 May 2003, Mike Anderson wrote: > I agree on the change in SCSI core instead of a change just for scsi > error handling. If you are depending on command serial number to be > unique as the comments in scsi_dispatch_cmd already indicate we have > problems (in USB storage case since can_queue is 1 it would not run into a > issue until scsi error handling). > > The patch below is a quick patch that moves the serial_number from a > scsi mid level static to a per scsi host value. In looking at the usage > of command serial_number it appears that moving to a per scsi host name > space should be ok. > > I have compile and boot tested it on my system with the ips and aic7xxx > drivers. > > The patch needs some more testing and some variable name updates, but > should be ok to test USB and allow for discussion of this type of > change. Mike: I tried it out, and it did exactly what I wanted. You, Luben, and the others can fight over the details. Alan Stern ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 19:57 ` Alan Stern @ 2003-05-21 20:42 ` Luben Tuikov 2003-05-21 21:05 ` Alan Stern 0 siblings, 1 reply; 35+ messages in thread From: Luben Tuikov @ 2003-05-21 20:42 UTC (permalink / raw) To: Alan Stern; +Cc: Mike Anderson, linux-scsi Alan Stern wrote: > > Mike: > > I tried it out, and it did exactly what I wanted. You, Luben, and the > others can fight over the details. Fighting? Who's fighting? I was just describing what standards say about those kinds of things, so that SCSI Core is in line with those. It's unfortunate that ppl can post comments like this; it's immature and doesn't help nor improve SCSI Core. All we're trying to do here is _improve_ SCSI Core any which way we can and we all contribute what we can. Comments like this are discouraging at best... -- Luben ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 20:42 ` Luben Tuikov @ 2003-05-21 21:05 ` Alan Stern 0 siblings, 0 replies; 35+ messages in thread From: Alan Stern @ 2003-05-21 21:05 UTC (permalink / raw) To: Luben Tuikov; +Cc: Mike Anderson, linux-scsi On Wed, 21 May 2003, Luben Tuikov wrote: > Alan Stern wrote: > > > > Mike: > > > > I tried it out, and it did exactly what I wanted. You, Luben, and the > > others can fight over the details. > > Fighting? Who's fighting? I was just describing what standards say > about those kinds of things, so that SCSI Core is in line with those. > > It's unfortunate that ppl can post comments like this; it's immature > and doesn't help nor improve SCSI Core. > > All we're trying to do here is _improve_ SCSI Core any which way > we can and we all contribute what we can. > > Comments like this are discouraging at best... Sorry, that was meant as a joke. I guess I should have used a smiley... Alan Stern ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 18:03 ` Mike Anderson ` (2 preceding siblings ...) 2003-05-21 19:57 ` Alan Stern @ 2003-05-21 21:19 ` James Bottomley 2003-05-21 22:53 ` Mike Anderson 2003-06-11 17:41 ` PATCH: (as33) Remove /proc/scsi directory in scsi_remove_host() Alan Stern 4 siblings, 1 reply; 35+ messages in thread From: James Bottomley @ 2003-05-21 21:19 UTC (permalink / raw) To: Mike Anderson; +Cc: Alan Stern, Luben Tuikov, SCSI Mailing List On Wed, 2003-05-21 at 14:03, Mike Anderson wrote: > This patch is against scsi-misc-2.5 but also applies against 2.5.69. > > Move scsi command serial number to per scsi host serial number. We also > only increment the serial number under a lock now so the race on the value > is removed. A new serial number is also acquired in the scsi_error handler > on new commands. This patch looks OK to me, except that I don't see a good reason to go to a per host serial number. The SCSI commands are so short lived that the current global serial number seems adequate (and saves us 4 bytes per device). On locking grounds, though it does look better to move it into the host structure (probably make explicit by documenting that the host lock needs to be held accessing the method). This: +static inline unsigned long scsi_get_next_serial(struct Scsi_Host *shost) +{ + return (++shost->serial_number) ? shost->serial_number : 1; +} + Will hand out two 1 serial numbers when we start at zero or wrap. Shouldn't that be static inline unsigned long scsi_get_next_serial(struct Scsi_Host *shost) { if(++shost->serial_number == 0) ++shost->serial_number; return shost->serial_number; } ? James ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 21:19 ` James Bottomley @ 2003-05-21 22:53 ` Mike Anderson 0 siblings, 0 replies; 35+ messages in thread From: Mike Anderson @ 2003-05-21 22:53 UTC (permalink / raw) To: James Bottomley; +Cc: Alan Stern, Luben Tuikov, SCSI Mailing List James Bottomley [James.Bottomley@SteelEye.com] wrote: > This patch looks OK to me, except that I don't see a good reason to go > to a per host serial number. The SCSI commands are so short lived that > the current global serial number seems adequate (and saves us 4 bytes > per device). > > On locking grounds, though it does look better to move it into the host > structure (probably make explicit by documenting that the host lock > needs to be held accessing the method). > I made the change on locking simplification a global lock to protect this value in the IO path sounded wrong. The overhead should be the same on single adapter systems, but would increase by 4 bytes for every adapter (real and pseudo). > This: > > +static inline unsigned long scsi_get_next_serial(struct Scsi_Host > *shost) > +{ > + return (++shost->serial_number) ? shost->serial_number : 1; > +} > + > > Will hand out two 1 serial numbers when we start at zero or wrap. > > Shouldn't that be > > static inline unsigned long scsi_get_next_serial(struct Scsi_Host > *shost) > { > if(++shost->serial_number == 0) > ++shost->serial_number; > > return shost->serial_number; > } > > ? Yes in the original patch snippet above 1 was passed out twice. In a previous response to this thread I indicated it should be like below or functionally similar to the code you have above. return (++shost->serial_number) ? shost->serial_number : ++shost->serial_number; -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* PATCH: (as33) Remove /proc/scsi directory in scsi_remove_host() 2003-05-21 18:03 ` Mike Anderson ` (3 preceding siblings ...) 2003-05-21 21:19 ` James Bottomley @ 2003-06-11 17:41 ` Alan Stern 2003-06-11 18:23 ` Mike Anderson 4 siblings, 1 reply; 35+ messages in thread From: Alan Stern @ 2003-06-11 17:41 UTC (permalink / raw) To: Mike Anderson; +Cc: SCSI development list Mike: You may know this already; if so please forgive the redundant information. The reworking of the host registration code in 2.5.70 has a bug that prevents the host's directory in /proc/scsi from being removed when the host is unregistered. The cause of the problem is that scsi_proc_host_rm() is checking for shost->hostt->present == 0, but it is called before scsi_free_shost() which is where shost->hostt->present gets decremented. Since there doesn't really seem to be any need to check shost->hostt->present (hosts shouldn't be removed more than once), I removed that line. I also reset shost->hostt->proc_dir back to NULL, in case the same host template is used again later to register another host. (There's another problem that prevents scsi_free_shost() from being called at all, but that's a bug in the driver model core. I've submitted a separate report about that.) Alan Stern ===== scsi_proc.c 1.11 vs edited ===== --- 1.11/drivers/scsi/scsi_proc.c Mon May 26 14:01:11 2003 +++ edited/drivers/scsi/scsi_proc.c Wed Jun 11 13:22:43 2003 @@ -149,8 +149,8 @@ sprintf(name,"%d", shost->host_no); remove_proc_entry(name, shost->hostt->proc_dir); - if (!shost->hostt->present) - remove_proc_entry(shost->hostt->proc_name, proc_scsi); + remove_proc_entry(shost->hostt->proc_name, proc_scsi); + shost->hostt->proc_dir = NULL; } static void proc_print_scsidevice(struct scsi_device* sdev, char *buffer, ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: PATCH: (as33) Remove /proc/scsi directory in scsi_remove_host() 2003-06-11 17:41 ` PATCH: (as33) Remove /proc/scsi directory in scsi_remove_host() Alan Stern @ 2003-06-11 18:23 ` Mike Anderson 2003-06-12 6:04 ` Christoph Hellwig 0 siblings, 1 reply; 35+ messages in thread From: Mike Anderson @ 2003-06-11 18:23 UTC (permalink / raw) To: Alan Stern; +Cc: SCSI development list Alan, This is bug, but I do not think this fix will work for non legacy callers. LLDDs that just use scsi_add_host and scsi_remove_host would have the remove_proc_entry called possibly too early. I do not have another fix to offer right now as "present" has race issues and just moving it might not be a good idea. Alan Stern [stern@rowland.harvard.edu] wrote: > ===== scsi_proc.c 1.11 vs edited ===== > --- 1.11/drivers/scsi/scsi_proc.c Mon May 26 14:01:11 2003 > +++ edited/drivers/scsi/scsi_proc.c Wed Jun 11 13:22:43 2003 > @@ -149,8 +149,8 @@ > > sprintf(name,"%d", shost->host_no); > remove_proc_entry(name, shost->hostt->proc_dir); > - if (!shost->hostt->present) > - remove_proc_entry(shost->hostt->proc_name, proc_scsi); > + remove_proc_entry(shost->hostt->proc_name, proc_scsi); > + shost->hostt->proc_dir = NULL; > } > > static void proc_print_scsidevice(struct scsi_device* sdev, char *buffer, > > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: PATCH: (as33) Remove /proc/scsi directory in scsi_remove_host() 2003-06-11 18:23 ` Mike Anderson @ 2003-06-12 6:04 ` Christoph Hellwig 2003-06-12 6:51 ` Mike Anderson 0 siblings, 1 reply; 35+ messages in thread From: Christoph Hellwig @ 2003-06-12 6:04 UTC (permalink / raw) To: Alan Stern, SCSI development list On Wed, Jun 11, 2003 at 11:23:24AM -0700, Mike Anderson wrote: > Alan, > This is bug, but I do not think this fix will work for non > legacy callers. LLDDs that just use scsi_add_host and > scsi_remove_host would have the remove_proc_entry called > possibly too early. Why should we care when the proc entry is remove? The whole ->proc_info mess is marked obsolete in 2.5 so no one should rely on it. It's certainly better than leaking the proc_dir_entry :) ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: PATCH: (as33) Remove /proc/scsi directory in scsi_remove_host() 2003-06-12 6:04 ` Christoph Hellwig @ 2003-06-12 6:51 ` Mike Anderson 2003-06-12 21:00 ` PATCH: (as33b) " Alan Stern 0 siblings, 1 reply; 35+ messages in thread From: Mike Anderson @ 2003-06-12 6:51 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Alan Stern, SCSI development list Christoph Hellwig [hch@infradead.org] wrote: > On Wed, Jun 11, 2003 at 11:23:24AM -0700, Mike Anderson wrote: > > Alan, > > This is bug, but I do not think this fix will work for non > > legacy callers. LLDDs that just use scsi_add_host and > > scsi_remove_host would have the remove_proc_entry called > > possibly too early. > > Why should we care when the proc entry is remove? The whole ->proc_info > mess is marked obsolete in 2.5 so no one should rely on it. It's certainly > better than leaking the proc_dir_entry :) While it is marked obsolete, but it should still function shouldn't it? Even if we use the proc_info replacement you suggested in previous mail: http://marc.theaimsgroup.com/?l=linux-scsi&m=105112638920306&w=2 we should have a similar cleanup issue. The patch suggested gives the output shown below which leaves "n" number of file entries not accessible. Ex. # ls /proc/scsi/scsi_debug/ 1 2 3 # echo "-1" > add_host Synchronizing SCSI cache: # ls /proc/scsi device_info ips scsi sg This proc cleanup routine also races with the read / write routines that previously where "protected" by the procfs try_module_get. I was looking into closing this race, but if you feel it is not worthwhile I can go look at other issues we have. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* PATCH: (as33b) Remove /proc/scsi directory in scsi_remove_host() 2003-06-12 6:51 ` Mike Anderson @ 2003-06-12 21:00 ` Alan Stern 2003-06-12 21:58 ` Mike Anderson 0 siblings, 1 reply; 35+ messages in thread From: Alan Stern @ 2003-06-12 21:00 UTC (permalink / raw) To: Mike Anderson; +Cc: Christoph Hellwig, SCSI development list Mike: Here is a more extensive attempt to fix the /proc/scsi problem. This patch should work with LLDDs that use either the legacy or the hotplug model. It should fix the problem you noted of trying to remove the directory while there are still hosts present. And it resolves at least one race that exists in the current code. The major difference between the current code and my patch is that the patch creates the /proc/scsi/hostname directory during scsi_register() and removes it during scsi_unregister(). The actual /proc/scsi/hostname/hostnum files are still created and removed during scsi_add_host() and scsi_remove_host(). I doubt that change should cause anybody any problem; it was the only way to reconcile the procfs stuff with host probing. I hope you like this attempt a little better. Alan Stern # This is a BitKeeper generated patch for the following project: # Project Name: greg k-h's linux 2.5 USB kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1649 -> 1.1650 # drivers/scsi/scsi_priv.h 1.1 -> 1.2 # drivers/scsi/hosts.c 1.26 -> 1.27 # drivers/scsi/scsi_proc.c 1.11 -> 1.12 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/06/12 stern@ida.rowland.org 1.1650 # Fix procfs handling during host unregistration # -------------------------------------------- # diff -Nru a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c --- a/drivers/scsi/hosts.c Thu Jun 12 16:52:28 2003 +++ b/drivers/scsi/hosts.c Thu Jun 12 16:52:28 2003 @@ -285,7 +285,7 @@ shost->eh_notify = NULL; } - shost->hostt->present--; + scsi_proc_hostdir_rm(shost->hostt); scsi_destroy_command_freelist(shost); kfree(shost); } @@ -417,7 +417,7 @@ */ wait_for_completion(&sem); shost->eh_notify = NULL; - shost->hostt->present++; + scsi_proc_hostdir_add(shost->hostt); return shost; fail: diff -Nru a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h --- a/drivers/scsi/scsi_priv.h Thu Jun 12 16:52:28 2003 +++ b/drivers/scsi/scsi_priv.h Thu Jun 12 16:52:28 2003 @@ -42,6 +42,7 @@ (((scmd)->sense_buffer[0] & 0x70) == 0x70) struct Scsi_Device_Template; +struct SHT; /* @@ -100,11 +101,15 @@ /* scsi_proc.c */ #ifdef CONFIG_PROC_FS +extern void scsi_proc_hostdir_add(struct SHT *); +extern void scsi_proc_hostdir_rm(struct SHT *); extern void scsi_proc_host_add(struct Scsi_Host *); extern void scsi_proc_host_rm(struct Scsi_Host *); extern int scsi_init_procfs(void); extern void scsi_exit_procfs(void); #else +# define scsi_proc_hostdir_add(sht) do { } while (0) +# define scsi_proc_hostdir_rm(sht) do { } while (0) # define scsi_proc_host_add(shost) do { } while (0) # define scsi_proc_host_rm(shost) do { } while (0) # define scsi_init_procfs() (0) diff -Nru a/drivers/scsi/scsi_proc.c b/drivers/scsi/scsi_proc.c --- a/drivers/scsi/scsi_proc.c Thu Jun 12 16:52:28 2003 +++ b/drivers/scsi/scsi_proc.c Thu Jun 12 16:52:28 2003 @@ -40,6 +40,8 @@ struct proc_dir_entry *proc_scsi; EXPORT_SYMBOL(proc_scsi); +/* Protect sht->present and sht->proc_dir */ +static DECLARE_MUTEX(global_host_template_sem); /* Used if the driver currently has no own support for /proc/scsi */ static int generic_proc_info(char *buffer, char **start, off_t offset, @@ -112,13 +114,10 @@ return ret; } -void scsi_proc_host_add(struct Scsi_Host *shost) +void scsi_proc_hostdir_add(struct SHT *sht) { - Scsi_Host_Template *sht = shost->hostt; - struct proc_dir_entry *p; - char name[10]; - - if (!sht->proc_dir) { + down(&global_host_template_sem); + if (!sht->present++) { sht->proc_dir = proc_mkdir(sht->proc_name, proc_scsi); if (!sht->proc_dir) { printk(KERN_ERR "%s: proc_mkdir failed for %s\n", @@ -127,30 +126,51 @@ } sht->proc_dir->owner = sht->module; } + up(&global_host_template_sem); +} + +void scsi_proc_hostdir_rm(struct SHT *sht) +{ + down(&global_host_template_sem); + if (!--sht->present && sht->proc_dir) { + remove_proc_entry(sht->proc_name, proc_scsi); + sht->proc_dir = NULL; + } + up(&global_host_template_sem); +} + +void scsi_proc_host_add(struct Scsi_Host *shost) +{ + Scsi_Host_Template *sht = shost->hostt; + struct proc_dir_entry *p; + char name[10]; + + if (!sht->proc_dir) + return; sprintf(name,"%d", shost->host_no); p = create_proc_read_entry(name, S_IFREG | S_IRUGO | S_IWUSR, - shost->hostt->proc_dir, proc_scsi_read, shost); + sht->proc_dir, proc_scsi_read, shost); if (!p) { printk(KERN_ERR "%s: Failed to register host %d in" "%s\n", __FUNCTION__, shost->host_no, - shost->hostt->proc_name); + sht->proc_name); return; } p->write_proc = proc_scsi_write; - p->owner = shost->hostt->module; - + p->owner = sht->module; } void scsi_proc_host_rm(struct Scsi_Host *shost) { char name[10]; + if (!shost->hostt->proc_dir) + return; + sprintf(name,"%d", shost->host_no); remove_proc_entry(name, shost->hostt->proc_dir); - if (!shost->hostt->present) - remove_proc_entry(shost->hostt->proc_name, proc_scsi); } static void proc_print_scsidevice(struct scsi_device* sdev, char *buffer, ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: PATCH: (as33b) Remove /proc/scsi directory in scsi_remove_host() 2003-06-12 21:00 ` PATCH: (as33b) " Alan Stern @ 2003-06-12 21:58 ` Mike Anderson 2003-06-13 14:38 ` Alan Stern 2003-06-15 13:01 ` Christoph Hellwig 0 siblings, 2 replies; 35+ messages in thread From: Mike Anderson @ 2003-06-12 21:58 UTC (permalink / raw) To: Alan Stern; +Cc: Christoph Hellwig, SCSI development list Alan, This looks good. In scsi-misc-2.5 SHT has been removed so there will need to be a little name sync up. I also had a patch that I was working on that removed the users of "->present" and just left the procfs users. This might be a good one to add on top of this patch. We still have the issue outside of what this patch was trying to address with the proc_read/proc_write functions racing with the remove_proc_entry, but in previous statements this will currently be left as is for this obsolete interface. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: PATCH: (as33b) Remove /proc/scsi directory in scsi_remove_host() 2003-06-12 21:58 ` Mike Anderson @ 2003-06-13 14:38 ` Alan Stern 2003-06-15 13:01 ` Christoph Hellwig 1 sibling, 0 replies; 35+ messages in thread From: Alan Stern @ 2003-06-13 14:38 UTC (permalink / raw) To: Mike Anderson; +Cc: Christoph Hellwig, SCSI development list On Thu, 12 Jun 2003, Mike Anderson wrote: > Alan, > This looks good. In scsi-misc-2.5 SHT has been removed so there > will need to be a little name sync up. That shouldn't be difficult. > I also had a patch that I was working on that removed the users > of "->present" and just left the procfs users. This might be a > good one to add on top of this patch. I agree. > We still have the issue outside of what this patch was trying > to address with the proc_read/proc_write functions racing with > the remove_proc_entry, but in previous statements this will > currently be left as is for this obsolete interface. That sounds like the right thing to do. This is a generic problem with procfs (maybe sysfs too). I think the only way to solve it is to add "open" and "close" routines that would increment/decrement a reference count for the host structure. Delaying the kfree of the structure until the reference count goes to 0 would prevent anything bad from happening when scsi_unregister() is called while some process still has the file open. But since this is an obsolete interface anyway, it's probably not worth going to all that trouble. Alan Stern ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: PATCH: (as33b) Remove /proc/scsi directory in scsi_remove_host() 2003-06-12 21:58 ` Mike Anderson 2003-06-13 14:38 ` Alan Stern @ 2003-06-15 13:01 ` Christoph Hellwig 1 sibling, 0 replies; 35+ messages in thread From: Christoph Hellwig @ 2003-06-15 13:01 UTC (permalink / raw) To: Alan Stern, Christoph Hellwig, SCSI development list On Thu, Jun 12, 2003 at 02:58:58PM -0700, Mike Anderson wrote: > I also had a patch that I was working on that removed the users > of "->present" and just left the procfs users. This might be a > good one to add on top of this patch. Yes, certainly. > We still have the issue outside of what this patch was trying > to address with the proc_read/proc_write functions racing with > the remove_proc_entry, but in previous statements this will > currently be left as is for this obsolete interface. This is a generic procfs problem. We could maybe use real file operations for the procfs files and then set the owner to the LLDD module. But probably it's not worth the effort.. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Patch: change the serial_number for error-handler commands 2003-05-21 1:19 ` Alan Stern 2003-05-21 18:03 ` Mike Anderson @ 2003-05-21 19:24 ` Luben Tuikov 1 sibling, 0 replies; 35+ messages in thread From: Luben Tuikov @ 2003-05-21 19:24 UTC (permalink / raw) To: Alan Stern; +Cc: Mike Anderson, linux-scsi Alan Stern wrote: >>Alan, could you please explain this patch? > > > It's just to insure that when the error handler sends a TEST-UNIT-READY, > the serial number is different from that of the immediately-prior failed > command (or from that of any command issued recently or likely to be > issued in the near future). This is a _trivial_ bug in SCSI Core (error handler, ...) which I hope will be fixed in the near future (Mike or Patrick will send a patch). That is, as per my other email, two successive commands (CDBs) should never be sent with duplicate cmdsn's. What we're talking about here is a _consistent_ path of a command through SCSI Core, before it reaches a LLDD. This is currently NOT the case, thus the duplicate cmdsn... -- Luben ^ permalink raw reply [flat|nested] 35+ messages in thread
* Fwd: Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8
@ 2003-05-16 15:56 Florin Iucha
2003-05-16 16:22 ` Mike Anderson
0 siblings, 1 reply; 35+ messages in thread
From: Florin Iucha @ 2003-05-16 15:56 UTC (permalink / raw)
To: linux-scsi; +Cc: andmike
[-- Attachment #1: Type: text/plain, Size: 3114 bytes --]
----- Forwarded message from Alan Stern <stern@rowland.harvard.edu> -----
> Subject: Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8
> From: Alan Stern <stern@rowland.harvard.edu>
> To: Florin Iucha <florin@iucha.net>
> Cc: USB development list <linux-usb-devel@lists.sourceforge.net>
> Date: Fri, 16 May 2003 11:49:00 -0400 (EDT)
>
> On Thu, 15 May 2003, Florin Iucha wrote:
>
> > Hello,
> >
> > I get the following oops in 2.5.69-bk8 when I try to mount a Compact
> > Flash card in a SanDisk adapter:
> >
> > usb 2-1: USB disconnect, address 2
> > hub 2-0:0: debounce: port 1: delay 100ms stable 4 status 0x301
> > hub 2-0:0: new USB device on port 1, assigned address 3
> > input: USB HID v1.00 Mouse [Microsoft Microsoft IntelliMouse? Explorer] on usb-00:02.3-1
> > usb 2-1: USB disconnect, address 3
> > drivers/usb/input/hid-core.c: can't resubmit intr, 00:02.3-1/input0, status -19
> > hub 2-0:0: debounce: port 1: delay 100ms stable 4 status 0x301
> > hub 2-0:0: new USB device on port 1, assigned address 4
> > input: USB HID v1.00 Mouse [Microsoft Microsoft IntelliMouse? Explorer] on usb-00:02.3-1
> > SCSI device sda: 31232 512-byte hdwr sectors (16 MB)
> > sda: Write Protect is off
> > sda: Mode Sense: 02 00 00 00
> > drivers/usb/core/message.c: usb_control/bulk_msg: timeout
> > drivers/usb/core/message.c: usb_control/bulk_msg: timeout
> > drivers/usb/core/hub.c: USB device not accepting new address (error=-22)
> > scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 0 lun 0
> > usb 5-2: USB disconnect, address 2
> > Unable to handle kernel paging request at virtual address 544150a8
> > printing eip:
> > c02abc39
> > *pde = 00000000
> > Oops: 0000 [#1]
> > CPU: 0
> > EIP: 0060:[<c02abc39>] Not tainted
> > EFLAGS: 00010286
> > EIP is at scsi_eh_finish_cmd+0x19/0x60
> > eax: ef19d000 ebx: eef28080 ecx: eef22140 edx: 54415056
> > esi: eef21fa8 edi: eef21fb0 ebp: eef21fa8 esp: eef21f60
> > ds: 007b es: 007b ss: 0068
> > Process scsi_eh_1 (pid: 118, threadinfo=eef20000 task=eef23340)
> > Stack: eef28080 eef28080 eef21fb0 c02ac418 eef28080 eef21fa8 00000000 00000000
> > 00000000 ef62f428 eef20000 00000282 eef21fa8 c02ac8d6 eef21fb0 eef21fa8
> > eef21fa8 eef21fb0 eef21fa8 eef21fa8 eef28098 eef28098 ef62f400 eef21fd4
> > Call Trace:
> > [<c02ac418>] scsi_eh_offline_sdevs+0x68/0x80
> > [<c02ac8d6>] scsi_unjam_host+0xc6/0xd0
> > [<c02ac9b1>] scsi_error_handler+0xd1/0x120
> > [<c02ac8e0>] scsi_error_handler+0x0/0x120
> > [<c0108abd>] kernel_thread_helper+0x5/0x18
> >
> > Code: 0f b7 42 52 48 66 89 42 52 c7 43 24 00 00 00 00 66 c7 43 08
> >
> > I cannot tell if it is a regression since I just got the CF and the
> > adapter.
>
> This is a problem in the SCSI error handler. You should post your
> question to <linux-scsi@vger.kernel.org>. Maybe also cc: to Mike Anderson
> <andmike@us.ibm.com>.
>
> Alan Stern
----- End forwarded message -----
florin
--
"NT is to UNIX what a doughnut is to a particle accelerator."
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: Fwd: Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 2003-05-16 15:56 Fwd: Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 Florin Iucha @ 2003-05-16 16:22 ` Mike Anderson 2003-05-16 17:25 ` Florin Iucha 0 siblings, 1 reply; 35+ messages in thread From: Mike Anderson @ 2003-05-16 16:22 UTC (permalink / raw) To: Florin Iucha; +Cc: linux-scsi This might be a case that the scsi_error handler never reclaimed or canceled the command (i.e., this is a lost command) and probably should not update the fields. Could you get me a little more data on what the error handler actions where by doing the following: 1.) Ensure SCSI logging is enabled in your config. CONFIG_SCSI_LOGGING=y. 2.) Prior to the mount of the Compact Flash card. echo "scsi log error 4" > /proc/scsi/scsi If this is getting set dmesg should give a line that scsi logging level set to ... 3.) Mount the Compact Flash card and send me the output when it fails. Thanks, Florin Iucha [florin@iucha.net] wrote: > ----- Forwarded message from Alan Stern <stern@rowland.harvard.edu> ----- > > > Subject: Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 > > From: Alan Stern <stern@rowland.harvard.edu> > > To: Florin Iucha <florin@iucha.net> > > Cc: USB development list <linux-usb-devel@lists.sourceforge.net> > > Date: Fri, 16 May 2003 11:49:00 -0400 (EDT) > > > > On Thu, 15 May 2003, Florin Iucha wrote: > > > > > Hello, > > > > > > I get the following oops in 2.5.69-bk8 when I try to mount a Compact > > > Flash card in a SanDisk adapter: > > > > > > usb 2-1: USB disconnect, address 2 > > > hub 2-0:0: debounce: port 1: delay 100ms stable 4 status 0x301 > > > hub 2-0:0: new USB device on port 1, assigned address 3 > > > input: USB HID v1.00 Mouse [Microsoft Microsoft IntelliMouse? Explorer] on usb-00:02.3-1 > > > usb 2-1: USB disconnect, address 3 > > > drivers/usb/input/hid-core.c: can't resubmit intr, 00:02.3-1/input0, status -19 > > > hub 2-0:0: debounce: port 1: delay 100ms stable 4 status 0x301 > > > hub 2-0:0: new USB device on port 1, assigned address 4 > > > input: USB HID v1.00 Mouse [Microsoft Microsoft IntelliMouse? Explorer] on usb-00:02.3-1 > > > SCSI device sda: 31232 512-byte hdwr sectors (16 MB) > > > sda: Write Protect is off > > > sda: Mode Sense: 02 00 00 00 > > > drivers/usb/core/message.c: usb_control/bulk_msg: timeout > > > drivers/usb/core/message.c: usb_control/bulk_msg: timeout > > > drivers/usb/core/hub.c: USB device not accepting new address (error=-22) > > > scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 0 lun 0 > > > usb 5-2: USB disconnect, address 2 > > > Unable to handle kernel paging request at virtual address 544150a8 > > > printing eip: > > > c02abc39 > > > *pde = 00000000 > > > Oops: 0000 [#1] > > > CPU: 0 > > > EIP: 0060:[<c02abc39>] Not tainted > > > EFLAGS: 00010286 > > > EIP is at scsi_eh_finish_cmd+0x19/0x60 > > > eax: ef19d000 ebx: eef28080 ecx: eef22140 edx: 54415056 > > > esi: eef21fa8 edi: eef21fb0 ebp: eef21fa8 esp: eef21f60 > > > ds: 007b es: 007b ss: 0068 > > > Process scsi_eh_1 (pid: 118, threadinfo=eef20000 task=eef23340) > > > Stack: eef28080 eef28080 eef21fb0 c02ac418 eef28080 eef21fa8 00000000 00000000 > > > 00000000 ef62f428 eef20000 00000282 eef21fa8 c02ac8d6 eef21fb0 eef21fa8 > > > eef21fa8 eef21fb0 eef21fa8 eef21fa8 eef28098 eef28098 ef62f400 eef21fd4 > > > Call Trace: > > > [<c02ac418>] scsi_eh_offline_sdevs+0x68/0x80 > > > [<c02ac8d6>] scsi_unjam_host+0xc6/0xd0 > > > [<c02ac9b1>] scsi_error_handler+0xd1/0x120 > > > [<c02ac8e0>] scsi_error_handler+0x0/0x120 > > > [<c0108abd>] kernel_thread_helper+0x5/0x18 > > > > > > Code: 0f b7 42 52 48 66 89 42 52 c7 43 24 00 00 00 00 66 c7 43 08 > > > > > > I cannot tell if it is a regression since I just got the CF and the > > > adapter. > > > > This is a problem in the SCSI error handler. You should post your > > question to <linux-scsi@vger.kernel.org>. Maybe also cc: to Mike Anderson > > <andmike@us.ibm.com>. > > > > Alan Stern > > ----- End forwarded message ----- > > florin > > -- > > "NT is to UNIX what a doughnut is to a particle accelerator." -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 2003-05-16 16:22 ` Mike Anderson @ 2003-05-16 17:25 ` Florin Iucha 0 siblings, 0 replies; 35+ messages in thread From: Florin Iucha @ 2003-05-16 17:25 UTC (permalink / raw) To: Mike Anderson; +Cc: linux-scsi [-- Attachment #1: Type: text/plain, Size: 3048 bytes --] On Fri, May 16, 2003 at 09:22:50AM -0700, Mike Anderson wrote: > This might be a case that the scsi_error handler never reclaimed or > canceled the command (i.e., this is a lost command) and probably should > not update the fields. > > Could you get me a little more data on what the error handler actions > where by doing the following: > 1.) Ensure SCSI logging is enabled in your config. > CONFIG_SCSI_LOGGING=y. > > 2.) Prior to the mount of the Compact Flash card. > echo "scsi log error 4" > /proc/scsi/scsi > > If this is getting set dmesg should give a line that scsi > logging level set to ... > > 3.) Mount the Compact Flash card and send me the output when it > fails. Here it is: scsi logging level set to 0x00000004 SCSI device sda: 31232 512-byte hdwr sectors (16 MB) sda: Write Protect is off sda: Mode Sense: 02 00 00 00 drivers/usb/core/message.c: usb_control/bulk_msg: timeout Error handler scsi_eh_1 waking up scsi_eh_prt_fail_stats: 1:0:0:0 cmds failed: 0, cancel: 1 Total of 1 commands on 1 devices require eh work scsi_eh_1: aborting cmd:0xedac3080 scsi_eh_times_out: scmd:edac3080 scsi_send_eh_cmnd: scmd: edac3080, rtn:2003 scsi_eh_tur: scmd edac3080 rtn 2003 scsi_eh_1: Sending BDR sdev: 0xee133c00 drivers/usb/core/message.c: usb_control/bulk_msg: timeout scsi_eh_1: BDR failed sdev:0xee133c00 scsi_eh_1: Sending BRST chan: 0 scsi_try_bus_reset: Snd Bus RST drivers/usb/core/hub.c: USB device not accepting new address (error=-22) scsi_eh_1: BRST failed chan: 0 scsi_eh_1: Sending HRST scsi_try_host_reset: Snd Host RST scsi_eh_1: HRST failed scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 0 lun 0 scsi_eh_1: flush finish cmd: edac3080 usb 5-2: USB disconnect, address 2 Unable to handle kernel paging request at virtual address 544150a6 printing eip: c02aa5e8 *pde = 00000000 Oops: 0000 [#1] CPU: 0 EIP: 0060:[<c02aa5e8>] Not tainted EFLAGS: 00010002 EIP is at scsi_host_busy_dec_and_test+0x18/0x8a eax: ed9f8000 ebx: 00000286 ecx: 54415056 edx: ed9f8000 esi: edac3080 edi: 54415056 ebp: ed9f8000 esp: ed9f9f24 ds: 007b es: 007b ss: 0068 Process scsi_eh_1 (pid: 604, threadinfo=ed9f8000 task=ee15d840) Stack: c036db3b ee133c00 c02a93fc 54415056 ee133c00 edac3080 edac3080 ed9f9fa8 ed9f9fa8 c02ad2bb edac3080 ee15db26 edac3080 ed9f8000 edac3098 ee8a0008 ee133c00 ed9f9fb0 c02ad49c ed9f9fa8 ed9f9fa8 ed9f9fa8 00000000 00000000 Call Trace: [<c02a93fc>] scsi_finish_command+0x1c/0x110 [<c02ad2bb>] scsi_eh_flush_done_q+0x5b/0xe0 [<c02ad49c>] scsi_unjam_host+0x15c/0x230 [<c02ad6af>] scsi_error_handler+0x13f/0x1b0 [<c02ad570>] scsi_error_handler+0x0/0x1b0 [<c0108abd>] kernel_thread_helper+0x5/0x18 Code: 0f b7 41 50 48 f6 81 92 00 00 00 01 66 89 41 50 74 16 0f b7 <6>note: scsi_eh_1[604] exited with preempt_count 1 Thank you, florin PS: this trace was obtained on 2.5.69-bk11. -- "NT is to UNIX what a doughnut is to a particle accelerator." [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2003-06-15 12:47 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20030516190414.GD11884@iucha.net>
2003-05-16 21:23 ` [linux-usb-devel] USB Storage oops in 2.5.69-bk8 Alan Stern
2003-05-16 21:44 ` Florin Iucha
2003-05-16 22:29 ` Mike Anderson
2003-05-17 15:29 ` Alan Stern
2003-05-17 16:33 ` David Brownell
2003-05-18 18:31 ` Alan Stern
2003-05-18 23:46 ` David Brownell
2003-05-21 15:19 ` Bug in hot-unplugging for SCSI CD-ROM Alan Stern
2003-05-17 18:38 ` [linux-usb-devel] USB Storage oops in 2.5.69-bk8 Patrick Mansfield
2003-05-31 14:35 ` Florin Iucha
2003-05-20 14:11 ` Patch: change the serial_number for error-handler commands Alan Stern
2003-05-20 21:25 ` Luben Tuikov
2003-05-21 1:19 ` Alan Stern
2003-05-21 18:03 ` Mike Anderson
2003-05-21 18:50 ` Luben Tuikov
2003-05-21 19:18 ` Luben Tuikov
2003-05-21 20:28 ` Mike Anderson
2003-05-21 21:11 ` Luben Tuikov
2003-05-21 23:15 ` Patrick Mansfield
2003-05-22 5:47 ` Luben Tuikov
2003-05-21 19:57 ` Alan Stern
2003-05-21 20:42 ` Luben Tuikov
2003-05-21 21:05 ` Alan Stern
2003-05-21 21:19 ` James Bottomley
2003-05-21 22:53 ` Mike Anderson
2003-06-11 17:41 ` PATCH: (as33) Remove /proc/scsi directory in scsi_remove_host() Alan Stern
2003-06-11 18:23 ` Mike Anderson
2003-06-12 6:04 ` Christoph Hellwig
2003-06-12 6:51 ` Mike Anderson
2003-06-12 21:00 ` PATCH: (as33b) " Alan Stern
2003-06-12 21:58 ` Mike Anderson
2003-06-13 14:38 ` Alan Stern
2003-06-15 13:01 ` Christoph Hellwig
2003-05-21 19:24 ` Patch: change the serial_number for error-handler commands Luben Tuikov
2003-05-16 15:56 Fwd: Re: [linux-usb-devel] USB Storage oops in 2.5.69-bk8 Florin Iucha
2003-05-16 16:22 ` Mike Anderson
2003-05-16 17:25 ` Florin Iucha
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox