linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* oops on usb storage device disconnect with 2.6.14-rc1
@ 2005-09-15 19:03 Greg KH
  2005-09-15 19:23 ` [linux-usb-devel] " Alan Stern
  0 siblings, 1 reply; 23+ messages in thread
From: Greg KH @ 2005-09-15 19:03 UTC (permalink / raw)
  To: linux-usb-devel, linux-scsi; +Cc: mdharm-usb

Here's a nice oops I get when removing a usb-storage device with
2.6.14-rc1.  I've also created a bug for this for those who like to
track bugzilla:
	http://bugzilla.kernel.org/show_bug.cgi?id=5265

Any ideas?

thanks,

greg k-h

-----------------------

hub 1-3:1.0: state 5 ports 4 chg 0000 evt 0010
hub 1-3:1.0: port 4, status 0100, change 0001, 12 Mb/s
usb 1-3.4: USB disconnect, address 6
usb 1-3.4: usb_disable_device nuking all URBs
usb 1-3.4: unregistering interface 1-3.4:1.0
bus usb: remove device 1-3.4:1.0
CLASS: Unregistering class device. ID = '2:0:0:0'
CLASS: Unregistering class device. ID = 'sg1'
kobject_hotplug
fill_kobj_path: path = '/class/scsi_generic/sg1'
class_hotplug - name = sg1
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb1/1-3/1-3.4/1-3.4:1.0/host2/target2:0:0/2:0:0:0'
kobject_hotplug: /sbin/udevsend scsi_generic seq=995 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=remove DEVPATH=/class/scsi_generic/sg1 SUBSYSTEM=scsi_generic
kobject sg1: cleaning up
device class 'sg1': release.
kobject <NULL>: cleaning up
kobject <NULL>: cleaning up
kobject_hotplug
fill_kobj_path: path = '/class/scsi_device/2:0:0:0'
class_hotplug - name = 2:0:0:0
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb1/1-3/1-3.4/1-3.4:1.0/host2/target2:0:0/2:0:0:0'
kobject_hotplug: /sbin/udevsend scsi_device seq=996 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=remove DEVPATH=/class/scsi_device/2:0:0:0 SUBSYSTEM=scsi_device
kobject 2:0:0:0: cleaning up
device class '2:0:0:0': release.
bus scsi: remove device 2:0:0:0
kobject sdb1: unregistering
kobject_hotplug
fill_kobj_path: path = '/block/sdb/sdb1'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb1/1-3/1-3.4/1-3.4:1.0/host2/target2:0:0/2:0:0:0'
kobject_hotplug: /sbin/udevsend block seq=997 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=remove DEVPATH=/block/sdb/sdb1 SUBSYSTEM=block
kobject sdb1: cleaning up
kobject iosched: unregistering
kobject iosched: cleaning up
kobject queue: unregistering
kobject queue: cleaning up
kobject_hotplug
fill_kobj_path: path = '/block/sdb'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb1/1-3/1-3.4/1-3.4:1.0/host2/target2:0:0/2:0:0:0'
kobject_hotplug: /sbin/udevsend block seq=998 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=remove DEVPATH=/block/sdb SUBSYSTEM=block
kobject sdb: cleaning up
PM: Removing info for scsi:2:0:0:0
kobject_hotplug
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb1/1-3/1-3.4/1-3.4:1.0/host2/target2:0:0/2:0:0:0'
kobject_hotplug: /sbin/udevsend scsi seq=999 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=remove DEVPATH=/devices/pci0000:00/0000:00:1d.7/usb1/1-3/1-3.4/1-3.4:1.0/host2/target2:0:0/2:0:0:0 SUBSYSTEM=scsi
kobject 2:0:0:0: cleaning up
Unable to handle kernel paging request at virtual address 6b6b6b9b
 printing eip:
c028f8cc
*pde = 00000000
Oops: 0002 [#1]
SMP 
Modules linked in: ub usb_storage usbhid uhci_hcd ehci_hcd usbcore
CPU:    1
EIP:    0060:[<c028f8cc>]    Not tainted VLI
EFLAGS: 00010286   (2.6.14-rc1-g1619cca2) 
EIP is at scsi_remove_device+0x39/0x4f
eax: 00000001   ebx: f711b2ec   ecx: 00000000   edx: 6b6b6b6b
esi: f7a28e60   edi: f7a28e58   ebp: f6f480bc   esp: f7bf7dbc
ds: 007b   es: 007b   ss: 0068
Process khubd (pid: 1292, threadinfo=f7bf6000 task=f7ac0030)
Stack: f711b4f8 00000066 f711b2ec f7a28e60 c028f963 f711b2ec f7a28e64 f7a28e60 
       f7a28e64 f7a28e68 c028e739 f6f480bc f7a28e60 f8aaab00 f8aaab20 f7a29380 
       c02869c2 f7a28e60 00000003 f7a290dc f8aaab00 f8a9bcc0 f7a28e60 f79ee2c8 
Call Trace:
 [<c028f963>] __scsi_remove_target+0x81/0xa9
 [<c028e739>] scsi_forget_host+0x3b/0x61
 [<c02869c2>] scsi_remove_host+0x49/0x93
 [<f8a9bcc0>] storage_disconnect+0x19/0x26 [usb_storage]
 [<f88351c3>] usb_unbind_interface+0x7e/0x80 [usbcore]
 [<f8835145>] usb_unbind_interface+0x0/0x80 [usbcore]
 [<c025e52d>] __device_release_driver+0x9a/0xbc
 [<c025e580>] device_release_driver+0x31/0x43
 [<c025dbbb>] bus_remove_device+0x96/0xb0
 [<c025ca16>] device_del+0x2c/0x6e
 [<f883f21d>] usb_disable_device+0x12b/0x1a0 [usbcore]
 [<f8838006>] usb_disconnect+0xcc/0x1c5 [usbcore]
 [<f883a394>] hub_port_connect_change+0x39c/0x53c [usbcore]
 [<f883a92c>] hub_events+0x3f8/0x622 [usbcore]
 [<f883ab6f>] hub_thread+0x19/0x101 [usbcore]
 [<c013630a>] autoremove_wake_function+0x0/0x57
 [<c013630a>] autoremove_wake_function+0x0/0x57
 [<f883ab56>] hub_thread+0x0/0x101 [usbcore]
 [<c0135e54>] kthread+0xba/0xf0
 [<c0135d9a>] kthread+0x0/0xf0
 [<c0101469>] kernel_thread_helper+0x5/0xb
Code: 14 8b 33 c7 44 24 04 66 00 00 00 c7 04 24 48 57 34 c0 e8 30 e3 e8 ff f0 ff 4e 30 0f 88 20 04 00 00 89 1c 24 e8 36 ff ff ff 8b 13 <f0> ff 42 30 0f 8e 19 04 00 00 8b 5c 24 08 8b 74 24 0c 83 c4 10 
 


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 19:03 oops on usb storage device disconnect with 2.6.14-rc1 Greg KH
@ 2005-09-15 19:23 ` Alan Stern
  2005-09-15 19:29   ` Greg KH
  0 siblings, 1 reply; 23+ messages in thread
From: Alan Stern @ 2005-09-15 19:23 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-usb-devel, linux-scsi, mdharm-usb

On Thu, 15 Sep 2005, Greg KH wrote:

> Here's a nice oops I get when removing a usb-storage device with
> 2.6.14-rc1.  I've also created a bug for this for those who like to
> track bugzilla:
> 	http://bugzilla.kernel.org/show_bug.cgi?id=5265
> 
> Any ideas?

Patches for this bug plus two others attached to the bug report.

Alan Stern


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 19:23 ` [linux-usb-devel] " Alan Stern
@ 2005-09-15 19:29   ` Greg KH
  2005-09-15 19:57     ` James Bottomley
  0 siblings, 1 reply; 23+ messages in thread
From: Greg KH @ 2005-09-15 19:29 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-usb-devel, linux-scsi, mdharm-usb

On Thu, Sep 15, 2005 at 03:23:11PM -0400, Alan Stern wrote:
> On Thu, 15 Sep 2005, Greg KH wrote:
> 
> > Here's a nice oops I get when removing a usb-storage device with
> > 2.6.14-rc1.  I've also created a bug for this for those who like to
> > track bugzilla:
> > 	http://bugzilla.kernel.org/show_bug.cgi?id=5265
> > 
> > Any ideas?
> 
> Patches for this bug plus two others attached to the bug report.

Ah, thanks for the pointer.  SCSI developers, any objections to these
patches being merged?

greg k-h

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 19:29   ` Greg KH
@ 2005-09-15 19:57     ` James Bottomley
  2005-09-15 21:08       ` [linux-usb-devel] " James Bottomley
  0 siblings, 1 reply; 23+ messages in thread
From: James Bottomley @ 2005-09-15 19:57 UTC (permalink / raw)
  To: Greg KH; +Cc: Alan Stern, linux-usb-devel, SCSI Mailing List, mdharm-usb

On Thu, 2005-09-15 at 12:29 -0700, Greg KH wrote:
> Ah, thanks for the pointer.  SCSI developers, any objections to these
> patches being merged?

I haven't had time to review the eh changes, but I was going to reply to
the other one (basically there's a better way to try to close the device
add/host remove race using the host state model).

Let me complete the SCSI process and I'll take them through the scsi-rc-
fixes tree.

James




-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 19:57     ` James Bottomley
@ 2005-09-15 21:08       ` James Bottomley
  2005-09-15 22:19         ` Mike Anderson
  2005-09-15 23:46         ` [linux-usb-devel] " Greg KH
  0 siblings, 2 replies; 23+ messages in thread
From: James Bottomley @ 2005-09-15 21:08 UTC (permalink / raw)
  To: Greg KH; +Cc: Alan Stern, linux-usb-devel, SCSI Mailing List, mdharm-usb

On Thu, 2005-09-15 at 15:57 -0400, James Bottomley wrote:
> I haven't had time to review the eh changes, but I was going to reply to
> the other one (basically there's a better way to try to close the device
> add/host remove race using the host state model).
> 
> Let me complete the SCSI process and I'll take them through the scsi-rc-
> fixes tree.

Well, I think the symptoms are racing scsi_remove_host() calls and the
solution is to enforce the state model on removal (as in if the host is
already in the remove state, don't try to remove it again).

Could you try the patch here:

http://marc.theaimsgroup.com/?l=linux-scsi&m=112613077011571

And see if it will fix the problem?

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 21:08       ` [linux-usb-devel] " James Bottomley
@ 2005-09-15 22:19         ` Mike Anderson
  2005-09-15 22:38           ` [linux-usb-devel] " James Bottomley
  2005-09-15 23:46         ` [linux-usb-devel] " Greg KH
  1 sibling, 1 reply; 23+ messages in thread
From: Mike Anderson @ 2005-09-15 22:19 UTC (permalink / raw)
  To: James Bottomley
  Cc: Greg KH, Alan Stern, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

James Bottomley <James.Bottomley@SteelEye.com> wrote:
> On Thu, 2005-09-15 at 15:57 -0400, James Bottomley wrote:
> > I haven't had time to review the eh changes, but I was going to reply to
> > the other one (basically there's a better way to try to close the device
> > add/host remove race using the host state model).
> > 
> > Let me complete the SCSI process and I'll take them through the scsi-rc-
> > fixes tree.
> 
> Well, I think the symptoms are racing scsi_remove_host() calls and the
> solution is to enforce the state model on removal (as in if the host is
> already in the remove state, don't try to remove it again).
> 
> Could you try the patch here:
> 
> http://marc.theaimsgroup.com/?l=linux-scsi&m=112613077011571
> 
> And see if it will fix the problem?
> 

A side effect of not applying Alan's previous patch that added
SHOST_RECOVERY to the SHOST_CANCEL: state is that we will not move to the
SHOST_CANCEL and subsequently not to SHOST_DEL state if the eh is active
during the start of scsi_remove_host. I sent mail on the 7th indicating to
include that state change hunk of the diff, but I guess that overlapped
with your newer state changes.
http://marc.theaimsgroup.com/?l=linux-scsi&m=112238726326927&w=2

In looking at the state model introduced by your patch I believe there may
still be a state model race issue if the recovery completes just after
the "if (!scsi_host_set_state(shost, SHOST_CANCEL))" call in
scsi_remove_host (maybe I am just looking to quickly at the state
updates).

I still do not understand as I asked in a previous comment why we are not
shutting down the eh_thread in scsi_remove_host and also why simpler state
model updates could not solve the problem.

I believe I also indicated that we could enhance scsi_error to shutdown
faster during this state which should only be a performance improvement.

-andmike
--
Michael Anderson
andmike@us.ibm.com


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 22:19         ` Mike Anderson
@ 2005-09-15 22:38           ` James Bottomley
  2005-09-15 23:55             ` Mike Anderson
                               ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: James Bottomley @ 2005-09-15 22:38 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Greg KH, Alan Stern, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Thu, 2005-09-15 at 15:19 -0700, Mike Anderson wrote:
> A side effect of not applying Alan's previous patch that added
> SHOST_RECOVERY to the SHOST_CANCEL: state is that we will not move to the
> SHOST_CANCEL and subsequently not to SHOST_DEL state if the eh is active
> during the start of scsi_remove_host. I sent mail on the 7th indicating to
> include that state change hunk of the diff, but I guess that overlapped
> with your newer state changes.
> http://marc.theaimsgroup.com/?l=linux-scsi&m=112238726326927&w=2

Yes, but that's not really legitimate since it introduces a bifurcation
in the state machine ... when the eh terminates it will come back to
running even if it went in from cancel.

> In looking at the state model introduced by your patch I believe there may
> still be a state model race issue if the recovery completes just after
> the "if (!scsi_host_set_state(shost, SHOST_CANCEL))" call in
> scsi_remove_host (maybe I am just looking to quickly at the state
> updates).

No, that's true; there's a tiny race that can be mediated by doing
locking around the state changes ... that was one of the feedback
comments from Alan.

> I still do not understand as I asked in a previous comment why we are not
> shutting down the eh_thread in scsi_remove_host and also why simpler state
> model updates could not solve the problem.

Well, it goes back to whether we wait for the thread or not.  To shut
the thread down, we also need to wait for it to complete.

As far as the state model goes, we either need to wait for the eh thread
before transitioning to cancel or introduce the extra states that
reflect the parallel eh transitions.

> I believe I also indicated that we could enhance scsi_error to shutdown
> faster during this state which should only be a performance improvement.

Yes, we could ... patches?

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 21:08       ` [linux-usb-devel] " James Bottomley
  2005-09-15 22:19         ` Mike Anderson
@ 2005-09-15 23:46         ` Greg KH
  2005-09-16  1:57           ` Alan Stern
  1 sibling, 1 reply; 23+ messages in thread
From: Greg KH @ 2005-09-15 23:46 UTC (permalink / raw)
  To: James Bottomley
  Cc: Alan Stern, linux-usb-devel, SCSI Mailing List, mdharm-usb

On Thu, Sep 15, 2005 at 05:08:03PM -0400, James Bottomley wrote:
> On Thu, 2005-09-15 at 15:57 -0400, James Bottomley wrote:
> > I haven't had time to review the eh changes, but I was going to reply to
> > the other one (basically there's a better way to try to close the device
> > add/host remove race using the host state model).
> > 
> > Let me complete the SCSI process and I'll take them through the scsi-rc-
> > fixes tree.
> 
> Well, I think the symptoms are racing scsi_remove_host() calls and the
> solution is to enforce the state model on removal (as in if the host is
> already in the remove state, don't try to remove it again).
> 
> Could you try the patch here:
> 
> http://marc.theaimsgroup.com/?l=linux-scsi&m=112613077011571
> 
> And see if it will fix the problem?

That helped in that there is no more kernel oops.  But the
/sys/block/sdb device (and partitions) do not get removed from the
kernel.  So the next time I plug the device in, it gets bumped to "sdc".
I can provide kernel logs of the unplug event if you wish.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 22:38           ` [linux-usb-devel] " James Bottomley
@ 2005-09-15 23:55             ` Mike Anderson
  2005-09-16  1:46               ` James Bottomley
  2005-09-16  1:52             ` [linux-usb-devel] " Alan Stern
  2005-09-18  0:35             ` James Bottomley
  2 siblings, 1 reply; 23+ messages in thread
From: Mike Anderson @ 2005-09-15 23:55 UTC (permalink / raw)
  To: James Bottomley
  Cc: Greg KH, Alan Stern, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

James Bottomley <James.Bottomley@SteelEye.com> wrote:
> On Thu, 2005-09-15 at 15:19 -0700, Mike Anderson wrote:
> > A side effect of not applying Alan's previous patch that added
> > SHOST_RECOVERY to the SHOST_CANCEL: state is that we will not move to the
> > SHOST_CANCEL and subsequently not to SHOST_DEL state if the eh is active
> > during the start of scsi_remove_host. I sent mail on the 7th indicating to
> > include that state change hunk of the diff, but I guess that overlapped
> > with your newer state changes.
> > http://marc.theaimsgroup.com/?l=linux-scsi&m=112238726326927&w=2
> 
> Yes, but that's not really legitimate since it introduces a bifurcation
> in the state machine ... when the eh terminates it will come back to
> running even if it went in from cancel.

Clarification here as I do not see the split in the state machine or the
transition back to running from cancel. If the above patch is applied we
can transition to cancel from recovery if eh already has started. When eh
is complete scsi_restart_operations transition to running will fail as we
are in the cancel state. 

That said I like the idea below of waiting / terminating the eh thread
prior to transitioning to cancel. There is some introduction of asymmetry
here in scsi_remove_host as the eh thread is created in scsi_host_alloc,
but possibly later patches could move the eh creation to scsi_add_host
(unless I forgot the reason it needed to be earlier).

> 
> > In looking at the state model introduced by your patch I believe there may
> > still be a state model race issue if the recovery completes just after
> > the "if (!scsi_host_set_state(shost, SHOST_CANCEL))" call in
> > scsi_remove_host (maybe I am just looking to quickly at the state
> > updates).
> 
> No, that's true; there's a tiny race that can be mediated by doing
> locking around the state changes ... that was one of the feedback
> comments from Alan.

ok.

> 
> > I still do not understand as I asked in a previous comment why we are not
> > shutting down the eh_thread in scsi_remove_host and also why simpler state
> > model updates could not solve the problem.
> 
> Well, it goes back to whether we wait for the thread or not.  To shut
> the thread down, we also need to wait for it to complete.
> 
> As far as the state model goes, we either need to wait for the eh thread
> before transitioning to cancel or introduce the extra states that
> reflect the parallel eh transitions.

I like waiting for the eh thread to complete / shutdown prior to
transitioning to cancel.

> 
> > I believe I also indicated that we could enhance scsi_error to shutdown
> > faster during this state which should only be a performance improvement.
> 
> Yes, we could ... patches?


ok, I will try, but IBM non-coding activities has made me very
unproductive in the patch department lately :-).

-andmike
--
Michael Anderson
andmike@us.ibm.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 23:55             ` Mike Anderson
@ 2005-09-16  1:46               ` James Bottomley
  0 siblings, 0 replies; 23+ messages in thread
From: James Bottomley @ 2005-09-16  1:46 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Greg KH, Alan Stern, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Thu, 2005-09-15 at 16:55 -0700, Mike Anderson wrote:
> James Bottomley <James.Bottomley@SteelEye.com> wrote:
> > On Thu, 2005-09-15 at 15:19 -0700, Mike Anderson wrote:
> > > A side effect of not applying Alan's previous patch that added
> > > SHOST_RECOVERY to the SHOST_CANCEL: state is that we will not move to the
> > > SHOST_CANCEL and subsequently not to SHOST_DEL state if the eh is active
> > > during the start of scsi_remove_host. I sent mail on the 7th indicating to
> > > include that state change hunk of the diff, but I guess that overlapped
> > > with your newer state changes.
> > > http://marc.theaimsgroup.com/?l=linux-scsi&m=112238726326927&w=2
> > 
> > Yes, but that's not really legitimate since it introduces a bifurcation
> > in the state machine ... when the eh terminates it will come back to
> > running even if it went in from cancel.
> 
> Clarification here as I do not see the split in the state machine or the
> transition back to running from cancel. If the above patch is applied we
> can transition to cancel from recovery if eh already has started. When eh
> is complete scsi_restart_operations transition to running will fail as we
> are in the cancel state. 

OK, the current model has 5 states, three of which are relevant:
RUNNING, RECOVER and CANCEL.  The problem is that if the error handler
is operating when someone calls scsi_remove_host() we're actually
simultaneously in the RECOVER and CANCEL states.  This is the
bifurcation.  Currently the model forbids the transition RECOVER-
>CANCEL, so we wrongly stay in the recover state (and even worse
transition back to RUNNING if the eh thread finishes).  The patch would
permit this transition so we'd now wrongly go into the CANCEL state
instead (in spite of the error handler still being active---meaning that
the not in recovery checks all wrongly pass).

The solution is either to update our model to reflect reality:  adding
the parallel CANCEL_RECOVER (and DEL_RECOVER) states that allow our
model to reflect simultaneous cancel and recovery.  Or we update reality
to reflect our model: make scsi_remove_host() wait for the recovery
thread to finish if it's active and never let it start once we
transition to CANCEL.

> That said I like the idea below of waiting / terminating the eh thread
> prior to transitioning to cancel. There is some introduction of asymmetry
> here in scsi_remove_host as the eh thread is created in scsi_host_alloc,
> but possibly later patches could move the eh creation to scsi_add_host
> (unless I forgot the reason it needed to be earlier).

That's the "update reality to reflect our model" option.

> > > I still do not understand as I asked in a previous comment why we are not
> > > shutting down the eh_thread in scsi_remove_host and also why simpler state
> > > model updates could not solve the problem.
> > 
> > Well, it goes back to whether we wait for the thread or not.  To shut
> > the thread down, we also need to wait for it to complete.
> > 
> > As far as the state model goes, we either need to wait for the eh thread
> > before transitioning to cancel or introduce the extra states that
> > reflect the parallel eh transitions.
> 
> I like waiting for the eh thread to complete / shutdown prior to
> transitioning to cancel.

Well ... I'm less keen on this.  Making one thing wait for another
introduces complex dependencies into the system.  Such complex
interactions always seem to come back to bite us one way or another.
Since we can model the system without introducing the dependency I'd
really rather do it that way.

James




-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 22:38           ` [linux-usb-devel] " James Bottomley
  2005-09-15 23:55             ` Mike Anderson
@ 2005-09-16  1:52             ` Alan Stern
  2005-09-16  2:27               ` James Bottomley
  2005-09-18  0:36               ` James Bottomley
  2005-09-18  0:35             ` James Bottomley
  2 siblings, 2 replies; 23+ messages in thread
From: Alan Stern @ 2005-09-16  1:52 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mike Anderson, Greg KH, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Thu, 15 Sep 2005, James Bottomley wrote:

> Well, I think the symptoms are racing scsi_remove_host() calls and the
> solution is to enforce the state model on removal (as in if the host is
> already in the remove state, don't try to remove it again).

You're forgetting something: Devices can be removed at any time, no matter
what state the host is in.  It's even possible for thread A to be removing
a device while thread B is removing the host.  The A thread will interfere
with the B thread, because the "list_for_each_entry_safe" loops in
scsi_forget_host and __scsi_remove_target _aren't_ safe against other
threads removing devices.  Also remember that the list pointers get
changed when a device is _released_, which can occur quite some time after
it is _removed_.

In short, those iterations must be carried out as in my patch.


> > I still do not understand as I asked in a previous comment why we are not
> > shutting down the eh_thread in scsi_remove_host and also why simpler state
> > model updates could not solve the problem.
> 
> Well, it goes back to whether we wait for the thread or not.  To shut
> the thread down, we also need to wait for it to complete.
> 
> As far as the state model goes, we either need to wait for the eh thread
> before transitioning to cancel or introduce the extra states that
> reflect the parallel eh transitions.

Agreed, one or the other of these is needed.  I think the parallel-states 
approach offers a little more flexibility.

> > I believe I also indicated that we could enhance scsi_error to shutdown
> > faster during this state which should only be a performance improvement.
> 
> Yes, we could ... patches?

During the DEL_RECOVERY state you could maybe make the error handler 
faster by removing the settle delays.  I'm not sure what else you can do 
-- all attempts at submitting commands will quickly fail.

During CANCEL_RECOVERY you may not want to speed up the error handler.  
In this state it needs to function normally, since there might be errors
during the cache-flush sequence or similar things (for non-forced
removal).


Finally, having said all that, it turns out there's still a bug in the 
2.6.14-rc1 code.  I feel pretty stupid about it -- you'll understand why 
when you read the patch below.

Alan Stern



Signed-off-by: Alan Stern <stern@rowland.harvard.edu>

--- a/drivers/scsi/scsi_sysfs.c	Mon Sep 12 23:12:09 2005
+++ b/drivers/scsi/scsi_sysfs.c	Thu Sep 15 21:32:27 2005
@@ -707,9 +707,11 @@
  **/
 void scsi_remove_device(struct scsi_device *sdev)
 {
-	down(&sdev->host->scan_mutex);
+	struct Scsi_Host *shost = sdev->host;
+
+	down(&shost->scan_mutex);
 	__scsi_remove_device(sdev);
-	up(&sdev->host->scan_mutex);
+	up(&shost->scan_mutex);
 }
 EXPORT_SYMBOL(scsi_remove_device);
 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 23:46         ` [linux-usb-devel] " Greg KH
@ 2005-09-16  1:57           ` Alan Stern
  0 siblings, 0 replies; 23+ messages in thread
From: Alan Stern @ 2005-09-16  1:57 UTC (permalink / raw)
  To: Greg KH; +Cc: James Bottomley, linux-usb-devel, SCSI Mailing List, mdharm-usb

On Thu, 15 Sep 2005, Greg KH wrote:

> On Thu, Sep 15, 2005 at 05:08:03PM -0400, James Bottomley wrote:
> > On Thu, 2005-09-15 at 15:57 -0400, James Bottomley wrote:
> > > I haven't had time to review the eh changes, but I was going to reply to
> > > the other one (basically there's a better way to try to close the device
> > > add/host remove race using the host state model).
> > > 
> > > Let me complete the SCSI process and I'll take them through the scsi-rc-
> > > fixes tree.
> > 
> > Well, I think the symptoms are racing scsi_remove_host() calls and the
> > solution is to enforce the state model on removal (as in if the host is
> > already in the remove state, don't try to remove it again).
> > 
> > Could you try the patch here:
> > 
> > http://marc.theaimsgroup.com/?l=linux-scsi&m=112613077011571
> > 
> > And see if it will fix the problem?
> 
> That helped in that there is no more kernel oops.  But the
> /sys/block/sdb device (and partitions) do not get removed from the
> kernel.  So the next time I plug the device in, it gets bumped to "sdc".
> I can provide kernel logs of the unplug event if you wish.

You still need to use the second patch in the Bugzilla entry, the one
that James hasn't had time to check over yet.  Without it, the SCSI error
handler thread never exits and the device removal sequence hangs.

Alan Stern


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-16  1:52             ` [linux-usb-devel] " Alan Stern
@ 2005-09-16  2:27               ` James Bottomley
  2005-09-18  0:36               ` James Bottomley
  1 sibling, 0 replies; 23+ messages in thread
From: James Bottomley @ 2005-09-16  2:27 UTC (permalink / raw)
  To: Alan Stern
  Cc: Mike Anderson, Greg KH, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Thu, 2005-09-15 at 21:52 -0400, Alan Stern wrote:
> You're forgetting something: Devices can be removed at any time, no matter
> what state the host is in.  It's even possible for thread A to be removing
> a device while thread B is removing the host.  The A thread will interfere
> with the B thread, because the "list_for_each_entry_safe" loops in
> scsi_forget_host and __scsi_remove_target _aren't_ safe against other
> threads removing devices.  Also remember that the list pointers get
> changed when a device is _released_, which can occur quite some time after
> it is _removed_.
> 
> In short, those iterations must be carried out as in my patch.

OK, I concede the point ... I'll put it in.

> Finally, having said all that, it turns out there's still a bug in the 
> 2.6.14-rc1 code.  I feel pretty stupid about it -- you'll understand why 
> when you read the patch below.

Heh, yes, got that one in too.

James




-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-15 22:38           ` [linux-usb-devel] " James Bottomley
  2005-09-15 23:55             ` Mike Anderson
  2005-09-16  1:52             ` [linux-usb-devel] " Alan Stern
@ 2005-09-18  0:35             ` James Bottomley
  2005-09-18 20:05               ` James Bottomley
  2 siblings, 1 reply; 23+ messages in thread
From: James Bottomley @ 2005-09-18  0:35 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Greg KH, Alan Stern, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Thu, 2005-09-15 at 18:38 -0400, James Bottomley wrote:
> On Thu, 2005-09-15 at 15:19 -0700, Mike Anderson wrote:
> > In looking at the state model introduced by your patch I believe there may
> > still be a state model race issue if the recovery completes just after
> > the "if (!scsi_host_set_state(shost, SHOST_CANCEL))" call in
> > scsi_remove_host (maybe I am just looking to quickly at the state
> > updates).
> 
> No, that's true; there's a tiny race that can be mediated by doing
> locking around the state changes ... that was one of the feedback
> comments from Alan.

The attached should be that patch with the race window closed.

James

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -98,6 +98,7 @@ int scsi_host_set_state(struct Scsi_Host
 		switch (oldstate) {
 		case SHOST_CREATED:
 		case SHOST_RUNNING:
+		case SHOST_CANCEL_RECOVERY:
 			break;
 		default:
 			goto illegal;
@@ -107,12 +108,31 @@ int scsi_host_set_state(struct Scsi_Host
 	case SHOST_DEL:
 		switch (oldstate) {
 		case SHOST_CANCEL:
+		case SHOST_DEL_RECOVERY:
 			break;
 		default:
 			goto illegal;
 		}
 		break;
 
+	case SHOST_CANCEL_RECOVERY:
+		switch (oldstate) {
+		case SHOST_CANCEL:
+		case SHOST_RECOVERY:
+			break;
+		default:
+			goto illegal;
+		}
+		break;
+
+	case SHOST_DEL_RECOVERY:
+		switch (oldstate) {
+		case SHOST_CANCEL_RECOVERY:
+			break;
+		default:
+			goto illegal;
+		}
+		break;
 	}
 	shost->shost_state = state;
 	return 0;
@@ -134,13 +154,24 @@ EXPORT_SYMBOL(scsi_host_set_state);
  **/
 void scsi_remove_host(struct Scsi_Host *shost)
 {
+	unsigned long flags;
 	down(&shost->scan_mutex);
-	scsi_host_set_state(shost, SHOST_CANCEL);
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (!scsi_host_set_state(shost, SHOST_CANCEL))
+		if (!scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY)) {
+			spin_unlock_irqrestore(shost->host_lock, flags);
+			up(&shost->scan_mutex);
+			return;
+		}
+	spin_unlock_irqrestore(shost->host_lock, flags);
 	up(&shost->scan_mutex);
 	scsi_forget_host(shost);
 	scsi_proc_host_rm(shost);
 
-	scsi_host_set_state(shost, SHOST_DEL);
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (!scsi_host_set_state(shost, SHOST_DEL))
+		BUG_ON(!scsi_host_set_state(shost, SHOST_DEL_RECOVERY));
+	spin_unlock_irqrestore(shost->host_lock, flags);
 
 	transport_unregister_device(&shost->shost_gendev);
 	class_device_unregister(&shost->shost_classdev);
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -68,19 +68,24 @@ int scsi_eh_scmd_add(struct scsi_cmnd *s
 {
 	struct Scsi_Host *shost = scmd->device->host;
 	unsigned long flags;
+	int ret = 0;
 
 	if (shost->eh_wait == NULL)
 		return 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
+	if (!scsi_host_set_state(shost, SHOST_RECOVERY))
+		if (!scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY))
+			goto out_unlock;
 
+	ret = 1;
 	scmd->eh_eflags |= eh_flag;
 	list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q);
-	scsi_host_set_state(shost, SHOST_RECOVERY);
 	shost->host_failed++;
 	scsi_eh_wakeup(shost);
+ out_unlock:
 	spin_unlock_irqrestore(shost->host_lock, flags);
-	return 1;
+	return ret;
 }
 
 /**
@@ -176,8 +181,8 @@ void scsi_times_out(struct scsi_cmnd *sc
 		}
 
 	if (unlikely(!scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD))) {
-		panic("Error handler thread not present at %p %p %s %d",
-		      scmd, scmd->device->host, __FILE__, __LINE__);
+		scmd->result |= DID_TIME_OUT << 16;
+		__scsi_done(scmd);
 	}
 }
 
@@ -196,8 +201,7 @@ int scsi_block_when_processing_errors(st
 {
 	int online;
 
-	wait_event(sdev->host->host_wait, (sdev->host->shost_state !=
-					   SHOST_RECOVERY));
+	wait_event(sdev->host->host_wait, !scsi_host_in_recovery(sdev->host));
 
 	online = scsi_device_online(sdev);
 
@@ -1441,6 +1445,7 @@ static void scsi_eh_lock_door(struct scs
 static void scsi_restart_operations(struct Scsi_Host *shost)
 {
 	struct scsi_device *sdev;
+	unsigned long flags;
 
 	/*
 	 * If the door was locked, we need to insert a door lock request
@@ -1460,7 +1465,11 @@ static void scsi_restart_operations(stru
 	SCSI_LOG_ERROR_RECOVERY(3, printk("%s: waking up host to restart\n",
 					  __FUNCTION__));
 
-	scsi_host_set_state(shost, SHOST_RUNNING);
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (!scsi_host_set_state(shost, SHOST_RUNNING))
+		if (!scsi_host_set_state(shost, SHOST_CANCEL))
+			BUG_ON(!scsi_host_set_state(shost, SHOST_DEL));
+	spin_unlock_irqrestore(shost->host_lock, flags);
 
 	wake_up(&shost->host_wait);
 
diff --git a/drivers/scsi/scsi_ioctl.c b/drivers/scsi/scsi_ioctl.c
--- a/drivers/scsi/scsi_ioctl.c
+++ b/drivers/scsi/scsi_ioctl.c
@@ -458,7 +458,7 @@ int scsi_nonblockable_ioctl(struct scsi_
 	 * error processing, as long as the device was opened
 	 * non-blocking */
 	if (filp && filp->f_flags & O_NONBLOCK) {
-		if (sdev->host->shost_state == SHOST_RECOVERY)
+		if (scsi_host_in_recovery(sdev->host))
 			return -ENODEV;
 	} else if (!scsi_block_when_processing_errors(sdev))
 		return -ENODEV;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -447,7 +447,7 @@ void scsi_device_unbusy(struct scsi_devi
 
 	spin_lock_irqsave(shost->host_lock, flags);
 	shost->host_busy--;
-	if (unlikely((shost->shost_state == SHOST_RECOVERY) &&
+	if (unlikely(scsi_host_in_recovery(shost) &&
 		     shost->host_failed))
 		scsi_eh_wakeup(shost);
 	spin_unlock(shost->host_lock);
@@ -1339,7 +1339,7 @@ static inline int scsi_host_queue_ready(
 				   struct Scsi_Host *shost,
 				   struct scsi_device *sdev)
 {
-	if (shost->shost_state == SHOST_RECOVERY)
+	if (scsi_host_in_recovery(shost))
 		return 0;
 	if (shost->host_busy == 0 && shost->host_blocked) {
 		/*
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -57,6 +57,8 @@ static struct {
 	{ SHOST_CANCEL, "cancel" },
 	{ SHOST_DEL, "deleted" },
 	{ SHOST_RECOVERY, "recovery" },
+	{ SHOST_CANCEL_RECOVERY, "cancel/recovery" },
+	{ SHOST_DEL_RECOVERY, "deleted/recovery", },
 };
 const char *scsi_host_state_name(enum scsi_host_state state)
 {
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1027,7 +1027,7 @@ sg_ioctl(struct inode *inode, struct fil
 		if (sdp->detached)
 			return -ENODEV;
 		if (filp->f_flags & O_NONBLOCK) {
-			if (sdp->device->host->shost_state == SHOST_RECOVERY)
+			if (scsi_host_in_recovery(sdp->device->host))
 				return -EBUSY;
 		} else if (!scsi_block_when_processing_errors(sdp->device))
 			return -EBUSY;
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -439,6 +439,8 @@ enum scsi_host_state {
 	SHOST_CANCEL,
 	SHOST_DEL,
 	SHOST_RECOVERY,
+	SHOST_CANCEL_RECOVERY,
+	SHOST_DEL_RECOVERY,
 };
 
 struct Scsi_Host {
@@ -621,6 +623,13 @@ static inline struct Scsi_Host *dev_to_s
 	return container_of(dev, struct Scsi_Host, shost_gendev);
 }
 
+static inline int scsi_host_in_recovery(struct Scsi_Host *shost)
+{
+	return shost->shost_state == SHOST_RECOVERY ||
+		shost->shost_state == SHOST_CANCEL_RECOVERY ||
+		shost->shost_state == SHOST_DEL_RECOVERY;
+}
+
 extern int scsi_queue_work(struct Scsi_Host *, struct work_struct *);
 extern void scsi_flush_work(struct Scsi_Host *);
 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-16  1:52             ` [linux-usb-devel] " Alan Stern
  2005-09-16  2:27               ` James Bottomley
@ 2005-09-18  0:36               ` James Bottomley
  2005-09-18  2:33                 ` [linux-usb-devel] " Alan Stern
  1 sibling, 1 reply; 23+ messages in thread
From: James Bottomley @ 2005-09-18  0:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: Mike Anderson, Greg KH, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Thu, 2005-09-15 at 21:52 -0400, Alan Stern wrote:
> In short, those iterations must be carried out as in my patch.

OK, I looked at making this work while reaping the target correctly, but
I couldn't (basic problem is that the target list keeps the target until
it has no more devices, a condition that could be made untrue by
something as simple as an open of the sysfs file).

So, rather than try that, I thought a better approach might be to make
the host state model work for us.  i.e. if we know the host is being
removed, there's no point allowing target or device removal because at
some point the host removal will do it for us.  This enforcement would
ensure we're the only legitimate removers of the target and device.

The alternative is to migrate to klists, I think ...

James

diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -7,6 +7,7 @@
 struct request_queue;
 struct scsi_cmnd;
 struct scsi_device;
+struct scsi_target;
 struct scsi_host_template;
 struct scsi_request;
 struct Scsi_Host;
@@ -125,6 +126,7 @@ extern void scsi_sysfs_device_initialize
 extern int scsi_sysfs_target_initialize(struct scsi_device *);
 extern struct scsi_transport_template blank_transport_template;
 extern void __scsi_remove_device(struct scsi_device *);
+extern void __scsi_remove_target(struct scsi_target *);
 
 extern struct bus_type scsi_bus_type;
 
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1481,7 +1481,7 @@ void scsi_forget_host(struct Scsi_Host *
 	spin_lock_irqsave(shost->host_lock, flags);
 	list_for_each_entry_safe(starget, tmp, &shost->__targets, siblings) {
 		spin_unlock_irqrestore(shost->host_lock, flags);
-		scsi_remove_target(&starget->dev);
+		__scsi_remove_target(starget);
 		spin_lock_irqsave(shost->host_lock, flags);
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -710,7 +710,8 @@ void scsi_remove_device(struct scsi_devi
 	struct Scsi_Host *shost = sdev->host;
 
 	down(&shost->scan_mutex);
-	__scsi_remove_device(sdev);
+	if (scsi_host_scan_allowed(shost))
+		__scsi_remove_device(sdev);
 	up(&shost->scan_mutex);
 }
 EXPORT_SYMBOL(scsi_remove_device);
@@ -728,7 +729,7 @@ void __scsi_remove_target(struct scsi_ta
 		    sdev->id != starget->id)
 			continue;
 		spin_unlock_irqrestore(shost->host_lock, flags);
-		scsi_remove_device(sdev);
+		__scsi_remove_device(sdev);
 		spin_lock_irqsave(shost->host_lock, flags);
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
@@ -755,7 +756,16 @@ void scsi_remove_target(struct device *d
 	struct device *rdev;
 
 	if (scsi_is_target_device(dev)) {
+		struct Scsi_Host *shost = dev_to_shost(dev);
+
+		down(&shost->scan_mutex);
+		/* We deny target removal here if the host is being
+		 * deleted, since it will remove the target itself */
+		if (!scsi_host_scan_allowed(shost))
+			goto out;
 		__scsi_remove_target(to_scsi_target(dev));
+	out:
+		up(&shost->scan_mutex);
 		return;
 	}
 
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -654,7 +654,8 @@ static inline struct device *scsi_get_de
  **/
 static inline int scsi_host_scan_allowed(struct Scsi_Host *shost)
 {
-	return shost->shost_state == SHOST_RUNNING;
+	return shost->shost_state == SHOST_RUNNING ||
+		shost->shost_state == SHOST_RECOVERY;
 }
 
 extern void scsi_unblock_requests(struct Scsi_Host *);





-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-18  0:36               ` James Bottomley
@ 2005-09-18  2:33                 ` Alan Stern
  2005-09-18 20:00                   ` James Bottomley
  0 siblings, 1 reply; 23+ messages in thread
From: Alan Stern @ 2005-09-18  2:33 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mike Anderson, Greg KH, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Sat, 17 Sep 2005, James Bottomley wrote:

> On Thu, 2005-09-15 at 21:52 -0400, Alan Stern wrote:
> > In short, those iterations must be carried out as in my patch.
> 
> OK, I looked at making this work while reaping the target correctly, but
> I couldn't (basic problem is that the target list keeps the target until
> it has no more devices, a condition that could be made untrue by
> something as simple as an open of the sysfs file).

Sorry, I'm a bit slow tonight.  I thought you _wanted_ the target to hang
around until it had no more devices.  (Wasn't that the whole point of
scsi_target_reap?)  What's wrong with keeping the target until the sysfs
file is closed?

> So, rather than try that, I thought a better approach might be to make
> the host state model work for us.  i.e. if we know the host is being
> removed, there's no point allowing target or device removal because at
> some point the host removal will do it for us.  This enforcement would
> ensure we're the only legitimate removers of the target and device.

How does this make things any better?  You still face the problem that an 
open sysfs file will delay device removal and target reaping, if all the 
devices on that target are hot-unplugged and the host remains alive and 
active.

> The alternative is to migrate to klists, I think ...

Think very carefully before taking such a step...  :-)

Alan Stern


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-18  2:33                 ` [linux-usb-devel] " Alan Stern
@ 2005-09-18 20:00                   ` James Bottomley
  0 siblings, 0 replies; 23+ messages in thread
From: James Bottomley @ 2005-09-18 20:00 UTC (permalink / raw)
  To: Alan Stern
  Cc: Mike Anderson, Greg KH, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Sat, 2005-09-17 at 22:33 -0400, Alan Stern wrote:
> > OK, I looked at making this work while reaping the target correctly, but
> > I couldn't (basic problem is that the target list keeps the target until
> > it has no more devices, a condition that could be made untrue by
> > something as simple as an open of the sysfs file).
> 
> Sorry, I'm a bit slow tonight.  I thought you _wanted_ the target to hang
> around until it had no more devices.  (Wasn't that the whole point of
> scsi_target_reap?)  What's wrong with keeping the target until the sysfs
> file is closed?

nothing ... that happens regardless (with either patch).

> > So, rather than try that, I thought a better approach might be to make
> > the host state model work for us.  i.e. if we know the host is being
> > removed, there's no point allowing target or device removal because at
> > some point the host removal will do it for us.  This enforcement would
> > ensure we're the only legitimate removers of the target and device.
> 
> How does this make things any better?  You still face the problem that an 
> open sysfs file will delay device removal and target reaping, if all the 
> devices on that target are hot-unplugged and the host remains alive and 
> active.

Actually, since we don't need the target reaping in the host removal (it
would be reaped when the device is removed), I agree ... I'll put your
original patch in.

Thanks,

James


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-18  0:35             ` James Bottomley
@ 2005-09-18 20:05               ` James Bottomley
  2005-09-18 20:37                 ` Alan Stern
  2005-09-18 22:01                 ` [linux-usb-devel] " Greg KH
  0 siblings, 2 replies; 23+ messages in thread
From: James Bottomley @ 2005-09-18 20:05 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Greg KH, Alan Stern, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Sat, 2005-09-17 at 19:35 -0500, James Bottomley wrote:
> The attached should be that patch with the race window closed.

There's a big oops in this one (and there was when greg tested it).  The
state checker is reversed (it's checking !scsi_host_set_state() for
indicating a problem ... of course, the return is 0 on success or
error).  I've corrected this; Greg, could you retest? 

Thanks,

James

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -98,6 +98,7 @@ int scsi_host_set_state(struct Scsi_Host
 		switch (oldstate) {
 		case SHOST_CREATED:
 		case SHOST_RUNNING:
+		case SHOST_CANCEL_RECOVERY:
 			break;
 		default:
 			goto illegal;
@@ -107,12 +108,31 @@ int scsi_host_set_state(struct Scsi_Host
 	case SHOST_DEL:
 		switch (oldstate) {
 		case SHOST_CANCEL:
+		case SHOST_DEL_RECOVERY:
 			break;
 		default:
 			goto illegal;
 		}
 		break;
 
+	case SHOST_CANCEL_RECOVERY:
+		switch (oldstate) {
+		case SHOST_CANCEL:
+		case SHOST_RECOVERY:
+			break;
+		default:
+			goto illegal;
+		}
+		break;
+
+	case SHOST_DEL_RECOVERY:
+		switch (oldstate) {
+		case SHOST_CANCEL_RECOVERY:
+			break;
+		default:
+			goto illegal;
+		}
+		break;
 	}
 	shost->shost_state = state;
 	return 0;
@@ -134,13 +154,24 @@ EXPORT_SYMBOL(scsi_host_set_state);
  **/
 void scsi_remove_host(struct Scsi_Host *shost)
 {
+	unsigned long flags;
 	down(&shost->scan_mutex);
-	scsi_host_set_state(shost, SHOST_CANCEL);
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_set_state(shost, SHOST_CANCEL))
+		if (scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY)) {
+			spin_unlock_irqrestore(shost->host_lock, flags);
+			up(&shost->scan_mutex);
+			return;
+		}
+	spin_unlock_irqrestore(shost->host_lock, flags);
 	up(&shost->scan_mutex);
 	scsi_forget_host(shost);
 	scsi_proc_host_rm(shost);
 
-	scsi_host_set_state(shost, SHOST_DEL);
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_set_state(shost, SHOST_DEL))
+		BUG_ON(scsi_host_set_state(shost, SHOST_DEL_RECOVERY));
+	spin_unlock_irqrestore(shost->host_lock, flags);
 
 	transport_unregister_device(&shost->shost_gendev);
 	class_device_unregister(&shost->shost_classdev);
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -1265,9 +1265,8 @@ int scsi_device_cancel(struct scsi_devic
 		list_for_each_safe(lh, lh_sf, &active_list) {
 			scmd = list_entry(lh, struct scsi_cmnd, eh_entry);
 			list_del_init(lh);
-			if (recovery) {
-				scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD);
-			} else {
+			if (recovery &&
+			    !scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD)) {
 				scmd->result = (DID_ABORT << 16);
 				scsi_finish_command(scmd);
 			}
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -68,19 +68,24 @@ int scsi_eh_scmd_add(struct scsi_cmnd *s
 {
 	struct Scsi_Host *shost = scmd->device->host;
 	unsigned long flags;
+	int ret = 0;
 
 	if (shost->eh_wait == NULL)
 		return 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_set_state(shost, SHOST_RECOVERY))
+		if (scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY))
+			goto out_unlock;
 
+	ret = 1;
 	scmd->eh_eflags |= eh_flag;
 	list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q);
-	scsi_host_set_state(shost, SHOST_RECOVERY);
 	shost->host_failed++;
 	scsi_eh_wakeup(shost);
+ out_unlock:
 	spin_unlock_irqrestore(shost->host_lock, flags);
-	return 1;
+	return ret;
 }
 
 /**
@@ -176,8 +181,8 @@ void scsi_times_out(struct scsi_cmnd *sc
 		}
 
 	if (unlikely(!scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD))) {
-		panic("Error handler thread not present at %p %p %s %d",
-		      scmd, scmd->device->host, __FILE__, __LINE__);
+		scmd->result |= DID_TIME_OUT << 16;
+		__scsi_done(scmd);
 	}
 }
 
@@ -196,8 +201,7 @@ int scsi_block_when_processing_errors(st
 {
 	int online;
 
-	wait_event(sdev->host->host_wait, (sdev->host->shost_state !=
-					   SHOST_RECOVERY));
+	wait_event(sdev->host->host_wait, !scsi_host_in_recovery(sdev->host));
 
 	online = scsi_device_online(sdev);
 
@@ -1441,6 +1445,7 @@ static void scsi_eh_lock_door(struct scs
 static void scsi_restart_operations(struct Scsi_Host *shost)
 {
 	struct scsi_device *sdev;
+	unsigned long flags;
 
 	/*
 	 * If the door was locked, we need to insert a door lock request
@@ -1460,7 +1465,11 @@ static void scsi_restart_operations(stru
 	SCSI_LOG_ERROR_RECOVERY(3, printk("%s: waking up host to restart\n",
 					  __FUNCTION__));
 
-	scsi_host_set_state(shost, SHOST_RUNNING);
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_set_state(shost, SHOST_RUNNING))
+		if (scsi_host_set_state(shost, SHOST_CANCEL))
+			BUG_ON(scsi_host_set_state(shost, SHOST_DEL));
+	spin_unlock_irqrestore(shost->host_lock, flags);
 
 	wake_up(&shost->host_wait);
 
diff --git a/drivers/scsi/scsi_ioctl.c b/drivers/scsi/scsi_ioctl.c
--- a/drivers/scsi/scsi_ioctl.c
+++ b/drivers/scsi/scsi_ioctl.c
@@ -458,7 +458,7 @@ int scsi_nonblockable_ioctl(struct scsi_
 	 * error processing, as long as the device was opened
 	 * non-blocking */
 	if (filp && filp->f_flags & O_NONBLOCK) {
-		if (sdev->host->shost_state == SHOST_RECOVERY)
+		if (scsi_host_in_recovery(sdev->host))
 			return -ENODEV;
 	} else if (!scsi_block_when_processing_errors(sdev))
 		return -ENODEV;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -447,7 +447,7 @@ void scsi_device_unbusy(struct scsi_devi
 
 	spin_lock_irqsave(shost->host_lock, flags);
 	shost->host_busy--;
-	if (unlikely((shost->shost_state == SHOST_RECOVERY) &&
+	if (unlikely(scsi_host_in_recovery(shost) &&
 		     shost->host_failed))
 		scsi_eh_wakeup(shost);
 	spin_unlock(shost->host_lock);
@@ -1339,7 +1339,7 @@ static inline int scsi_host_queue_ready(
 				   struct Scsi_Host *shost,
 				   struct scsi_device *sdev)
 {
-	if (shost->shost_state == SHOST_RECOVERY)
+	if (scsi_host_in_recovery(shost))
 		return 0;
 	if (shost->host_busy == 0 && shost->host_blocked) {
 		/*
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -57,6 +57,8 @@ static struct {
 	{ SHOST_CANCEL, "cancel" },
 	{ SHOST_DEL, "deleted" },
 	{ SHOST_RECOVERY, "recovery" },
+	{ SHOST_CANCEL_RECOVERY, "cancel/recovery" },
+	{ SHOST_DEL_RECOVERY, "deleted/recovery", },
 };
 const char *scsi_host_state_name(enum scsi_host_state state)
 {
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1027,7 +1027,7 @@ sg_ioctl(struct inode *inode, struct fil
 		if (sdp->detached)
 			return -ENODEV;
 		if (filp->f_flags & O_NONBLOCK) {
-			if (sdp->device->host->shost_state == SHOST_RECOVERY)
+			if (scsi_host_in_recovery(sdp->device->host))
 				return -EBUSY;
 		} else if (!scsi_block_when_processing_errors(sdp->device))
 			return -EBUSY;
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -439,6 +439,8 @@ enum scsi_host_state {
 	SHOST_CANCEL,
 	SHOST_DEL,
 	SHOST_RECOVERY,
+	SHOST_CANCEL_RECOVERY,
+	SHOST_DEL_RECOVERY,
 };
 
 struct Scsi_Host {
@@ -621,6 +623,13 @@ static inline struct Scsi_Host *dev_to_s
 	return container_of(dev, struct Scsi_Host, shost_gendev);
 }
 
+static inline int scsi_host_in_recovery(struct Scsi_Host *shost)
+{
+	return shost->shost_state == SHOST_RECOVERY ||
+		shost->shost_state == SHOST_CANCEL_RECOVERY ||
+		shost->shost_state == SHOST_DEL_RECOVERY;
+}
+
 extern int scsi_queue_work(struct Scsi_Host *, struct work_struct *);
 extern void scsi_flush_work(struct Scsi_Host *);
 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-18 20:05               ` James Bottomley
@ 2005-09-18 20:37                 ` Alan Stern
  2005-09-18 22:01                 ` [linux-usb-devel] " Greg KH
  1 sibling, 0 replies; 23+ messages in thread
From: Alan Stern @ 2005-09-18 20:37 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mike Anderson, Greg KH, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Sun, 18 Sep 2005, James Bottomley wrote:

>  void scsi_remove_host(struct Scsi_Host *shost)
>  {
> +	unsigned long flags;
>  	down(&shost->scan_mutex);
> -	scsi_host_set_state(shost, SHOST_CANCEL);
> +	spin_lock_irqsave(shost->host_lock, flags);
> +	if (scsi_host_set_state(shost, SHOST_CANCEL))
> +		if (scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY)) {
> +			spin_unlock_irqrestore(shost->host_lock, flags);
> +			up(&shost->scan_mutex);
> +			return;
> +		}
> +	spin_unlock_irqrestore(shost->host_lock, flags);
> 	up(&shost->scan_mutex);
> 	scsi_forget_host(shost);

Personally I would prefer to see something like:

	if (scsi_host_set_state(shost,
			(shost->shost_state == SHOST_RECOVERY) ?
			SHOST_CANCEL_RECOVERY : SHOST_CANCEL)) {
		spin_unlock_irqrestore...

with similar changes in the other places.  It avoids a function call and
prevents misleading error messages from showing up in the system log.

Actually, I would probably do

	unsigned long flags;
	int rc;

	down(&shost->scan_mutex);
	spin_lock_irqsave(shost->host_lock, flags);
	rc = scsi_host_set_state(shost,
			(shost->shost_state == SHOST_RECOVERY) ?
			SHOST_CANCEL_RECOVERY : SHOST_CANCEL);
	spin_unlock_irqrestore(shost->host_lock, flags);
	up(&shost->scan_mutex);
	if (rc)
		return;
	scsi_forget_host(shost);
	...

just to avoid the extra spin_unlock call.  Those macros can expand to a 
surprising length.

Alan Stern



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-18 20:05               ` James Bottomley
  2005-09-18 20:37                 ` Alan Stern
@ 2005-09-18 22:01                 ` Greg KH
  2005-09-18 22:34                   ` Greg KH
  2005-09-19 15:19                   ` [linux-usb-devel] " James Bottomley
  1 sibling, 2 replies; 23+ messages in thread
From: Greg KH @ 2005-09-18 22:01 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mike Anderson, Alan Stern, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Sun, Sep 18, 2005 at 03:05:20PM -0500, James Bottomley wrote:
> On Sat, 2005-09-17 at 19:35 -0500, James Bottomley wrote:
> > The attached should be that patch with the race window closed.
> 
> There's a big oops in this one (and there was when greg tested it).  The
> state checker is reversed (it's checking !scsi_host_set_state() for
> indicating a problem ... of course, the return is 0 on success or
> error).  I've corrected this; Greg, could you retest? 

Should I apply any additional patches too?  Or just this one will
suffice?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-18 22:01                 ` [linux-usb-devel] " Greg KH
@ 2005-09-18 22:34                   ` Greg KH
  2005-09-19 15:19                   ` [linux-usb-devel] " James Bottomley
  1 sibling, 0 replies; 23+ messages in thread
From: Greg KH @ 2005-09-18 22:34 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mike Anderson, Alan Stern, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Sun, Sep 18, 2005 at 03:01:26PM -0700, Greg KH wrote:
> On Sun, Sep 18, 2005 at 03:05:20PM -0500, James Bottomley wrote:
> > On Sat, 2005-09-17 at 19:35 -0500, James Bottomley wrote:
> > > The attached should be that patch with the race window closed.
> > 
> > There's a big oops in this one (and there was when greg tested it).  The
> > state checker is reversed (it's checking !scsi_host_set_state() for
> > indicating a problem ... of course, the return is 0 on success or
> > error).  I've corrected this; Greg, could you retest? 
> 
> Should I apply any additional patches too?  Or just this one will
> suffice?

Seems to not oops on removal anymore, but I can't rmmod usb-storage
anymore.  SysRq-T output below of possibly relevant processes.

thanks,

greg k-h


[  390.334608] kblockd/0     S C03FA2F0     0    77      5           112     7 (L-TLB)
[  390.334664] d6ebbf40 d6fc5a70 c03d93e0 c03fa2f0 d19a4c64 c03fa260 00000000 00000096 
[  390.334700]        d6ca7e34 d6fc5070 000041f4 4f2930e3 0000004e d6fc5a70 d6fc5b98 00000286 
[  390.334746]        d6ca7da0 d6f37ea8 c0258490 c0129c09 00000000 d6ebbf6c 00000000 d6f37eb8 
[  390.334792] Call Trace:
[  390.334888]  [<c0258490>] blk_unplug_work+0x0/0x10
[  390.334918]  [<c0129c09>] worker_thread+0x229/0x260
[  390.334956]  [<c0116fc0>] default_wake_function+0x0/0x20
[  390.334987]  [<c0116fc0>] default_wake_function+0x0/0x20
[  390.335018]  [<c01299e0>] worker_thread+0x0/0x260
[  390.335301]  [<c012db19>] kthread+0xa9/0xf0
[  390.335331]  [<c012da70>] kthread+0x0/0xf0
[  390.335356]  [<c0101379>] kernel_thread_helper+0x5/0xc
[  390.335407] pdflush       S D6FDA550     0   112      5           113    77 (L-TLB)
[  390.335463] d6efbf78 d6fc5570 c03d93e0 d6fda550 c03d9410 00000002 d6fda550 d6efbf68 
[  390.335501]        d6fc5570 d6fc5070 000006ff ab42be82 00000002 d6fc5570 d6fc5698 d6efbfa8 
[  390.335543]        d6efbf9c d6efa000 c0141620 c0141509 d6efa000 d6efa000 d6f9ff74 00000000 
[  390.335588] Call Trace:
[  390.335617]  [<c0141620>] pdflush+0x0/0x30
[  390.335643]  [<c0141509>] __pdflush+0x79/0x190
[  390.335669]  [<c0141646>] pdflush+0x26/0x30
[  390.335699]  [<c0141620>] pdflush+0x0/0x30
[  390.335720]  [<c012db19>] kthread+0xa9/0xf0
[  390.335747]  [<c012da70>] kthread+0x0/0xf0
[  390.335771]  [<c0101379>] kernel_thread_helper+0x5/0xc
[  390.335816] pdflush       S 00000400     0   113      5           115   112 (L-TLB)
[  390.335870] d6efdf78 d6fc5070 c03d93e0 00000400 00000000 00000000 00000000 00000000 
[  390.335902]        00000000 00000005 00002e27 f3f1df78 00000059 d6fc5070 d6fc5198 d6efdfa8 
[  390.336243]        d6efdf9c d6efc000 c0141620 c0141509 00000000 d6efc000 d6f9ff74 00000000 
[  390.336289] Call Trace:
[  390.336320]  [<c0141620>] pdflush+0x0/0x30
[  390.336344]  [<c0141509>] __pdflush+0x79/0x190
[  390.336369]  [<c0141646>] pdflush+0x26/0x30
[  390.336398]  [<c0141620>] pdflush+0x0/0x30
[  390.336419]  [<c012db19>] kthread+0xa9/0xf0
[  390.336447]  [<c012da70>] kthread+0x0/0xf0
[  390.336471]  [<c0101379>] kernel_thread_helper+0x5/0xc
[  390.336518] aio/0         S 00000002     0   115      5           706   113 (L-TLB)
[  390.336575] d6c01f40 d6fdaa50 c03d9410 00000002 00000000 ab76da12 00000002 00004fa7 
[  390.336609]        00000000 d6fdaa50 00001a15 ab7796b9 00000002 d6f44590 d6f446b8 d6c00000 
[  390.336647]        00000000 00010000 00000000 c0129c09 00000011 d6c01f6c 00000000 d6f37858 
[  390.336685] Call Trace:
[  390.336714]  [<c0129c09>] worker_thread+0x229/0x260
[  390.336751]  [<c0116fc0>] default_wake_function+0x0/0x20
[  390.336781]  [<c0116fc0>] default_wake_function+0x0/0x20
[  390.336812]  [<c01299e0>] worker_thread+0x0/0x260
[  390.336836]  [<c012db19>] kthread+0xa9/0xf0
[  390.336864]  [<c012da70>] kthread+0x0/0xf0
[  390.336888]  [<c0101379>] kernel_thread_helper+0x5/0xc
[  390.336932] kswapd0       S C033A760     0   114      1           743     5 (L-TLB)
[  390.336987] d6efff80 d6f44a90 c03d93e0 c033a760 c01239c0 c033a760 d6efe000 c0334418 
[  390.337024]        d6fd42f8 c011bcc5 0000efca abcb3a68 00000002 d6f44a90 d6f44bb8 d6efe000 
[  390.337062]        00000000 c033b634 c033b260 c0146d45 c0307f9d 00000000 d6f44a90 00000000 
[  390.337101] Call Trace:
[  390.337120]  [<c01239c0>] switch_uid+0x40/0x60
[  390.337146]  [<c011bcc5>] daemonize+0x1d5/0x250
[  390.337180]  [<c0146d45>] kswapd+0x115/0x120
[  390.337210]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.337242]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.337271]  [<c0146c30>] kswapd+0x0/0x120
[  390.337296]  [<c0101379>] kernel_thread_helper+0x5/0xc
[  390.337340] kseriod       S C034D048     0   706      5          1217   115 (L-TLB)
[  390.337396] d6c27f60 d6f44090 c03d93e0 c034d048 d6c27f8c c012df40 d6c26000 c0251aed 
[  390.337432]        c034d048 c0251a90 00004dd7 a9736cf6 00000003 d6f44090 d6f441b8 d6c26000 
[  390.337471]        d6c27f8c c012df40 d6c26000 c023801d d6f9fe18 00000000 d6f44090 c012df40 
[  390.337511] Call Trace:
[  390.337532]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.337559]  [<c0251aed>] driver_register+0x3d/0x50
[  390.337586]  [<c0251a90>] klist_devices_get+0x0/0x10
[  390.337616]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.337642]  [<c023801d>] serio_thread+0xad/0x120
[  390.337671]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.337702]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.337734]  [<c0237f70>] serio_thread+0x0/0x120
[  390.337757]  [<c012db19>] kthread+0xa9/0xf0
[  390.337783]  [<c012da70>] kthread+0x0/0xf0
[  390.337808]  [<c0101379>] kernel_thread_helper+0x5/0xc
[  390.337852] kjournald     S 00000032     0   743      1           772   114 (L-TLB)
[  390.337906] d6ce3f58 d6c95ab0 c03d93e0 00000032 00000000 6bcc439f 00000000 00000296 
[  390.337939]        00000001 d6ce3f58 00002db8 2e223dd0 00000058 d6c95ab0 d6c95bd8 00000000 
[  390.337976]        d6cb7aa0 00000001 d6cb7ae4 c01a9bd9 00000000 00000005 d6fd2a30 d6c95ab0 
[  390.338014] Call Trace:
[  390.338044]  [<c01a9bd9>] kjournald+0x379/0x380
[  390.338082]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.338115]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.338144]  [<c0102f62>] ret_from_fork+0x6/0x14
[  390.338170]  [<c01a9850>] commit_timeout+0x0/0x10
[  390.338196]  [<c01a9860>] kjournald+0x0/0x380
[  390.338221]  [<c0101379>] kernel_thread_helper+0x5/0xc
[  390.338266] udevd         S 00000000     0   772      1          1778   743 (NOTLB)
[  390.338322] d6479eac d640f0d0 c03d93e0 00000000 000000d0 00000001 00000000 00000000 
[  390.338353]        00000000 00000001 0000ebb3 7385c3a6 0000004f d640f0d0 d640f1f8 00000000 
[  390.338389]        7fffffff 00000007 00000000 c02efea1 d6cbe200 d6479f3c c02885b1 d6bff0a0 
[  390.338434] Call Trace:
[  390.338464]  [<c02efea1>] schedule_timeout+0xa1/0xb0
[  390.338495]  [<c02885b1>] datagram_poll+0xe1/0xf0
[  390.338526]  [<c0281520>] sock_poll+0x20/0x30
[  390.338559]  [<c016c220>] do_select+0x280/0x2c0
[  390.338604]  [<c016bdc0>] __pollwait+0x0/0xd0
[  390.338632]  [<c016c5b8>] sys_select+0x318/0x450
[  390.338670]  [<c011dc05>] sys_waitpid+0x25/0x29
[  390.338698]  [<c0103059>] syscall_call+0x7/0xb
[  390.339920] khubd         S 0000001A     0  1217      5          5557   706 (L-TLB)
[  390.340945] d65fdf5c d656a070 c03d9410 0000001a 00000100 dc19a852 0000001a 0002fee8 
[  390.340981]        00000000 d656a070 0000435a dc19a852 0000001a d6abd550 d6abd678 d65fc000 
[  390.341019]        d65fc000 d65fdf80 d65fdf8c d7841e3a 000057bf d6415e0c 00000000 d6abd550 
[  390.341059] Call Trace:
[  390.341099]  [<d7841e3a>] hub_thread+0xea/0x120 [usbcore]
[  390.343386]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.343426]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.343463]  [<d7841d50>] hub_thread+0x0/0x120 [usbcore]
[  390.343854]  [<c012db19>] kthread+0xa9/0xf0
[  390.343884]  [<c012da70>] kthread+0x0/0xf0
[  390.343909]  [<c0101379>] kernel_thread_helper+0x5/0xc
[  390.344418] kjournald     S 00000001     0  1778      1          5124   772 (L-TLB)
[  390.345452] d5e59f58 d6b35070 c03d93e0 00000001 d6b5d704 c010513e c0103a26 00000000 
[  390.345491]        d5e59fa4 00000000 00005ec2 4beee643 00000055 d6b35070 d6b35198 00000000 
[  390.345529]        d6b5d6c0 00000001 d6b5d704 c01a9bd9 00000000 00000005 00000008 d6b35070 
[  390.345695] Call Trace:
[  390.346187]  [<c010513e>] do_IRQ+0x1e/0x30
[  390.346219]  [<c0103a26>] common_interrupt+0x1a/0x20
[  390.346253]  [<c01a9bd9>] kjournald+0x379/0x380
[  390.346288]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.346321]  [<c012df40>] autoremove_wake_function+0x0/0x60
[  390.346351]  [<c0102f62>] ret_from_fork+0x6/0x14
[  390.346377]  [<c01a9850>] commit_timeout+0x0/0x10
[  390.346403]  [<c01a9860>] kjournald+0x0/0x380
[  390.346429]  [<c0101379>] kernel_thread_helper+0x5/0xc
[  390.351800] scsi_eh_0     S 00000010     0  5557      5                1217 (L-TLB)
[  390.352081] d5907f50 d6b69590 c03d93e0 00000010 80b05cad 0000593e a361faff 00000000 
[  390.352119]        4fa02c17 d6b35a70 00000126 d5dd9f69 0000001a d6b69590 d6b696b8 d5907fa8 
[  390.352158]        d5906000 00000296 d6b69590 c02f0601 d5907fb0 00000000 00000001 d6b69590 
[  390.352196] Call Trace:
[  390.352223]  [<c02f0601>] __down_interruptible+0x71/0xf2
[  390.352256]  [<c0116fc0>] default_wake_function+0x0/0x20
[  390.352292]  [<d814b1e0>] scsi_error_handler+0x0/0x120 [scsi_mod]
[  390.352934]  [<c02ef1b3>] __down_failed_interruptible+0x7/0xc
[  390.352970]  [<d814b826>] .text.lock.scsi_error+0x3b/0x45 [scsi_mod]
[  390.353085]  [<c012db19>] kthread+0xa9/0xf0
[  390.353117]  [<c012da70>] kthread+0x0/0xf0
[  390.353141]  [<c0101379>] kernel_thread_helper+0x5/0xc
[  390.353173] usb-storage   D 0000001A     0  5558      1          7015  5462 (L-TLB)
[  390.353214] d59d9ed8 d6b69590 c03d9410 0000001a d61726c0 d5dd86e7 0000001a 0000057e 
[  390.353249]        00000000 d6b69590 0000111e d5dd93e9 0000001a d6713ab0 d6713bd8 d59d8000 
[  390.353287]        c03ec58c d59d9ef4 d59d9f2c c02ef83e 00000000 d6713ab0 c0116fc0 00000000 
[  390.353326] Call Trace:
[  390.353352]  [<c02ef83e>] wait_for_completion+0x6e/0xa0
[  390.353380]  [<c0116fc0>] default_wake_function+0x0/0x20
[  390.353411]  [<c0116fc0>] default_wake_function+0x0/0x20
[  390.353439]  [<c011677d>] wake_up_process+0x1d/0x20
[  390.353465]  [<c012dd51>] kthread_stop+0x51/0x7a
[  390.353489]  [<c01170b7>] __wake_up_locked+0x27/0x30
[  390.353518]  [<d814781c>] scsi_host_dev_release+0x8c/0xa0 [scsi_mod]
[  390.353637]  [<c01c4b69>] kobject_cleanup+0xa9/0xb0
[  390.353671]  [<c01c4b70>] kobject_release+0x0/0x10
[  390.353698]  [<c01c55a5>] kref_put+0x45/0x90
[  390.353727]  [<c01c4b9f>] kobject_put+0x1f/0x30
[  390.353752]  [<c01c4b70>] kobject_release+0x0/0x10
[  390.353778]  [<d8160103>] usb_stor_control_thread+0x153/0x1b0 [usb_storage]
[  390.353891]  [<c0102f62>] ret_from_fork+0x6/0x14
[  390.353917]  [<d815ffb0>] usb_stor_control_thread+0x0/0x1b0 [usb_storage]
[  390.354013]  [<d815ffb0>] usb_stor_control_thread+0x0/0x1b0 [usb_storage]
[  390.354103]  [<c0101379>] kernel_thread_helper+0x5/0xc
  390.355561] rmmod         D CE0FBED8     0  9411   5435                     (NOTLB)
[  390.355585] ce0fbef4 d63440d0 c03d93e0 ce0fbed8 00000286 d816ffec 00000003 d816fff4 
[  390.355605]        c034d948 d61e7a90 00006352 7005c59d 0000004f d63440d0 d63441f8 ce0fa000 
[  390.355629]        d816e770 ce0fbf10 ce0fbf48 c02ef83e 00000000 d63440d0 c0116fc0 00000000 
[  390.355652] Call Trace:
[  390.355668]  [<c02ef83e>] wait_for_completion+0x6e/0xa0
[  390.355685]  [<c0116fc0>] default_wake_function+0x0/0x20
[  390.355701]  [<c01c4b9f>] kobject_put+0x1f/0x30
[  390.355719]  [<c0116fc0>] default_wake_function+0x0/0x20
[  390.355737]  [<d8168de0>] usb_stor_exit+0x0/0x2c [usb_storage]
[  390.355810]  [<d8168de0>] usb_stor_exit+0x0/0x2c [usb_storage]
[  390.355865]  [<d8168dfb>] usb_stor_exit+0x1b/0x2c [usb_storage]
[  390.355918]  [<c01330b6>] sys_delete_module+0x1a6/0x1e0
[  390.355950]  [<c014d510>] sys_munmap+0x50/0x80
[  390.355968]  [<c0103059>] syscall_call+0x7/0xb


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-18 22:01                 ` [linux-usb-devel] " Greg KH
  2005-09-18 22:34                   ` Greg KH
@ 2005-09-19 15:19                   ` James Bottomley
  2005-09-20 14:48                     ` Greg KH
  1 sibling, 1 reply; 23+ messages in thread
From: James Bottomley @ 2005-09-19 15:19 UTC (permalink / raw)
  To: Greg KH
  Cc: Mike Anderson, Alan Stern, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Sun, 2005-09-18 at 15:01 -0700, Greg KH wrote:
> Should I apply any additional patches too?  Or just this one will
> suffice?

Just to make assurances doubly sure, could you remove all the scsi
patches and just pull the scsi git tree at

master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git

into your build and verify that everything works (the module is
removable and the error thread stops).

Thanks,

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: oops on usb storage device disconnect with 2.6.14-rc1
  2005-09-19 15:19                   ` [linux-usb-devel] " James Bottomley
@ 2005-09-20 14:48                     ` Greg KH
  0 siblings, 0 replies; 23+ messages in thread
From: Greg KH @ 2005-09-20 14:48 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mike Anderson, Alan Stern, linux-usb-devel, SCSI Mailing List,
	mdharm-usb

On Mon, Sep 19, 2005 at 10:19:32AM -0500, James Bottomley wrote:
> On Sun, 2005-09-18 at 15:01 -0700, Greg KH wrote:
> > Should I apply any additional patches too?  Or just this one will
> > suffice?
> 
> Just to make assurances doubly sure, could you remove all the scsi
> patches and just pull the scsi git tree at
> 
> master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git
> 
> into your build and verify that everything works (the module is
> removable and the error thread stops).

Yes, with this tree, on top of Linus's latest (2.6.14-rc2), it works
just fine!

thanks,

greg k-h


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2005-09-20 14:48 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-15 19:03 oops on usb storage device disconnect with 2.6.14-rc1 Greg KH
2005-09-15 19:23 ` [linux-usb-devel] " Alan Stern
2005-09-15 19:29   ` Greg KH
2005-09-15 19:57     ` James Bottomley
2005-09-15 21:08       ` [linux-usb-devel] " James Bottomley
2005-09-15 22:19         ` Mike Anderson
2005-09-15 22:38           ` [linux-usb-devel] " James Bottomley
2005-09-15 23:55             ` Mike Anderson
2005-09-16  1:46               ` James Bottomley
2005-09-16  1:52             ` [linux-usb-devel] " Alan Stern
2005-09-16  2:27               ` James Bottomley
2005-09-18  0:36               ` James Bottomley
2005-09-18  2:33                 ` [linux-usb-devel] " Alan Stern
2005-09-18 20:00                   ` James Bottomley
2005-09-18  0:35             ` James Bottomley
2005-09-18 20:05               ` James Bottomley
2005-09-18 20:37                 ` Alan Stern
2005-09-18 22:01                 ` [linux-usb-devel] " Greg KH
2005-09-18 22:34                   ` Greg KH
2005-09-19 15:19                   ` [linux-usb-devel] " James Bottomley
2005-09-20 14:48                     ` Greg KH
2005-09-15 23:46         ` [linux-usb-devel] " Greg KH
2005-09-16  1:57           ` Alan Stern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).