Synchronizing scsi_remove_host and the error handler

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Synchronizing scsi_remove_host and the error handler
@ 2005-08-07 14:59 Alan Stern
  2005-08-07 15:36 ` James Bottomley
  2005-08-08 18:04 ` Luben Tuikov
  0 siblings, 2 replies; 30+ messages in thread
From: Alan Stern @ 2005-08-07 14:59 UTC (permalink / raw)
  To: Mike Anderson; +Cc: SCSI development list

Mike:

What sort of synchronization is there between scsi_remove_host and the 
error-handler thread?  Offhand I can see two possible problems, depending 
on how the LLD is written:

     1.	The error handler asks the LLD to perform a reset just as the LLD
	calls scsi_remove_host as part of rmmod.  scsi_remove_host 
	returns, the LLD's remove method returns, and the LLD is unloaded
	from memory while its reset routine is executing.

     2. The error handler asks the LLD to perform a reset just before the
	LLD calls scsi_remove_host.  The LLD's remove method can't return
	until the reset completes (otherwise the reset would be executing 
	code that is about to be unloaded, as in (1)).  But the reset
	can't get started until it acquires the host adapter's device 
	semaphore, which is owned by the rmmod thread.

(1) causes an oops and (2) causes a deadlock.  Note that (1) can be 
avoided by making scis_remove_host wait until the error handler thread has 
exited, but that will make (2) much more likely to happen.

In general, there needs to be some way for the error handler to prevent
the LLD from being unloaded.  I don't know what the best answer is.  Has
anyone thought about this?

Alan Stern

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-07 14:59 Synchronizing scsi_remove_host and the error handler Alan Stern
@ 2005-08-07 15:36 ` James Bottomley
  2005-08-07 18:43   ` Alan Stern
  2005-08-08 18:07   ` Luben Tuikov
  2005-08-08 18:04 ` Luben Tuikov
  1 sibling, 2 replies; 30+ messages in thread
From: James Bottomley @ 2005-08-07 15:36 UTC (permalink / raw)
  To: Alan Stern; +Cc: Mike Anderson, SCSI development list

On Sun, 2005-08-07 at 10:59 -0400, Alan Stern wrote:
> What sort of synchronization is there between scsi_remove_host and the 
> error-handler thread?  Offhand I can see two possible problems, depending 
> on how the LLD is written:

There isn't any by design.

I think you're not thinking about how this works correctly.  What remove
host does is loop over the active devices removing them from visibility
and trying to do a final put on the generic devices before removing the
host from visibility and doing a final put on it.

However, any outstanding user will have a reference and will keep all
the bits of the hierarchy in place until that reference is relinquished.

> In general, there needs to be some way for the error handler to prevent
> the LLD from being unloaded.  I don't know what the best answer is.  Has
> anyone thought about this?

I take it the problem is that most LLDs go on to release resources (like
interrupts and memory regions) immediately after calling
scsi_remove_host()?

The solution is probably to trigger the resource cleanup from the host
device release method.

James

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-07 15:36 ` James Bottomley
@ 2005-08-07 18:43   ` Alan Stern
  2005-08-07 21:57     ` James Bottomley
  2005-08-07 22:15     ` Stefan Richter
  2005-08-08 18:07   ` Luben Tuikov
  1 sibling, 2 replies; 30+ messages in thread
From: Alan Stern @ 2005-08-07 18:43 UTC (permalink / raw)
  To: James Bottomley; +Cc: Mike Anderson, SCSI development list

On Sun, 7 Aug 2005, James Bottomley wrote:

> On Sun, 2005-08-07 at 10:59 -0400, Alan Stern wrote:
> > What sort of synchronization is there between scsi_remove_host and the 
> > error-handler thread?  Offhand I can see two possible problems, depending 
> > on how the LLD is written:
> 
> There isn't any by design.
> 
> I think you're not thinking about how this works correctly.  What remove
> host does is loop over the active devices removing them from visibility
> and trying to do a final put on the generic devices before removing the
> host from visibility and doing a final put on it.
> 
> However, any outstanding user will have a reference and will keep all
> the bits of the hierarchy in place until that reference is relinquished.

That's all as it should be.

> I take it the problem is that most LLDs go on to release resources (like
> interrupts and memory regions) immediately after calling
> scsi_remove_host()?
> 
> The solution is probably to trigger the resource cleanup from the host
> device release method.

What host device release method?  scsi_host_template->release is marked 
OBSOLETE and for use only with old-style drivers.  scsi_host_dev_release 
is the only release method for hosts, and it doesn't notify the LLD in any 
way.

Anyhow, I'm not so concerned about releasing resources associated with the
host.  The LLD can handle that easily enough by first flagging the HBA
as not accessible, then waiting for current I/O to terminate, then
calling scsi_remove_host.  The resources can be released any time after
the current I/O is finished; it doesn't matter whether the host is removed
before or after.

The only resource that matters for this discussion is associated with the
LLD itself, not with any of its hosts: the host template.  Once the SCSI
core has released all references to the template, it can't call the LLD
any more.  The problem is that the LLD has no way to know when all the
references have been dropped.  This suggests that the entire problem could
be solved by adding a kref to struct scsi_host_template.

Would you agree to a patch adding such a kref?

Alan Stern

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-07 18:43   ` Alan Stern
@ 2005-08-07 21:57     ` James Bottomley
  2005-08-08 18:24       ` Luben Tuikov
  2005-08-08 20:06       ` Mike Anderson
  2005-08-07 22:15     ` Stefan Richter
  1 sibling, 2 replies; 30+ messages in thread
From: James Bottomley @ 2005-08-07 21:57 UTC (permalink / raw)
  To: Alan Stern; +Cc: Mike Anderson, SCSI development list

On Sun, 2005-08-07 at 14:43 -0400, Alan Stern wrote:
> What host device release method?  scsi_host_template->release is marked 
> OBSOLETE and for use only with old-style drivers.  scsi_host_dev_release 
                                                     ^^^^^
This is the one I was thinking of adding to.

> is the only release method for hosts, and it doesn't notify the LLD in any 
> way.
> 
> Anyhow, I'm not so concerned about releasing resources associated with the
> host.  The LLD can handle that easily enough by first flagging the HBA
> as not accessible, then waiting for current I/O to terminate, then
> calling scsi_remove_host.  The resources can be released any time after
> the current I/O is finished; it doesn't matter whether the host is removed
> before or after.
> 
> The only resource that matters for this discussion is associated with the
> LLD itself, not with any of its hosts: the host template.  Once the SCSI
> core has released all references to the template, it can't call the LLD
> any more.  The problem is that the LLD has no way to know when all the
> references have been dropped.  This suggests that the entire problem could
> be solved by adding a kref to struct scsi_host_template.
> 
> Would you agree to a patch adding such a kref?

Well, not really.  The host template usually exists as a variable in the
module, so its lifetime is tied to the lifetime of the module.  Adding a
kref wouldn't help because it will still be freed when the module is
removed.  If there's a case where the template is being freed
prematurely because the module is being removed, then we have the module
refcounting wrong somewhere.  Have you run across such a case?

James



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-07 18:43   ` Alan Stern
  2005-08-07 21:57     ` James Bottomley
@ 2005-08-07 22:15     ` Stefan Richter
  2005-08-08 20:41       ` Alan Stern
  1 sibling, 1 reply; 30+ messages in thread
From: Stefan Richter @ 2005-08-07 22:15 UTC (permalink / raw)
  To: linux-scsi; +Cc: Alan Stern, James Bottomley, Mike Anderson

Alan Stern wrote:
> Anyhow, I'm not so concerned about releasing resources associated with the
> host.  The LLD can handle that easily enough by first flagging the HBA
> as not accessible, then waiting for current I/O to terminate, then
> calling scsi_remove_host.  The resources can be released any time after
> the current I/O is finished; it doesn't matter whether the host is removed
> before or after.

What do you mean by "flagging the HBA as not accessible"?

In case you refer to scsi_block_requests(), this is taboo before any
device is being removed from a host, or at least before high-level
drivers are being detached from any device. scsi_remove_device() or
--- if there are still devices present on a host and have high-level
drivers attached --- scsi_remove_host() itself will cause I/O. If
this I/O was blocked, scsi_remove_*() would be sitting in a deadlock.

> The only resource that matters for this discussion is associated with the
> LLD itself, not with any of its hosts: the host template.  Once the SCSI
> core has released all references to the template, it can't call the LLD
> any more.  The problem is that the LLD has no way to know when all the
> references have been dropped.  This suggests that the entire problem could
> be solved by adding a kref to struct scsi_host_template.

Why should the LLD care if the core might want to access it after the
LLD made the prescribed API calls for host removal? It is the core's
responsibility that it never enters the host again after these API
calls were invoked.
-- 
Stefan Richter
-=====-=-=-= =--- --===
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-07 14:59 Synchronizing scsi_remove_host and the error handler Alan Stern
  2005-08-07 15:36 ` James Bottomley
@ 2005-08-08 18:04 ` Luben Tuikov
  1 sibling, 0 replies; 30+ messages in thread
From: Luben Tuikov @ 2005-08-08 18:04 UTC (permalink / raw)
  To: Alan Stern; +Cc: Mike Anderson, SCSI development list

On 08/07/05 10:59, Alan Stern wrote:
> Mike:
> 
> What sort of synchronization is there between scsi_remove_host and the 
> error-handler thread?  Offhand I can see two possible problems, depending 
> on how the LLD is written:
> 
>      1.	The error handler asks the LLD to perform a reset just as the LLD
> 	calls scsi_remove_host as part of rmmod.  scsi_remove_host 
> 	returns, the LLD's remove method returns, and the LLD is unloaded
> 	from memory while its reset routine is executing.
> 
>      2. The error handler asks the LLD to perform a reset just before the
> 	LLD calls scsi_remove_host.  The LLD's remove method can't return
> 	until the reset completes (otherwise the reset would be executing 
> 	code that is about to be unloaded, as in (1)).  But the reset
> 	can't get started until it acquires the host adapter's device 
> 	semaphore, which is owned by the rmmod thread.
> 
> (1) causes an oops and (2) causes a deadlock.  Note that (1) can be 
> avoided by making scis_remove_host wait until the error handler thread has 
> exited, but that will make (2) much more likely to happen.

Hmm, this doesn't sound good.

I'm thinking, if the LLDD knows that it's being unloaded, then it can
return the EH reset with success de facto.

> In general, there needs to be some way for the error handler to prevent
> the LLD from being unloaded.  I don't know what the best answer is.  Has
> anyone thought about this?

Two ways:
	1) Let one entity be the active one and the other the passive one,
	   always.  That is, let the LLDD send an "event" and then let the
	   Managing layer decide what to do.
	2) Tack a kobject, or kset to each struct you have and set the
	   parent pointer accordingly to how the struct objects relate
	   to each other.  So in effect, the kobj relationship would
	   represent how those objects actually relate to each other
	   in the kernel.  Advantage you get is the automatic kref.

... or something like this. ;-)

	Luben

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-07 15:36 ` James Bottomley
  2005-08-07 18:43   ` Alan Stern
@ 2005-08-08 18:07   ` Luben Tuikov
  1 sibling, 0 replies; 30+ messages in thread
From: Luben Tuikov @ 2005-08-08 18:07 UTC (permalink / raw)
  To: James Bottomley; +Cc: Alan Stern, Mike Anderson, SCSI development list

On 08/07/05 11:36, James Bottomley wrote:
> On Sun, 2005-08-07 at 10:59 -0400, Alan Stern wrote:
> 
>>What sort of synchronization is there between scsi_remove_host and the 
>>error-handler thread?  Offhand I can see two possible problems, depending 
>>on how the LLD is written:
> 
> 
> There isn't any by design.
> 
> I think you're not thinking about how this works correctly.  What remove
> host does is loop over the active devices removing them from visibility
> and trying to do a final put on the generic devices before removing the
> host from visibility and doing a final put on it.
> 
> However, any outstanding user will have a reference and will keep all
> the bits of the hierarchy in place until that reference is relinquished.

Which automatically implies that any such entity (holding a ref)
trying to do any kind of action to what it is holding, should get
an error result, else it would be misled to believe that things are ok,
when in fact the whole thing is coming down...

	Luben

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-07 21:57     ` James Bottomley
@ 2005-08-08 18:24       ` Luben Tuikov
  2005-08-08 19:54         ` Mike Anderson
  2005-08-08 20:06       ` Mike Anderson
  1 sibling, 1 reply; 30+ messages in thread
From: Luben Tuikov @ 2005-08-08 18:24 UTC (permalink / raw)
  To: James Bottomley; +Cc: Alan Stern, Mike Anderson, SCSI development list

On 08/07/05 17:57, James Bottomley wrote:
> Alan Stern wrote:
>>The only resource that matters for this discussion is associated with the
>>LLD itself, not with any of its hosts: the host template.  Once the SCSI
>>core has released all references to the template, it can't call the LLD
>>any more.  The problem is that the LLD has no way to know when all the
>>references have been dropped.  This suggests that the entire problem could
>>be solved by adding a kref to struct scsi_host_template.
>>
>>Would you agree to a patch adding such a kref?
> 
> 
> Well, not really.  The host template usually exists as a variable in the
> module, so its lifetime is tied to the lifetime of the module.  Adding a
> kref wouldn't help because it will still be freed when the module is
> removed.  If there's a case where the template is being freed
> prematurely because the module is being removed, then we have the module
> refcounting wrong somewhere.  Have you run across such a case?

Hmm, I think Alan has a point.

>From object point of view, who is the parent and who is the child
when talking about LLDD/module and the host template?

The reason is that the host template could be simulated on behalf
of the underlying transport (as is the case for USB).

So, it could be the case that the host template _should_ be
removed but the entity which removed it should stay, and that
entity wants to know when the host template is to be removed.

(Actually it doesn't, but setting the release method in the
kobj_ktype would do wonders. ;-) )

In which case, it does make sense to include a kset to the
host template (since it has many children), and anyone
using/manipulating it does kset_get().  When that entity
is done with the host template it does a kset_put().

Such entities could be the managing layer above, error
handling, etc.

Another solution, just as good, is to use the template
_only_ for the registration call, as a _template_, and
as soon as the registration call has returned, the
caller (LLDD/module) can free the template.  After all,
it is only a template.

Then the managing layer allocates the actual host struct
and "gives" it to the LLDD.  The LLDD goes a kset_get()
on it while it lives, and when it is to die, it does a
kset_put().  And if it gets a method call, it would
error it out, after the put...

Either way, event relationship must be very well defined.

	Luben

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 18:24       ` Luben Tuikov
@ 2005-08-08 19:54         ` Mike Anderson
  2005-08-08 20:25           ` Luben Tuikov
  0 siblings, 1 reply; 30+ messages in thread
From: Mike Anderson @ 2005-08-08 19:54 UTC (permalink / raw)
  To: Luben Tuikov; +Cc: James Bottomley, Alan Stern, SCSI development list

Luben Tuikov <luben_tuikov@adaptec.com> wrote:
> On 08/07/05 17:57, James Bottomley wrote:
> > Alan Stern wrote:
> >>The only resource that matters for this discussion is associated with the
> >>LLD itself, not with any of its hosts: the host template.  Once the SCSI
> >>core has released all references to the template, it can't call the LLD
> >>any more.  The problem is that the LLD has no way to know when all the
> >>references have been dropped.  This suggests that the entire problem could
> >>be solved by adding a kref to struct scsi_host_template.
> >>
> >>Would you agree to a patch adding such a kref?
> > 
> > 
> > Well, not really.  The host template usually exists as a variable in the
> > module, so its lifetime is tied to the lifetime of the module.  Adding a
> > kref wouldn't help because it will still be freed when the module is
> > removed.  If there's a case where the template is being freed
> > prematurely because the module is being removed, then we have the module
> > refcounting wrong somewhere.  Have you run across such a case?
> 
> Hmm, I think Alan has a point.
> 
> >From object point of view, who is the parent and who is the child
> when talking about LLDD/module and the host template?
> 
> The reason is that the host template could be simulated on behalf
> of the underlying transport (as is the case for USB).
> 
> So, it could be the case that the host template _should_ be
> removed but the entity which removed it should stay, and that
> entity wants to know when the host template is to be removed.
> 
> (Actually it doesn't, but setting the release method in the
> kobj_ktype would do wonders. ;-) )
> 
> In which case, it does make sense to include a kset to the
> host template (since it has many children), and anyone
> using/manipulating it does kset_get().  When that entity
> is done with the host template it does a kset_put().
> 
> Such entities could be the managing layer above, error
> handling, etc.
> 
> Another solution, just as good, is to use the template
> _only_ for the registration call, as a _template_, and
> as soon as the registration call has returned, the
> caller (LLDD/module) can free the template.  After all,
> it is only a template.

While you could do this I believe this would not solve Alan's question as
if I read correctly it involved the functions called through the host
template (i.e., through shost->hostt).

> 
> Then the managing layer allocates the actual host struct
> and "gives" it to the LLDD.  The LLDD goes a kset_get()
> on it while it lives, and when it is to die, it does a
> kset_put().  And if it gets a method call, it would
> error it out, after the put...
> 

We already have kobject for the scsi_host object so it is unclear why
switch to a kset. In looking at the kernel tree I do not see any users of
ksets outside the driver model subsystems. I believe James suggestion of
adding to scsi_host_dev_release would allow the LLDD to get a better
indicator of release cleanup without changing the LLDD interface to use
ksets.



-andmike
--
Michael Anderson
andmike@us.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-07 21:57     ` James Bottomley
  2005-08-08 18:24       ` Luben Tuikov
@ 2005-08-08 20:06       ` Mike Anderson
  1 sibling, 0 replies; 30+ messages in thread
From: Mike Anderson @ 2005-08-08 20:06 UTC (permalink / raw)
  To: James Bottomley; +Cc: Alan Stern, SCSI development list

James Bottomley <James.Bottomley@SteelEye.com> wrote:
> On Sun, 2005-08-07 at 14:43 -0400, Alan Stern wrote:
> > What host device release method?  scsi_host_template->release is marked 
> > OBSOLETE and for use only with old-style drivers.  scsi_host_dev_release 
>                                                      ^^^^^
> This is the one I was thinking of adding to.

Is the thought here that if a LLDD provided some new scsi_host_template
function we would call this from scsi_host_dev_release?

-andmike
--
Michael Anderson
andmike@us.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 19:54         ` Mike Anderson
@ 2005-08-08 20:25           ` Luben Tuikov
  0 siblings, 0 replies; 30+ messages in thread
From: Luben Tuikov @ 2005-08-08 20:25 UTC (permalink / raw)
  To: Mike Anderson; +Cc: James Bottomley, Alan Stern, SCSI development list

On 08/08/05 15:54, Mike Anderson wrote:
> I believe James suggestion of
> adding to scsi_host_dev_release would allow the LLDD to get a better
> indicator of release cleanup without changing the LLDD interface to use
> ksets.

Ah, yes, I completely agree.

	Luben

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-07 22:15     ` Stefan Richter
@ 2005-08-08 20:41       ` Alan Stern
  2005-08-08 21:36         ` Mike Anderson
                           ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: Alan Stern @ 2005-08-08 20:41 UTC (permalink / raw)
  To: Stefan Richter
  Cc: SCSI development list, James Bottomley, Luben Tuikov,
	Mike Anderson

On Mon, 8 Aug 2005, Stefan Richter wrote:

> Alan Stern wrote:
> > Anyhow, I'm not so concerned about releasing resources associated with the
> > host.  The LLD can handle that easily enough by first flagging the HBA
> > as not accessible, then waiting for current I/O to terminate, then
> > calling scsi_remove_host.  The resources can be released any time after
> > the current I/O is finished; it doesn't matter whether the host is removed
> > before or after.
> 
> What do you mean by "flagging the HBA as not accessible"?

I mean: setting a flag in the LLD's private data structure so that the LLD 
will know to immediately fail all future calls to queuecommand, 
device_reset, bus_reset, etc.

> Why should the LLD care if the core might want to access it after the
> LLD made the prescribed API calls for host removal? It is the core's
> responsibility that it never enters the host again after these API
> calls were invoked.

Here you are wrong.  In fact the core makes no such guarantees.  It _will_
try to enter the host (for things like telling disk drives to flush
their caches) for as long as it retains a reference to the host structure.

On Sun, 7 Aug 2005, James Bottomley wrote:

> > The only resource that matters for this discussion is associated with the
> > LLD itself, not with any of its hosts: the host template.  Once the SCSI 
> > core has released all references to the template, it can't call the LLD  
> > any more.  The problem is that the LLD has no way to know when all the   
> > references have been dropped.  This suggests that the entire problem could
> > be solved by adding a kref to struct scsi_host_template.
> > 
> > Would you agree to a patch adding such a kref?
> 
> Well, not really.  The host template usually exists as a variable in the
> module, so its lifetime is tied to the lifetime of the module.  Adding a
> kref wouldn't help because it will still be freed when the module is
> removed.  If there's a case where the template is being freed
> prematurely because the module is being removed, then we have the module
> refcounting wrong somewhere.  Have you run across such a case?

I tried to provoke such a case, but failed.  Evidently I've been barking
up the wrong tree.  Normally the SCSI core doesn't call the LLD unless
some process has opened a device file for something on that host.  This
will automatically do try_module_get on the LLD, making it impossible for
the LLD to be removed from memory.

I'm not certain this reasoning is 100% reliable, though -- does it cover
_every_ case where the core calls the LLD?  Maybe something like this 
little patch would be a good idea:

Index: usb-2.6/include/scsi/scsi_host.h
===================================================================
--- usb-2.6.orig/include/scsi/scsi_host.h
+++ usb-2.6/include/scsi/scsi_host.h
@@ -461,6 +461,7 @@ struct Scsi_Host {
 	spinlock_t		default_lock;
 	spinlock_t		*host_lock;

+	struct device_driver	*parent_driver;
 	struct semaphore	scan_mutex;/* serialize scanning activity */

 	struct list_head	eh_cmd_q;
Index: usb-2.6/drivers/scsi/hosts.c
===================================================================
--- usb-2.6.orig/drivers/scsi/hosts.c
+++ usb-2.6/drivers/scsi/hosts.c
@@ -205,6 +205,9 @@ int scsi_add_host(struct Scsi_Host *shos
 		goto out_destroy_host;

 	scsi_proc_host_add(shost);
+	shost->parent_driver = dev->driver;
+	if (shost->parent_driver)
+		get_driver(shost->parent_driver);
 	return error;

  out_destroy_host:
@@ -244,6 +247,8 @@ static void scsi_host_dev_release(struct

 	if (parent)
 		put_device(parent);
+	if (shost->parent_driver)
+		put_driver(shost->parent_driver);
 	kfree(shost);
 }

But if you think this isn't needed, then okay.

There is an exceptional case: device scanning.  During scanning, there is
nothing to directly prevent the LLD from being unloaded.  There is an
indirect protection, however, because the scanning thread will own
shost->scan_mutex, and scsi_remove_host acquires the mutex before
returning.

On Mon, 8 Aug 2005, Luben Tuikov wrote:

> Hmm, I think Alan has a point.
> 
> From object point of view, who is the parent and who is the child
> when talking about LLDD/module and the host template?
> 
> The reason is that the host template could be simulated on behalf
> of the underlying transport (as is the case for USB).
> 
> So, it could be the case that the host template _should_ be
> removed but the entity which removed it should stay, and that
> entity wants to know when the host template is to be removed.

I don't think things are quite this bad.  Host templates are static data
(normally) and they remain available as long as the LLD is present in
memory.  In fact, I've been using the template as kind of a surrogate,
standing for all of the LLD's module -- so long as one is present, so is
the other.

I still have one related question.  This is a little bit off to one 
side, but maybe you folks can suggest possible solutions.  The question 
concerns a deadlock I _was_ able to generate earlier today with a patched
usb-storage.

My USB mass-storage test device doesn't respond to TEST UNIT READY, so it
causes a timeout and kicks the error handler into action.  This happens 
during device scanning, just prior to reading the partition table.  The 
error handler goes through various stages of processing, leading up to a 
bus reset.  I disconnected the USB device just before the bus reset 
routine was called.

Now, usb-storage implements a "SCSI bus reset" by actually performing a
USB port reset.  The USB subsystem requires the caller to acquire a
device-specific semaphore before doing a port reset, and the subsystem
itself acquires this same semaphore when notifying drivers about a
disconnection.  (The idea is that we don't want drivers trying to handle
a disconnect and a reset on the same device at the same time.)

So here's how things end up.

	The scanning thread owns shost->scan_mutex and is waiting
	for the error handler to finish.

	The EH thread is executing usb-storage's bus_reset routine
	and is waiting to acquire the device semaphore.

	USB's khubd thread owns the device semaphore and has invoked
	the usb-storage disconnect routine.  Among other things, this
	routine calls scsi_remove_host, which tries to acquire the
	scan_mutex.

How should this deadlock be resolved?  The current code has an extremely 
inelegant solution, and I would like to find a better one.  Any ideas?

Alan Stern

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 20:41       ` Alan Stern
@ 2005-08-08 21:36         ` Mike Anderson
  2005-08-08 22:36           ` Alan Stern
  2005-08-08 21:49         ` Stefan Richter
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 30+ messages in thread
From: Mike Anderson @ 2005-08-08 21:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: Stefan Richter, SCSI development list, James Bottomley,
	Luben Tuikov

Alan Stern <stern@rowland.harvard.edu> wrote:
> Here you are wrong.  In fact the core makes no such guarantees.  It _will_
> try to enter the host (for things like telling disk drives to flush
> their caches) for as long as it retains a reference to the host structure.

Well post scsi_remove_host returning no commands can be sent through the
LLDD's queuecommand. There is still the issue you previously mentioned
about a possible race of the scsi error handler and a call to
scsi_remove_host.


> I tried to provoke such a case, but failed.  Evidently I've been barking
> up the wrong tree.  Normally the SCSI core doesn't call the LLD unless
> some process has opened a device file for something on that host.  This
> will automatically do try_module_get on the LLD, making it impossible for
> the LLD to be removed from memory.
> 
> I'm not certain this reasoning is 100% reliable, though -- does it cover
> _every_ case where the core calls the LLD?  Maybe something like this 
> little patch would be a good idea:

In USB is a driver controlling more than one host instance. As I believe
the suggestion that James indicated as a possible addition would be at per
host instance granularity and your solution below looks like a
notification once all instances controlled by a driver are gone.

> --- usb-2.6.orig/drivers/scsi/hosts.c
> +++ usb-2.6/drivers/scsi/hosts.c
> @@ -205,6 +205,9 @@ int scsi_add_host(struct Scsi_Host *shos
>  		goto out_destroy_host;
>  
>  	scsi_proc_host_add(shost);
> +	shost->parent_driver = dev->driver;
> +	if (shost->parent_driver)
> +		get_driver(shost->parent_driver);

Just a nit, for a general case solution dev can be null for drivers who live on a
bus that is not converted to the driver model and may not have a parent.

> But if you think this isn't needed, then okay.
> 
> There is an exceptional case: device scanning.  During scanning, there is
> nothing to directly prevent the LLD from being unloaded.  There is an
> indirect protection, however, because the scanning thread will own
> shost->scan_mutex, and scsi_remove_host acquires the mutex before
> returning.

I think in the past someone has indicated that we really should be getting
refs during scanning to hold things in place. I guess as you note above
and below scan_mutex is holding things in place.


> memory.  In fact, I've been using the template as kind of a surrogate,
> standing for all of the LLD's module -- so long as one is present, so is
> the other.
> 
> 
> I still have one related question.  This is a little bit off to one 
> side, but maybe you folks can suggest possible solutions.  The question 
> concerns a deadlock I _was_ able to generate earlier today with a patched
> usb-storage.
> 
> My USB mass-storage test device doesn't respond to TEST UNIT READY, so it
> causes a timeout and kicks the error handler into action.  This happens 
> during device scanning, just prior to reading the partition table.  The 
> error handler goes through various stages of processing, leading up to a 
> bus reset.  I disconnected the USB device just before the bus reset 
> routine was called.
> 
> Now, usb-storage implements a "SCSI bus reset" by actually performing a
> USB port reset.  The USB subsystem requires the caller to acquire a
> device-specific semaphore before doing a port reset, and the subsystem
> itself acquires this same semaphore when notifying drivers about a
> disconnection.  (The idea is that we don't want drivers trying to handle
> a disconnect and a reset on the same device at the same time.)
> 
> So here's how things end up.
> 
> 	The scanning thread owns shost->scan_mutex and is waiting
> 	for the error handler to finish.
> 
> 	The EH thread is executing usb-storage's bus_reset routine
> 	and is waiting to acquire the device semaphore.
> 
> 	USB's khubd thread owns the device semaphore and has invoked
> 	the usb-storage disconnect routine.  Among other things, this
> 	routine calls scsi_remove_host, which tries to acquire the
> 	scan_mutex.
> 
> How should this deadlock be resolved?  The current code has an extremely 
> inelegant solution, and I would like to find a better one.  Any ideas?
> 
> Alan Stern
> 

I do not mean to push everything down into the LLDD, but in this case it
is unclear what type of protection the scsi mid layer could add to protect
against unknown future events while scanning. Is the current inelegant
solution in the usb storage driver using some form of state model. It
would appear that if it is not that a state model that uses a spin lock or
something other than device semaphore to determine a disconnect was
happening during a reset or vis-versa would be a good idea.

-andmike
--
Michael Anderson
andmike@us.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 20:41       ` Alan Stern
  2005-08-08 21:36         ` Mike Anderson
@ 2005-08-08 21:49         ` Stefan Richter
  2005-08-08 22:20         ` Luben Tuikov
  2005-08-08 22:23         ` James Bottomley
  3 siblings, 0 replies; 30+ messages in thread
From: Stefan Richter @ 2005-08-08 21:49 UTC (permalink / raw)
  To: SCSI development list
  Cc: Alan Stern, James Bottomley, Luben Tuikov, Mike Anderson

Alan Stern wrote:
> On Mon, 8 Aug 2005, Stefan Richter wrote:
>>Why should the LLD care if the core might want to access it after the
>>LLD made the prescribed API calls for host removal? It is the core's
>>responsibility that it never enters the host again after these API
>>calls were invoked.
> 
> Here you are wrong.  In fact the core makes no such guarantees.  It _will_
> try to enter the host (for things like telling disk drives to flush
> their caches) for as long as it retains a reference to the host structure.

Sure. But after all high-level drivers were detached (and that should
have happend right before scsi_remove_host returns) I don't see why
the host's ref count might not be down to zero.

[...]
> My USB mass-storage test device doesn't respond to TEST UNIT READY, so it
> causes a timeout and kicks the error handler into action.  This happens 
> during device scanning, just prior to reading the partition table.  The 
> error handler goes through various stages of processing, leading up to a 
> bus reset.  I disconnected the USB device just before the bus reset 
> routine was called.
> 
> Now, usb-storage implements a "SCSI bus reset" by actually performing a
> USB port reset.  The USB subsystem requires the caller to acquire a
> device-specific semaphore before doing a port reset, and the subsystem
> itself acquires this same semaphore when notifying drivers about a
> disconnection.  (The idea is that we don't want drivers trying to handle
> a disconnect and a reset on the same device at the same time.)
> 
> So here's how things end up.
> 
> 	The scanning thread owns shost->scan_mutex and is waiting
> 	for the error handler to finish.
> 
> 	The EH thread is executing usb-storage's bus_reset routine
> 	and is waiting to acquire the device semaphore.
> 
> 	USB's khubd thread owns the device semaphore and has invoked
> 	the usb-storage disconnect routine.  Among other things, this
> 	routine calls scsi_remove_host, which tries to acquire the
> 	scan_mutex.
> 
> How should this deadlock be resolved?  The current code has an extremely 
> inelegant solution, and I would like to find a better one.  Any ideas?

Can't the eh_*_reset_handler use down_*_trylock? If the semaphore was
already down, there should be a means for the reset handler to figure
out the reason so that it can back out in an appropriate way. I hope
there is a limited set of reasons...
-- 
Stefan Richter
-=====-=-=-= =--- -=---
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 20:41       ` Alan Stern
  2005-08-08 21:36         ` Mike Anderson
  2005-08-08 21:49         ` Stefan Richter
@ 2005-08-08 22:20         ` Luben Tuikov
  2005-08-08 22:30           ` Alan Stern
  2005-08-08 22:23         ` James Bottomley
  3 siblings, 1 reply; 30+ messages in thread
From: Luben Tuikov @ 2005-08-08 22:20 UTC (permalink / raw)
  To: Alan Stern
  Cc: Stefan Richter, SCSI development list, James Bottomley,
	Mike Anderson

On 08/08/05 16:41, Alan Stern wrote:
> I still have one related question.  This is a little bit off to one 
> side, but maybe you folks can suggest possible solutions.  The question 
> concerns a deadlock I _was_ able to generate earlier today with a patched
> usb-storage.
> 
> My USB mass-storage test device doesn't respond to TEST UNIT READY, so it
> causes a timeout and kicks the error handler into action.  This happens 
> during device scanning, just prior to reading the partition table.  The 
> error handler goes through various stages of processing, leading up to a 
> bus reset.  I disconnected the USB device just before the bus reset 
> routine was called.

I think that "scanning" is a special process which should involve
the minimum of error handling, by either ignoring errors and trying
to connect to the device anyway, or on the first error, give up
the device.  Which policy would one follow depends on the transport.

If the latter, you need to blacklist the device as not supporting
TUR.  Then on any error, like you pulling the cable during scanning,
the scanning process will give up and all will be well.

	Luben

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 20:41       ` Alan Stern
                           ` (2 preceding siblings ...)
  2005-08-08 22:20         ` Luben Tuikov
@ 2005-08-08 22:23         ` James Bottomley
  2005-08-08 22:40           ` Alan Stern
  3 siblings, 1 reply; 30+ messages in thread
From: James Bottomley @ 2005-08-08 22:23 UTC (permalink / raw)
  To: Alan Stern
  Cc: Stefan Richter, SCSI development list, Luben Tuikov,
	Mike Anderson

On Mon, 2005-08-08 at 16:41 -0400, Alan Stern wrote:
> > > Would you agree to a patch adding such a kref?
> > 
> > Well, not really.  The host template usually exists as a variable in the
> > module, so its lifetime is tied to the lifetime of the module.  Adding a
> > kref wouldn't help because it will still be freed when the module is
> > removed.  If there's a case where the template is being freed
> > prematurely because the module is being removed, then we have the module
> > refcounting wrong somewhere.  Have you run across such a case?
> 
> I tried to provoke such a case, but failed.  Evidently I've been barking
> up the wrong tree.  Normally the SCSI core doesn't call the LLD unless
> some process has opened a device file for something on that host.  This
> will automatically do try_module_get on the LLD, making it impossible for
> the LLD to be removed from memory.
> 
> I'm not certain this reasoning is 100% reliable, though -- does it cover
> _every_ case where the core calls the LLD?  Maybe something like this 
> little patch would be a good idea:

Yes, I was considering something similar, since all the last put of a
driver does is send the completion that driver_unregister() should be
waiting for

> +	struct device_driver	*parent_driver;

I don't think we need this.  The underlying device has to be the parent
of shost_gendev, so you can get the parent driver as

shost_gendev.parent->driver

> +	if (shost->parent_driver)
> +		put_driver(shost->parent_driver);

And just before this would be the place to do the final release of all
the resources the HBA is holding.

James



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 22:20         ` Luben Tuikov
@ 2005-08-08 22:30           ` Alan Stern
  2005-08-09  0:49             ` Luben Tuikov
  0 siblings, 1 reply; 30+ messages in thread
From: Alan Stern @ 2005-08-08 22:30 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Stefan Richter, SCSI development list, James Bottomley,
	Mike Anderson

On Mon, 8 Aug 2005, Luben Tuikov wrote:

> On 08/08/05 16:41, Alan Stern wrote:
> > I still have one related question.  This is a little bit off to one 
> > side, but maybe you folks can suggest possible solutions.  The question 
> > concerns a deadlock I _was_ able to generate earlier today with a patched
> > usb-storage.
> > 
> > My USB mass-storage test device doesn't respond to TEST UNIT READY, so it
> > causes a timeout and kicks the error handler into action.  This happens 
> > during device scanning, just prior to reading the partition table.  The 
> > error handler goes through various stages of processing, leading up to a 
> > bus reset.  I disconnected the USB device just before the bus reset 
> > routine was called.
> 
> I think that "scanning" is a special process which should involve
> the minimum of error handling, by either ignoring errors and trying
> to connect to the device anyway, or on the first error, give up
> the device.  Which policy would one follow depends on the transport.

No, that's not feasible.  We can't just ignore errors, and we do have to 
cope with them.  Scanning is a particular vulnerable time, since it 
involves sending commands that don't occur most of the time during normal 
operation.

> If the latter, you need to blacklist the device as not supporting
> TUR.  Then on any error, like you pulling the cable during scanning,
> the scanning process will give up and all will be well.

You can't blacklist devices you don't know about.  The kernel should work 
regardless.

Alan Stern


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 21:36         ` Mike Anderson
@ 2005-08-08 22:36           ` Alan Stern
  2005-08-08 23:10             ` Stefan Richter
  2005-08-08 23:59             ` Stefan Richter
  0 siblings, 2 replies; 30+ messages in thread
From: Alan Stern @ 2005-08-08 22:36 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Stefan Richter, SCSI development list, James Bottomley,
	Luben Tuikov

On Mon, 8 Aug 2005, Stefan Richter wrote:

> > Here you are wrong.  In fact the core makes no such guarantees.  It _will_
> > try to enter the host (for things like telling disk drives to flush
> > their caches) for as long as it retains a reference to the host structure.
> 
> Sure. But after all high-level drivers were detached (and that should
> have happend right before scsi_remove_host returns) I don't see why  
> the host's ref count might not be down to zero.

There might still be outstanding references: processes holding files open, 
that sort of thing.  It happens all the time in refcounting systems.


On Mon, 8 Aug 2005, Mike Anderson wrote:

> Well post scsi_remove_host returning no commands can be sent through the
> LLDD's queuecommand. There is still the issue you previously mentioned
> about a possible race of the scsi error handler and a call to
> scsi_remove_host.

Yes, and there's other pathway too: the procfs interface.


> In USB is a driver controlling more than one host instance. As I believe
> the suggestion that James indicated as a possible addition would be at per
> host instance granularity and your solution below looks like a
> notification once all instances controlled by a driver are gone.

That's right.  They are not the same thing.

> > --- usb-2.6.orig/drivers/scsi/hosts.c
> > +++ usb-2.6/drivers/scsi/hosts.c
> > @@ -205,6 +205,9 @@ int scsi_add_host(struct Scsi_Host *shos
> >  		goto out_destroy_host;
> >  
> >  	scsi_proc_host_add(shost);
> > +	shost->parent_driver = dev->driver;
> > +	if (shost->parent_driver)
> > +		get_driver(shost->parent_driver);
> 
> Just a nit, for a general case solution dev can be null for drivers who live on a
> bus that is not converted to the driver model and may not have a parent.

Okay, so the patch needs to test for (dev != NULL) before making the 
assignment.  The idea is still sound.


> > So here's how things end up.
> > 
> > 	The scanning thread owns shost->scan_mutex and is waiting
> > 	for the error handler to finish.
> > 
> > 	The EH thread is executing usb-storage's bus_reset routine
> > 	and is waiting to acquire the device semaphore.
> > 
> > 	USB's khubd thread owns the device semaphore and has invoked
> > 	the usb-storage disconnect routine.  Among other things, this
> > 	routine calls scsi_remove_host, which tries to acquire the
> > 	scan_mutex.
> > 
> > How should this deadlock be resolved?  The current code has an extremely 
> > inelegant solution, and I would like to find a better one.  Any ideas?
> > 
> > Alan Stern
> > 
> 
> I do not mean to push everything down into the LLDD, but in this case it
> is unclear what type of protection the scsi mid layer could add to protect
> against unknown future events while scanning. Is the current inelegant
> solution in the usb storage driver using some form of state model. It
> would appear that if it is not that a state model that uses a spin lock or
> something other than device semaphore to determine a disconnect was
> happening during a reset or vis-versa would be a good idea.

On Mon, 8 Aug 2005, Stefan Richter wrote:

> Can't the eh_*_reset_handler use down_*_trylock? If the semaphore was
> already down, there should be a means for the reset handler to figure
> out the reason so that it can back out in an appropriate way. I hope 
> there is a limited set of reasons...

To answer both of you at once...  The current code does use
down_*_trylock, and it does check the device state for
disconnect-in-progress.  If things don't work out, it sleeps for a short
while and then tries again.  Like I said, it's inelegant.

The underlying problem is that as things stand, the natural order of
locking is device semaphore first, scan_mutex second -- that's what
happens during a disconnect.  (It's also what would happen during a probe,
if we performed device scanning from the probe routine.)  But when the
error handler tries to do a bus reset during scanning, the locking order
is reversed.  It's a hard problem.  The only reason I've kept the current
code is because USB port resets occur very infrequently.

Alan Stern


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 22:23         ` James Bottomley
@ 2005-08-08 22:40           ` Alan Stern
  0 siblings, 0 replies; 30+ messages in thread
From: Alan Stern @ 2005-08-08 22:40 UTC (permalink / raw)
  To: James Bottomley
  Cc: Stefan Richter, SCSI development list, Luben Tuikov,
	Mike Anderson

On Mon, 8 Aug 2005, James Bottomley wrote:

> Yes, I was considering something similar, since all the last put of a
> driver does is send the completion that driver_unregister() should be
> waiting for

That's exactly how I came to think of it.

> 
> > +	struct device_driver	*parent_driver;
> 
> I don't think we need this.  The underlying device has to be the parent
> of shost_gendev, so you can get the parent driver as
> 
> shost_gendev.parent->driver

The driver may have completely unbound from the parent device by the time
shost_gendev is released, so .parent->driver may already be NULL.

> > +	if (shost->parent_driver)
> > +		put_driver(shost->parent_driver);
> 
> And just before this would be the place to do the final release of all
> the resources the HBA is holding.

Right.

Alan Stern


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 22:36           ` Alan Stern
@ 2005-08-08 23:10             ` Stefan Richter
  2005-08-09 14:23               ` Alan Stern
  2005-08-08 23:59             ` Stefan Richter
  1 sibling, 1 reply; 30+ messages in thread
From: Stefan Richter @ 2005-08-08 23:10 UTC (permalink / raw)
  To: SCSI development list; +Cc: Alan Stern

Alan Stern wrote:
> On Mon, 8 Aug 2005, Stefan Richter wrote:
>>But after all high-level drivers were detached (and that should
>>have happend right before scsi_remove_host returns) I don't see why  
>>the host's ref count might not be down to zero.
> 
> There might still be outstanding references: processes holding files 
> open, that sort of thing.

But as long as files are open etc., scsi high-level drivers are
'in use', cannot be detached, and scsi_remove_host does not return.

Or that's what I assumed so far.
-- 
Stefan Richter
-=====-=-=-= =--- -=--=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 22:36           ` Alan Stern
  2005-08-08 23:10             ` Stefan Richter
@ 2005-08-08 23:59             ` Stefan Richter
  2005-08-09 14:40               ` Alan Stern
  1 sibling, 1 reply; 30+ messages in thread
From: Stefan Richter @ 2005-08-08 23:59 UTC (permalink / raw)
  To: SCSI development list; +Cc: Alan Stern, Mike Anderson

Alan Stern wrote:
> The current code does use
> down_*_trylock, and it does check the device state for
> disconnect-in-progress.  If things don't work out, it sleeps
> for a short while and then tries again.

I think the error handler should rather give SUCCESS back to
scsi if it knows the device was or is being disconnected.

After that, commands would be enqueued again, and these
commands would be immediately done() with an appropriate
result code.

IMO the error handler should return SUCCESS rather than
FAILED because, if the device is diconnected, there is
simply nothing left to do that could advance recovery
from error. Retries (in the error handler) will not help
with error recovery either. Or do you retry in hope that
the device might be reconnected and then a device reset
should happen?

A more general thought: It appears from the example you
explained that maybe the USB highlevel carries too much of a
burden to watch out for concurrency. Tasks like to mutually
exclude port resets and disconnects --- which seems to me to
be a task unspecific to highlevel protocols, correct me if
I'm wrong --- might be better moved into the USB core. Then
the highlevel would have to worry less about how exactly to
avoid races and deadlocks, only about _what_ to do if there
is a conflict. (The kind of conflict should be indicated by
the core by means of return values or status flags or the
like.)

But I am of course speaking without any knowledge about USB
and Linux' USB stack.
-- 
Stefan Richter
-=====-=-=-= =--- -=--=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 22:30           ` Alan Stern
@ 2005-08-09  0:49             ` Luben Tuikov
  2005-08-09 14:50               ` Alan Stern
  0 siblings, 1 reply; 30+ messages in thread
From: Luben Tuikov @ 2005-08-09  0:49 UTC (permalink / raw)
  To: Alan Stern, Luben Tuikov
  Cc: Stefan Richter, SCSI development list, James Bottomley,
	Mike Anderson

--- Alan Stern <stern@rowland.harvard.edu> wrote:
> > I think that "scanning" is a special process which should involve
> > the minimum of error handling, by either ignoring errors and trying
> > to connect to the device anyway, or on the first error, give up
> > the device.  Which policy would one follow depends on the transport.
> 
> No, that's not feasible.  We can't just ignore errors, and we do have to 
> cope with them.  Scanning is a particular vulnerable time, since it 
> involves sending commands that don't occur most of the time during normal 
> operation.

Your reasoning is correct, but your conclusion is not.

Exactly, scanning is a "particularly vulnerable time" and this is why
we do not want the full I_T nexus error recovery.  At scanning time
we're just "probing" here and there.
 
> > If the latter, you need to blacklist the device as not supporting
> > TUR.  Then on any error, like you pulling the cable during scanning,
> > the scanning process will give up and all will be well.
> 
> You can't blacklist devices you don't know about.  The kernel should work 
> regardless.

Hmm, I thought there was a black listing somewhere in scsi, maybe
/proc/scsi/device_info?

So then you do scheme #1 as I described: ignore as much as possible and try
to establish an I_T nexus and _then_ try to poke around with the device.

      Luben


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 23:10             ` Stefan Richter
@ 2005-08-09 14:23               ` Alan Stern
  2005-08-09 19:37                 ` Stefan Richter
  0 siblings, 1 reply; 30+ messages in thread
From: Alan Stern @ 2005-08-09 14:23 UTC (permalink / raw)
  To: Stefan Richter; +Cc: SCSI development list

On Tue, 9 Aug 2005, Stefan Richter wrote:

> Alan Stern wrote:
> > On Mon, 8 Aug 2005, Stefan Richter wrote:
> >>But after all high-level drivers were detached (and that should
> >>have happend right before scsi_remove_host returns) I don't see why  
> >>the host's ref count might not be down to zero.
> > 
> > There might still be outstanding references: processes holding files 
> > open, that sort of thing.
> 
> But as long as files are open etc., scsi high-level drivers are
> 'in use', cannot be detached, and scsi_remove_host does not return.
> 
> Or that's what I assumed so far.

No, that's not right.  scsi_remove_host _can_ return while the high-level
drivers are still in use -- just think about what happens when you unplug
a USB storage device without unmounting it first.  I'm not sure what you 
mean about "detach"ing high-level SCSI drivers.  The high-level drivers 
are notified about the device removal; they drop their references and 
return.  They can still submit commands if they want, but all such 
submissions will fail in the SCSI core.

What can't happen is that the LLD's module can't be unloaded from memory.  
That's prevented by the open files holding a reference to the module.

Alan Stern

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-08 23:59             ` Stefan Richter
@ 2005-08-09 14:40               ` Alan Stern
  2005-08-09 18:21                 ` Stefan Richter
  0 siblings, 1 reply; 30+ messages in thread
From: Alan Stern @ 2005-08-09 14:40 UTC (permalink / raw)
  To: Stefan Richter; +Cc: SCSI development list, Mike Anderson

On Tue, 9 Aug 2005, Stefan Richter wrote:

> Alan Stern wrote:
> > The current code does use
> > down_*_trylock, and it does check the device state for
> > disconnect-in-progress.  If things don't work out, it sleeps
> > for a short while and then tries again.
> 
> I think the error handler should rather give SUCCESS back to
> scsi if it knows the device was or is being disconnected.
> 
> After that, commands would be enqueued again, and these
> commands would be immediately done() with an appropriate
> result code.
> 
> IMO the error handler should return SUCCESS rather than
> FAILED because, if the device is diconnected, there is
> simply nothing left to do that could advance recovery
> from error.

Your premise is correct: Nothing can advance error recovery once the 
device is gone.  It doesn't follow that the error handler should return 
SUCCESS.  In fact it doesn't matter what the handler returns, since no 
further action will succeed.  When the handler returns FAILED, the device 
is set off-line and no further commands are issued at all.

>  Retries (in the error handler) will not help
> with error recovery either. Or do you retry in hope that
> the device might be reconnected and then a device reset
> should happen?

Are you referring to the "try again" part I wrote above?  That has nothing
to do with disconnects.  If the handler realizes a disconnect has
occurred, it immediately returns FAILED.  But if there hasn't been a
disconnect and the handler was still unable to acquire the semaphore, then
it waits a short while, checks for disconnect, and does the down_*_trylock 
again.

> A more general thought: It appears from the example you
> explained that maybe the USB highlevel carries too much of a
> burden to watch out for concurrency. Tasks like to mutually
> exclude port resets and disconnects --- which seems to me to
> be a task unspecific to highlevel protocols, correct me if
> I'm wrong --- might be better moved into the USB core. Then
> the highlevel would have to worry less about how exactly to
> avoid races and deadlocks, only about _what_ to do if there
> is a conflict. (The kind of conflict should be indicated by
> the core by means of return values or status flags or the
> like.)
> 
> But I am of course speaking without any knowledge about USB
> and Linux' USB stack.

I'm not sure I understand your comment.  Are you saying that the job I 
outlined above -- trying to acquire the device semaphore while checking 
for possible disconnects -- should be part of the USB core instead of 
the high-level USB device driver?

It already is in the core.  If you're interested, read 
drivers/usb/core/usb.c:usb_lock_device_for_reset().

Alan Stern

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-09  0:49             ` Luben Tuikov
@ 2005-08-09 14:50               ` Alan Stern
  0 siblings, 0 replies; 30+ messages in thread
From: Alan Stern @ 2005-08-09 14:50 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Stefan Richter, SCSI development list, James Bottomley,
	Mike Anderson

On Mon, 8 Aug 2005, Luben Tuikov wrote:

> --- Alan Stern <stern@rowland.harvard.edu> wrote:
> > > I think that "scanning" is a special process which should involve
> > > the minimum of error handling, by either ignoring errors and trying
> > > to connect to the device anyway, or on the first error, give up
> > > the device.  Which policy would one follow depends on the transport.
> > 
> > No, that's not feasible.  We can't just ignore errors, and we do have to 
> > cope with them.  Scanning is a particular vulnerable time, since it 
> > involves sending commands that don't occur most of the time during normal 
> > operation.
> 
> Your reasoning is correct, but your conclusion is not.
> 
> Exactly, scanning is a "particularly vulnerable time" and this is why
> we do not want the full I_T nexus error recovery.  At scanning time
> we're just "probing" here and there.

Scanning is a lot more involved than just "probing" here and there.  
Getting the INQUIRY data is just the first step.  The second step involves
registering the scsi_dev, which causes the high-level driver's probe
routine to be called.  That routine does an awful lot of work: determine
the device capacity, check for write-protect and cache settings, read the
partition table, maybe more.  These activities need robust error recovery.

> > > If the latter, you need to blacklist the device as not supporting
> > > TUR.  Then on any error, like you pulling the cable during scanning,
> > > the scanning process will give up and all will be well.
> > 
> > You can't blacklist devices you don't know about.  The kernel should work 
> > regardless.
> 
> Hmm, I thought there was a black listing somewhere in scsi, maybe
> /proc/scsi/device_info?

The point of a black list is to avoid using features or commands that a
device doesn't support.  Black lists are _not_ meant to prevent the kernel
from crashing or falling into deadlock.

> So then you do scheme #1 as I described: ignore as much as possible and try
> to establish an I_T nexus and _then_ try to poke around with the device.

Under the circumstances, I think it will be best to leave the usb-storage
bus-reset handler and device locking as they are.  There doesn't seem to 
be any simpler approach that is provably safe.

Alan Stern

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-09 14:40               ` Alan Stern
@ 2005-08-09 18:21                 ` Stefan Richter
  2005-08-09 18:49                   ` Alan Stern
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Richter @ 2005-08-09 18:21 UTC (permalink / raw)
  To: SCSI development list; +Cc: Alan Stern

Alan Stern wrote:
> On Tue, 9 Aug 2005, Stefan Richter wrote:
...
>> Retries (in the error handler) will not help
>>with error recovery either.
...
> Are you referring to the "try again" part I wrote above?
> That has nothing to do with disconnects.

OK, I confused that.

>>A more general thought: It appears from the example you
>>explained that maybe the USB highlevel carries too much of a
>>burden to watch out for concurrency.
...
> I'm not sure I understand your comment.  Are you saying that the job I 
> outlined above -- trying to acquire the device semaphore while checking 
> for possible disconnects -- should be part of the USB core instead of 
> the high-level USB device driver?

That's what I meant. (Two different things actually: Knowing that
one cannot reset a port after disconnection started, and having
to down a semaphore.) But again: I don't know If this is actually
desirable and feasible.

> It already is in the core.  If you're interested, read 
> drivers/usb/core/usb.c:usb_lock_device_for_reset().

I was actually browsing the usb/storage/scsiglue.c, but only
old code at lxr.linux.no, 2.6.11 in fact. Sorry, I see now that
this is way too old to look at. However the scsiglue.c's
device_reset and bus_reset in Linus' tree are still using down().
-- 
Stefan Richter
-=====-=-=-= =--- -=--=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-09 18:21                 ` Stefan Richter
@ 2005-08-09 18:49                   ` Alan Stern
  0 siblings, 0 replies; 30+ messages in thread
From: Alan Stern @ 2005-08-09 18:49 UTC (permalink / raw)
  To: Stefan Richter; +Cc: SCSI development list

On Tue, 9 Aug 2005, Stefan Richter wrote:

> > It already is in the core.  If you're interested, read 
> > drivers/usb/core/usb.c:usb_lock_device_for_reset().
> 
> I was actually browsing the usb/storage/scsiglue.c, but only
> old code at lxr.linux.no, 2.6.11 in fact. Sorry, I see now that
> this is way too old to look at. However the scsiglue.c's
> device_reset and bus_reset in Linus' tree are still using down().

2.6.11 isn't too old.  The code hasn't really changed since then.

However, you're looking at the wrong semaphore.  (Yes, the call to

	down(&(us->dev_semaphore));

in bus_reset() would also create a deadlock, but it's not really needed
and I'm going to take it out.)  You should instead be looking at the call
to usb_lock_device_for_reset() -- notice how it uses usb_trylock_device().

Alan Stern

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-09 14:23               ` Alan Stern
@ 2005-08-09 19:37                 ` Stefan Richter
  2005-08-09 20:07                   ` Alan Stern
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Richter @ 2005-08-09 19:37 UTC (permalink / raw)
  To: SCSI development list; +Cc: Alan Stern

Alan Stern wrote:
> On Tue, 9 Aug 2005, Stefan Richter wrote:
>>But as long as files are open etc., scsi high-level drivers are
>>'in use', cannot be detached, and scsi_remove_host does not return.
>>
>>Or that's what I assumed so far.
> 
> No, that's not right.  scsi_remove_host _can_ return while the high-level
> drivers are still in use -- just think about what happens when you unplug
> a USB storage device without unmounting it first.  I'm not sure what you 
> mean about "detach"ing high-level SCSI drivers.  The high-level drivers 
> are notified about the device removal; they drop their references and 
> return.  They can still submit commands if they want, but all such 
> submissions will fail in the SCSI core.

"detach" = shutdown, remove (from a device)

I don't know what happens if USB storage devices are unplugged with 
filesystems still mounted. But this is what happens with FireWire 
storage devices:
  - ieee1394 core triggers remove function of sbp2 via LKDM
    (Linux Kernel Device Model)
  - sbp2 enters scsi_remove_host (each SBP-2 device has its own host)
  - scsi_remove_host enters scsi_remove_target enters scsi_remove_device
    for each target and device
  - scsi_remove_device triggers all sorts of shutdown functions of scsi's
    highlevel drivers, via LKDM
  - highlevel runs all sorts of last traffic; in case of a mounted
    filesystem of a harddisk you see messages like 'rejecting I/O to
    device being removed', 'Buffer I/O error on device...', errors
    from the filesystem, and finally 'Synchronizing SCSI cache for...'
    which fails.
    Right after this, highlevel's (i.e. sd_mod's) shutdown callbacks
    exit.
  - scsi_remove_device does boring stuff, exits
  - scsi_remove_target does boring stuff, exits
  - scsi_remove_host does boring stuff, exits
  - sbp2 is back in control and removes its own stuff

Well, the parts in sbp2 itself changed a bit lately, but that is not 
relevant now. My point is, scsi_remove_device (and thereby scsi 
highlevel's shutdown callbacks) are executed and finished before 
scsi_remove_host is finished.

You are right though that there are still situations (and it seems 
detaching a device with mounted filesystem is among them) where all 
these remove functions are completed but the scsi_eh thread still stays 
behind, placidly sleeping.

And now we are right back on the subject. :-)
-- 
Stefan Richter
-=====-=-=-= =--- -=--=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-09 19:37                 ` Stefan Richter
@ 2005-08-09 20:07                   ` Alan Stern
  2005-08-09 20:45                     ` Stefan Richter
  0 siblings, 1 reply; 30+ messages in thread
From: Alan Stern @ 2005-08-09 20:07 UTC (permalink / raw)
  To: Stefan Richter; +Cc: SCSI development list

On Tue, 9 Aug 2005, Stefan Richter wrote:

> Alan Stern wrote:
> > On Tue, 9 Aug 2005, Stefan Richter wrote:
> >>But as long as files are open etc., scsi high-level drivers are
> >>'in use', cannot be detached, and scsi_remove_host does not return.
> >>
> >>Or that's what I assumed so far.
> > 
> > No, that's not right.  scsi_remove_host _can_ return while the high-level
> > drivers are still in use -- just think about what happens when you unplug
> > a USB storage device without unmounting it first.  I'm not sure what you 
> > mean about "detach"ing high-level SCSI drivers.  The high-level drivers 
> > are notified about the device removal; they drop their references and 
> > return.  They can still submit commands if they want, but all such 
> > submissions will fail in the SCSI core.
> 
> "detach" = shutdown, remove (from a device)
> 
> I don't know what happens if USB storage devices are unplugged with 
> filesystems still mounted. But this is what happens with FireWire 
> storage devices:
>   - ieee1394 core triggers remove function of sbp2 via LKDM
>     (Linux Kernel Device Model)
>   - sbp2 enters scsi_remove_host (each SBP-2 device has its own host)
>   - scsi_remove_host enters scsi_remove_target enters scsi_remove_device
>     for each target and device
>   - scsi_remove_device triggers all sorts of shutdown functions of scsi's
>     highlevel drivers, via LKDM
>   - highlevel runs all sorts of last traffic; in case of a mounted
>     filesystem of a harddisk you see messages like 'rejecting I/O to
>     device being removed', 'Buffer I/O error on device...', errors
>     from the filesystem, and finally 'Synchronizing SCSI cache for...'
>     which fails.
>     Right after this, highlevel's (i.e. sd_mod's) shutdown callbacks
>     exit.
>   - scsi_remove_device does boring stuff, exits
>   - scsi_remove_target does boring stuff, exits
>   - scsi_remove_host does boring stuff, exits
>   - sbp2 is back in control and removes its own stuff

The USB picture is pretty much the same.

> Well, the parts in sbp2 itself changed a bit lately, but that is not 
> relevant now. My point is, scsi_remove_device (and thereby scsi 
> highlevel's shutdown callbacks) are executed and finished before 
> scsi_remove_host is finished.

That's right.  The mounted filesystem is still open (mounted), but
contrary to what you wrote before, the high-level drivers have been
detached, are no longer "in-use", and scsi_remove_host has returned.

> You are right though that there are still situations (and it seems 
> detaching a device with mounted filesystem is among them) where all 
> these remove functions are completed but the scsi_eh thread still stays 
> behind, placidly sleeping.

Because the scsi_host still exists.  It won't be released until the open
file reference to it has disappeared.

> And now we are right back on the subject. :-)

Yes.  And it's conceivable that the core might still call the LLD,
because the scsi_host still contains a pointer to the template.  
Fortunately, the same open file reference that pins the host structure
also pins the LLD's module in memory.

Alan Stern


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Synchronizing scsi_remove_host and the error handler
  2005-08-09 20:07                   ` Alan Stern
@ 2005-08-09 20:45                     ` Stefan Richter
  0 siblings, 0 replies; 30+ messages in thread
From: Stefan Richter @ 2005-08-09 20:45 UTC (permalink / raw)
  To: SCSI development list; +Cc: Alan Stern

Alan Stern wrote:
> On Tue, 9 Aug 2005, Stefan Richter wrote:
>>these remove functions are completed but the scsi_eh thread still stays 
>>behind, placidly sleeping.
> 
> Because the scsi_host still exists.  It won't be released until the open
> file reference to it has disappeared.

Indeed. I am beginning to comprehend. Thanks for your patience. :-)
-- 
Stefan Richter
-=====-=-=-= =--- -=--=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2005-08-09 20:45 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-07 14:59 Synchronizing scsi_remove_host and the error handler Alan Stern
2005-08-07 15:36 ` James Bottomley
2005-08-07 18:43   ` Alan Stern
2005-08-07 21:57     ` James Bottomley
2005-08-08 18:24       ` Luben Tuikov
2005-08-08 19:54         ` Mike Anderson
2005-08-08 20:25           ` Luben Tuikov
2005-08-08 20:06       ` Mike Anderson
2005-08-07 22:15     ` Stefan Richter
2005-08-08 20:41       ` Alan Stern
2005-08-08 21:36         ` Mike Anderson
2005-08-08 22:36           ` Alan Stern
2005-08-08 23:10             ` Stefan Richter
2005-08-09 14:23               ` Alan Stern
2005-08-09 19:37                 ` Stefan Richter
2005-08-09 20:07                   ` Alan Stern
2005-08-09 20:45                     ` Stefan Richter
2005-08-08 23:59             ` Stefan Richter
2005-08-09 14:40               ` Alan Stern
2005-08-09 18:21                 ` Stefan Richter
2005-08-09 18:49                   ` Alan Stern
2005-08-08 21:49         ` Stefan Richter
2005-08-08 22:20         ` Luben Tuikov
2005-08-08 22:30           ` Alan Stern
2005-08-09  0:49             ` Luben Tuikov
2005-08-09 14:50               ` Alan Stern
2005-08-08 22:23         ` James Bottomley
2005-08-08 22:40           ` Alan Stern
2005-08-08 18:07   ` Luben Tuikov
2005-08-08 18:04 ` Luben Tuikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).