Discussion: soft unbinding

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* Discussion: soft unbinding
@ 2008-05-03 16:03 Alan Stern
  2008-05-03 17:22 ` Stefan Richter
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Stern @ 2008-05-03 16:03 UTC (permalink / raw)
  To: Matthew Dharm, Oliver Neukum, Stefan Richter
  Cc: USB Storage list, SCSI development list

When talking about "soft" unbinding, the main question seems to be: How 
soft?

It would be easy, for instance, to change usb-storage so that unbinding
would wait until the current command was finished.  But clearly one
wants to do more: Give the upper-level SCSI drivers a chance to
shutdown cleanly and issue their FLUSH CACHE commands, wait for all
pending commands to complete, and so on.

It's the "wait for pending commands to complete" part that is hard.  
Some commands have relatively long timeouts.  Error handler operations
have no timeouts.  Commands submitted through sg can have effectively
infinite timeouts.  So how long should we wait?

Should there be a scsi_soft_remove_host() routine that accepts a
timeout value?  It would remove the devices under the host and wait
until the timeout expires (if necessary) before aborting all pending
commands.  Unlike scsi_remove_host(), it would really abort these
commands as though they had timed out, instead of simply cancelling
them.  It would guarantee that when it returned, no commands were still
running on the host and no more commands would be submitted.

This would essentially be a standardized version of the special code 
Stefan has put into the sbp2 and firewire-sbp2 drivers.

Alan Stern

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Discussion: soft unbinding
  2008-05-03 16:03 Discussion: soft unbinding Alan Stern
@ 2008-05-03 17:22 ` Stefan Richter
  2008-05-03 20:42   ` Alan Stern
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Richter @ 2008-05-03 17:22 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matthew Dharm, Oliver Neukum, USB Storage list,
	SCSI development list

Alan Stern wrote:
> When talking about "soft" unbinding, the main question seems to be: How 
> soft?
> 
> It would be easy, for instance, to change usb-storage so that unbinding
> would wait until the current command was finished.  But clearly one
> wants to do more: Give the upper-level SCSI drivers a chance to
> shutdown cleanly and issue their FLUSH CACHE commands, wait for all
> pending commands to complete, and so on.

scsi_remove_host is potentially able to do this, and unless my memory 
betrays me, did so in the past.

> It's the "wait for pending commands to complete" part that is hard.  
> Some commands have relatively long timeouts.

Is there reason to be less patient during soft unbinding?

If so, the decision which commands can be aborted should IMO be made by 
the application layer.

> Error handler operations have no timeouts.  Commands submitted through
> sg can have effectively infinite timeouts.

Hmm, I can't comment on these two.

> So how long should we wait?

I presume if a user launches a "remove safely" command, he means it.  Or 
if he doesn't mean it, he still can hot-unplug before completion of the 
shutdown procedures.  The only exception is a locked drive door or a 
similar ejection mechanism which forces the user to wait for software 
coming to terms.

> Should there be a scsi_soft_remove_host() routine that accepts a
> timeout value?  It would remove the devices under the host and wait
> until the timeout expires (if necessary) before aborting all pending
> commands.  Unlike scsi_remove_host(), it would really abort these
> commands as though they had timed out, instead of simply cancelling
> them.  It would guarantee that when it returned, no commands were still
> running on the host and no more commands would be submitted.

It would be an API with more guarantees/ clearer semantics than 
scsi_remove_host() and even also...

> This would essentially be a standardized version of the special code 
> Stefan has put into the sbp2 and firewire-sbp2 drivers.

...with more guarantees/ clearer semantics than the scsi_remove_device() 
API which the SBP-2 drivers happen to use.  They use them merely because 
this has been found to work more satisfyingly at some point, and they 
don't have difficulties to use this API (i.e. look up the logical units 
to feed to scsi_remove_device()).

Curious; scsi_mid_low_api.txt says in the context of scsi_remove_host:

     When an HBA is being removed it could be as part of an orderly
     shutdown associated with the LLD module being unloaded (e.g. with
     the "rmmod" command) or in response to a "hot unplug" indicated by
     sysfs()'s remove() callback being invoked. In either case, the
     sequence is the same [...]

while it says in the context of scsi_remove_device:

     In a similar fashion, an LLD may become aware that a SCSI device has
     been removed (unplugged) or the connection to it has been
     interrupted. [...] An LLD that detects the removal of a SCSI device
     can instigate its removal from upper layers with this sequence [...]

AFAIR scsi_remove_host once simply worked just as if the LLD itself 
called scsi_remove_device() for each device on that host beforehand. 
Eventually there was a change in the SCSI core internal state model 
which reduced what scsi_remove_device(), when called internally from 
within scsi_remove_host(), was able to do.  This is contrary to the text 
quoted above.  I haven't tested for some time now how the SCSI core 
behaves right nowadays.

Back to scsi_soft_remove_host():

Does the SCSI core actually need separate APIs for soft unbinding 
(a.k.a. orderly shutdown) and hot removal?  We surely have different 
requirements in both cases:  Give pending commands some time to finish 
and send some finalizing commands (e.g. synchronize cache, unlock door) 
in the shutdown case, fail all commands and stop any error retries in 
the hot unplug case.

But isn't hot unplug just a special case of orderly shutdown --- 
basically a case where the transport driver's responsibility is to fail 
commands (pending ones and new ones) quickly?  In addition, fail them 
with failure indicators which tell upper layers that it is no use to 
retry them.

Actually, quick failure and suppression of retries in the hot unplug 
case is IMO not even as critical as the proper execution of pending and 
finalizing commands in the soft unbinding case.  The only critical 
aspect of hot unplug is that IO terminates eventually, i.e. applications 
don't hang.

So, rather than adding a scsi_soft_remove_host API, wouldn't it be 
appropriate and possible to make sure that

   - scsi_remove_host is able to initiate and perform soft unbinding,

   - LLDs return proper failure codes in the hot unplug case, and SCSI
     core and upper layers properly interpret them i.e. don't initiate
     futile retries.
-- 
Stefan Richter
-=====-==--- -=-= ---==
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Discussion: soft unbinding
  2008-05-03 17:22 ` Stefan Richter
@ 2008-05-03 20:42   ` Alan Stern
  2008-05-03 22:32     ` James Bottomley
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Stern @ 2008-05-03 20:42 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Matthew Dharm, Oliver Neukum, USB Storage list,
	SCSI development list

On Sat, 3 May 2008, Stefan Richter wrote:

> Alan Stern wrote:
> > When talking about "soft" unbinding, the main question seems to be: How 
> > soft?
> > 
> > It would be easy, for instance, to change usb-storage so that unbinding
> > would wait until the current command was finished.  But clearly one
> > wants to do more: Give the upper-level SCSI drivers a chance to
> > shutdown cleanly and issue their FLUSH CACHE commands, wait for all
> > pending commands to complete, and so on.
> 
> scsi_remove_host is potentially able to do this, and unless my memory 
> betrays me, did so in the past.
> 
> > It's the "wait for pending commands to complete" part that is hard.  
> > Some commands have relatively long timeouts.
> 
> Is there reason to be less patient during soft unbinding?
> 
> If so, the decision which commands can be aborted should IMO be made by 
> the application layer.
> 
> > Error handler operations have no timeouts.  Commands submitted through
> > sg can have effectively infinite timeouts.
> 
> Hmm, I can't comment on these two.
> 
> > So how long should we wait?
> 
> I presume if a user launches a "remove safely" command, he means it.  Or 
> if he doesn't mean it, he still can hot-unplug before completion of the 
> shutdown procedures.  The only exception is a locked drive door or a 
> similar ejection mechanism which forces the user to wait for software 
> coming to terms.

That's probably true.  With USB at least, a hot unplug causes all 
outstanding I/O to fail immediately.

On the other hand, since unbinding involves acquiring the host 
adapter's device semaphore, it will block things like suspend.  So it 
would not be a bad idea to have a hard upper limit on how long it can 
wait.

> > Should there be a scsi_soft_remove_host() routine that accepts a
> > timeout value?  It would remove the devices under the host and wait
> > until the timeout expires (if necessary) before aborting all pending
> > commands.  Unlike scsi_remove_host(), it would really abort these
> > commands as though they had timed out, instead of simply cancelling
> > them.  It would guarantee that when it returned, no commands were still
> > running on the host and no more commands would be submitted.
> 
> It would be an API with more guarantees/ clearer semantics than 
> scsi_remove_host() and even also...
> 
> > This would essentially be a standardized version of the special code 
> > Stefan has put into the sbp2 and firewire-sbp2 drivers.
> 
> ...with more guarantees/ clearer semantics than the scsi_remove_device() 
> API which the SBP-2 drivers happen to use.  They use them merely because 
> this has been found to work more satisfyingly at some point, and they 
> don't have difficulties to use this API (i.e. look up the logical units 
> to feed to scsi_remove_device()).

Deciding on the timeout value to use is the hard part.  Or even whether 
there should be a timeout at all.

> Curious; scsi_mid_low_api.txt says in the context of scsi_remove_host:
> 
>      When an HBA is being removed it could be as part of an orderly
>      shutdown associated with the LLD module being unloaded (e.g. with
>      the "rmmod" command) or in response to a "hot unplug" indicated by
>      sysfs()'s remove() callback being invoked. In either case, the
>      sequence is the same [...]

Yeah, well, it also says in the description of scsi_remove_host:

	Returns value: 0 on success, 1 on failure (e.g. LLD busy ??)

So you can't rely on the documentation being up-to-date.

> while it says in the context of scsi_remove_device:
> 
>      In a similar fashion, an LLD may become aware that a SCSI device has
>      been removed (unplugged) or the connection to it has been
>      interrupted. [...] An LLD that detects the removal of a SCSI device
>      can instigate its removal from upper layers with this sequence [...]
> 
> AFAIR scsi_remove_host once simply worked just as if the LLD itself 
> called scsi_remove_device() for each device on that host beforehand. 
> Eventually there was a change in the SCSI core internal state model 
> which reduced what scsi_remove_device(), when called internally from 
> within scsi_remove_host(), was able to do.  This is contrary to the text 
> quoted above.  I haven't tested for some time now how the SCSI core 
> behaves right nowadays.
> 
> Back to scsi_soft_remove_host():
> 
> Does the SCSI core actually need separate APIs for soft unbinding 
> (a.k.a. orderly shutdown) and hot removal?  We surely have different 
> requirements in both cases:  Give pending commands some time to finish 
> and send some finalizing commands (e.g. synchronize cache, unlock door) 
> in the shutdown case, fail all commands and stop any error retries in 
> the hot unplug case.
> 
> But isn't hot unplug just a special case of orderly shutdown --- 
> basically a case where the transport driver's responsibility is to fail 
> commands (pending ones and new ones) quickly?  In addition, fail them 
> with failure indicators which tell upper layers that it is no use to 
> retry them.

That's right.

> Actually, quick failure and suppression of retries in the hot unplug 
> case is IMO not even as critical as the proper execution of pending and 
> finalizing commands in the soft unbinding case.  The only critical 
> aspect of hot unplug is that IO terminates eventually, i.e. applications 
> don't hang.
> 
> So, rather than adding a scsi_soft_remove_host API, wouldn't it be 
> appropriate and possible to make sure that
> 
>    - scsi_remove_host is able to initiate and perform soft unbinding,
> 
>    - LLDs return proper failure codes in the hot unplug case, and SCSI
>      core and upper layers properly interpret them i.e. don't initiate
>      futile retries.

These ideas have not escaped me.  There's really no reason to have a 
separate API for hot unplug at all; the soft unbind sequence would work 
perfectly well (assuming that I/O fails immediately, as it does with 
USB).

No, the reason I suggested a separate new API is because of the bizarre 
way scsi_remove_host() handles -- or used to handle -- outstanding 
commands.  The midlayer would cancel them all by itself, without 
telling the LLD or doing anything else.  There's special code in 
usb-storage to work around this, possibly in other LLDs as well.  I'm 
afraid that changing the midlayer's behavior would cause some LLDs to 
malfunction in this regard.


The last time I looked at this stuff was back in 2004.  This email 
thread may be interesting:

	http://marc.info/?t=109644432800002&r=1&w=2

Of course the midlayer has changed since then (scsi_host_cancel no 
longer exists), so it may not be relevant any more.

Of even more interest and relevance is this thread:

	http://marc.info/?t=109630920600005&r=1&w=2

In one of the messages in that thread, James Bottomley wrote:

------------------------------------------------------------------------
Right.  scsi_remove_host tells the mid-layer that it's OK to trash all
inflight commands because you removed all their users before calling
it.  It also tells us that you won't accept any future commands for this
host (because you'll error any attempt in queuecommand).
------------------------------------------------------------------------

Later on Mike Anderson asked:

------------------------------------------------------------------------
Clarification. James, are you indicating that there needs to be a new
scsi mid api that performs similar function to scsi_remove_host expect
does not cancel commands?
------------------------------------------------------------------------

There was no real answer and things were left hanging.

So I guess part of what I'm asking is whether the situation is now 
significantly different.

Alan Stern


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Discussion: soft unbinding
  2008-05-03 20:42   ` Alan Stern
@ 2008-05-03 22:32     ` James Bottomley
  2008-05-04  2:28       ` Alan Stern
  0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2008-05-03 22:32 UTC (permalink / raw)
  To: Alan Stern
  Cc: Stefan Richter, Matthew Dharm, Oliver Neukum, USB Storage list,
	SCSI development list

On Sat, 2008-05-03 at 16:42 -0400, Alan Stern wrote:
> Of even more interest and relevance is this thread:
> 
>         http://marc.info/?t=109630920600005&r=1&w=2
> 
> In one of the messages in that thread, James Bottomley wrote:
> 
> ------------------------------------------------------------------------
> Right.  scsi_remove_host tells the mid-layer that it's OK to trash all
> inflight commands because you removed all their users before calling
> it.  It also tells us that you won't accept any future commands for
> this
> host (because you'll error any attempt in queuecommand).
> ------------------------------------------------------------------------
> 
> Later on Mike Anderson asked:
> 
> ------------------------------------------------------------------------
> Clarification. James, are you indicating that there needs to be a new
> scsi mid api that performs similar function to scsi_remove_host expect
> does not cancel commands?
> ------------------------------------------------------------------------
> 
> There was no real answer and things were left hanging.
> 
> So I guess part of what I'm asking is whether the situation is now 
> significantly different.

Not really ... there's never been cause to make it so.  At the beginning
of the hotplug debate it was thought there was value in a wait for
unplug event ... some PCI busses have a little button you push and then
a light lights up to tell you everything's OK and you can remove the
card.

After a lot of back and forth, it was decided that the best thing for
the latter was for userland to quiesce and unmount the filesystem,
application or whatever and then tell the kernel it was gone, so in that
scenario, the two paths were identical.  I don't think anything's really
changed in that regard.

James



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Discussion: soft unbinding
  2008-05-03 22:32     ` James Bottomley
@ 2008-05-04  2:28       ` Alan Stern
  2008-05-04 10:53         ` Stefan Richter
  2008-05-04 14:15         ` James Bottomley
  0 siblings, 2 replies; 9+ messages in thread
From: Alan Stern @ 2008-05-04  2:28 UTC (permalink / raw)
  To: James Bottomley
  Cc: Stefan Richter, Matthew Dharm, Oliver Neukum, USB Storage list,
	SCSI development list

On Sat, 3 May 2008, James Bottomley wrote:

> On Sat, 2008-05-03 at 16:42 -0400, Alan Stern wrote:
> > Of even more interest and relevance is this thread:
> > 
> >         http://marc.info/?t=109630920600005&r=1&w=2
> > 
> > In one of the messages in that thread, James Bottomley wrote:
> > 
> > ------------------------------------------------------------------------
> > Right.  scsi_remove_host tells the mid-layer that it's OK to trash all
> > inflight commands because you removed all their users before calling
> > it.  It also tells us that you won't accept any future commands for
> > this
> > host (because you'll error any attempt in queuecommand).
> > ------------------------------------------------------------------------
> > 
> > Later on Mike Anderson asked:
> > 
> > ------------------------------------------------------------------------
> > Clarification. James, are you indicating that there needs to be a new
> > scsi mid api that performs similar function to scsi_remove_host expect
> > does not cancel commands?
> > ------------------------------------------------------------------------
> > 
> > There was no real answer and things were left hanging.
> > 
> > So I guess part of what I'm asking is whether the situation is now 
> > significantly different.
> 
> Not really ... there's never been cause to make it so.  At the beginning
> of the hotplug debate it was thought there was value in a wait for
> unplug event ... some PCI busses have a little button you push and then
> a light lights up to tell you everything's OK and you can remove the
> card.
> 
> After a lot of back and forth, it was decided that the best thing for
> the latter was for userland to quiesce and unmount the filesystem,
> application or whatever and then tell the kernel it was gone, so in that
> scenario, the two paths were identical.  I don't think anything's really
> changed in that regard.

I still don't understand.  Let's say the user does unmount the
filesystem and tell the kernel it is gone.  So the LLD calls
scsi_unregister_host() and from that point on fails every call to
queuecommand.  Then how does sd transmit its final FLUSH CACHE command
to the device?  Are you saying that it doesn't need to, since
unmounting the filesystem will cause a FLUSH CACHE to be sent anyway?

Or let's put it the other way around.  Suppose the LLD doesn't start
failing calls to queuecommand until after scsi_unregister_host() 
returns.  Then what about the commands that were in flight when 
scsi_unregister_host() was called?  The LLD thinks it owns them, and 
the midlayer thinks that _it_ owns them and can unilaterally cancel 
them.  They can't both be right.

Alan Stern


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Discussion: soft unbinding
  2008-05-04  2:28       ` Alan Stern
@ 2008-05-04 10:53         ` Stefan Richter
  2008-05-04 14:15         ` James Bottomley
  1 sibling, 0 replies; 9+ messages in thread
From: Stefan Richter @ 2008-05-04 10:53 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Matthew Dharm, Oliver Neukum, USB Storage list,
	SCSI development list

Alan Stern wrote:
> On Sat, 3 May 2008, James Bottomley wrote:
[...]
>> At the beginning
>> of the hotplug debate it was thought there was value in a wait for
>> unplug event ... some PCI busses have a little button you push and then
>> a light lights up to tell you everything's OK and you can remove the
>> card.
>>
>> After a lot of back and forth, it was decided that the best thing for
>> the latter was for userland to quiesce and unmount the filesystem,
>> application or whatever and then tell the kernel it was gone, so in that
>> scenario, the two paths were identical.  I don't think anything's really
>> changed in that regard.
> 
> I still don't understand.  Let's say the user does unmount the
> filesystem and tell the kernel it is gone.  So the LLD calls
> scsi_unregister_host() and from that point on fails every call to
> queuecommand.  Then how does sd transmit its final FLUSH CACHE command
> to the device?  Are you saying that it doesn't need to, since
> unmounting the filesystem will cause a FLUSH CACHE to be sent anyway?

Before a device can be safely detached, there may be other things that 
need to be done besides what umount implies.  But let's have a look at 
the grander picture.

I see the following levels at which userspace can initiate detachment:

    1. Close block device files/ character device files.  E.g. umount
       filesystems.  Since userspace is multiprocess/ multithreaded,
       it has no way to prevent new open()s though.

       IOW userspace is unable to say which particular close() is the
       final one.  Or am I missing something?

    2. Unbind the command set driver (SCSI ULD) from the logical unit
       representation.

       How does 2 relate to 1?  Obviously, open() is guaranteed to be
       impossible after 2.

       Note, nothing prevents step 2 to be performed before step 1.
       IOW it is possible to unbind the ULD while the corresponding
       device file is still open, e.g. a filesystem still mounted.

       Furthermore, step 2 involves the execution of some request for
       purposes like flush write cache, stop motor, unlock drive door.
       These requests are dependent on device type and should be
       configurable by userspace to some degree (e.g. whether to go
       into a low power state if in single initiator mode).  The
       command set driver can ensure that these finalizing requests are
       executed in the desired order.  The sg driver sticks out here in
       so far as it has no knowledge of the device type, hence does not
       emit finalizing requests.

    3. Unbind the transport layer driver from the target port
       representation.

       How does 3 relate to 2?  Step 3 will cause step 2 be performed.
       But depending on which SCSI low-level API calls are used, the
       ULD may be unable to get the finalizing requests of step 2
       through the SCSI core to the LLD, because a core-internal
       state variable may prevent it.  The API documentation is
       unclear about it, IOW the behavior is basically undefined.

    4. Unbind the interconnect layer driver from what corresponded to
       the initiator port.

       Some drivers don't implement 3 and 4 separately.

For the discussion here it is obviously crucial how we want 2 relate to 
1 and how we want 3 relate to 2.

The relationship between 4 and 3 is an extension of the issue and 
interesting for hotpluggable PCI, CardBus, ExpressCard and the likes. 
But unlike 3/2 and 2/1, LLD authors have full control over this since 
the SCSI core is not in the picture here (if we treat the "transport 
attributes" programs as parts of the LLDs, not part of the SCSI core).

Side note:  There are various reference counters involved in the layers 
and partially across the layers.  There is for example the module 
reference count of the LLD which is usually (among else) manipulated 
when the device files of ULDs are open()ed and close()d.  A side effect 
is that module unloading as a special case of unbinding is prevented by 
upper layers as long as the upper layers have business with the device. 
But for now this is only a side effect while the actual purpose of these 
reference counters is really only to prevent dereferencing invalid pointers.

> Or let's put it the other way around.  Suppose the LLD doesn't start
> failing calls to queuecommand until after scsi_unregister_host() 
> returns.  Then what about the commands that were in flight when 
> scsi_unregister_host() was called?  The LLD thinks it owns them, and 
> the midlayer thinks that _it_ owns them and can unilaterally cancel 
> them.  They can't both be right.

Is there an actual problem?  As soon as a scsi_cmnd reached 
.queuecommand(), it is the sole privilege and responsibility of the LLD 
to tell when the scmd is complete from the transport's point of view. 
The SCSI core can at this point ask the LLD to prematurely complete an 
scmd, e.g. by means of .eh_abort_handler().

In my opinion, the LLD should simply process all scmds which it gets by 
.queuecommand() independently of whether unbinding was initiated.  I.e. 
complete them successfully if possible, complete them with failure if 
something went wrong at the transport protocol level, complete them as 
aborted when .eh_abort_handler() and friends requested it.

The SCSI core's low-level API should have guarantees somewhere that 
.queuecommand() will not be called anymore after certain 
scsi_remove_XYZ() calls returned.

Furthermore, I would like it if the SCSI core would allow step 2 to be 
performed as gracefully as possible (i.e. with successful execution of 
all finalizing requests which the ULDs emit) --- either in case of all 
scsi_remove_XYZ()s, or only in case of some possibly new 
scsi_remove_ABC()s if the necessary change/clarification of semantics of 
existing scsi_remove_XYZ() is too problematic for some existing LLDs.
-- 
Stefan Richter
-=====-==--- -=-= --=--
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Discussion: soft unbinding
  2008-05-04  2:28       ` Alan Stern
  2008-05-04 10:53         ` Stefan Richter
@ 2008-05-04 14:15         ` James Bottomley
  2008-05-04 21:14           ` Alan Stern
  1 sibling, 1 reply; 9+ messages in thread
From: James Bottomley @ 2008-05-04 14:15 UTC (permalink / raw)
  To: Alan Stern
  Cc: Stefan Richter, Matthew Dharm, Oliver Neukum, USB Storage list,
	SCSI development list

On Sat, 2008-05-03 at 22:28 -0400, Alan Stern wrote:
> > > So I guess part of what I'm asking is whether the situation is now 
> > > significantly different.
> > 
> > Not really ... there's never been cause to make it so.  At the beginning
> > of the hotplug debate it was thought there was value in a wait for
> > unplug event ... some PCI busses have a little button you push and then
> > a light lights up to tell you everything's OK and you can remove the
> > card.
> > 
> > After a lot of back and forth, it was decided that the best thing for
> > the latter was for userland to quiesce and unmount the filesystem,
> > application or whatever and then tell the kernel it was gone, so in that
> > scenario, the two paths were identical.  I don't think anything's really
> > changed in that regard.
> 
> I still don't understand.  Let's say the user does unmount the
> filesystem and tell the kernel it is gone.  So the LLD calls
> scsi_unregister_host() and from that point on fails every call to
> queuecommand.  Then how does sd transmit its final FLUSH CACHE command
> to the device?  Are you saying that it doesn't need to, since
> unmounting the filesystem will cause a FLUSH CACHE to be sent anyway?

This is the sequence of events scsi_remove_host causes:

     1. Host goes into CANCEL state.  This has no real meaning to the
        mid-layer command processor really: it only checks device state
        for commands.
     2. it calls scsi_forget_host() which loops over all the hosts
        devices calling __scsi_remove_device().
     3. __scsi_remove_device puts the device into cancel mode (now only
        special commands get through).
     4. it unbinds bsg and calls device_unregister triggering the
        ->remove method of the driver
     5. the ->remove method of sd sends the flush cache as a special
        command (which still gets through).
     6. it removes the transport
     7. it calls device_del and sets the device state to DEL; now no
        commands will be permitted
     8. finally it calls transport destroy and slave destroy
     9. after this is done for every device the host goes into DEL

> Or let's put it the other way around.  Suppose the LLD doesn't start
> failing calls to queuecommand until after scsi_unregister_host() 
> returns.  Then what about the commands that were in flight when 
> scsi_unregister_host() was called?  The LLD thinks it owns them, and 
> the midlayer thinks that _it_ owns them and can unilaterally cancel 
> them.  They can't both be right.

This is a misunderstanding: there's no active cancellation (although
there was a long discussion about that too).  All it does is start
saying "no" to commands as they come down.  In flight commands are up to
the HBA driver to deal with (or the error handler will activate on
timeout if it doesn't).

James



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Discussion: soft unbinding
  2008-05-04 14:15         ` James Bottomley
@ 2008-05-04 21:14           ` Alan Stern
  2008-05-05  3:42             ` James Bottomley
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Stern @ 2008-05-04 21:14 UTC (permalink / raw)
  To: James Bottomley
  Cc: Stefan Richter, Matthew Dharm, Oliver Neukum, USB Storage list,
	SCSI development list

On Sun, 4 May 2008, James Bottomley wrote:

> This is the sequence of events scsi_remove_host causes:
> 
>      1. Host goes into CANCEL state.  This has no real meaning to the
>         mid-layer command processor really: it only checks device state
>         for commands.
>      2. it calls scsi_forget_host() which loops over all the hosts
>         devices calling __scsi_remove_device().
>      3. __scsi_remove_device puts the device into cancel mode (now only
>         special commands get through).
>      4. it unbinds bsg and calls device_unregister triggering the
>         ->remove method of the driver
>      5. the ->remove method of sd sends the flush cache as a special
>         command (which still gets through).
>      6. it removes the transport
>      7. it calls device_del and sets the device state to DEL; now no
>         commands will be permitted
>      8. finally it calls transport destroy and slave destroy
>      9. after this is done for every device the host goes into DEL

That all sounds appropriate for a "soft" unbind.

What about the error handler?  It's still possible for the 
device-reset, bus-reset, and host-reset methods to be called after 
scsi_remove_host returns, isn't it?

Speaking of which, it's also possible for the error handler to remain
running when scsi_remove_host returns, right?  This would mean that the
host is in DEL_RECOVERY, not DEL -- which in turn means that commands
are still permitted.  Shouldn't scsi_remove_host wait for the host to
reach DEL before returning?

> > Or let's put it the other way around.  Suppose the LLD doesn't start
> > failing calls to queuecommand until after scsi_unregister_host() 
> > returns.  Then what about the commands that were in flight when 
> > scsi_unregister_host() was called?  The LLD thinks it owns them, and 
> > the midlayer thinks that _it_ owns them and can unilaterally cancel 
> > them.  They can't both be right.
> 
> This is a misunderstanding: there's no active cancellation (although
> there was a long discussion about that too).  All it does is start
> saying "no" to commands as they come down.  In flight commands are up to
> the HBA driver to deal with (or the error handler will activate on
> timeout if it doesn't).

Okay, good.  Once upon a time (i.e., back in 2004) there _was_ active 
cancellation.  It caused oopses; I'm glad to hear that it is gone.

Alan Stern


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Discussion: soft unbinding
  2008-05-04 21:14           ` Alan Stern
@ 2008-05-05  3:42             ` James Bottomley
  0 siblings, 0 replies; 9+ messages in thread
From: James Bottomley @ 2008-05-05  3:42 UTC (permalink / raw)
  To: Alan Stern
  Cc: Stefan Richter, Matthew Dharm, Oliver Neukum, USB Storage list,
	SCSI development list

On Sun, 2008-05-04 at 17:14 -0400, Alan Stern wrote:
> On Sun, 4 May 2008, James Bottomley wrote:
> 
> > This is the sequence of events scsi_remove_host causes:
> > 
> >      1. Host goes into CANCEL state.  This has no real meaning to the
> >         mid-layer command processor really: it only checks device state
> >         for commands.
> >      2. it calls scsi_forget_host() which loops over all the hosts
> >         devices calling __scsi_remove_device().
> >      3. __scsi_remove_device puts the device into cancel mode (now only
> >         special commands get through).
> >      4. it unbinds bsg and calls device_unregister triggering the
> >         ->remove method of the driver
> >      5. the ->remove method of sd sends the flush cache as a special
> >         command (which still gets through).
> >      6. it removes the transport
> >      7. it calls device_del and sets the device state to DEL; now no
> >         commands will be permitted
> >      8. finally it calls transport destroy and slave destroy
> >      9. after this is done for every device the host goes into DEL
> 
> That all sounds appropriate for a "soft" unbind.
> 
> What about the error handler?  It's still possible for the 
> device-reset, bus-reset, and host-reset methods to be called after 
> scsi_remove_host returns, isn't it?

Yes ... that's one of the eh problems; although it can probably fixed
just by extending the offline state checking

> Speaking of which, it's also possible for the error handler to remain
> running when scsi_remove_host returns, right?  This would mean that the
> host is in DEL_RECOVERY, not DEL -- which in turn means that commands
> are still permitted.  Shouldn't scsi_remove_host wait for the host to
> reach DEL before returning?

No ... because the host state doesn't really matter for commands, only
the device state.

> > > Or let's put it the other way around.  Suppose the LLD doesn't start
> > > failing calls to queuecommand until after scsi_unregister_host() 
> > > returns.  Then what about the commands that were in flight when 
> > > scsi_unregister_host() was called?  The LLD thinks it owns them, and 
> > > the midlayer thinks that _it_ owns them and can unilaterally cancel 
> > > them.  They can't both be right.
> > 
> > This is a misunderstanding: there's no active cancellation (although
> > there was a long discussion about that too).  All it does is start
> > saying "no" to commands as they come down.  In flight commands are up to
> > the HBA driver to deal with (or the error handler will activate on
> > timeout if it doesn't).
> 
> Okay, good.  Once upon a time (i.e., back in 2004) there _was_ active 
> cancellation.  It caused oopses; I'm glad to hear that it is gone.

James



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-05-05  3:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-03 16:03 Discussion: soft unbinding Alan Stern
2008-05-03 17:22 ` Stefan Richter
2008-05-03 20:42   ` Alan Stern
2008-05-03 22:32     ` James Bottomley
2008-05-04  2:28       ` Alan Stern
2008-05-04 10:53         ` Stefan Richter
2008-05-04 14:15         ` James Bottomley
2008-05-04 21:14           ` Alan Stern
2008-05-05  3:42             ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox