Re: [PATCH 1/5] SCSI scanning and removal fixes

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

From: James Bottomley <James.Bottomley@SteelEye.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: SCSI development list <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH 1/5] SCSI scanning and removal fixes
Date: Thu, 08 Sep 2005 15:15:41 -0500	[thread overview]
Message-ID: <1126210542.4845.66.camel@mulgrave> (raw)
In-Reply-To: <Pine.LNX.4.44L0.0509081441190.5768-100000@iolanthe.rowland.org>

On Thu, 2005-09-08 at 14:58 -0400, Alan Stern wrote:
> On Thu, 8 Sep 2005, James Bottomley wrote:
> > Actually, no, that's why we have the parallel EH states ... let me put
> > in the events that trigger state transitions so you can see what
> > happens:
> > 
> > 
> >         EH thread finishes
> > 	<---------------
> > 
> >         EH thread begins
> >         ---------------->
> > 
> > 
> >          <--------- 
> >   running ---------> recovery
> >     |                   |
> >     |                   | scsi_remove_host() called
> >     v   <----------     v
> >   cancel ----------> recover/cancel
> >     |                   |
> >     |                   | scsi_remove_host finishes visibility removal
> >     v                   v
> >    del <------------ recover/del
> > 
> > So the EH is allowed to activate in either running or cancel states, but
> > goes through its own state transition eventually coming back to del when
> > it finishes.  Once the EH gets into recover/cancel it can never
> > transition back to running.
> 
> Why allow cancel -> recover/cancel?  Once the device is in the cancel 
> state, there isn't anything useful the error handler can accomplish, is 
> there?  Failed or timed-out commands should simply return an error.
> 
> And if you accept that, then what point is there in distinguishing between 
> cancel and recover/cancel?  As far as I can see, the only significant 
> difference is that in recover/cancel the error handler is running (but not 
> accomplishing anything).  Is this related to 1) below?

Yes, a state machine like I've shown can only be in one state at any
given time.  The problem is what happens if the error handler is running
when you try to remove the device.  We can reach in and try to stop it,
but under a state model we'd have to wait until the error handler put
the device back to running unless we allow a set of state transitions
that the error handler can go through in parallel with the removal
(which is the model I've shown above).

The idea of allowing the error handler to start up in cancel is that not
every device removal is a forced removal, so you want to permit the
shutdown commands (like sync cache) and consequent error handling on
those commands.

> > It's just nasty on two counts:
> > 
> > 1) we have an incorrect bifurcation in the state model and
> > 2) we never actually enforce any of the state transitions.
> 
> Can you explain 1) more fully?  I don't really understand what you're 
> getting at.

Does the explanation above cover this?

> As for 2), what do you mean?  In 2.6.13, scsi_device_set_state does not 
> change the state if the transition is illegal.

Yes, but we never act on the return value.  if the state is illegal, we
shouldn't be taking actions believing we're in that state, so we should
always be checking the returns.

James

next prev parent reply	other threads:[~2005-09-08 20:15 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-26 14:12 [PATCH 1/5] SCSI scanning and removal fixes Alan Stern
2005-09-07 15:16 ` James Bottomley
2005-09-07 18:27   ` Alan Stern
2005-09-07 18:37     ` Luben Tuikov
2005-09-07 18:42     ` Luben Tuikov
2005-09-07 19:31       ` Alan Stern
2005-09-07 20:00         ` Mike Anderson
2005-09-07 20:43         ` Luben Tuikov
2005-09-07 21:34           ` Stefan Richter
2005-09-08 15:19           ` Alan Stern
2005-09-08 16:07             ` Luben Tuikov
2005-09-08 18:36               ` Alan Stern
2005-09-08 23:59                 ` Luben Tuikov
2005-09-09 14:44                   ` Alan Stern
2005-09-09 17:08                   ` Stefan Richter
2005-09-09 17:15                     ` Luben Tuikov
2005-09-07 19:58     ` James Bottomley
2005-09-07 22:05       ` James Bottomley
2005-09-08 15:59       ` Alan Stern
2005-09-08 16:15         ` James Bottomley
2005-09-08 18:58           ` Alan Stern
2005-09-08 20:15             ` James Bottomley [this message]
2005-09-09  0:18               ` Luben Tuikov
2005-09-09 14:16               ` Alan Stern
2005-09-09 14:44                 ` James Bottomley
2005-09-09 15:16                   ` Alan Stern
2005-09-09 15:37                     ` James Bottomley
2005-09-09 16:17                       ` Alan Stern
2005-09-09 16:47                         ` Mike Anderson
2005-09-08 16:08       ` Alan Stern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1126210542.4845.66.camel@mulgrave \
    --to=james.bottomley@steeleye.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox