public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@SteelEye.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: SCSI development list <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH 1/5] SCSI scanning and removal fixes
Date: Wed, 07 Sep 2005 14:58:09 -0500	[thread overview]
Message-ID: <1126123089.4823.48.camel@mulgrave> (raw)
In-Reply-To: <Pine.LNX.4.44L0.0509071413070.4988-100000@iolanthe.rowland.org>

On Wed, 2005-09-07 at 14:27 -0400, Alan Stern wrote:
> > The second (allow RECOVERY->CANCEL) isn't really an answer.  The correct
> > thing, I suppose, is to have scsi_remove_host() wait for the error
> > handler to finish if the state transition cannot be accomodated
> > (otherwise the error handler will try to transition ->RUNNING part way
> > through the removal).
> 
> I'm going to argue strongly about this.  scsi_remove_host should _not_
> wait for error recovery to complete -- to do so will invite deadlocks.  
> (Suppose the error handler is waiting for a bus reset, but the bus reset
> routine requires a semaphore held by the LLD during the call to
> scsi_remove_host?)  Furthermore, error recovery can potentially take quite
> a long time -- much longer than we want to wait during a removal event.  
> Instead, the error handler should not be allowed to make the transition to
> RUNNING once the removal has started.

I agree (about the deadlocks).  However, as things stand RECOVERY is a
state in the model and the model can only be in a single state.  If you
permit the transition, and recovery is going on in parallel with
removal, they'll race to set the final state (removal wants DEL and the
eh thread will set it to RUNNING).

Either we go back to having an in_recovery flag (i.e. lift recovery out
of the state model) or we make the model more complex to cope with this.
Since really the only thing we test is in_recovery, we could do a more
complex model; something like:

created
   |
   v    <--------- 
 running ---------> recovery
   |                   |
   v   <----------     v
 cancel ----------> recover/cancel
   |                   |
   v   ----------->    v
  del <------------ recover/del

I also think I'd like not to go from del -> recover/del, but unless del
actually means that all devices have completed their I/O for deletion
that can't be avoided.


> Changing the API is fine with me, but the existing code is still shaky
> because it calls scsi_alloc_target before checking scsi_host_scan_allowed.  
> Maybe that's not an out-and-out mistake, but better to avoid it.

Actually, alloc_target is properly guarded so it doesn't need the scan
mutex.  It might be nice to update the SDEV_ state model to include a
"scanning" state, that way we could properly guard the sdev_alloc as
well and dump the scan mutex ... that's probably more than a slight
change, though.

> Would you like me to submit an updated patch?

Yes, please.  It's been suggested that we should have a scsi_add_target
that returns zero on success or error on failure (with no ref to the
sdev) and keep the old behaviour of __scsi_add_target().

James



  parent reply	other threads:[~2005-09-07 19:58 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-26 14:12 [PATCH 1/5] SCSI scanning and removal fixes Alan Stern
2005-09-07 15:16 ` James Bottomley
2005-09-07 18:27   ` Alan Stern
2005-09-07 18:37     ` Luben Tuikov
2005-09-07 18:42     ` Luben Tuikov
2005-09-07 19:31       ` Alan Stern
2005-09-07 20:00         ` Mike Anderson
2005-09-07 20:43         ` Luben Tuikov
2005-09-07 21:34           ` Stefan Richter
2005-09-08 15:19           ` Alan Stern
2005-09-08 16:07             ` Luben Tuikov
2005-09-08 18:36               ` Alan Stern
2005-09-08 23:59                 ` Luben Tuikov
2005-09-09 14:44                   ` Alan Stern
2005-09-09 17:08                   ` Stefan Richter
2005-09-09 17:15                     ` Luben Tuikov
2005-09-07 19:58     ` James Bottomley [this message]
2005-09-07 22:05       ` James Bottomley
2005-09-08 15:59       ` Alan Stern
2005-09-08 16:15         ` James Bottomley
2005-09-08 18:58           ` Alan Stern
2005-09-08 20:15             ` James Bottomley
2005-09-09  0:18               ` Luben Tuikov
2005-09-09 14:16               ` Alan Stern
2005-09-09 14:44                 ` James Bottomley
2005-09-09 15:16                   ` Alan Stern
2005-09-09 15:37                     ` James Bottomley
2005-09-09 16:17                       ` Alan Stern
2005-09-09 16:47                         ` Mike Anderson
2005-09-08 16:08       ` Alan Stern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1126123089.4823.48.camel@mulgrave \
    --to=james.bottomley@steeleye.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox