All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: James.Smart@emulex.com, "Elliott,
	Robert (Server Storage)" <Elliott@hp.com>,
	"emilne@redhat.com" <emilne@redhat.com>,
	SCSI Mailing List <linux-scsi@vger.kernel.org>,
	Andrew Vasquez <andrew.vasquez@qlogic.com>,
	Chad Dupuis <chad.dupuis@qlogic.com>,
	James Bottomley <James.Bottomley@HansenPartnership.com>
Subject: Re: Error handling on FC devices
Date: Mon, 03 Dec 2012 08:15:10 +0100	[thread overview]
Message-ID: <50BC517E.4090208@suse.de> (raw)
In-Reply-To: <50B8E4AC.8@cs.wisc.edu>

On 11/30/2012 05:54 PM, Mike Christie wrote:
> On 11/30/2012 05:44 AM, Hannes Reinecke wrote:
>> On 11/29/2012 05:02 PM, James Smart wrote:
>>> Always possible - but....   Our f/w works at the FCP level and
>>> below, which means it doesn't know/do SCSI commands - e.g what the
>>> cdb within the FCP CMD frame is; know anything about SCSI device
>>> classes and state; etc. And it shouldn't be required to do so.
>>> Anytime this has been there in the past, it's been problematic.
>>>
>>> if we want to do this - we should add it to the midlayer/transport.
>>>
>> D'accord. Transport layer looks like a good fit.
>>
>> What we should be doing is hooking up 'bus_reset' to be equivalent to
>> REMOVE I_T NEXUS (SAS is already doing this).
>
> Do you mean the scsi eh bus reset callout and if so does that work on
> multiple targets but REMOVE I_T NEXUS only will operate on one at a
> time? I think it would be cleaner to add a new callout that works like
> the target reset one where the scsi-ml loops over the targets for the
> drivers.
>
Well, looking at QLogic and Emulex both emulate a bus reset with a 
loop over each target and invoke a target reset there.
I somewhat fail to see the rationale behind it, other than emulating 
the bus reset behaviour on SPI.
Given that the original target reset already failed (otherwise we 
wouldn't be doing a bus reset), I doubt a _second_ target reset
will lead to a different result.

So invoking REMOVE I_T NEXUS here can only improve matters :-)

I'm all for renaming bus_reset, though :-)

>>
>> In our case a REMOVE I_T NEXUS would be roughly equivalent to
>> scsi_remote_port_delete(); only we should be starting aborting
>> outstanding I/O directly and not waiting for fast_fail_tmo
>> to kick in.
>>
>
> To abort IO, will you be calling the drivers terminate_rport_io or
> dev_loss_tmo_callbk? If so I just wanted to warn you that I noticed that
> some drivers will only initiate the aborting/cleanup of IO in there. So
> if you call those callouts and expect that when finished scsi-ml can
> free the scsi command and pass the request back up, I think we could hit
> some races with memory issues.
>
Yeah, I know.
What I had in mind was to invoke terminate_rport_io() and then wait 
for a certain time until either all outstanding commands have been
processes (ie starget->busy drops to zero) or the port state changed.
I'm not quite sure as for how long I should be waiting, but 
dev_loss_tmo will be a good upper limit here.

As said, I'll be posting a patch.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-12-03  7:15 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-19 12:41 Error handling on FC devices Hannes Reinecke
2012-11-26 22:32 ` James Smart
2012-11-27 20:03   ` Ewan Milne
2012-11-27 20:29     ` Elliott, Robert (Server Storage)
2012-11-28  7:09       ` Hannes Reinecke
2012-11-29 16:02         ` James Smart
2012-11-30 11:44           ` Hannes Reinecke
2012-11-30 16:54             ` Mike Christie
2012-12-03  7:15               ` Hannes Reinecke [this message]
2012-12-03 17:19                 ` Jeremy Linton
2012-12-03 22:52                 ` Elliott, Robert (Server Storage)
2012-12-04 15:56                   ` Kipp Aldrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50BC517E.4090208@suse.de \
    --to=hare@suse.de \
    --cc=Elliott@hp.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=James.Smart@emulex.com \
    --cc=andrew.vasquez@qlogic.com \
    --cc=chad.dupuis@qlogic.com \
    --cc=emilne@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.