From: Hannes Reinecke <hare@suse.de>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: James.Smart@emulex.com, "Elliott,
Robert (Server Storage)" <Elliott@hp.com>,
"emilne@redhat.com" <emilne@redhat.com>,
SCSI Mailing List <linux-scsi@vger.kernel.org>,
Andrew Vasquez <andrew.vasquez@qlogic.com>,
Chad Dupuis <chad.dupuis@qlogic.com>,
James Bottomley <James.Bottomley@HansenPartnership.com>
Subject: Re: Error handling on FC devices
Date: Mon, 03 Dec 2012 08:15:10 +0100 [thread overview]
Message-ID: <50BC517E.4090208@suse.de> (raw)
In-Reply-To: <50B8E4AC.8@cs.wisc.edu>
On 11/30/2012 05:54 PM, Mike Christie wrote:
> On 11/30/2012 05:44 AM, Hannes Reinecke wrote:
>> On 11/29/2012 05:02 PM, James Smart wrote:
>>> Always possible - but.... Our f/w works at the FCP level and
>>> below, which means it doesn't know/do SCSI commands - e.g what the
>>> cdb within the FCP CMD frame is; know anything about SCSI device
>>> classes and state; etc. And it shouldn't be required to do so.
>>> Anytime this has been there in the past, it's been problematic.
>>>
>>> if we want to do this - we should add it to the midlayer/transport.
>>>
>> D'accord. Transport layer looks like a good fit.
>>
>> What we should be doing is hooking up 'bus_reset' to be equivalent to
>> REMOVE I_T NEXUS (SAS is already doing this).
>
> Do you mean the scsi eh bus reset callout and if so does that work on
> multiple targets but REMOVE I_T NEXUS only will operate on one at a
> time? I think it would be cleaner to add a new callout that works like
> the target reset one where the scsi-ml loops over the targets for the
> drivers.
>
Well, looking at QLogic and Emulex both emulate a bus reset with a
loop over each target and invoke a target reset there.
I somewhat fail to see the rationale behind it, other than emulating
the bus reset behaviour on SPI.
Given that the original target reset already failed (otherwise we
wouldn't be doing a bus reset), I doubt a _second_ target reset
will lead to a different result.
So invoking REMOVE I_T NEXUS here can only improve matters :-)
I'm all for renaming bus_reset, though :-)
>>
>> In our case a REMOVE I_T NEXUS would be roughly equivalent to
>> scsi_remote_port_delete(); only we should be starting aborting
>> outstanding I/O directly and not waiting for fast_fail_tmo
>> to kick in.
>>
>
> To abort IO, will you be calling the drivers terminate_rport_io or
> dev_loss_tmo_callbk? If so I just wanted to warn you that I noticed that
> some drivers will only initiate the aborting/cleanup of IO in there. So
> if you call those callouts and expect that when finished scsi-ml can
> free the scsi command and pass the request back up, I think we could hit
> some races with memory issues.
>
Yeah, I know.
What I had in mind was to invoke terminate_rport_io() and then wait
for a certain time until either all outstanding commands have been
processes (ie starget->busy drops to zero) or the port state changed.
I'm not quite sure as for how long I should be waiting, but
dev_loss_tmo will be a good upper limit here.
As said, I'll be posting a patch.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-12-03 7:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-19 12:41 Error handling on FC devices Hannes Reinecke
2012-11-26 22:32 ` James Smart
2012-11-27 20:03 ` Ewan Milne
2012-11-27 20:29 ` Elliott, Robert (Server Storage)
2012-11-28 7:09 ` Hannes Reinecke
2012-11-29 16:02 ` James Smart
2012-11-30 11:44 ` Hannes Reinecke
2012-11-30 16:54 ` Mike Christie
2012-12-03 7:15 ` Hannes Reinecke [this message]
2012-12-03 17:19 ` Jeremy Linton
2012-12-03 22:52 ` Elliott, Robert (Server Storage)
2012-12-04 15:56 ` Kipp Aldrich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50BC517E.4090208@suse.de \
--to=hare@suse.de \
--cc=Elliott@hp.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=James.Smart@emulex.com \
--cc=andrew.vasquez@qlogic.com \
--cc=chad.dupuis@qlogic.com \
--cc=emilne@redhat.com \
--cc=linux-scsi@vger.kernel.org \
--cc=michaelc@cs.wisc.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox