From: Hannes Reinecke <hare@suse.de>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: James.Smart@emulex.com, "Elliott,
Robert (Server Storage)" <Elliott@hp.com>,
"emilne@redhat.com" <emilne@redhat.com>,
SCSI Mailing List <linux-scsi@vger.kernel.org>,
Andrew Vasquez <andrew.vasquez@qlogic.com>,
Chad Dupuis <chad.dupuis@qlogic.com>,
James Bottomley <James.Bottomley@HansenPartnership.com>
Subject: Re: Error handling on FC devices
Date: Mon, 03 Dec 2012 08:15:10 +0100 [thread overview]
Message-ID: <50BC517E.4090208@suse.de> (raw)
In-Reply-To: <50B8E4AC.8@cs.wisc.edu>
On 11/30/2012 05:54 PM, Mike Christie wrote:
> On 11/30/2012 05:44 AM, Hannes Reinecke wrote:
>> On 11/29/2012 05:02 PM, James Smart wrote:
>>> Always possible - but.... Our f/w works at the FCP level and
>>> below, which means it doesn't know/do SCSI commands - e.g what the
>>> cdb within the FCP CMD frame is; know anything about SCSI device
>>> classes and state; etc. And it shouldn't be required to do so.
>>> Anytime this has been there in the past, it's been problematic.
>>>
>>> if we want to do this - we should add it to the midlayer/transport.
>>>
>> D'accord. Transport layer looks like a good fit.
>>
>> What we should be doing is hooking up 'bus_reset' to be equivalent to
>> REMOVE I_T NEXUS (SAS is already doing this).
>
> Do you mean the scsi eh bus reset callout and if so does that work on
> multiple targets but REMOVE I_T NEXUS only will operate on one at a
> time? I think it would be cleaner to add a new callout that works like
> the target reset one where the scsi-ml loops over the targets for the
> drivers.
>
Well, looking at QLogic and Emulex both emulate a bus reset with a
loop over each target and invoke a target reset there.
I somewhat fail to see the rationale behind it, other than emulating
the bus reset behaviour on SPI.
Given that the original target reset already failed (otherwise we
wouldn't be doing a bus reset), I doubt a _second_ target reset
will lead to a different result.
So invoking REMOVE I_T NEXUS here can only improve matters :-)
I'm all for renaming bus_reset, though :-)
>>
>> In our case a REMOVE I_T NEXUS would be roughly equivalent to
>> scsi_remote_port_delete(); only we should be starting aborting
>> outstanding I/O directly and not waiting for fast_fail_tmo
>> to kick in.
>>
>
> To abort IO, will you be calling the drivers terminate_rport_io or
> dev_loss_tmo_callbk? If so I just wanted to warn you that I noticed that
> some drivers will only initiate the aborting/cleanup of IO in there. So
> if you call those callouts and expect that when finished scsi-ml can
> free the scsi command and pass the request back up, I think we could hit
> some races with memory issues.
>
Yeah, I know.
What I had in mind was to invoke terminate_rport_io() and then wait
for a certain time until either all outstanding commands have been
processes (ie starget->busy drops to zero) or the port state changed.
I'm not quite sure as for how long I should be waiting, but
dev_loss_tmo will be a good upper limit here.
As said, I'll be posting a patch.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-12-03 7:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-19 12:41 Error handling on FC devices Hannes Reinecke
2012-11-26 22:32 ` James Smart
2012-11-27 20:03 ` Ewan Milne
2012-11-27 20:29 ` Elliott, Robert (Server Storage)
2012-11-28 7:09 ` Hannes Reinecke
2012-11-29 16:02 ` James Smart
2012-11-30 11:44 ` Hannes Reinecke
2012-11-30 16:54 ` Mike Christie
2012-12-03 7:15 ` Hannes Reinecke [this message]
2012-12-03 17:19 ` Jeremy Linton
2012-12-03 22:52 ` Elliott, Robert (Server Storage)
2012-12-04 15:56 ` Kipp Aldrich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50BC517E.4090208@suse.de \
--to=hare@suse.de \
--cc=Elliott@hp.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=James.Smart@emulex.com \
--cc=andrew.vasquez@qlogic.com \
--cc=chad.dupuis@qlogic.com \
--cc=emilne@redhat.com \
--cc=linux-scsi@vger.kernel.org \
--cc=michaelc@cs.wisc.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.