From: James Smart <James.Smart@Emulex.Com>
To: Christof Schmitt <christof.schmitt@de.ibm.com>
Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: fc_remote_port_delete and returning SCSI commands from LLD
Date: Wed, 21 Oct 2009 12:24:31 -0400 [thread overview]
Message-ID: <4ADF35BF.8050100@emulex.com> (raw)
In-Reply-To: <20091020144027.GA17717@schmichrtp.de.ibm.com>
Christof Schmitt wrote:
> I am looking again at how and when a FC LLD should call
> fc_remote_port_delete. Some help would be welcome to cover all
> requirements and to plug the holes...
It's pretty simple, as far as the FC transport is concerned.
Call fc_remote_port_add() once connectivity is established.
Call fc_remote_port_delete() once connectivity is lost.
It is expected that there is a clear add->delete->add->delete->... sequence.
Timing is considered "immediate", but there's always a window of delay.
In general, the Transport ignores what happens to outstanding i/o, letting the
LLDD do something based on its policy, or let natural i/o timers or the fast
fail timer to fire. The transport, at the delete call, will start the fast
fail timer if it is set.
> One scenario i am looking at: The connection to the HBA has been
> temporarily lost and the LLD has to return all pending I/O requests to
> the upper layers, so they can be retried later. Now with the SCSI
> device being part of a multipath device, the first failed I/O request
> triggers path failover:
You are now asking a different question - how to make the upper layers play
nice with the different responses from queuecommand, the LLDD's interaction
with the transport and midlayer, etc.
>
> multipath_end_io
> do_end_io
> fail_path
> queue_work(kmultipathd, &pgpath->deactivate_path);
>
> which then marks the following returned requests as timed out:
>
> deactivate_path
> blk_abort_queue
> blk_abort_request
> blk_rq_timed_out
> scsi_times_out
> fc_timed_out
>
> If the remote_port status is not BLOCKED, this will trigger the SCSI
> midlayer error handling which cannot do much during the interruption
> to the hardware and will mark the SCSI devices 'offline'.
Well - this isn't absolute, but is pretty much true. We expect, when
connectivity is lost, for the block state to be temporarily entered. The
blocked state holds off further i/o and the eh handler as well, to postpone
the normal i/o failure cases which do lead to offline conditions in most
scenarios.
But - this process is a coordinated effort between the driver and the upper
layers, and where the driver doesn't get helped by the transport (the blocked
state) it had better mimic the return codes at the different points, and
perhaps more, so that bad things don't happen.
> In order to
> prevent this, the rule would be: First call fc_remote_port_delete to
> set the remote port (or in the case of an HBA interruption all remote
> ports) to BLOCKED, and only after this step call scsi_done to pass the
> SCSI commands back to the upper layers.
True, although as mentioned, i/o termination is considered independent from
the rport/transport. But, you're best off if the target is blocked due to the
rport delete as we've prepped the upper layers to behave best with this behavior.
There will always be a few i/o's that sneak in or complete (timeout ?) in
between when the LLDD detects connectivity loss and when the
fc_remote_port_delete has been called. It's up to the LLDD to handle this window.
Completions, including i/o timeouts, are typically not a big deal and should
just return via scsi_done as they normally would. The caveat is when those
i/o's are from the eh thread. Granted - if you are actively aborting/failing
i/o at the connectivity loss, and doing so before the block is in place,
you're causing more headaches for yourself in getting the upper layers to play
right with the LLDD - with the recommendation being "don't do that".
New i/o needs to be caught in queuecommand with the LLDD emulating the
transport status that would normally get returned. E.g. the call to
fc_remote_port_chkready() won't catch it as the fc_remote_port_delete() call
hasn't completed yet - so the LLDD needs a 2nd check against it's own
structures, and if it detects the state, it should fail the i/o with the same
codes that chkready would. In reality, if you wanted to accept the command,
but never issue it and just leave it outstanding - waiting for i/o timeout, or
fast fail i/o timout, or devloss_tmo, I guess you could.
> This means, if the HBA problem is detected in interrupt context,
> fc_remote_port_delete has to be called before calling scsi_done.
Well - execution context is somewhat unrelated, as it depends on how the LLDD
is implemented, and what else its doing when connectivity is lost.
>
> But the description for fc_remote_port_delete states:
>
> * Called from normal process context only - cannot be called from
> * interrupt.
> *
> * Notes:
> * This routine assumes no locks are held on entry.
> */
>
> Looking at the functions called from fc_remote_port_delete, i don't
> see a problem in calling fc_remote_port_delete from interrupt context
> or with locks held. Does this mean the description should be fixed or
> am i missing something?
That's probably true. underlying routines have changed a bit over time and it
may be better now. I'd still hesitate with fc_tgt_it_nexus_destroy() (although
it shouldn't be applicable to you), and scsi_target_block(). Creating
additional lock hierachies between LLDD locks and the locks in these paths
(which the LLDD rarely sees/knows about) isn't good. Thus, we've mostly
pushed LLDDs to use a pristine context when calling the transport (such as a
workq context) so that we can disassociate low-level LLDD design from midlayer
design.
> fc_remote_port_add on the other hand can wait during flushes and has
> to be called from process context. To summarize:
> - A LLD has to call fc_remote_port_delete before returning SCSI
> commands from a failed port or failed HBA.
not true, but best behavior.
> - fc_remote_port_delete can be called from interrupt context before
> calling scsi_done if necessary
part a (called from interrupt context) - do so at your own risk. These other
paths can change at any time and its not fair for those developers to know
your driver dependencies.
part b (before calling scsi_done) - recommended approach.
> - fc_remote_port_add has to be called from process context
True.
> - The LLD has to serialize the fc_remote_port_add and
> fc_remote_port_delete calls to guarantee the add->delete->...
> sequence.
True.
-- james s
>
> --
> Christof
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2009-10-21 16:24 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-20 14:40 fc_remote_port_delete and returning SCSI commands from LLD Christof Schmitt
2009-10-21 15:24 ` Christof Schmitt
2009-10-21 16:33 ` James Smart
2009-10-23 7:58 ` Christof Schmitt
2009-10-23 14:50 ` James Smart
2009-10-27 16:59 ` Christof Schmitt
2009-10-27 19:44 ` James Smart
2009-10-21 16:24 ` James Smart [this message]
2009-10-23 7:47 ` Christof Schmitt
2009-10-23 14:47 ` James Smart
2009-10-27 21:57 ` Mike Christie
2009-10-21 18:11 ` Mike Christie
2009-10-23 7:13 ` Christof Schmitt
2009-10-27 21:53 ` Mike Christie
2009-10-28 14:27 ` Christof Schmitt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ADF35BF.8050100@emulex.com \
--to=james.smart@emulex.com \
--cc=christof.schmitt@de.ibm.com \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox