From: Mike Christie <michaelc@cs.wisc.edu>
To: James.Smart@Emulex.Com
Cc: linux-scsi@vger.kernel.org
Subject: Re: [RFC] fc transport: extensions for fast fail and dev loss
Date: Tue, 25 Jul 2006 12:12:54 -0500 [thread overview]
Message-ID: <44C65116.3030503@cs.wisc.edu> (raw)
In-Reply-To: <1150829123.16981.1.camel@localhost.localdomain>
James Smart wrote:
> Folks,
>
> The following addresses some long standing todo items I've had in the
> FC transport. They primarily arise when considering multipathing, or
> trying to marry driver internal state to transport state. It is intended
> that this same type of functionality would be usable in other transports
> as well.
>
I agree we need something like this. iSCSI is going to move to something
closer to FC in 2.6.19 to better integrate qla4xxx and give FC and iSCSI
a similar interface when it makes sense.
> Here's what is contained:
>
> - dev_loss_tmo LLDD callback :
> Currently, there is no notification to the LLDD of when the transport
> gives up on the device returning and starts to return DID_NO_CONNECT
> in the queuecommand helper function. This callback notifies the LLDD
> that the transport has now given up on the rport, thereby acknowledging
> the prior fc_remote_port_delete() call. The callback also expects the
> LLDD to initiate the termination of any outstanding i/o on the rport.
>
iSCSI does something like this at the lower level right now. For the
common lower level iscsi layer that software drivers share we have a
callback that allows drivers to do the same thing as your dev_loss_tmo
callback. When we move to the new model we will need something like this.
> - fast_io_fail_tmo and LLD callback:
> There are some cases where it may take a long while to truly determine
> device loss, but the system is in a multipathing configuration that if
> the i/o was failed quickly (faster than dev_loss_tmo), it could be
> redirected to a different path and completed sooner (assuming the
> multipath thing knew that the sdev was blocked).
>
> iSCSI is one of the transports that may vary dev_loss_tmo values
> per session, and you would like fast io failure.
>
Agree. Currently we are sort of doing this in userspace, but since
qla4xxx does more in the kernel we would like to move it there so
qla4xxxx and other HW iscsi cards do not have to jump to so many hoops
to use the functionality.
> - fast_loss_time recommendation:
> In discussing how a admin should set dev_loss_tmo in a multipathing
> environment, it became apparent that we expected the admin to know
> a lot. They had to know the transport type, what the minimum setting
> can be that still survives normal link bouncing, and they may even
> have to know about device specifics. For iSCSI, the proper loss time
> may vary widely from session to session.
>
> This attribute is an exported "recommendation" by the LLDD and transport
> on what the lowest setting for dev_loss_tmo should be for a multipathing
> environment. Thus, the admin only needs to cat this attribute to obtain
> the value to echo into dev_loss_tmo.
>
> I have one criticism of these changes. The callbacks are calling into
> the LLDD with an rport post the driver's rport_delete call. What it means
> is that we are essentially extending the lifetime of an rport until the
> dev_loss_tmo call occurs.
>
So is the fast_io_fail_tmo callback the terminate_rport_io callback? If
so, are we supposed to unblock the rport/session/target from
fc_timeout_fail_rport_io and call into the LLD and the LLD will set some
bit (or maybe check some rport/session/target/scsi_device bit) so that
incoming IO and IO sitting in the driver will be failed with something
like DID_BUS_BUSY so it goes to the upper layers? I think I only the
unblock happen on success or fc_starget_delete, so IO in the driver
looks like it can get failed upwards but IO sitting in the queue sits
there until fc_rport_final_delete or success.
If that is correct, what about a new device state? When the fail fast
tmo expires we can set the device to the new state, run the queue and
incoming IO or IO in the request_queue marked with FAILFAST can be
failed upwards by scsi-ml.
I just woke up though :)
next prev parent reply other threads:[~2006-07-25 17:12 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-20 18:45 [RFC] fc transport: extensions for fast fail and dev loss James Smart
2006-07-25 17:12 ` Mike Christie [this message]
2006-07-25 18:49 ` James Smart
2006-07-25 21:15 ` Michael Reed
2006-07-26 3:33 ` James Smart
2006-07-26 9:20 ` Christoph Hellwig
2006-07-26 16:35 ` James Smart
2006-08-08 17:54 ` [RFC] [Last Rites] " James Smart
2006-08-08 21:56 ` Michael Reed
2006-08-08 22:15 ` Michael Reed
2006-08-09 15:31 ` Michael Reed
2006-08-10 16:38 ` James Smart
2006-08-09 17:36 ` Christoph Hellwig
2006-08-10 16:17 ` James Smart
2006-08-10 20:01 ` Mike Christie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44C65116.3030503@cs.wisc.edu \
--to=michaelc@cs.wisc.edu \
--cc=James.Smart@Emulex.Com \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox