* [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout
@ 2008-12-05 17:18 James Smart
2008-12-05 22:29 ` Mike Christie
2008-12-08 22:31 ` Mike Christie
0 siblings, 2 replies; 4+ messages in thread
From: James Smart @ 2008-12-05 17:18 UTC (permalink / raw)
To: linux-scsi
Pre-emptively terminate i/o on the rport if dev_loss_tmo has fired.
The desire is to terminate everything, so that the i/o is cleaned up
prior to the sdev's being unblocked, thus any outstanding timeouts/aborts
are avoided.
Also, we do this early enough such that the rport's port_id field is
still valid. FCOE libFC code needs this info to find the i/o's to
terminate.
-- james s
Signed-off-by: James Smart <james.smart@emulex.com>
---
scsi_transport_fc.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff -upNr a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
--- a/drivers/scsi/scsi_transport_fc.c 2008-10-18 10:32:52.000000000 -0400
+++ b/drivers/scsi/scsi_transport_fc.c 2008-12-05 12:13:54.000000000 -0500
@@ -3012,6 +3012,16 @@ fc_timeout_deleted_rport(struct work_str
rport->port_state = FC_PORTSTATE_NOTPRESENT;
rport->flags &= ~FC_RPORT_FAST_FAIL_TIMEDOUT;
+ /*
+ * Pre-emptively kill I/O rather than waiting for the work queue
+ * item to teardown the starget. (FCOE libFC folks prefer this
+ * and to have the rport_port_id still set when it's done).
+ */
+ spin_unlock_irqrestore(shost->host_lock, flags);
+ fc_terminate_rport_io(rport);
+
+ BUG_ON(rport->port_state != FC_PORTSTATE_NOTPRESENT);
+
/* remove the identifiers that aren't used in the consisting binding */
switch (fc_host->tgtid_bind_type) {
case FC_TGTID_BIND_BY_WWPN:
@@ -3035,7 +3045,6 @@ fc_timeout_deleted_rport(struct work_str
* went away and didn't come back - we'll remove
* all attached scsi devices.
*/
- spin_unlock_irqrestore(shost->host_lock, flags);
scsi_target_unblock(&rport->dev);
fc_queue_work(shost, &rport->stgt_delete_work);
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout
2008-12-05 17:18 [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout James Smart
@ 2008-12-05 22:29 ` Mike Christie
2008-12-05 22:41 ` James Smart
2008-12-08 22:31 ` Mike Christie
1 sibling, 1 reply; 4+ messages in thread
From: Mike Christie @ 2008-12-05 22:29 UTC (permalink / raw)
To: James.Smart; +Cc: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 770 bytes --]
James Smart wrote:
>
> diff -upNr a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> --- a/drivers/scsi/scsi_transport_fc.c 2008-10-18 10:32:52.000000000 -0400
> +++ b/drivers/scsi/scsi_transport_fc.c 2008-12-05 12:13:54.000000000 -0500
> @@ -3012,6 +3012,16 @@ fc_timeout_deleted_rport(struct work_str
> @@ -3035,7 +3045,6 @@ fc_timeout_deleted_rport(struct work_str
> * went away and didn't come back - we'll remove
> * all attached scsi devices.
> */
> - spin_unlock_irqrestore(shost->host_lock, flags);
>
> scsi_target_unblock(&rport->dev);
> fc_queue_work(shost, &rport->stgt_delete_work);
>
We do not need this unblock call anymore. The added
fc_terminate_rport_io will do this. I attached a patch that removes it
for you.
[-- Attachment #2: james-pre-emptively-terminate-2.patch --]
[-- Type: text/plain, Size: 1718 bytes --]
Pre-emptively terminate i/o on the rport if dev_loss_tmo has fired.
The desire is to terminate everything, so that the i/o is cleaned up
prior to the sdev's being unblocked, thus any outstanding timeouts/aborts
are avoided.
Also, we do this early enough such that the rport's port_id field is
still valid. FCOE libFC code needs this info to find the i/o's to
terminate.
-- james s
Signed-off-by: James Smart <james.smart@emulex.com>
[michaelc@cs.wisc.edu.com: remove extra scsi_target_unblock call]
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu.com>
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 1e71abf..062304d 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3012,6 +3012,16 @@ fc_timeout_deleted_rport(struct work_struct *work)
rport->port_state = FC_PORTSTATE_NOTPRESENT;
rport->flags &= ~FC_RPORT_FAST_FAIL_TIMEDOUT;
+ /*
+ * Pre-emptively kill I/O rather than waiting for the work queue
+ * item to teardown the starget. (FCOE libFC folks prefer this
+ * and to have the rport_port_id still set when it's done).
+ */
+ spin_unlock_irqrestore(shost->host_lock, flags);
+ fc_terminate_rport_io(rport);
+
+ BUG_ON(rport->port_state != FC_PORTSTATE_NOTPRESENT);
+
/* remove the identifiers that aren't used in the consisting binding */
switch (fc_host->tgtid_bind_type) {
case FC_TGTID_BIND_BY_WWPN:
@@ -3035,9 +3045,6 @@ fc_timeout_deleted_rport(struct work_struct *work)
* went away and didn't come back - we'll remove
* all attached scsi devices.
*/
- spin_unlock_irqrestore(shost->host_lock, flags);
-
- scsi_target_unblock(&rport->dev);
fc_queue_work(shost, &rport->stgt_delete_work);
}
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout
2008-12-05 22:29 ` Mike Christie
@ 2008-12-05 22:41 ` James Smart
0 siblings, 0 replies; 4+ messages in thread
From: James Smart @ 2008-12-05 22:41 UTC (permalink / raw)
To: Mike Christie; +Cc: linux-scsi@vger.kernel.org
Agreed - thanks Mike.
-- james s
Mike Christie wrote:
> James Smart wrote:
>> diff -upNr a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
>> --- a/drivers/scsi/scsi_transport_fc.c 2008-10-18 10:32:52.000000000 -0400
>> +++ b/drivers/scsi/scsi_transport_fc.c 2008-12-05 12:13:54.000000000 -0500
>> @@ -3012,6 +3012,16 @@ fc_timeout_deleted_rport(struct work_str
>> @@ -3035,7 +3045,6 @@ fc_timeout_deleted_rport(struct work_str
>> * went away and didn't come back - we'll remove
>> * all attached scsi devices.
>> */
>> - spin_unlock_irqrestore(shost->host_lock, flags);
>>
>> scsi_target_unblock(&rport->dev);
>> fc_queue_work(shost, &rport->stgt_delete_work);
>>
>
> We do not need this unblock call anymore. The added
> fc_terminate_rport_io will do this. I attached a patch that removes it
> for you.
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout
2008-12-05 17:18 [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout James Smart
2008-12-05 22:29 ` Mike Christie
@ 2008-12-08 22:31 ` Mike Christie
1 sibling, 0 replies; 4+ messages in thread
From: Mike Christie @ 2008-12-08 22:31 UTC (permalink / raw)
To: James.Smart; +Cc: linux-scsi
James Smart wrote:
> Pre-emptively terminate i/o on the rport if dev_loss_tmo has fired.
> The desire is to terminate everything, so that the i/o is cleaned up
> prior to the sdev's being unblocked, thus any outstanding timeouts/aborts
> are avoided.
>
> Also, we do this early enough such that the rport's port_id field is
> still valid. FCOE libFC code needs this info to find the i/o's to
> terminate.
>
> -- james s
>
> Signed-off-by: James Smart <james.smart@emulex.com>
>
> ---
>
> scsi_transport_fc.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
>
> diff -upNr a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> --- a/drivers/scsi/scsi_transport_fc.c 2008-10-18 10:32:52.000000000 -0400
> +++ b/drivers/scsi/scsi_transport_fc.c 2008-12-05 12:13:54.000000000 -0500
> @@ -3012,6 +3012,16 @@ fc_timeout_deleted_rport(struct work_str
> rport->port_state = FC_PORTSTATE_NOTPRESENT;
> rport->flags &= ~FC_RPORT_FAST_FAIL_TIMEDOUT;
>
> + /*
> + * Pre-emptively kill I/O rather than waiting for the work queue
> + * item to teardown the starget. (FCOE libFC folks prefer this
> + * and to have the rport_port_id still set when it's done).
> + */
> + spin_unlock_irqrestore(shost->host_lock, flags);
> + fc_terminate_rport_io(rport);
> +
> + BUG_ON(rport->port_state != FC_PORTSTATE_NOTPRESENT);
> +
I think we could this this bug_on or we might hit a problem below where
this thread is changing the port_id but some other thread is calling
fc_remote_port_add and changging the port id.
I think the problem is that fc_remote_port_add only calls fc_flush_work
which flushes the fc_host work_q:
struct fc_rport *
fc_remote_port_add(struct Scsi_Host *shost, int channel,
struct fc_rport_identifiers *ids)
{
struct fc_internal *fci = to_fc_internal(shost->transportt);
struct fc_host_attrs *fc_host = shost_to_fc_host(shost);
struct fc_rport *rport;
unsigned long flags;
int match = 0;
/* ensure any stgt delete functions are done */
fc_flush_work(shost);
But fc_timeout_deleted_rport dev_loss_work is running on the devloss_work_q.
So if fc_timeout_deleted_rport grabs the host lock first, it will set
the port state to FC_PORTSTATE_NOTPRESENT, then drop the lock.
Once fc_timeout_delete_rport drops the lock, fc_remote_port_add could
have already passed the fc_flush_work() call because there was nothing
on it. fc_remote_port_add would then drop down to the list search on the
bindings array and see the remote port. It would then start setting the
port state, id and port_name and node_name, but at the same time,
because the host lock no longer guards it, fc_timedout_deleted_rport
could be fiddling with the same fields and we could end up a mix of
values or it could be running over the BUG_ON.
Is that right?
If so do we just want to flush the devloss_work_queue in fc_remote_port_add?
> /* remove the identifiers that aren't used in the consisting binding */
> switch (fc_host->tgtid_bind_type) {
> case FC_TGTID_BIND_BY_WWPN:
> @@ -3035,7 +3045,6 @@ fc_timeout_deleted_rport(struct work_str
> * went away and didn't come back - we'll remove
> * all attached scsi devices.
> */
> - spin_unlock_irqrestore(shost->host_lock, flags);
>
> scsi_target_unblock(&rport->dev);
> fc_queue_work(shost, &rport->stgt_delete_work);
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-12-08 22:31 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-05 17:18 [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout James Smart
2008-12-05 22:29 ` Mike Christie
2008-12-05 22:41 ` James Smart
2008-12-08 22:31 ` Mike Christie
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).