linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout
@ 2008-12-05 17:18 James Smart
  2008-12-05 22:29 ` Mike Christie
  2008-12-08 22:31 ` Mike Christie
  0 siblings, 2 replies; 4+ messages in thread
From: James Smart @ 2008-12-05 17:18 UTC (permalink / raw)
  To: linux-scsi

Pre-emptively terminate i/o on the rport if dev_loss_tmo has fired.
The desire is to terminate everything, so that the i/o is cleaned up
prior to the sdev's being unblocked, thus any outstanding timeouts/aborts
are avoided.

Also, we do this early enough such that the rport's port_id field is
still valid. FCOE libFC code needs this info to find the i/o's to
terminate.

-- james s

 Signed-off-by: James Smart <james.smart@emulex.com>

 ---

 scsi_transport_fc.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)


diff -upNr a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
--- a/drivers/scsi/scsi_transport_fc.c	2008-10-18 10:32:52.000000000 -0400
+++ b/drivers/scsi/scsi_transport_fc.c	2008-12-05 12:13:54.000000000 -0500
@@ -3012,6 +3012,16 @@ fc_timeout_deleted_rport(struct work_str
 	rport->port_state = FC_PORTSTATE_NOTPRESENT;
 	rport->flags &= ~FC_RPORT_FAST_FAIL_TIMEDOUT;
 
+	/*
+	 * Pre-emptively kill I/O rather than waiting for the work queue
+	 * item to teardown the starget. (FCOE libFC folks prefer this
+	 * and to have the rport_port_id still set when it's done).
+	 */
+	spin_unlock_irqrestore(shost->host_lock, flags);
+	fc_terminate_rport_io(rport);
+
+	BUG_ON(rport->port_state != FC_PORTSTATE_NOTPRESENT);
+
 	/* remove the identifiers that aren't used in the consisting binding */
 	switch (fc_host->tgtid_bind_type) {
 	case FC_TGTID_BIND_BY_WWPN:
@@ -3035,7 +3045,6 @@ fc_timeout_deleted_rport(struct work_str
 	 * went away and didn't come back - we'll remove
 	 * all attached scsi devices.
 	 */
-	spin_unlock_irqrestore(shost->host_lock, flags);
 
 	scsi_target_unblock(&rport->dev);
 	fc_queue_work(shost, &rport->stgt_delete_work);



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout
  2008-12-05 17:18 [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout James Smart
@ 2008-12-05 22:29 ` Mike Christie
  2008-12-05 22:41   ` James Smart
  2008-12-08 22:31 ` Mike Christie
  1 sibling, 1 reply; 4+ messages in thread
From: Mike Christie @ 2008-12-05 22:29 UTC (permalink / raw)
  To: James.Smart; +Cc: linux-scsi

[-- Attachment #1: Type: text/plain, Size: 770 bytes --]

James Smart wrote:
> 
> diff -upNr a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> --- a/drivers/scsi/scsi_transport_fc.c	2008-10-18 10:32:52.000000000 -0400
> +++ b/drivers/scsi/scsi_transport_fc.c	2008-12-05 12:13:54.000000000 -0500
> @@ -3012,6 +3012,16 @@ fc_timeout_deleted_rport(struct work_str
> @@ -3035,7 +3045,6 @@ fc_timeout_deleted_rport(struct work_str
>  	 * went away and didn't come back - we'll remove
>  	 * all attached scsi devices.
>  	 */
> -	spin_unlock_irqrestore(shost->host_lock, flags);
>  
>  	scsi_target_unblock(&rport->dev);
>  	fc_queue_work(shost, &rport->stgt_delete_work);
> 

We do not need this unblock call anymore. The added 
fc_terminate_rport_io will do this. I attached a patch that removes it 
for you.

[-- Attachment #2: james-pre-emptively-terminate-2.patch --]
[-- Type: text/plain, Size: 1718 bytes --]

Pre-emptively terminate i/o on the rport if dev_loss_tmo has fired.
The desire is to terminate everything, so that the i/o is cleaned up
prior to the sdev's being unblocked, thus any outstanding timeouts/aborts
are avoided.
 
Also, we do this early enough such that the rport's port_id field is
still valid. FCOE libFC code needs this info to find the i/o's to
terminate.

 -- james s
 
Signed-off-by: James Smart <james.smart@emulex.com>
[michaelc@cs.wisc.edu.com: remove extra scsi_target_unblock call]
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu.com>

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 1e71abf..062304d 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3012,6 +3012,16 @@ fc_timeout_deleted_rport(struct work_struct *work)
 	rport->port_state = FC_PORTSTATE_NOTPRESENT;
 	rport->flags &= ~FC_RPORT_FAST_FAIL_TIMEDOUT;
 
+	/*
+	 * Pre-emptively kill I/O rather than waiting for the work queue
+	 * item to teardown the starget. (FCOE libFC folks prefer this
+	 * and to have the rport_port_id still set when it's done).
+	 */
+	spin_unlock_irqrestore(shost->host_lock, flags);
+	fc_terminate_rport_io(rport);
+
+	BUG_ON(rport->port_state != FC_PORTSTATE_NOTPRESENT);
+
 	/* remove the identifiers that aren't used in the consisting binding */
 	switch (fc_host->tgtid_bind_type) {
 	case FC_TGTID_BIND_BY_WWPN:
@@ -3035,9 +3045,6 @@ fc_timeout_deleted_rport(struct work_struct *work)
 	 * went away and didn't come back - we'll remove
 	 * all attached scsi devices.
 	 */
-	spin_unlock_irqrestore(shost->host_lock, flags);
-
-	scsi_target_unblock(&rport->dev);
 	fc_queue_work(shost, &rport->stgt_delete_work);
 }
 

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout
  2008-12-05 22:29 ` Mike Christie
@ 2008-12-05 22:41   ` James Smart
  0 siblings, 0 replies; 4+ messages in thread
From: James Smart @ 2008-12-05 22:41 UTC (permalink / raw)
  To: Mike Christie; +Cc: linux-scsi@vger.kernel.org

Agreed - thanks Mike.

-- james s

Mike Christie wrote:
> James Smart wrote:
>> diff -upNr a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
>> --- a/drivers/scsi/scsi_transport_fc.c        2008-10-18 10:32:52.000000000 -0400
>> +++ b/drivers/scsi/scsi_transport_fc.c        2008-12-05 12:13:54.000000000 -0500
>> @@ -3012,6 +3012,16 @@ fc_timeout_deleted_rport(struct work_str
>> @@ -3035,7 +3045,6 @@ fc_timeout_deleted_rport(struct work_str
>>        * went away and didn't come back - we'll remove
>>        * all attached scsi devices.
>>        */
>> -     spin_unlock_irqrestore(shost->host_lock, flags);
>>
>>       scsi_target_unblock(&rport->dev);
>>       fc_queue_work(shost, &rport->stgt_delete_work);
>>
> 
> We do not need this unblock call anymore. The added
> fc_terminate_rport_io will do this. I attached a patch that removes it
> for you.
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout
  2008-12-05 17:18 [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout James Smart
  2008-12-05 22:29 ` Mike Christie
@ 2008-12-08 22:31 ` Mike Christie
  1 sibling, 0 replies; 4+ messages in thread
From: Mike Christie @ 2008-12-08 22:31 UTC (permalink / raw)
  To: James.Smart; +Cc: linux-scsi

James Smart wrote:
> Pre-emptively terminate i/o on the rport if dev_loss_tmo has fired.
> The desire is to terminate everything, so that the i/o is cleaned up
> prior to the sdev's being unblocked, thus any outstanding timeouts/aborts
> are avoided.
> 
> Also, we do this early enough such that the rport's port_id field is
> still valid. FCOE libFC code needs this info to find the i/o's to
> terminate.
> 
> -- james s
> 
>  Signed-off-by: James Smart <james.smart@emulex.com>
> 
>  ---
> 
>  scsi_transport_fc.c |   11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> 
> diff -upNr a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> --- a/drivers/scsi/scsi_transport_fc.c	2008-10-18 10:32:52.000000000 -0400
> +++ b/drivers/scsi/scsi_transport_fc.c	2008-12-05 12:13:54.000000000 -0500
> @@ -3012,6 +3012,16 @@ fc_timeout_deleted_rport(struct work_str
>  	rport->port_state = FC_PORTSTATE_NOTPRESENT;
>  	rport->flags &= ~FC_RPORT_FAST_FAIL_TIMEDOUT;
>  
> +	/*
> +	 * Pre-emptively kill I/O rather than waiting for the work queue
> +	 * item to teardown the starget. (FCOE libFC folks prefer this
> +	 * and to have the rport_port_id still set when it's done).
> +	 */
> +	spin_unlock_irqrestore(shost->host_lock, flags);
> +	fc_terminate_rport_io(rport);
> +
> +	BUG_ON(rport->port_state != FC_PORTSTATE_NOTPRESENT);
> +

I think we could this this bug_on or we might hit a problem below where 
this thread is changing the port_id but some other thread is calling 
fc_remote_port_add and changging the port id.

I think the problem is that fc_remote_port_add only calls fc_flush_work 
which flushes the fc_host work_q:


struct fc_rport *
fc_remote_port_add(struct Scsi_Host *shost, int channel,
         struct fc_rport_identifiers  *ids)
{
         struct fc_internal *fci = to_fc_internal(shost->transportt);
         struct fc_host_attrs *fc_host = shost_to_fc_host(shost);
         struct fc_rport *rport;
         unsigned long flags;
         int match = 0;

         /* ensure any stgt delete functions are done */
         fc_flush_work(shost);


But fc_timeout_deleted_rport dev_loss_work is running on the devloss_work_q.

So if fc_timeout_deleted_rport grabs the host lock first, it will set 
the port state to FC_PORTSTATE_NOTPRESENT, then drop the lock.

Once fc_timeout_delete_rport drops the lock, fc_remote_port_add could 
have already passed the fc_flush_work() call because there was nothing 
on it. fc_remote_port_add would then drop down to the list search on the 
bindings array and see the remote port. It would then start setting the 
port state, id and port_name and node_name, but at the same time, 
because the host lock no longer guards it, fc_timedout_deleted_rport 
could be fiddling with the same fields and we could end up a mix of 
values or it could be running over the BUG_ON.

Is that right?

If so do we just want to flush the devloss_work_queue in fc_remote_port_add?



>  	/* remove the identifiers that aren't used in the consisting binding */
>  	switch (fc_host->tgtid_bind_type) {
>  	case FC_TGTID_BIND_BY_WWPN:
> @@ -3035,7 +3045,6 @@ fc_timeout_deleted_rport(struct work_str
>  	 * went away and didn't come back - we'll remove
>  	 * all attached scsi devices.
>  	 */
> -	spin_unlock_irqrestore(shost->host_lock, flags);
>  
>  	scsi_target_unblock(&rport->dev);
>  	fc_queue_work(shost, &rport->stgt_delete_work);
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-12-08 22:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-05 17:18 [PATCH] fc transport: pre-emptively terminate i/o upon dev_loss_tmo timeout James Smart
2008-12-05 22:29 ` Mike Christie
2008-12-05 22:41   ` James Smart
2008-12-08 22:31 ` Mike Christie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).