From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Christie Subject: Re: [PATCH 03/28] libfc: IO errors on link down due to cable unplug Date: Wed, 28 Jul 2010 02:32:06 -0500 Message-ID: <4C4FDCF6.2070105@cs.wisc.edu> References: <20100720221904.17116.78553.stgit@localhost.localdomain> <20100720221920.17116.59505.stgit@localhost.localdomain> <4C466986.5040801@cs.wisc.edu> <1280266320.31061.2590.camel@vi2.jf.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------030405050601080004020707" Return-path: Received: from sabe.cs.wisc.edu ([128.105.6.20]:34086 "EHLO sabe.cs.wisc.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752099Ab0G1Hcp (ORCPT ); Wed, 28 Jul 2010 03:32:45 -0400 In-Reply-To: <1280266320.31061.2590.camel@vi2.jf.intel.com> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Vasu Dev Cc: Robert Love , James.Bottomley@suse.de, linux-scsi@vger.kernel.org, devel@open-fcoe.org This is a multi-part message in MIME format. --------------030405050601080004020707 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 07/27/2010 04:32 PM, Vasu Dev wrote: >> This isn't one of those races where you are blocking the rport, but the >> IO keeps coming around so the retries are used really quickly right (the >> driver looks like it has the fc class and internal state checks to >> prevent this but I wanted to make sure). > I think I am hitting the above problem. > > This tiny time windows can be further reduced by use of wmb() on rport > state change to block and lport getting out of ready since their states > are checked w/o lock in queuecommand path. I think you need something like this for the lport queue ready check. It looks like the initial failure from __fc_linkdown->fc_fcp_cleanup burned a timeout. Then because it missed the lport ready check, it will use an extra retry just sitting there timing out. For the rport state check, I think you are supposed to call the fc chkready function under the host lock. The port state and flags are set under it. I did the attached patch for that. --------------030405050601080004020707 Content-Type: text/x-patch; name="libfc-hold-host-lock-when-checking-rport.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="libfc-hold-host-lock-when-checking-rport.patch" libfc: call fc_remote_port_chkready under the host lock. The rport port state and flags are set under the host lock, so this patch calls fc_remote_port_chkready with the host lock held like is also done in the other fc drivers. Signed-off-by: Mike Christie diff --git a/drivers/scsi/libfc/fc_fcp.c b/drivers/scsi/libfc/fc_fcp.c index eac4d09..a116846 100644 --- a/drivers/scsi/libfc/fc_fcp.c +++ b/drivers/scsi/libfc/fc_fcp.c @@ -1765,14 +1768,14 @@ int fc_queuecommand(struct scsi_cmnd *sc_cmd, void (*done)(struct scsi_cmnd *)) struct fcoe_dev_stats *stats; lport = shost_priv(sc_cmd->device->host); - spin_unlock_irq(lport->host->host_lock); rval = fc_remote_port_chkready(rport); if (rval) { sc_cmd->result = rval; done(sc_cmd); - goto out; + return 0; } + spin_unlock_irq(lport->host->host_lock); if (!*(struct fc_remote_port **)rport->dd_data) { /* --------------030405050601080004020707--