linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* deadlock while cleaning up after transport timeout for target
@ 2005-09-21 20:47 goggin, edward
  2005-09-21 21:21 ` Mike Anderson
  0 siblings, 1 reply; 2+ messages in thread
From: goggin, edward @ 2005-09-21 20:47 UTC (permalink / raw)
  To: linux-scsi

Running with 2.6.13, an EMC Symmetrix target with lpfc and getting
into a deadlock when the target port times out WHILE starting to scan
the port via fc_scsi_scan_rport.

The lpfc_worker thread is stuck waiting for the scsi host's work_q to
empty in scsi_flush_work.  But this work queue wont empty since the
inquiry to LUN 0 initiated by the fc_scsi_scan_rport port scan is
getting retried onto the scsi mid-level request queue, seemingly ad
infinitum, since lpfc_queuecommand is returning
SCSI_MLQUEUE_HOST_BUSY whenever a target port infrastructure
is being dismantled.  This prevents the fc_scsi_scan_rport call from
finishing which prevents the target's NODEV timeout from being
completely serviced.  My boot up hangs as a result.

Not clear yet why the timeout is happening in the first place.  This
problem is occurring for about 70% of my reboots :((

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: deadlock while cleaning up after transport timeout for target
  2005-09-21 20:47 deadlock while cleaning up after transport timeout for target goggin, edward
@ 2005-09-21 21:21 ` Mike Anderson
  0 siblings, 0 replies; 2+ messages in thread
From: Mike Anderson @ 2005-09-21 21:21 UTC (permalink / raw)
  To: goggin, edward, James.Smart; +Cc: linux-scsi

goggin, edward <egoggin@emc.com> wrote:
> Running with 2.6.13, an EMC Symmetrix target with lpfc and getting
> into a deadlock when the target port times out WHILE starting to scan
> the port via fc_scsi_scan_rport.
> 
> The lpfc_worker thread is stuck waiting for the scsi host's work_q to
> empty in scsi_flush_work.  But this work queue wont empty since the
> inquiry to LUN 0 initiated by the fc_scsi_scan_rport port scan is
> getting retried onto the scsi mid-level request queue, seemingly ad
> infinitum, since lpfc_queuecommand is returning
> SCSI_MLQUEUE_HOST_BUSY whenever a target port infrastructure
> is being dismantled.  This prevents the fc_scsi_scan_rport call from
> finishing which prevents the target's NODEV timeout from being
> completely serviced.  My boot up hangs as a result.
> 

I believe the lpfc driver in 2.6.13 is missing the latest update from
Emulex which is in 2.6.14-rc* (i.e., 8.0.29 vs. 8.0.30). Have you tried a
test run on 2.6.14-rc?

James S can better comment if this fix alone will solve your issue, but it
matches fairly close to the signature I was previously receiving on the
older version of the lpfc driver.

In my testing of port bounce runs, I could not complete a run until I
started using the updated version of the driver.

-andmike
--
Michael Anderson
andmike@us.ibm.com

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2005-09-21 21:21 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-21 20:47 deadlock while cleaning up after transport timeout for target goggin, edward
2005-09-21 21:21 ` Mike Anderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).