All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Bhanu Prakash Gollapudi" <bprakash@broadcom.com>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"devel@open-fcoe.org" <devel@open-fcoe.org>
Subject: Re: deadlock during fc_remove_host
Date: Wed, 20 Apr 2011 22:32:17 -0700	[thread overview]
Message-ID: <4DAFC161.6040708@broadcom.com> (raw)
In-Reply-To: <4DAF9C15.1010808@cs.wisc.edu>

On 4/20/2011 7:53 PM, Mike Christie wrote:
> On 04/20/2011 07:24 PM, Bhanu Prakash Gollapudi wrote:
>> Hi,
>>
>> We are seeing a similar issue to what Joe has observed a while back -
>> http://www.mail-archive.com/devel@open-fcoe.org/msg02993.html.
>>
>> This happens in a very corner case scenario by creating and destroying
>> fcoe interface in a tight loop. (fcoeadm -c followed by fcoeadm -d). The
>> system had a simple configuration with a single local port 2 remote ports.
>>
>> Reason for the deadlock:
>>
>> 1. destroy (fcoeadm -d) thread hangs in fc_remove_host().
>> 2. fc_remove_host() is trying to flush the shost->work_q, via
>> scsi_flush_work(), but the operation never completes.
>> 3. There are two works scheduled to be run in this work_q, one belonging
>> to rport A, and other rport B.
>> 4. The thread is currently executing rport_delete_work (fc_rport_final
>> _delete) for rport A. It calls fc_terminate_rport_io() that unblocks the
>> sdev->request_queue, so that __blk_run_queue() can be called. So, IO for
>> rport A is ready to run, but stuck at the async layer.
>> 5. Meanwhile, async layer is serializing all the IOs belonging to both
>> rport A and rport B. At this point, it is waiting for IO belonging to
>> rport B to complete.
>> 6. However, the request_queue for rport B is stopped and
>> fc_terminate_rport_io on rport B is not called yet to unblock the
>> device, which will only be called after rport A completes. rport A does
>
> Is the reason that rport b's terminate_rport_io has not been called,
> because that workqueue is queued behind rport a's workqueue and rport
> b's workqueue function is not called? If so, have you tested this with
> the current upstream kernel?
>
Yes, this has been tested with upstream kernel.


      parent reply	other threads:[~2011-04-21  5:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-21  0:24 deadlock during fc_remove_host Bhanu Prakash Gollapudi
2011-04-21  2:53 ` Mike Christie
2011-04-21  3:21   ` [Open-FCoE] " Mike Christie
2011-04-22  5:47     ` Bhanu Prakash Gollapudi
2011-04-21  5:32   ` Bhanu Prakash Gollapudi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DAFC161.6040708@broadcom.com \
    --to=bprakash@broadcom.com \
    --cc=devel@open-fcoe.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.