* [Bug 42918] New: fcoe: Enabling VN2VN mode triggers a circular locking complaint
@ 2012-03-13 12:16 bugzilla-daemon
2012-03-14 1:22 ` [PATCH] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up Robert Love
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: bugzilla-daemon @ 2012-03-13 12:16 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=42918
URL: http://comments.gmane.org/gmane.linux.scsi.open-fcoe.d
evel/11451?set_lines=100000
Summary: fcoe: Enabling VN2VN mode triggers a circular locking
complaint
Product: IO/Storage
Version: 2.5
Kernel Version: 3.3.0-rc7
Platform: All
OS/Version: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: SCSI
AssignedTo: linux-scsi@vger.kernel.org
ReportedBy: bvanassche@acm.org
Regression: No
Kernel version: 3.3.0-rc7
How to reproduce:
# modprobe fcoe
# echo eth0 >/sys/module/libfcoe/parameters/create_vn2vn
Result:
# dmesg
device eth0 entered promiscuous mode
scsi3 : FCoE Driver
host3: libfc: Link up on port (000000)
======================================================
[ INFO: possible circular locking dependency detected ]
3.3.0-rc7-scst-debug+ #1 Not tainted
-------------------------------------------------------
kworker/2:0/14 is trying to acquire lock:
(rtnl_mutex){+.+.+.}, at: [<c13a10c4>] rtnl_lock+0x14/0x20
but task is already holding lock:
(&fip->ctlr_mutex){+.+...}, at: [<f89713e7>] fcoe_ctlr_timer_work+0x3e7/0xb60
[libfcoe]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&fip->ctlr_mutex){+.+...}:
[<c1091f70>] lock_acquire+0x80/0x1b0
[<c147655d>] mutex_lock_nested+0x6d/0x340
[<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe]
[<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe]
[<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe]
[<c10527e0>] param_attr_store+0x30/0x60
[<c1052696>] module_attr_store+0x26/0x40
[<c11a201e>] sysfs_write_file+0xae/0x100
[<c11449df>] vfs_write+0x8f/0x160
[<c1144cbd>] sys_write+0x3d/0x70
[<c147a0c4>] syscall_call+0x7/0xb
-> #0 (rtnl_mutex){+.+.+.}:
[<c109164b>] __lock_acquire+0x140b/0x1720
[<c1091f70>] lock_acquire+0x80/0x1b0
[<c147655d>] mutex_lock_nested+0x6d/0x340
[<c13a10c4>] rtnl_lock+0x14/0x20
[<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe]
[<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe]
[<c104fb69>] process_one_work+0x179/0x5d0
[<c10502f1>] worker_thread+0x121/0x2d0
[<c10550ed>] kthread+0x7d/0x90
[<c1481a82>] kernel_thread_helper+0x6/0x10
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&fip->ctlr_mutex);
lock(rtnl_mutex);
lock(&fip->ctlr_mutex);
lock(rtnl_mutex);
*** DEADLOCK ***
3 locks held by kworker/2:0/14:
#0: (events){.+.+.+}, at: [<c104faf5>] process_one_work+0x105/0x5d0
#1: ((&fip->timer_work)){+.+...}, at: [<c104faf5>]
process_one_work+0x105/0x5d0
#2: (&fip->ctlr_mutex){+.+...}, at: [<f89713e7>]
fcoe_ctlr_timer_work+0x3e7/0xb60 [libfcoe]
stack backtrace:
Pid: 14, comm: kworker/2:0 Not tainted 3.3.0-rc7 #1
Call Trace:
[<c14714a6>] ? printk+0x1d/0x1f
[<c1471d4f>] print_circular_bug+0x1b4/0x1be
[<c109164b>] __lock_acquire+0x140b/0x1720
[<c1091f70>] lock_acquire+0x80/0x1b0
[<c13a10c4>] ? rtnl_lock+0x14/0x20
[<c147655d>] mutex_lock_nested+0x6d/0x340
[<c13a10c4>] ? rtnl_lock+0x14/0x20
[<c13a10c4>] ? rtnl_lock+0x14/0x20
[<f89713e7>] ? fcoe_ctlr_timer_work+0x3e7/0xb60 [libfcoe]
[<c13a10c4>] rtnl_lock+0x14/0x20
[<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe]
[<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe]
[<c106a2c5>] ? local_clock+0x65/0x70
[<c104faf5>] ? process_one_work+0x105/0x5d0
[<c10928e4>] ? trace_hardirqs_on_caller+0xf4/0x180
[<c104fb69>] process_one_work+0x179/0x5d0
[<c104faf5>] ? process_one_work+0x105/0x5d0
[<f8971000>] ? fcoe_ctlr_vn_send_claim+0x40/0x40 [libfcoe]
[<c10502f1>] worker_thread+0x121/0x2d0
[<c10501d0>] ? rescuer_thread+0x1d0/0x1d0
[<c10550ed>] kthread+0x7d/0x90
[<c1055070>] ? __init_kthread_worker+0x60/0x60
[<c1481a82>] kernel_thread_helper+0x6/0x10
host3: Assigned Port ID 0092b5
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up
2012-03-13 12:16 [Bug 42918] New: fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
@ 2012-03-14 1:22 ` Robert Love
2012-03-15 1:21 ` [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Robert Love @ 2012-03-14 1:22 UTC (permalink / raw)
To: bvanassche, linux-scsi; +Cc: devel, yi.zou
The rtnl_lock is primarily used to serialize networking
driver changes as well as to ensure that a networking driver
is not removed when making changes to it. fcoe also uses
the rtnl_lock to protect the fcoe hostlist.
fcoe_create holds the rtnl_lock over the entirity of the
routine including a the call to fcoe_ctlr_link_up.
This causes the below deadlock because fcoe_ctlr_link_up
acquires the fcoe_ctlr ctlr_mutex and this deadlocks with
a libfcoe thread that acquires the fcoe_ctlr ctlr_mutex and
then the rtnl_lock (to update a MAC address).
This patch drops the rtnl_lock before calling
fcoe_ctlr_link_up and therefore the deadlock is prevented.
https://bugzilla.kernel.org/show_bug.cgi?id=42918
the existing dependency chain (in reverse order) is:
-> #1 (&fip->ctlr_mutex){+.+...}:
[<c1091f70>] lock_acquire+0x80/0x1b0
[<c147655d>] mutex_lock_nested+0x6d/0x340
[<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe]
[<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe]
[<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe]
[<c10527e0>] param_attr_store+0x30/0x60
[<c1052696>] module_attr_store+0x26/0x40
[<c11a201e>] sysfs_write_file+0xae/0x100
[<c11449df>] vfs_write+0x8f/0x160
[<c1144cbd>] sys_write+0x3d/0x70
[<c147a0c4>] syscall_call+0x7/0xb
-> #0 (rtnl_mutex){+.+.+.}:
[<c109164b>] __lock_acquire+0x140b/0x1720
[<c1091f70>] lock_acquire+0x80/0x1b0
[<c147655d>] mutex_lock_nested+0x6d/0x340
[<c13a10c4>] rtnl_lock+0x14/0x20
[<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe]
[<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe]
[<c104fb69>] process_one_work+0x179/0x5d0
[<c10502f1>] worker_thread+0x121/0x2d0
[<c10550ed>] kthread+0x7d/0x90
[<c1481a82>] kernel_thread_helper+0x6/0x10
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&fip->ctlr_mutex);
lock(rtnl_mutex);
lock(&fip->ctlr_mutex);
lock(rtnl_mutex);
*** DEADLOCK ***
Signed-off-by: Robert Love <robert.w.love@intel.com>
---
drivers/scsi/fcoe/fcoe.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index e959960..408ca0e 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -2133,8 +2133,12 @@ static int fcoe_create(struct net_device *netdev, enum fip_state fip_mode)
/* start FIP Discovery and FLOGI */
lport->boot_time = jiffies;
fc_fabric_login(lport);
- if (!fcoe_link_ok(lport))
+ if (!fcoe_link_ok(lport)) {
+ rtnl_unlock();
fcoe_ctlr_link_up(&fcoe->ctlr);
+ mutex_unlock(&fcoe_config_mutex);
+ return rc;
+ }
out_nodev:
rtnl_unlock();
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint
2012-03-13 12:16 [Bug 42918] New: fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
2012-03-14 1:22 ` [PATCH] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up Robert Love
@ 2012-03-15 1:21 ` bugzilla-daemon
2012-03-15 1:28 ` Love, Robert W
2012-04-04 14:56 ` bugzilla-daemon
` (4 subsequent siblings)
6 siblings, 1 reply; 9+ messages in thread
From: bugzilla-daemon @ 2012-03-15 1:21 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=42918
--- Comment #1 from Robert Love <robert.w.love@intel.com> 2012-03-15 01:21:48 ---
Fix was posted to linux-scsi here:
http://www.spinics.net/lists/linux-scsi/msg58027.html
Note that I Nacked my own patch but that others commented that the patch is
correct. Please ignore the Nack; the patch is good.
Can the submitted please verify that this fixes their issue?
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint
2012-03-15 1:21 ` [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
@ 2012-03-15 1:28 ` Love, Robert W
0 siblings, 0 replies; 9+ messages in thread
From: Love, Robert W @ 2012-03-15 1:28 UTC (permalink / raw)
To: bugzilla-daemon@bugzilla.kernel.org, bvanassche@acm.org
Cc: linux-scsi@vger.kernel.org
On 03/14/2012 06:21 PM, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=42918
>
>
>
>
>
> --- Comment #1 from Robert Love<robert.w.love@intel.com> 2012-03-15 01:21:48 ---
> Fix was posted to linux-scsi here:
> http://www.spinics.net/lists/linux-scsi/msg58027.html
>
> Note that I Nacked my own patch but that others commented that the patch is
> correct. Please ignore the Nack; the patch is good.
>
> Can the submitted please verify that this fixes their issue?
>
I was unable to change the owner in the BZ. Bart, does this fix your
problem?
I've never worked with the kernel.org BZ, so I'm not sure if I need to
follow up on this anymore...
I'm not convinced that this needs to be rushed in for 3.3 since we're at
rc7. It's a warning in fcoe in point-to-multipoint mode and as much as
I'd like everything in fcoe working in every release I don't think this
will bother too many users.
Thanks, //Rob
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint
2012-03-13 12:16 [Bug 42918] New: fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
2012-03-14 1:22 ` [PATCH] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up Robert Love
2012-03-15 1:21 ` [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
@ 2012-04-04 14:56 ` bugzilla-daemon
2012-04-04 15:21 ` bugzilla-daemon
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2012-04-04 14:56 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=42918
Florian Mickler <florian@mickler.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |florian@mickler.org
--- Comment #2 from Florian Mickler <florian@mickler.org> 2012-04-04 14:56:51 ---
A patch referencing this bug report has been merged in Linux v3.4-rc1:
commit 2280512342ead9a2858b1490b21e5bcaf4f4cfc7
Author: Robert Love <robert.w.love@intel.com>
Date: Tue Mar 13 18:22:12 2012 -0700
[SCSI] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint
2012-03-13 12:16 [Bug 42918] New: fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
` (2 preceding siblings ...)
2012-04-04 14:56 ` bugzilla-daemon
@ 2012-04-04 15:21 ` bugzilla-daemon
2012-06-13 15:17 ` bugzilla-daemon
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2012-04-04 15:21 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=42918
--- Comment #3 from Bart Van Assche <bvanassche@acm.org> 2012-04-04 15:21:49 ---
Please leave this bug report open since the patch referenced in comment #1
hasn't been merged yet.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint
2012-03-13 12:16 [Bug 42918] New: fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
` (3 preceding siblings ...)
2012-04-04 15:21 ` bugzilla-daemon
@ 2012-06-13 15:17 ` bugzilla-daemon
2012-06-13 15:17 ` bugzilla-daemon
2012-07-01 9:48 ` bugzilla-daemon
6 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2012-06-13 15:17 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=42918
Alan <alan@lxorguk.ukuu.org.uk> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
CC| |alan@lxorguk.ukuu.org.uk
Resolution| |CODE_FIX
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint
2012-03-13 12:16 [Bug 42918] New: fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
` (4 preceding siblings ...)
2012-06-13 15:17 ` bugzilla-daemon
@ 2012-06-13 15:17 ` bugzilla-daemon
2012-07-01 9:48 ` bugzilla-daemon
6 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2012-06-13 15:17 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=42918
Alan <alan@lxorguk.ukuu.org.uk> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |CLOSED
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint
2012-03-13 12:16 [Bug 42918] New: fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
` (5 preceding siblings ...)
2012-06-13 15:17 ` bugzilla-daemon
@ 2012-07-01 9:48 ` bugzilla-daemon
6 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2012-07-01 9:48 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=42918
--- Comment #4 from Florian Mickler <florian@mickler.org> 2012-07-01 09:48:29 ---
A patch referencing this bug report has been merged in Linux v3.5-rc1:
commit 949e71f17d9a5c59fa7b02cce3b548384bff1c92
Author: Robert Love <robert.w.love@intel.com>
Date: Fri Apr 20 12:16:43 2012 -0700
[SCSI] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-07-01 9:48 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-13 12:16 [Bug 42918] New: fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
2012-03-14 1:22 ` [PATCH] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up Robert Love
2012-03-15 1:21 ` [Bug 42918] fcoe: Enabling VN2VN mode triggers a circular locking complaint bugzilla-daemon
2012-03-15 1:28 ` Love, Robert W
2012-04-04 14:56 ` bugzilla-daemon
2012-04-04 15:21 ` bugzilla-daemon
2012-06-13 15:17 ` bugzilla-daemon
2012-06-13 15:17 ` bugzilla-daemon
2012-07-01 9:48 ` bugzilla-daemon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).