* [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac
[not found] <bug-42918-11613@https.bugzilla.kernel.org>
@ 2012-03-13 22:52 ` Robert Love
2012-03-14 0:42 ` Love, Robert W
2012-03-15 8:40 ` Bart Van Assche
0 siblings, 2 replies; 5+ messages in thread
From: Robert Love @ 2012-03-13 22:52 UTC (permalink / raw)
To: bvanassche, linux-scsi; +Cc: devel, yi.zou
The rtnl_mutex was held to protect calls to dev_uc_add
and dev_uc_del. Holding rtnl is not required as those
functions make use of the netif_addr_lock* API to
protect the MAC changing.
This change fixes the following regression by removing
the rtnl usage when fcoe_update_src_mac is called.
https://bugzilla.kernel.org/show_bug.cgi?id=42918
the existing dependency chain (in reverse order) is:
-> #1 (&fip->ctlr_mutex){+.+...}:
[<c1091f70>] lock_acquire+0x80/0x1b0
[<c147655d>] mutex_lock_nested+0x6d/0x340
[<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe]
[<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe]
[<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe]
[<c10527e0>] param_attr_store+0x30/0x60
[<c1052696>] module_attr_store+0x26/0x40
[<c11a201e>] sysfs_write_file+0xae/0x100
[<c11449df>] vfs_write+0x8f/0x160
[<c1144cbd>] sys_write+0x3d/0x70
[<c147a0c4>] syscall_call+0x7/0xb
-> #0 (rtnl_mutex){+.+.+.}:
[<c109164b>] __lock_acquire+0x140b/0x1720
[<c1091f70>] lock_acquire+0x80/0x1b0
[<c147655d>] mutex_lock_nested+0x6d/0x340
[<c13a10c4>] rtnl_lock+0x14/0x20
[<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe]
[<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe]
[<c104fb69>] process_one_work+0x179/0x5d0
[<c10502f1>] worker_thread+0x121/0x2d0
[<c10550ed>] kthread+0x7d/0x90
[<c1481a82>] kernel_thread_helper+0x6/0x10
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&fip->ctlr_mutex);
lock(rtnl_mutex);
lock(&fip->ctlr_mutex);
lock(rtnl_mutex);
*** DEADLOCK ***
Signed-off-by: Robert Love <robert.w.love@intel.com>
---
drivers/scsi/fcoe/fcoe.c | 2 --
1 files changed, 0 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index e959960..85b8203 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport *lport, u8 *addr)
struct fcoe_port *port = lport_priv(lport);
struct fcoe_interface *fcoe = port->priv;
- rtnl_lock();
if (!is_zero_ether_addr(port->data_src_addr))
dev_uc_del(fcoe->netdev, port->data_src_addr);
if (!is_zero_ether_addr(addr))
dev_uc_add(fcoe->netdev, addr);
memcpy(port->data_src_addr, addr, ETH_ALEN);
- rtnl_unlock();
}
/**
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac
2012-03-13 22:52 ` [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac Robert Love
@ 2012-03-14 0:42 ` Love, Robert W
2012-03-14 2:14 ` [Open-FCoE] " John Fastabend
2012-03-15 8:40 ` Bart Van Assche
1 sibling, 1 reply; 5+ messages in thread
From: Love, Robert W @ 2012-03-14 0:42 UTC (permalink / raw)
To: bvanassche@acm.org, linux-scsi@vger.kernel.org
Cc: devel@open-fcoe.org, Zou, Yi
On 03/13/2012 03:52 PM, Robert Love wrote:
> The rtnl_mutex was held to protect calls to dev_uc_add
> and dev_uc_del. Holding rtnl is not required as those
> functions make use of the netif_addr_lock* API to
> protect the MAC changing.
>
> This change fixes the following regression by removing
> the rtnl usage when fcoe_update_src_mac is called.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=42918
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&fip->ctlr_mutex){+.+...}:
> [<c1091f70>] lock_acquire+0x80/0x1b0
> [<c147655d>] mutex_lock_nested+0x6d/0x340
> [<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe]
> [<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe]
> [<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe]
> [<c10527e0>] param_attr_store+0x30/0x60
> [<c1052696>] module_attr_store+0x26/0x40
> [<c11a201e>] sysfs_write_file+0xae/0x100
> [<c11449df>] vfs_write+0x8f/0x160
> [<c1144cbd>] sys_write+0x3d/0x70
> [<c147a0c4>] syscall_call+0x7/0xb
>
> -> #0 (rtnl_mutex){+.+.+.}:
> [<c109164b>] __lock_acquire+0x140b/0x1720
> [<c1091f70>] lock_acquire+0x80/0x1b0
> [<c147655d>] mutex_lock_nested+0x6d/0x340
> [<c13a10c4>] rtnl_lock+0x14/0x20
> [<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe]
> [<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe]
> [<c104fb69>] process_one_work+0x179/0x5d0
> [<c10502f1>] worker_thread+0x121/0x2d0
> [<c10550ed>] kthread+0x7d/0x90
> [<c1481a82>] kernel_thread_helper+0x6/0x10
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&fip->ctlr_mutex);
> lock(rtnl_mutex);
> lock(&fip->ctlr_mutex);
> lock(rtnl_mutex);
>
> *** DEADLOCK ***
>
> Signed-off-by: Robert Love<robert.w.love@intel.com>
> ---
> drivers/scsi/fcoe/fcoe.c | 2 --
> 1 files changed, 0 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
> index e959960..85b8203 100644
> --- a/drivers/scsi/fcoe/fcoe.c
> +++ b/drivers/scsi/fcoe/fcoe.c
> @@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport *lport, u8 *addr)
> struct fcoe_port *port = lport_priv(lport);
> struct fcoe_interface *fcoe = port->priv;
>
> - rtnl_lock();
> if (!is_zero_ether_addr(port->data_src_addr))
> dev_uc_del(fcoe->netdev, port->data_src_addr);
> if (!is_zero_ether_addr(addr))
> dev_uc_add(fcoe->netdev, addr);
> memcpy(port->data_src_addr, addr, ETH_ALEN);
> - rtnl_unlock();
> }
>
> /**
>
This isn't going to work. We do need rtnl_lock when calling
dev_uc_add/del to ensure the driver isn't removed while making the
change. I have an alternative patch that I'll post as soon as I clean it
up a bit.
Nacked-by: Robert Love <robert.w.love@intel.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Open-FCoE] [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac
2012-03-14 0:42 ` Love, Robert W
@ 2012-03-14 2:14 ` John Fastabend
[not found] ` <4F5FFF19.1060601-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: John Fastabend @ 2012-03-14 2:14 UTC (permalink / raw)
To: Love, Robert W
Cc: bvanassche@acm.org, linux-scsi@vger.kernel.org,
devel@open-fcoe.org
On 3/13/2012 5:42 PM, Love, Robert W wrote:
> On 03/13/2012 03:52 PM, Robert Love wrote:
>> The rtnl_mutex was held to protect calls to dev_uc_add
>> and dev_uc_del. Holding rtnl is not required as those
>> functions make use of the netif_addr_lock* API to
>> protect the MAC changing.
>>
>> This change fixes the following regression by removing
>> the rtnl usage when fcoe_update_src_mac is called.
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=42918
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (&fip->ctlr_mutex){+.+...}:
>> [<c1091f70>] lock_acquire+0x80/0x1b0
>> [<c147655d>] mutex_lock_nested+0x6d/0x340
>> [<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe]
>> [<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe]
>> [<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe]
>> [<c10527e0>] param_attr_store+0x30/0x60
>> [<c1052696>] module_attr_store+0x26/0x40
>> [<c11a201e>] sysfs_write_file+0xae/0x100
>> [<c11449df>] vfs_write+0x8f/0x160
>> [<c1144cbd>] sys_write+0x3d/0x70
>> [<c147a0c4>] syscall_call+0x7/0xb
>>
>> -> #0 (rtnl_mutex){+.+.+.}:
>> [<c109164b>] __lock_acquire+0x140b/0x1720
>> [<c1091f70>] lock_acquire+0x80/0x1b0
>> [<c147655d>] mutex_lock_nested+0x6d/0x340
>> [<c13a10c4>] rtnl_lock+0x14/0x20
>> [<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe]
>> [<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe]
>> [<c104fb69>] process_one_work+0x179/0x5d0
>> [<c10502f1>] worker_thread+0x121/0x2d0
>> [<c10550ed>] kthread+0x7d/0x90
>> [<c1481a82>] kernel_thread_helper+0x6/0x10
>>
>> other info that might help us debug this:
>>
>> Possible unsafe locking scenario:
>>
>> CPU0 CPU1
>> ---- ----
>> lock(&fip->ctlr_mutex);
>> lock(rtnl_mutex);
>> lock(&fip->ctlr_mutex);
>> lock(rtnl_mutex);
>>
>> *** DEADLOCK ***
>>
>> Signed-off-by: Robert Love<robert.w.love@intel.com>
>> ---
>> drivers/scsi/fcoe/fcoe.c | 2 --
>> 1 files changed, 0 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
>> index e959960..85b8203 100644
>> --- a/drivers/scsi/fcoe/fcoe.c
>> +++ b/drivers/scsi/fcoe/fcoe.c
>> @@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport *lport, u8 *addr)
>> struct fcoe_port *port = lport_priv(lport);
>> struct fcoe_interface *fcoe = port->priv;
>>
>> - rtnl_lock();
>> if (!is_zero_ether_addr(port->data_src_addr))
>> dev_uc_del(fcoe->netdev, port->data_src_addr);
>> if (!is_zero_ether_addr(addr))
>> dev_uc_add(fcoe->netdev, addr);
>> memcpy(port->data_src_addr, addr, ETH_ALEN);
>> - rtnl_unlock();
>> }
>>
>> /**
>>
> This isn't going to work. We do need rtnl_lock when calling
> dev_uc_add/del to ensure the driver isn't removed while making the
> change. I have an alternative patch that I'll post as soon as I clean it
> up a bit.
>
> Nacked-by: Robert Love <robert.w.love@intel.com>
So there is a case you don't have a ref cnt on the netdev here?
I guess my point is if your carrying around a ptr to the struct why
haven't you incremented the refcnt. I think the dev_hold() in the
create path would be enough to stop the above concern.
Thanks,
John
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac
[not found] ` <4F5FFF19.1060601-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2012-03-14 16:18 ` Zou, Yi
0 siblings, 0 replies; 5+ messages in thread
From: Zou, Yi @ 2012-03-14 16:18 UTC (permalink / raw)
To: Fastabend, John R, Love, Robert W
Cc: bvanassche-HInyCGIudOg@public.gmane.org,
linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
devel-s9riP+hp16TNLxjTenLetw@public.gmane.org
>
> On 3/13/2012 5:42 PM, Love, Robert W wrote:
> > On 03/13/2012 03:52 PM, Robert Love wrote:
> >> The rtnl_mutex was held to protect calls to dev_uc_add
> >> and dev_uc_del. Holding rtnl is not required as those
> >> functions make use of the netif_addr_lock* API to
> >> protect the MAC changing.
> >>
> >> This change fixes the following regression by removing
> >> the rtnl usage when fcoe_update_src_mac is called.
> >>
> >> https://bugzilla.kernel.org/show_bug.cgi?id=42918
> >>
> >> the existing dependency chain (in reverse order) is:
> >>
> >> -> #1 (&fip->ctlr_mutex){+.+...}:
> >> [<c1091f70>] lock_acquire+0x80/0x1b0
> >> [<c147655d>] mutex_lock_nested+0x6d/0x340
> >> [<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe]
> >> [<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe]
> >> [<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe]
> >> [<c10527e0>] param_attr_store+0x30/0x60
> >> [<c1052696>] module_attr_store+0x26/0x40
> >> [<c11a201e>] sysfs_write_file+0xae/0x100
> >> [<c11449df>] vfs_write+0x8f/0x160
> >> [<c1144cbd>] sys_write+0x3d/0x70
> >> [<c147a0c4>] syscall_call+0x7/0xb
> >>
> >> -> #0 (rtnl_mutex){+.+.+.}:
> >> [<c109164b>] __lock_acquire+0x140b/0x1720
> >> [<c1091f70>] lock_acquire+0x80/0x1b0
> >> [<c147655d>] mutex_lock_nested+0x6d/0x340
> >> [<c13a10c4>] rtnl_lock+0x14/0x20
> >> [<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe]
> >> [<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe]
> >> [<c104fb69>] process_one_work+0x179/0x5d0
> >> [<c10502f1>] worker_thread+0x121/0x2d0
> >> [<c10550ed>] kthread+0x7d/0x90
> >> [<c1481a82>] kernel_thread_helper+0x6/0x10
> >>
> >> other info that might help us debug this:
> >>
> >> Possible unsafe locking scenario:
> >>
> >> CPU0 CPU1
> >> ---- ----
> >> lock(&fip->ctlr_mutex);
> >> lock(rtnl_mutex);
> >> lock(&fip->ctlr_mutex);
> >> lock(rtnl_mutex);
> >>
> >> *** DEADLOCK ***
> >>
> >> Signed-off-by: Robert Love<robert.w.love-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >> ---
> >> drivers/scsi/fcoe/fcoe.c | 2 --
> >> 1 files changed, 0 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
> >> index e959960..85b8203 100644
> >> --- a/drivers/scsi/fcoe/fcoe.c
> >> +++ b/drivers/scsi/fcoe/fcoe.c
> >> @@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport
> *lport, u8 *addr)
> >> struct fcoe_port *port = lport_priv(lport);
> >> struct fcoe_interface *fcoe = port->priv;
> >>
> >> - rtnl_lock();
> >> if (!is_zero_ether_addr(port->data_src_addr))
> >> dev_uc_del(fcoe->netdev, port->data_src_addr);
> >> if (!is_zero_ether_addr(addr))
> >> dev_uc_add(fcoe->netdev, addr);
> >> memcpy(port->data_src_addr, addr, ETH_ALEN);
> >> - rtnl_unlock();
> >> }
> >>
> >> /**
> >>
> > This isn't going to work. We do need rtnl_lock when calling
> > dev_uc_add/del to ensure the driver isn't removed while making the
> > change. I have an alternative patch that I'll post as soon as I clean
> it
> > up a bit.
> >
> > Nacked-by: Robert Love <robert.w.love-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> So there is a case you don't have a ref cnt on the netdev here?
>
> I guess my point is if your carrying around a ptr to the struct why
> haven't you incremented the refcnt. I think the dev_hold() in the
> create path would be enough to stop the above concern.
>
> Thanks,
> John
I agree, dev_put() is only called until after fcoe ctlr is destroyed,
which does delete the fip timer as well as cancel_work_sync the fip
timeout work already, so this patch should be good.
yi
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac
2012-03-13 22:52 ` [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac Robert Love
2012-03-14 0:42 ` Love, Robert W
@ 2012-03-15 8:40 ` Bart Van Assche
1 sibling, 0 replies; 5+ messages in thread
From: Bart Van Assche @ 2012-03-15 8:40 UTC (permalink / raw)
To: Robert Love; +Cc: linux-scsi, devel, yi.zou
On 03/13/12 22:52, Robert Love wrote:
> The rtnl_mutex was held to protect calls to dev_uc_add
> and dev_uc_del. Holding rtnl is not required as those
> functions make use of the netif_addr_lock* API to
> protect the MAC changing.
>
> This change fixes the following regression by removing
> the rtnl usage when fcoe_update_src_mac is called.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=42918
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&fip->ctlr_mutex){+.+...}:
> [<c1091f70>] lock_acquire+0x80/0x1b0
> [<c147655d>] mutex_lock_nested+0x6d/0x340
> [<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe]
> [<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe]
> [<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe]
> [<c10527e0>] param_attr_store+0x30/0x60
> [<c1052696>] module_attr_store+0x26/0x40
> [<c11a201e>] sysfs_write_file+0xae/0x100
> [<c11449df>] vfs_write+0x8f/0x160
> [<c1144cbd>] sys_write+0x3d/0x70
> [<c147a0c4>] syscall_call+0x7/0xb
>
> -> #0 (rtnl_mutex){+.+.+.}:
> [<c109164b>] __lock_acquire+0x140b/0x1720
> [<c1091f70>] lock_acquire+0x80/0x1b0
> [<c147655d>] mutex_lock_nested+0x6d/0x340
> [<c13a10c4>] rtnl_lock+0x14/0x20
> [<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe]
> [<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe]
> [<c104fb69>] process_one_work+0x179/0x5d0
> [<c10502f1>] worker_thread+0x121/0x2d0
> [<c10550ed>] kthread+0x7d/0x90
> [<c1481a82>] kernel_thread_helper+0x6/0x10
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&fip->ctlr_mutex);
> lock(rtnl_mutex);
> lock(&fip->ctlr_mutex);
> lock(rtnl_mutex);
>
> *** DEADLOCK ***
>
> Signed-off-by: Robert Love <robert.w.love@intel.com>
> ---
> drivers/scsi/fcoe/fcoe.c | 2 --
> 1 files changed, 0 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
> index e959960..85b8203 100644
> --- a/drivers/scsi/fcoe/fcoe.c
> +++ b/drivers/scsi/fcoe/fcoe.c
> @@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport *lport, u8 *addr)
> struct fcoe_port *port = lport_priv(lport);
> struct fcoe_interface *fcoe = port->priv;
>
> - rtnl_lock();
> if (!is_zero_ether_addr(port->data_src_addr))
> dev_uc_del(fcoe->netdev, port->data_src_addr);
> if (!is_zero_ether_addr(addr))
> dev_uc_add(fcoe->netdev, addr);
> memcpy(port->data_src_addr, addr, ETH_ALEN);
> - rtnl_unlock();
> }
>
> /**
Seems to be sufficient here to avoid the lockdep complaint.
Tested-by: Bart Van Assche <bvanassche@acm.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-03-15 8:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <bug-42918-11613@https.bugzilla.kernel.org>
2012-03-13 22:52 ` [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac Robert Love
2012-03-14 0:42 ` Love, Robert W
2012-03-14 2:14 ` [Open-FCoE] " John Fastabend
[not found] ` <4F5FFF19.1060601-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2012-03-14 16:18 ` Zou, Yi
2012-03-15 8:40 ` Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox