* [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac [not found] <bug-42918-11613@https.bugzilla.kernel.org> @ 2012-03-13 22:52 ` Robert Love 2012-03-14 0:42 ` Love, Robert W 2012-03-15 8:40 ` Bart Van Assche 0 siblings, 2 replies; 5+ messages in thread From: Robert Love @ 2012-03-13 22:52 UTC (permalink / raw) To: bvanassche, linux-scsi; +Cc: devel, yi.zou The rtnl_mutex was held to protect calls to dev_uc_add and dev_uc_del. Holding rtnl is not required as those functions make use of the netif_addr_lock* API to protect the MAC changing. This change fixes the following regression by removing the rtnl usage when fcoe_update_src_mac is called. https://bugzilla.kernel.org/show_bug.cgi?id=42918 the existing dependency chain (in reverse order) is: -> #1 (&fip->ctlr_mutex){+.+...}: [<c1091f70>] lock_acquire+0x80/0x1b0 [<c147655d>] mutex_lock_nested+0x6d/0x340 [<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe] [<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe] [<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe] [<c10527e0>] param_attr_store+0x30/0x60 [<c1052696>] module_attr_store+0x26/0x40 [<c11a201e>] sysfs_write_file+0xae/0x100 [<c11449df>] vfs_write+0x8f/0x160 [<c1144cbd>] sys_write+0x3d/0x70 [<c147a0c4>] syscall_call+0x7/0xb -> #0 (rtnl_mutex){+.+.+.}: [<c109164b>] __lock_acquire+0x140b/0x1720 [<c1091f70>] lock_acquire+0x80/0x1b0 [<c147655d>] mutex_lock_nested+0x6d/0x340 [<c13a10c4>] rtnl_lock+0x14/0x20 [<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe] [<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe] [<c104fb69>] process_one_work+0x179/0x5d0 [<c10502f1>] worker_thread+0x121/0x2d0 [<c10550ed>] kthread+0x7d/0x90 [<c1481a82>] kernel_thread_helper+0x6/0x10 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&fip->ctlr_mutex); lock(rtnl_mutex); lock(&fip->ctlr_mutex); lock(rtnl_mutex); *** DEADLOCK *** Signed-off-by: Robert Love <robert.w.love@intel.com> --- drivers/scsi/fcoe/fcoe.c | 2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c index e959960..85b8203 100644 --- a/drivers/scsi/fcoe/fcoe.c +++ b/drivers/scsi/fcoe/fcoe.c @@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport *lport, u8 *addr) struct fcoe_port *port = lport_priv(lport); struct fcoe_interface *fcoe = port->priv; - rtnl_lock(); if (!is_zero_ether_addr(port->data_src_addr)) dev_uc_del(fcoe->netdev, port->data_src_addr); if (!is_zero_ether_addr(addr)) dev_uc_add(fcoe->netdev, addr); memcpy(port->data_src_addr, addr, ETH_ALEN); - rtnl_unlock(); } /** ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac 2012-03-13 22:52 ` [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac Robert Love @ 2012-03-14 0:42 ` Love, Robert W 2012-03-14 2:14 ` [Open-FCoE] " John Fastabend 2012-03-15 8:40 ` Bart Van Assche 1 sibling, 1 reply; 5+ messages in thread From: Love, Robert W @ 2012-03-14 0:42 UTC (permalink / raw) To: bvanassche@acm.org, linux-scsi@vger.kernel.org Cc: devel@open-fcoe.org, Zou, Yi On 03/13/2012 03:52 PM, Robert Love wrote: > The rtnl_mutex was held to protect calls to dev_uc_add > and dev_uc_del. Holding rtnl is not required as those > functions make use of the netif_addr_lock* API to > protect the MAC changing. > > This change fixes the following regression by removing > the rtnl usage when fcoe_update_src_mac is called. > > https://bugzilla.kernel.org/show_bug.cgi?id=42918 > > the existing dependency chain (in reverse order) is: > > -> #1 (&fip->ctlr_mutex){+.+...}: > [<c1091f70>] lock_acquire+0x80/0x1b0 > [<c147655d>] mutex_lock_nested+0x6d/0x340 > [<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe] > [<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe] > [<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe] > [<c10527e0>] param_attr_store+0x30/0x60 > [<c1052696>] module_attr_store+0x26/0x40 > [<c11a201e>] sysfs_write_file+0xae/0x100 > [<c11449df>] vfs_write+0x8f/0x160 > [<c1144cbd>] sys_write+0x3d/0x70 > [<c147a0c4>] syscall_call+0x7/0xb > > -> #0 (rtnl_mutex){+.+.+.}: > [<c109164b>] __lock_acquire+0x140b/0x1720 > [<c1091f70>] lock_acquire+0x80/0x1b0 > [<c147655d>] mutex_lock_nested+0x6d/0x340 > [<c13a10c4>] rtnl_lock+0x14/0x20 > [<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe] > [<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe] > [<c104fb69>] process_one_work+0x179/0x5d0 > [<c10502f1>] worker_thread+0x121/0x2d0 > [<c10550ed>] kthread+0x7d/0x90 > [<c1481a82>] kernel_thread_helper+0x6/0x10 > > other info that might help us debug this: > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&fip->ctlr_mutex); > lock(rtnl_mutex); > lock(&fip->ctlr_mutex); > lock(rtnl_mutex); > > *** DEADLOCK *** > > Signed-off-by: Robert Love<robert.w.love@intel.com> > --- > drivers/scsi/fcoe/fcoe.c | 2 -- > 1 files changed, 0 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c > index e959960..85b8203 100644 > --- a/drivers/scsi/fcoe/fcoe.c > +++ b/drivers/scsi/fcoe/fcoe.c > @@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport *lport, u8 *addr) > struct fcoe_port *port = lport_priv(lport); > struct fcoe_interface *fcoe = port->priv; > > - rtnl_lock(); > if (!is_zero_ether_addr(port->data_src_addr)) > dev_uc_del(fcoe->netdev, port->data_src_addr); > if (!is_zero_ether_addr(addr)) > dev_uc_add(fcoe->netdev, addr); > memcpy(port->data_src_addr, addr, ETH_ALEN); > - rtnl_unlock(); > } > > /** > This isn't going to work. We do need rtnl_lock when calling dev_uc_add/del to ensure the driver isn't removed while making the change. I have an alternative patch that I'll post as soon as I clean it up a bit. Nacked-by: Robert Love <robert.w.love@intel.com> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Open-FCoE] [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac 2012-03-14 0:42 ` Love, Robert W @ 2012-03-14 2:14 ` John Fastabend [not found] ` <4F5FFF19.1060601-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: John Fastabend @ 2012-03-14 2:14 UTC (permalink / raw) To: Love, Robert W Cc: bvanassche@acm.org, linux-scsi@vger.kernel.org, devel@open-fcoe.org On 3/13/2012 5:42 PM, Love, Robert W wrote: > On 03/13/2012 03:52 PM, Robert Love wrote: >> The rtnl_mutex was held to protect calls to dev_uc_add >> and dev_uc_del. Holding rtnl is not required as those >> functions make use of the netif_addr_lock* API to >> protect the MAC changing. >> >> This change fixes the following regression by removing >> the rtnl usage when fcoe_update_src_mac is called. >> >> https://bugzilla.kernel.org/show_bug.cgi?id=42918 >> >> the existing dependency chain (in reverse order) is: >> >> -> #1 (&fip->ctlr_mutex){+.+...}: >> [<c1091f70>] lock_acquire+0x80/0x1b0 >> [<c147655d>] mutex_lock_nested+0x6d/0x340 >> [<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe] >> [<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe] >> [<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe] >> [<c10527e0>] param_attr_store+0x30/0x60 >> [<c1052696>] module_attr_store+0x26/0x40 >> [<c11a201e>] sysfs_write_file+0xae/0x100 >> [<c11449df>] vfs_write+0x8f/0x160 >> [<c1144cbd>] sys_write+0x3d/0x70 >> [<c147a0c4>] syscall_call+0x7/0xb >> >> -> #0 (rtnl_mutex){+.+.+.}: >> [<c109164b>] __lock_acquire+0x140b/0x1720 >> [<c1091f70>] lock_acquire+0x80/0x1b0 >> [<c147655d>] mutex_lock_nested+0x6d/0x340 >> [<c13a10c4>] rtnl_lock+0x14/0x20 >> [<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe] >> [<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe] >> [<c104fb69>] process_one_work+0x179/0x5d0 >> [<c10502f1>] worker_thread+0x121/0x2d0 >> [<c10550ed>] kthread+0x7d/0x90 >> [<c1481a82>] kernel_thread_helper+0x6/0x10 >> >> other info that might help us debug this: >> >> Possible unsafe locking scenario: >> >> CPU0 CPU1 >> ---- ---- >> lock(&fip->ctlr_mutex); >> lock(rtnl_mutex); >> lock(&fip->ctlr_mutex); >> lock(rtnl_mutex); >> >> *** DEADLOCK *** >> >> Signed-off-by: Robert Love<robert.w.love@intel.com> >> --- >> drivers/scsi/fcoe/fcoe.c | 2 -- >> 1 files changed, 0 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c >> index e959960..85b8203 100644 >> --- a/drivers/scsi/fcoe/fcoe.c >> +++ b/drivers/scsi/fcoe/fcoe.c >> @@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport *lport, u8 *addr) >> struct fcoe_port *port = lport_priv(lport); >> struct fcoe_interface *fcoe = port->priv; >> >> - rtnl_lock(); >> if (!is_zero_ether_addr(port->data_src_addr)) >> dev_uc_del(fcoe->netdev, port->data_src_addr); >> if (!is_zero_ether_addr(addr)) >> dev_uc_add(fcoe->netdev, addr); >> memcpy(port->data_src_addr, addr, ETH_ALEN); >> - rtnl_unlock(); >> } >> >> /** >> > This isn't going to work. We do need rtnl_lock when calling > dev_uc_add/del to ensure the driver isn't removed while making the > change. I have an alternative patch that I'll post as soon as I clean it > up a bit. > > Nacked-by: Robert Love <robert.w.love@intel.com> So there is a case you don't have a ref cnt on the netdev here? I guess my point is if your carrying around a ptr to the struct why haven't you incremented the refcnt. I think the dev_hold() in the create path would be enough to stop the above concern. Thanks, John ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <4F5FFF19.1060601-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac [not found] ` <4F5FFF19.1060601-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> @ 2012-03-14 16:18 ` Zou, Yi 0 siblings, 0 replies; 5+ messages in thread From: Zou, Yi @ 2012-03-14 16:18 UTC (permalink / raw) To: Fastabend, John R, Love, Robert W Cc: bvanassche-HInyCGIudOg@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, devel-s9riP+hp16TNLxjTenLetw@public.gmane.org > > On 3/13/2012 5:42 PM, Love, Robert W wrote: > > On 03/13/2012 03:52 PM, Robert Love wrote: > >> The rtnl_mutex was held to protect calls to dev_uc_add > >> and dev_uc_del. Holding rtnl is not required as those > >> functions make use of the netif_addr_lock* API to > >> protect the MAC changing. > >> > >> This change fixes the following regression by removing > >> the rtnl usage when fcoe_update_src_mac is called. > >> > >> https://bugzilla.kernel.org/show_bug.cgi?id=42918 > >> > >> the existing dependency chain (in reverse order) is: > >> > >> -> #1 (&fip->ctlr_mutex){+.+...}: > >> [<c1091f70>] lock_acquire+0x80/0x1b0 > >> [<c147655d>] mutex_lock_nested+0x6d/0x340 > >> [<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe] > >> [<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe] > >> [<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe] > >> [<c10527e0>] param_attr_store+0x30/0x60 > >> [<c1052696>] module_attr_store+0x26/0x40 > >> [<c11a201e>] sysfs_write_file+0xae/0x100 > >> [<c11449df>] vfs_write+0x8f/0x160 > >> [<c1144cbd>] sys_write+0x3d/0x70 > >> [<c147a0c4>] syscall_call+0x7/0xb > >> > >> -> #0 (rtnl_mutex){+.+.+.}: > >> [<c109164b>] __lock_acquire+0x140b/0x1720 > >> [<c1091f70>] lock_acquire+0x80/0x1b0 > >> [<c147655d>] mutex_lock_nested+0x6d/0x340 > >> [<c13a10c4>] rtnl_lock+0x14/0x20 > >> [<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe] > >> [<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe] > >> [<c104fb69>] process_one_work+0x179/0x5d0 > >> [<c10502f1>] worker_thread+0x121/0x2d0 > >> [<c10550ed>] kthread+0x7d/0x90 > >> [<c1481a82>] kernel_thread_helper+0x6/0x10 > >> > >> other info that might help us debug this: > >> > >> Possible unsafe locking scenario: > >> > >> CPU0 CPU1 > >> ---- ---- > >> lock(&fip->ctlr_mutex); > >> lock(rtnl_mutex); > >> lock(&fip->ctlr_mutex); > >> lock(rtnl_mutex); > >> > >> *** DEADLOCK *** > >> > >> Signed-off-by: Robert Love<robert.w.love-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > >> --- > >> drivers/scsi/fcoe/fcoe.c | 2 -- > >> 1 files changed, 0 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c > >> index e959960..85b8203 100644 > >> --- a/drivers/scsi/fcoe/fcoe.c > >> +++ b/drivers/scsi/fcoe/fcoe.c > >> @@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport > *lport, u8 *addr) > >> struct fcoe_port *port = lport_priv(lport); > >> struct fcoe_interface *fcoe = port->priv; > >> > >> - rtnl_lock(); > >> if (!is_zero_ether_addr(port->data_src_addr)) > >> dev_uc_del(fcoe->netdev, port->data_src_addr); > >> if (!is_zero_ether_addr(addr)) > >> dev_uc_add(fcoe->netdev, addr); > >> memcpy(port->data_src_addr, addr, ETH_ALEN); > >> - rtnl_unlock(); > >> } > >> > >> /** > >> > > This isn't going to work. We do need rtnl_lock when calling > > dev_uc_add/del to ensure the driver isn't removed while making the > > change. I have an alternative patch that I'll post as soon as I clean > it > > up a bit. > > > > Nacked-by: Robert Love <robert.w.love-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> > > So there is a case you don't have a ref cnt on the netdev here? > > I guess my point is if your carrying around a ptr to the struct why > haven't you incremented the refcnt. I think the dev_hold() in the > create path would be enough to stop the above concern. > > Thanks, > John I agree, dev_put() is only called until after fcoe ctlr is destroyed, which does delete the fip timer as well as cancel_work_sync the fip timeout work already, so this patch should be good. yi > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac 2012-03-13 22:52 ` [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac Robert Love 2012-03-14 0:42 ` Love, Robert W @ 2012-03-15 8:40 ` Bart Van Assche 1 sibling, 0 replies; 5+ messages in thread From: Bart Van Assche @ 2012-03-15 8:40 UTC (permalink / raw) To: Robert Love; +Cc: linux-scsi, devel, yi.zou On 03/13/12 22:52, Robert Love wrote: > The rtnl_mutex was held to protect calls to dev_uc_add > and dev_uc_del. Holding rtnl is not required as those > functions make use of the netif_addr_lock* API to > protect the MAC changing. > > This change fixes the following regression by removing > the rtnl usage when fcoe_update_src_mac is called. > > https://bugzilla.kernel.org/show_bug.cgi?id=42918 > > the existing dependency chain (in reverse order) is: > > -> #1 (&fip->ctlr_mutex){+.+...}: > [<c1091f70>] lock_acquire+0x80/0x1b0 > [<c147655d>] mutex_lock_nested+0x6d/0x340 > [<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe] > [<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe] > [<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe] > [<c10527e0>] param_attr_store+0x30/0x60 > [<c1052696>] module_attr_store+0x26/0x40 > [<c11a201e>] sysfs_write_file+0xae/0x100 > [<c11449df>] vfs_write+0x8f/0x160 > [<c1144cbd>] sys_write+0x3d/0x70 > [<c147a0c4>] syscall_call+0x7/0xb > > -> #0 (rtnl_mutex){+.+.+.}: > [<c109164b>] __lock_acquire+0x140b/0x1720 > [<c1091f70>] lock_acquire+0x80/0x1b0 > [<c147655d>] mutex_lock_nested+0x6d/0x340 > [<c13a10c4>] rtnl_lock+0x14/0x20 > [<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe] > [<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe] > [<c104fb69>] process_one_work+0x179/0x5d0 > [<c10502f1>] worker_thread+0x121/0x2d0 > [<c10550ed>] kthread+0x7d/0x90 > [<c1481a82>] kernel_thread_helper+0x6/0x10 > > other info that might help us debug this: > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&fip->ctlr_mutex); > lock(rtnl_mutex); > lock(&fip->ctlr_mutex); > lock(rtnl_mutex); > > *** DEADLOCK *** > > Signed-off-by: Robert Love <robert.w.love@intel.com> > --- > drivers/scsi/fcoe/fcoe.c | 2 -- > 1 files changed, 0 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c > index e959960..85b8203 100644 > --- a/drivers/scsi/fcoe/fcoe.c > +++ b/drivers/scsi/fcoe/fcoe.c > @@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport *lport, u8 *addr) > struct fcoe_port *port = lport_priv(lport); > struct fcoe_interface *fcoe = port->priv; > > - rtnl_lock(); > if (!is_zero_ether_addr(port->data_src_addr)) > dev_uc_del(fcoe->netdev, port->data_src_addr); > if (!is_zero_ether_addr(addr)) > dev_uc_add(fcoe->netdev, addr); > memcpy(port->data_src_addr, addr, ETH_ALEN); > - rtnl_unlock(); > } > > /** Seems to be sufficient here to avoid the lockdep complaint. Tested-by: Bart Van Assche <bvanassche@acm.org> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-03-15 8:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <bug-42918-11613@https.bugzilla.kernel.org>
2012-03-13 22:52 ` [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac Robert Love
2012-03-14 0:42 ` Love, Robert W
2012-03-14 2:14 ` [Open-FCoE] " John Fastabend
[not found] ` <4F5FFF19.1060601-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2012-03-14 16:18 ` Zou, Yi
2012-03-15 8:40 ` Bart Van Assche
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox