From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Fastabend Subject: Re: [Open-FCoE] [PATCH] fcoe: Don't hold rtnl_mutex in fcoe_update_src_mac Date: Tue, 13 Mar 2012 19:14:49 -0700 Message-ID: <4F5FFF19.1060601@intel.com> References: <20120313225254.5473.92174.stgit@localhost6.localdomain6> <4F5FE95D.8040408@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: Received: from mga11.intel.com ([192.55.52.93]:65009 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760671Ab2CNCOu (ORCPT ); Tue, 13 Mar 2012 22:14:50 -0400 In-Reply-To: <4F5FE95D.8040408@intel.com> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: "Love, Robert W" Cc: "bvanassche@acm.org" , "linux-scsi@vger.kernel.org" , "devel@open-fcoe.org" On 3/13/2012 5:42 PM, Love, Robert W wrote: > On 03/13/2012 03:52 PM, Robert Love wrote: >> The rtnl_mutex was held to protect calls to dev_uc_add >> and dev_uc_del. Holding rtnl is not required as those >> functions make use of the netif_addr_lock* API to >> protect the MAC changing. >> >> This change fixes the following regression by removing >> the rtnl usage when fcoe_update_src_mac is called. >> >> https://bugzilla.kernel.org/show_bug.cgi?id=42918 >> >> the existing dependency chain (in reverse order) is: >> >> -> #1 (&fip->ctlr_mutex){+.+...}: >> [] lock_acquire+0x80/0x1b0 >> [] mutex_lock_nested+0x6d/0x340 >> [] fcoe_ctlr_link_up+0x22/0x180 [libfcoe] >> [] fcoe_create+0x47e/0x6e0 [fcoe] >> [] fcoe_transport_create+0x143/0x250 [libfcoe] >> [] param_attr_store+0x30/0x60 >> [] module_attr_store+0x26/0x40 >> [] sysfs_write_file+0xae/0x100 >> [] vfs_write+0x8f/0x160 >> [] sys_write+0x3d/0x70 >> [] syscall_call+0x7/0xb >> >> -> #0 (rtnl_mutex){+.+.+.}: >> [] __lock_acquire+0x140b/0x1720 >> [] lock_acquire+0x80/0x1b0 >> [] mutex_lock_nested+0x6d/0x340 >> [] rtnl_lock+0x14/0x20 >> [] fcoe_update_src_mac+0x2c/0xb0 [fcoe] >> [] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe] >> [] process_one_work+0x179/0x5d0 >> [] worker_thread+0x121/0x2d0 >> [] kthread+0x7d/0x90 >> [] kernel_thread_helper+0x6/0x10 >> >> other info that might help us debug this: >> >> Possible unsafe locking scenario: >> >> CPU0 CPU1 >> ---- ---- >> lock(&fip->ctlr_mutex); >> lock(rtnl_mutex); >> lock(&fip->ctlr_mutex); >> lock(rtnl_mutex); >> >> *** DEADLOCK *** >> >> Signed-off-by: Robert Love >> --- >> drivers/scsi/fcoe/fcoe.c | 2 -- >> 1 files changed, 0 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c >> index e959960..85b8203 100644 >> --- a/drivers/scsi/fcoe/fcoe.c >> +++ b/drivers/scsi/fcoe/fcoe.c >> @@ -539,13 +539,11 @@ static void fcoe_update_src_mac(struct fc_lport *lport, u8 *addr) >> struct fcoe_port *port = lport_priv(lport); >> struct fcoe_interface *fcoe = port->priv; >> >> - rtnl_lock(); >> if (!is_zero_ether_addr(port->data_src_addr)) >> dev_uc_del(fcoe->netdev, port->data_src_addr); >> if (!is_zero_ether_addr(addr)) >> dev_uc_add(fcoe->netdev, addr); >> memcpy(port->data_src_addr, addr, ETH_ALEN); >> - rtnl_unlock(); >> } >> >> /** >> > This isn't going to work. We do need rtnl_lock when calling > dev_uc_add/del to ensure the driver isn't removed while making the > change. I have an alternative patch that I'll post as soon as I clean it > up a bit. > > Nacked-by: Robert Love So there is a case you don't have a ref cnt on the netdev here? I guess my point is if your carrying around a ptr to the struct why haven't you incremented the refcnt. I think the dev_hold() in the create path would be enough to stop the above concern. Thanks, John