From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EDD5C10F0B for ; Thu, 18 Apr 2019 07:05:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 445DE217D7 for ; Thu, 18 Apr 2019 07:05:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387570AbfDRHFO convert rfc822-to-8bit (ORCPT ); Thu, 18 Apr 2019 03:05:14 -0400 Received: from smtp.nue.novell.com ([195.135.221.5]:41162 "EHLO smtp.nue.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726162AbfDRHFO (ORCPT ); Thu, 18 Apr 2019 03:05:14 -0400 Received: from [10.67.17.168] ([45.122.156.254]) by smtp.nue.novell.com with ESMTP (TLS encrypted); Thu, 18 Apr 2019 09:05:08 +0200 Subject: Re: [PATCH 1/1] be2net: Detach interface for avoiding a system crash To: Saeed Mahameed , "sathya.perla@broadcom.com" , "sriharsha.basavapatna@broadcom.com" , "somnath.kotur@broadcom.com" , "ajit.khaparde@broadcom.com" , "davem@davemloft.net" Cc: "firogm@gmail.com" , "netdev@vger.kernel.org" References: <20190401122421.30116-1-fyang@suse.com> <786ef1d29facea3163b91ce160dd146f5759d160.camel@mellanox.com> From: Firo Message-ID: <22d06c53-2cfc-6000-ba0d-044dd9457a10@suse.com> Date: Thu, 18 Apr 2019 15:05:02 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <786ef1d29facea3163b91ce160dd146f5759d160.camel@mellanox.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8BIT Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 4/2/19 12:25 AM, Saeed Mahameed wrote: > On Mon, 2019-04-01 at 20:24 +0800, Firo Yang wrote: >> This crash is triggered by a user-after-free since lake of >> the synchronization of a race condition between >> be_update_queues() modifying multi-purpose channels of >> network device and be_tx_timeout(). >> >> BUG: unable to handle kernel NULL pointer dereference at (null) >> Call Trace: >> be_tx_timeout+0xa5/0x360 [be2net] >> dev_watchdog+0x1d8/0x210 >> call_timer_fn+0x32/0x140 >> >> To fix it, detach the interface before modifying >> multi-purpose channels of network device. >> >> Signed-off-by: Firo Yang >> --- >> drivers/net/ethernet/emulex/benet/be_main.c | 12 ++++++++---- >> 1 file changed, 8 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/net/ethernet/emulex/benet/be_main.c >> b/drivers/net/ethernet/emulex/benet/be_main.c >> index d5026909dec5..25d0128bf684 100644 >> --- a/drivers/net/ethernet/emulex/benet/be_main.c >> +++ b/drivers/net/ethernet/emulex/benet/be_main.c >> @@ -4705,6 +4705,8 @@ int be_update_queues(struct be_adapter >> *adapter) >> struct net_device *netdev = adapter->netdev; >> int status; >> >> + netif_device_detach(netdev); >> + > > This will reduce the probability, but will not do the trick. > since this will not guarantee that the dev_watchdog is disabled. Hi Saeed, What about using dev_watchdog_down/up() to temporarily disable the dev_watchdog? +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -4697,6 +4697,8 @@ int be_update_queues(struct be_adapter *adapter) struct net_device *netdev = adapter->netdev; int status; + dev_watchdog_down(); + if (netif_running(netdev)) be_close(netdev); @@ -4711,21 +4713,21 @@ int be_update_queues(struct be_adapter *adapter) be_clear_queues(adapter); status = be_cmd_if_destroy(adapter, adapter->if_handle, 0); if (status) - return status; + goto out; if (!msix_enabled(adapter)) { status = be_msix_enable(adapter); if (status) - return status; + goto out; } status = be_if_create(adapter); if (status) - return status; + goto out; status = be_setup_queues(adapter); if (status) - return status; + goto out; be_schedule_worker(adapter); @@ -4741,6 +4743,9 @@ int be_update_queues(struct be_adapter *adapter) if (netif_running(netdev)) status = be_open(netdev); +out: + dev_watchdog_up(); Thanks, Firo > > what you need is proper locking mechanism and/or scheduling the > tx_timeout handling out of atomic context if a mutex will be required. > > netif_device_detach is a too heavy hammer for such synchronizations > tasks. > >> if (netif_running(netdev)) >> be_close(netdev); >> >> @@ -4719,21 +4721,21 @@ int be_update_queues(struct be_adapter >> *adapter) >> be_clear_queues(adapter); >> status = be_cmd_if_destroy(adapter, adapter->if_handle, 0); >> if (status) >> - return status; >> + goto out; >> >> if (!msix_enabled(adapter)) { >> status = be_msix_enable(adapter); >> if (status) >> - return status; >> + goto out; >> } >> >> status = be_if_create(adapter); >> if (status) >> - return status; >> + goto out; >> >> status = be_setup_queues(adapter); >> if (status) >> - return status; >> + goto out; >> >> be_schedule_worker(adapter); >> >> @@ -4748,6 +4750,8 @@ int be_update_queues(struct be_adapter >> *adapter) >> if (netif_running(netdev)) >> status = be_open(netdev); >> >> +out: >> + netif_device_attach(netdev); >> return status; >> } >>