From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78810C433DF for ; Thu, 30 Jul 2020 17:09:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 629A72082E for ; Thu, 30 Jul 2020 17:09:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730085AbgG3RJq (ORCPT ); Thu, 30 Jul 2020 13:09:46 -0400 Received: from mga04.intel.com ([192.55.52.120]:53573 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727072AbgG3RJp (ORCPT ); Thu, 30 Jul 2020 13:09:45 -0400 IronPort-SDR: kZjhnbVnSsKlFbOX/Sv4KOZIP5N/ZwEIjE7uxlDR0+ILR6EUuVVZAhMCilEuggXXqwlBce4d5l GM80S86204yw== X-IronPort-AV: E=McAfee;i="6000,8403,9698"; a="149111765" X-IronPort-AV: E=Sophos;i="5.75,415,1589266800"; d="scan'208";a="149111765" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jul 2020 10:09:44 -0700 IronPort-SDR: y0nW39Z1i8VKzqXnB6B/8ud7Db11d4wnn/oQAZLWbGbCyWuz1goOt+l/XCBe4nspf7Y/VYUN4A YOUs3yaRvHFA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,415,1589266800"; d="scan'208";a="272979243" Received: from jtkirshe-desk1.jf.intel.com ([134.134.177.86]) by fmsmga007.fm.intel.com with ESMTP; 30 Jul 2020 10:09:44 -0700 From: Tony Nguyen To: davem@davemloft.net Cc: Francesco Ruggeri , netdev@vger.kernel.org, nhorman@redhat.com, sassmann@redhat.com, jeffrey.t.kirsher@intel.com, anthony.l.nguyen@intel.com, Aaron Brown Subject: [net 2/2] igb: reinit_locked() should be called with rtnl_lock Date: Thu, 30 Jul 2020 10:09:38 -0700 Message-Id: <20200730170938.3766899-3-anthony.l.nguyen@intel.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200730170938.3766899-1-anthony.l.nguyen@intel.com> References: <20200730170938.3766899-1-anthony.l.nguyen@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Francesco Ruggeri We observed two panics involving races with igb_reset_task. The first panic is caused by this race condition: kworker reboot -f igb_reset_task igb_reinit_locked igb_down napi_synchronize __igb_shutdown igb_clear_interrupt_scheme igb_free_q_vectors igb_free_q_vector adapter->q_vector[v_idx] = NULL; napi_disable Panics trying to access adapter->q_vector[v_idx].napi_state The second panic (a divide error) is caused by this race: kworker reboot -f tx packet igb_reset_task __igb_shutdown rtnl_lock() ... igb_clear_interrupt_scheme igb_free_q_vectors adapter->num_tx_queues = 0 ... rtnl_unlock() rtnl_lock() igb_reinit_locked igb_down igb_up netif_tx_start_all_queues dev_hard_start_xmit igb_xmit_frame igb_tx_queue_mapping Panics on r_idx % adapter->num_tx_queues This commit applies to igb_reset_task the same changes that were applied to ixgbe in commit 2f90b8657ec9 ("ixgbe: this patch adds support for DCB to the kernel and ixgbe driver"), commit 8f4c5c9fb87a ("ixgbe: reinit_locked() should be called with rtnl_lock") and commit 88adce4ea8f9 ("ixgbe: fix possible race in reset subtask"). Signed-off-by: Francesco Ruggeri Tested-by: Aaron Brown Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 8bb3db2cbd41..6e5861bfb0fa 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -6224,9 +6224,18 @@ static void igb_reset_task(struct work_struct *work) struct igb_adapter *adapter; adapter = container_of(work, struct igb_adapter, reset_task); + rtnl_lock(); + /* If we're already down or resetting, just bail */ + if (test_bit(__IGB_DOWN, &adapter->state) || + test_bit(__IGB_RESETTING, &adapter->state)) { + rtnl_unlock(); + return; + } + igb_dump(adapter); netdev_err(adapter->netdev, "Reset adapter\n"); igb_reinit_locked(adapter); + rtnl_unlock(); } /** -- 2.26.2