All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Firo Yang <firo.yang@suse.com>,
	"David S . Miller" <davem@davemloft.net>,
	Sasha Levin <sashal@kernel.org>,
	netdev@vger.kernel.org
Subject: [PATCH AUTOSEL 4.14 18/21] enic: prevent waking up stopped tx queues over watchdog reset
Date: Sat, 22 Feb 2020 21:24:08 -0500	[thread overview]
Message-ID: <20200223022411.2159-18-sashal@kernel.org> (raw)
In-Reply-To: <20200223022411.2159-1-sashal@kernel.org>

From: Firo Yang <firo.yang@suse.com>

[ Upstream commit 0f90522591fd09dd201065c53ebefdfe3c6b55cb ]

Recent months, our customer reported several kernel crashes all
preceding with following message:
NETDEV WATCHDOG: eth2 (enic): transmit queue 0 timed out
Error message of one of those crashes:
BUG: unable to handle kernel paging request at ffffffffa007e090

After analyzing severl vmcores, I found that most of crashes are
caused by memory corruption. And all the corrupted memory areas
are overwritten by data of network packets. Moreover, I also found
that the tx queues were enabled over watchdog reset.

After going through the source code, I found that in enic_stop(),
the tx queues stopped by netif_tx_disable() could be woken up over
a small time window between netif_tx_disable() and the
napi_disable() by the following code path:
napi_poll->
  enic_poll_msix_wq->
     vnic_cq_service->
        enic_wq_service->
           netif_wake_subqueue(enic->netdev, q_number)->
              test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &txq->state)
In turn, upper netowrk stack could queue skb to ENIC NIC though
enic_hard_start_xmit(). And this might introduce some race condition.

Our customer comfirmed that this kind of kernel crash doesn't occur over
90 days since they applied this patch.

Signed-off-by: Firo Yang <firo.yang@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/cisco/enic/enic_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 19f374b180fc1..52a3b32390a9c 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -1972,10 +1972,10 @@ static int enic_stop(struct net_device *netdev)
 		napi_disable(&enic->napi[i]);
 
 	netif_carrier_off(netdev);
-	netif_tx_disable(netdev);
 	if (vnic_dev_get_intr_mode(enic->vdev) == VNIC_DEV_INTR_MODE_MSIX)
 		for (i = 0; i < enic->wq_count; i++)
 			napi_disable(&enic->napi[enic_cq_wq(enic, i)]);
+	netif_tx_disable(netdev);
 
 	if (!enic_is_dynamic(enic) && !enic_is_sriov_vf(enic))
 		enic_dev_del_station_addr(enic);
-- 
2.20.1


  parent reply	other threads:[~2020-02-23  2:24 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-23  2:23 [PATCH AUTOSEL 4.14 01/21] ipmi:ssif: Handle a possible NULL pointer reference Sasha Levin
2020-02-23  2:23 ` [PATCH AUTOSEL 4.14 02/21] drm/msm: Set dma maximum segment size for mdss Sasha Levin
2020-02-23  2:23   ` Sasha Levin
2020-02-23  2:23 ` [PATCH AUTOSEL 4.14 03/21] dax: pass NOWAIT flag to iomap_apply Sasha Levin
2020-02-23  2:23   ` Sasha Levin
2020-02-23  2:23 ` [PATCH AUTOSEL 4.14 04/21] mac80211: consider more elements in parsing CRC Sasha Levin
2020-02-23  2:23 ` [PATCH AUTOSEL 4.14 05/21] cfg80211: check wiphy driver existence for drvinfo report Sasha Levin
2020-02-23  2:23 ` [PATCH AUTOSEL 4.14 06/21] qmi_wwan: re-add DW5821e pre-production variant Sasha Levin
2020-02-23  2:23 ` [PATCH AUTOSEL 4.14 07/21] qmi_wwan: unconditionally reject 2 ep interfaces Sasha Levin
2020-02-23  2:23 ` [PATCH AUTOSEL 4.14 08/21] arm/ftrace: Fix BE text poking Sasha Levin
2020-02-23  2:23   ` Sasha Levin
2020-02-23  2:23 ` [PATCH AUTOSEL 4.14 09/21] net: ena: fix potential crash when rxfh key is NULL Sasha Levin
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 10/21] net: ena: fix uses of round_jiffies() Sasha Levin
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 11/21] net: ena: add missing ethtool TX timestamping indication Sasha Levin
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 12/21] net: ena: fix incorrect default RSS key Sasha Levin
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 13/21] net: ena: rss: fix failure to get indirection table Sasha Levin
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 14/21] net: ena: rss: store hash function as values and not bits Sasha Levin
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 15/21] net: ena: fix incorrectly saving queue numbers when setting RSS indirection table Sasha Levin
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 16/21] net: ena: ethtool: use correct value for crc32 hash Sasha Levin
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 17/21] net: ena: ena-com.c: prevent NULL pointer dereference Sasha Levin
2020-02-23  2:24 ` Sasha Levin [this message]
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 19/21] cifs: Fix mode output in debugging statements Sasha Levin
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 20/21] bcache: ignore pending signals when creating gc and allocator thread Sasha Levin
2020-02-23  2:24 ` [PATCH AUTOSEL 4.14 21/21] cfg80211: add missing policy for NL80211_ATTR_STATUS_CODE Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200223022411.2159-18-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=davem@davemloft.net \
    --cc=firo.yang@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.