Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net v3 0/3] net: airoha: Fix airoha_qdma_cleanup_tx_queue() processing
From: Lorenzo Bianconi @ 2026-04-17  6:26 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman
  Cc: linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260416-airoha_qdma_cleanup_tx_queue-fix-net-v3-0-2b69f5788580@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]

> Add missing bits in airoha_qdma_cleanup_tx_queue routine.
> Fix airoha_qdma_cleanup_tx_queue processing errors intorduced in commit
> '3f47e67dff1f7 ("net: airoha: Add the capability to consume out-of-order
> DMA tx descriptors")'.
> 
> ---
> Changes in v3:
> - Move ndesc initialization fix in a dedicated patch.
> - Add patch 2/3 to move entries to queue head in case of DMA mapping
>   failure in airoha_dev_xmit().
> - Cosmetics.
> - Link to v2: https://lore.kernel.org/r/20260414-airoha_qdma_cleanup_tx_queue-fix-net-v2-1-875de57cc022@kernel.org
> 
> Changes in v2:
> - Move q->ndesc initialization at end of airoha_qdma_init_tx routine in
>   order to avoid any possible NULL pointer dereference in
>   airoha_qdma_cleanup_tx_queue()
> - Check if q->tx_list is empty in airoha_qdma_cleanup_tx_queue()
> - Link to v1: https://lore.kernel.org/r/20260410-airoha_qdma_cleanup_tx_queue-fix-net-v1-1-b7171c8f1e78@kernel.org
> 
> ---
> Lorenzo Bianconi (3):
>       net: airoha: Move ndesc initialization at end of airoha_qdma_init_tx()
>       net: airoha: Move entries to queue head in case of DMA mapping failure in airoha_dev_xmit()
>       net: airoha: Add missing bits in airoha_qdma_cleanup_tx_queue()

Please drop this version, I will send a new one dropping patch 2/3.

Regards,
Lorenzo

> 
>  drivers/net/ethernet/airoha/airoha_eth.c | 42 ++++++++++++++++++++++++++------
>  1 file changed, 35 insertions(+), 7 deletions(-)
> ---
> base-commit: 3f20012a3964f487ae1e9ff942e2f35d4e9595bf
> change-id: 20260410-airoha_qdma_cleanup_tx_queue-fix-net-93375f5ee80f
> 
> Best regards,
> -- 
> Lorenzo Bianconi <lorenzo@kernel.org>
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* [PATCH iwl-net 0/4] ice: fixes for pause reporting, autoneg, RDMA and EIPE
From: Aleksandr Loktionov @ 2026-04-17  6:29 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov; +Cc: netdev

This is v2 of the ice fixes patchset for iwl-net.

v2 changes:
 - Dropped patch "ice: fix 'adjust' timer programming for E830 devices"
   as it has already been applied to the iwl-net tree.

This series fixes four issues in the Intel ice driver:

- Asymmetric Pause capability was missing from the ethtool-reported
  supported link modes, causing ethtool to always show Pause as
  unsupported even when the hardware supports asymmetric flow control.

- Autoneg disable was only attempted when AN had already completed,
  ignoring the case where the link partner does not advertise AN ability
  at all (AN37).  Both conditions should allow the user to disable
  autoneg.

- RDMA was incorrectly disabled on E830 devices with 4 or more ports
  because the generic port-limited-capabilities path capped maxtc=4 and
  then cleared the RDMA capability bit.  E830 does not have that
  limitation and must be skipped.

- On E830, Ethernet Inline IPsec Engine (EIPE) decryption errors trigger
  a checksum-error path that returned early without reporting the error
  to the OS.  The packet must be forwarded to the stack with the
  checksum error flag set so the OS can handle it correctly.

Jan Glaza (1):
  ice: report EIPE checksum errors to the OS on E830

Konrad Knitter (1):
  ice: fix autoneg disable when link partner doesn't support AN

Lukasz Czapnik (1):
  ice: support RDMA on 4+-port E830 devices

Tomasz Lichwala (1):
  ice: fix asymmetric pause negotiation reporting in ethtool

 drivers/net/ethernet/intel/ice/ice_common.c   |  2 +-
 drivers/net/ethernet/intel/ice/ice_ethtool.c  | 30 ++++++++++++++++++++++++--
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c |  2 ++
 3 files changed, 31 insertions(+), 3 deletions(-)

-- 
2.52.0


^ permalink raw reply

* [PATCH iwl-net 1/4] ice: fix asymmetric pause negotiation reporting in ethtool
From: Aleksandr Loktionov @ 2026-04-17  6:29 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov
  Cc: netdev, Tomasz Lichwala
In-Reply-To: <20260417062954.1241900-1-aleksandr.loktionov@intel.com>

From: Tomasz Lichwala <tomasz.lichwala@intel.com>

Add Asym_Pause to the supported link modes so that asymmetric pause
negotiation is properly reported via ethtool. Without Asym_Pause in
the supported modes, 'ethtool -a' incorrectly shows 'RX/TX negotiated: off'
for asymmetric pause configurations, even when pause is properly
negotiated and functional at the hardware level.

Fixes: 5a056cd7ead2 ("ice: add lp_advertising flow control support")
Signed-off-by: Tomasz Lichwala <tomasz.lichwala@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_ethtool.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index e6a20af..30d2550 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -2373,8 +2373,9 @@ ice_get_link_ksettings(struct net_device *netdev,
 		break;
 	}

-	/* flow control is symmetric and always supported */
+	/* flow control is symmetric or asymmetric and always supported */
 	ethtool_link_ksettings_add_link_mode(ks, supported, Pause);
+	ethtool_link_ksettings_add_link_mode(ks, supported, Asym_Pause);

 	caps = kzalloc_obj(*caps);
 	if (!caps)
--
2.52.0


^ permalink raw reply related

* [PATCH iwl-net 2/4] ice: fix autoneg disable when link partner doesn't support AN
From: Aleksandr Loktionov @ 2026-04-17  6:29 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov
  Cc: netdev, Konrad Knitter
In-Reply-To: <20260417062954.1241900-1-aleksandr.loktionov@intel.com>

From: Konrad Knitter <konrad.knitter@intel.com>

Disabling autonegotiation was silently ignored when autoneg had not yet
completed (ICE_AQ_AN_COMPLETED was not set), leaving the configuration
unchanged with no error. This could prevent link from forming if the
link partner requires non-autoneg mode.

Extend the condition to also allow disabling autoneg when the link
partner reports no AN ability (ICE_AQ_LP_AN_ABILITY clear). Gate the
ICE_AQ_LP_AN_ABILITY check on the link being up so that stale or
zeroed an_info when link is down does not produce a false positive.
Introduce the helper ice_autoneg_disable_allowed() to make the check
explicit.

Fixes: f1a4a66d2310 ("ice: fix set pause param autoneg check")
Signed-off-by: Konrad Knitter <konrad.knitter@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_ethtool.c | 26 ++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index 30d2550..e41fc8d 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -2509,6 +2509,28 @@ ice_ksettings_find_adv_link_speed(const struct ethtool_link_ksettings *ks)
 	return adv_link_speed;
 }

+/**
+ * ice_autoneg_disable_allowed - check if autoneg can be disabled
+ * @p: port info
+ *
+ * Check if autonegotiation can be disabled based on link state.
+ * ICE_AQ_LP_AN_ABILITY is only valid when the link is up; gate that
+ * check accordingly to avoid false positives from stale link data.
+ *
+ * Return: true if autoneg has completed, or if the link is up and the
+ *         link partner does not advertise autonegotiation capability.
+ */
+static bool ice_autoneg_disable_allowed(struct ice_port_info *p)
+{
+	u8 an_info = p->phy.link_info.an_info;
+
+	if (an_info & ICE_AQ_AN_COMPLETED)
+		return true;
+	/* ICE_AQ_LP_AN_ABILITY is only valid when link is up */
+	return (p->phy.link_info.link_info & ICE_AQ_LINK_UP) &&
+	       !(an_info & ICE_AQ_LP_AN_ABILITY);
+}
+
 /**
  * ice_setup_autoneg
  * @p: port info
@@ -2547,8 +2569,8 @@ ice_setup_autoneg(struct ice_port_info *p, struct ethtool_link_ksettings *ks,
 			}
 		}
 	} else {
-		/* If autoneg is currently enabled */
-		if (p->phy.link_info.an_info & ICE_AQ_AN_COMPLETED) {
+		/* If autoneg completed or link partner does not support AN */
+		if (ice_autoneg_disable_allowed(p)) {
 			/* If autoneg is supported 10GBASE_T is the only PHY
 			 * that can disable it, so otherwise return error
 			 */
--
2.52.0


^ permalink raw reply related

* [PATCH iwl-net 3/4] ice: support RDMA on 4+-port E830 devices
From: Aleksandr Loktionov @ 2026-04-17  6:29 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov
  Cc: netdev, Lukasz Czapnik
In-Reply-To: <20260417062954.1241900-1-aleksandr.loktionov@intel.com>

From: Lukasz Czapnik <lukasz.czapnik@intel.com>

E810 and E82X devices do not support RDMA on configurations with more
than 4 ports. This limitation does not apply to E830 devices, which
have a different hardware design and support RDMA regardless of the
port count.

Narrow the RDMA capability disable condition to skip E830 devices.

Fixes: ba1124f58afd ("ice: Add E830 device IDs, MAC type and registers")
Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index ce11fea..0e40011 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -2509,7 +2509,7 @@ ice_recalc_port_limited_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps)
 		caps->maxtc = 4;
 		ice_debug(hw, ICE_DBG_INIT, "reducing maxtc to %d (based on #ports)\n",
 			  caps->maxtc);
-		if (caps->rdma) {
+		if (caps->rdma && hw->mac_type != ICE_MAC_E830) {
 			ice_debug(hw, ICE_DBG_INIT, "forcing RDMA off\n");
 			caps->rdma = 0;
 		}
--
2.52.0


^ permalink raw reply related

* [PATCH iwl-net 4/4] ice: report EIPE checksum errors to the OS on E830
From: Aleksandr Loktionov @ 2026-04-17  6:29 UTC (permalink / raw)
  To: intel-wired-lan, anthony.l.nguyen, aleksandr.loktionov; +Cc: netdev, Jan Glaza
In-Reply-To: <20260417062954.1241900-1-aleksandr.loktionov@intel.com>

From: Jan Glaza <jan.glaza@intel.com>

For E830 adapters the hardware-reported EIPE (Ethernet Inline IPsec
Engine) error is a reliable indication that a received packet failed
decryption and has a bad checksum. Route EIPE errors through the
generic checksum error path on E830 so the error is visible via
standard ethtool statistics (rx_csum_bad).

On previous devices (E810, E82X) the EIPE flag can be spuriously set
on encapsulated packets with inner L2 padding, so those adapters only
increment the driver-private hw_rx_eipe_error counter without routing
through the checksum error path.

Fixes: 0ca6755f3cc2 ("ice: Add a new counter for Rx EIPE errors")
Signed-off-by: Jan Glaza <jan.glaza@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index e695a66..82d9d2c4 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -140,6 +140,8 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,

 	if (ipv4 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_EIPE_S)))) {
 		ring->vsi->back->hw_rx_eipe_error++;
+		if (ring->vsi->back->hw.mac_type == ICE_MAC_E830)
+			goto checksum_fail;
 		return;
 	}

--
2.52.0

^ permalink raw reply related

* Re: Path forward for NFC in the kernel
From: Michael Walle @ 2026-04-17  6:35 UTC (permalink / raw)
  To: Jakub Kicinski, Michael Thalmeier, Raymond Hackley, Bongsu Jeon,
	Krzysztof Kozlowski, Mark Greer
  Cc: netdev
In-Reply-To: <20260416101041.4c533306@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 881 bytes --]

Hi,

On Thu Apr 16, 2026 at 7:10 PM CEST, Jakub Kicinski wrote:
> Hi folks!
>
> We are struggling to keep up with the number of security reports and AI
> generated patches in the kernel. NFC is infamous for being a huge CVE
> magnet. We need someone to step up as a maintainer, create an NFC tree
> and handle all the incoming submissions. Send us (or Linus if you
> prefer) periodic PRs, like WiFi, Bluetooth etc. do. If that does not
> happen I'm afraid we'll have to move the NFC code out of the tree, 
> put it up on GH or some such, and let it accumulate CVEs there..
>
> I'm planning to send a PR to Linus to shed the unmaintained code early
> next week. We need to have a maintainer established by then.

Thanks for asking, but I'm busy renovating my house, sorry. I
couldn't put much work into that. The former is already stressful
enough :)

-michael

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 297 bytes --]

^ permalink raw reply

* [PATCH net v4 0/2] net: airoha: Fix airoha_qdma_cleanup_tx_queue() processing
From: Lorenzo Bianconi @ 2026-04-17  6:36 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Lorenzo Bianconi, Simon Horman
  Cc: linux-arm-kernel, linux-mediatek, netdev

Add missing bits in airoha_qdma_cleanup_tx_queue routine.
Fix airoha_qdma_cleanup_tx_queue processing errors intorduced in commit
'3f47e67dff1f7 ("net: airoha: Add the capability to consume out-of-order
DMA tx descriptors")'.

---
Changes in v4:
- Drop patch 2/3 to move entries to queue head in case of DMA mapping
  failure in airoha_dev_xmit().
- Link to v3: https://lore.kernel.org/r/20260416-airoha_qdma_cleanup_tx_queue-fix-net-v3-0-2b69f5788580@kernel.org

Changes in v3:
- Move ndesc initialization fix in a dedicated patch.
- Add patch 2/3 to move entries to queue head in case of DMA mapping
  failure in airoha_dev_xmit().
- Cosmetics.
- Link to v2: https://lore.kernel.org/r/20260414-airoha_qdma_cleanup_tx_queue-fix-net-v2-1-875de57cc022@kernel.org

Changes in v2:
- Move q->ndesc initialization at end of airoha_qdma_init_tx routine in
  order to avoid any possible NULL pointer dereference in
  airoha_qdma_cleanup_tx_queue()
- Check if q->tx_list is empty in airoha_qdma_cleanup_tx_queue()
- Link to v1: https://lore.kernel.org/r/20260410-airoha_qdma_cleanup_tx_queue-fix-net-v1-1-b7171c8f1e78@kernel.org

---
Lorenzo Bianconi (2):
      net: airoha: Move ndesc initialization at end of airoha_qdma_init_tx()
      net: airoha: Add missing bits in airoha_qdma_cleanup_tx_queue()

 drivers/net/ethernet/airoha/airoha_eth.c | 40 +++++++++++++++++++++++++++-----
 1 file changed, 34 insertions(+), 6 deletions(-)
---
base-commit: 82c21069028c5db3463f851ae8ac9cc2e38a3827
change-id: 20260410-airoha_qdma_cleanup_tx_queue-fix-net-93375f5ee80f

Best regards,
-- 
Lorenzo Bianconi <lorenzo@kernel.org>


^ permalink raw reply

* [PATCH net v4 1/2] net: airoha: Move ndesc initialization at end of airoha_qdma_init_tx()
From: Lorenzo Bianconi @ 2026-04-17  6:36 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Lorenzo Bianconi, Simon Horman
  Cc: linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260417-airoha_qdma_cleanup_tx_queue-fix-net-v4-0-e04bcc2c9642@kernel.org>

If queue entry list allocation fails in airoha_qdma_init_tx_queue routine,
airoha_qdma_cleanup_tx_queue() will trigger a NULL pointer dereference
accessing the queue entry array. The issue is due to the early ndesc
initialization in airoha_qdma_init_tx_queue(). Fix the issue moving ndesc
initialization at end of airoha_qdma_init_tx routine.

Fixes: 3f47e67dff1f7 ("net: airoha: Add the capability to consume out-of-order DMA tx descriptors")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/airoha/airoha_eth.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index e1ab15f1ee7d..690bfaf8d7d9 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -954,27 +954,27 @@ static int airoha_qdma_init_tx_queue(struct airoha_queue *q,
 	dma_addr_t dma_addr;
 
 	spin_lock_init(&q->lock);
-	q->ndesc = size;
 	q->qdma = qdma;
 	q->free_thr = 1 + MAX_SKB_FRAGS;
 	INIT_LIST_HEAD(&q->tx_list);
 
-	q->entry = devm_kzalloc(eth->dev, q->ndesc * sizeof(*q->entry),
+	q->entry = devm_kzalloc(eth->dev, size * sizeof(*q->entry),
 				GFP_KERNEL);
 	if (!q->entry)
 		return -ENOMEM;
 
-	q->desc = dmam_alloc_coherent(eth->dev, q->ndesc * sizeof(*q->desc),
+	q->desc = dmam_alloc_coherent(eth->dev, size * sizeof(*q->desc),
 				      &dma_addr, GFP_KERNEL);
 	if (!q->desc)
 		return -ENOMEM;
 
-	for (i = 0; i < q->ndesc; i++) {
+	for (i = 0; i < size; i++) {
 		u32 val = FIELD_PREP(QDMA_DESC_DONE_MASK, 1);
 
 		list_add_tail(&q->entry[i].list, &q->tx_list);
 		WRITE_ONCE(q->desc[i].ctrl, cpu_to_le32(val));
 	}
+	q->ndesc = size;
 
 	/* xmit ring drop default setting */
 	airoha_qdma_set(qdma, REG_TX_RING_BLOCKING(qid),

-- 
2.53.0


^ permalink raw reply related

* [PATCH net v4 2/2] net: airoha: Add missing bits in airoha_qdma_cleanup_tx_queue()
From: Lorenzo Bianconi @ 2026-04-17  6:36 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Lorenzo Bianconi, Simon Horman
  Cc: linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260417-airoha_qdma_cleanup_tx_queue-fix-net-v4-0-e04bcc2c9642@kernel.org>

Similar to airoha_qdma_cleanup_rx_queue(), reset DMA TX descriptors in
airoha_qdma_cleanup_tx_queue routine. Moreover, reset TX_DMA_IDX to
TX_CPU_IDX to notify the NIC the QDMA TX ring is empty.

Fixes: 23020f0493270 ("net: airoha: Introduce ethernet support for EN7581 SoC")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/airoha/airoha_eth.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index 690bfaf8d7d9..6d9f82c677a0 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -1039,12 +1039,15 @@ static int airoha_qdma_init_tx(struct airoha_qdma *qdma)
 
 static void airoha_qdma_cleanup_tx_queue(struct airoha_queue *q)
 {
-	struct airoha_eth *eth = q->qdma->eth;
-	int i;
+	struct airoha_qdma *qdma = q->qdma;
+	struct airoha_eth *eth = qdma->eth;
+	int i, qid = q - &qdma->q_tx[0];
+	u16 index = 0;
 
 	spin_lock_bh(&q->lock);
 	for (i = 0; i < q->ndesc; i++) {
 		struct airoha_queue_entry *e = &q->entry[i];
+		struct airoha_qdma_desc *desc = &q->desc[i];
 
 		if (!e->dma_addr)
 			continue;
@@ -1055,8 +1058,33 @@ static void airoha_qdma_cleanup_tx_queue(struct airoha_queue *q)
 		e->dma_addr = 0;
 		e->skb = NULL;
 		list_add_tail(&e->list, &q->tx_list);
+
+		/* Reset DMA descriptor */
+		WRITE_ONCE(desc->ctrl, 0);
+		WRITE_ONCE(desc->addr, 0);
+		WRITE_ONCE(desc->data, 0);
+		WRITE_ONCE(desc->msg0, 0);
+		WRITE_ONCE(desc->msg1, 0);
+		WRITE_ONCE(desc->msg2, 0);
+
 		q->queued--;
 	}
+
+	if (!list_empty(&q->tx_list)) {
+		struct airoha_queue_entry *e;
+
+		e = list_first_entry(&q->tx_list, struct airoha_queue_entry,
+				     list);
+		index = e - q->entry;
+	}
+	/* Set TX_DMA_IDX to TX_CPU_IDX to notify the hw the QDMA TX ring is
+	 * empty.
+	 */
+	airoha_qdma_rmw(qdma, REG_TX_CPU_IDX(qid), TX_RING_CPU_IDX_MASK,
+			FIELD_PREP(TX_RING_CPU_IDX_MASK, index));
+	airoha_qdma_rmw(qdma, REG_TX_DMA_IDX(qid), TX_RING_DMA_IDX_MASK,
+			FIELD_PREP(TX_RING_DMA_IDX_MASK, index));
+
 	spin_unlock_bh(&q->lock);
 }
 

-- 
2.53.0


^ permalink raw reply related

* [PATCH v4 net] ax25: fix OOB read after address header strip in ax25_rcv()
From: Ashutosh Desai @ 2026-04-17  6:54 UTC (permalink / raw)
  To: netdev
  Cc: linux-hams, jreuter, davem, edumazet, kuba, pabeni, horms, stable,
	linux-kernel, david.laight.linux, Ashutosh Desai

A crafted AX.25 frame with a valid address header but no control or PID
bytes causes skb->len to drop to zero after skb_pull() strips the
address header. The subsequent reads of skb->data[0] and skb->data[1]
are then out of bounds.

Linearize the skb at entry to ax25_rcv() so all subsequent accesses to
skb->data are safe. Then check skb->len before reading the control and
PID bytes, discarding frames that are too short.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
---
v4: linearize skb at entry to ax25_rcv(); replace pskb_may_pull() with
    skb->len check
v3: remove incorrect Suggested-by; add symptom, Fixes, Cc stable
v2: use pskb_may_pull(skb, 2) instead of skb->len < 2

Link to v3: https://lore.kernel.org/netdev/20260415063654.3831353-1-ashutoshdesai993@gmail.com/
Link to v2: https://lore.kernel.org/netdev/20260409152400.2219716-1-ashutoshdesai993@gmail.com/
Link to v1: https://lore.kernel.org/netdev/20260409012235.2049389-1-ashutoshdesai993@gmail.com/

 net/ax25/ax25_in.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/ax25/ax25_in.c b/net/ax25/ax25_in.c
index d75b3e9ed93d..d14ccebf9cdd 100644
--- a/net/ax25/ax25_in.c
+++ b/net/ax25/ax25_in.c
@@ -190,6 +190,9 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
 	ax25_cb *ax25;
 	ax25_dev *ax25_dev;
 
+	if (skb_linearize(skb))
+		goto free;
+
 	/*
 	 *	Process the AX.25/LAPB frame.
 	 */
@@ -217,6 +220,9 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
 	 */
 	skb_pull(skb, ax25_addr_size(&dp));
 
+	if (skb->len < 2)
+		goto free;
+
 	/* For our port addresses ? */
 	if (ax25cmp(&dest, dev_addr) == 0 && dp.lastrepeat + 1 == dp.ndigi)
 		mine = 1;
-- 
2.34.1


^ permalink raw reply related

* Re: TCP default settings (bugzilla)
From: plantegg ren @ 2026-04-17  7:01 UTC (permalink / raw)
  To: stephen; +Cc: netdev

Hi,

One more real-world data point that just happened two weeks ago,
directly related to tcp_keepalive_time.

AWS recently rolled out Nitro V6 (8th-gen EC2 instances) which reduced
the ENI connection tracking timeout from 432000 seconds (5 days) to
just 350 seconds:

  https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/security-group-connection-tracking.html

Our MySQL/HikariCP connection pools started seeing intermittent timeout
errors every 20-30 minutes after migrating to 8th-gen instances. We
captured packets on both client and server simultaneously. Here is what
we found on a single connection (idle for 818 seconds, well past the
350-second ENI timeout):

Server side -- MySQL receives the request and sends responses normally:

  #270  71.51s  10.23.99.71 -> 172.20.64.240  [ACK]     last activity
                  ~~~ connection idle for 818 seconds ~~~
  #271 889.94s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  client
request arrives
  #272 889.94s  172.20.64.240 -> 10.23.99.71  [PSH,ACK] len=11 server
responds OK
  #275 890.15s  172.20.64.240 -> 10.23.99.71  [PSH,ACK] len=11 server
retransmits
  #278 890.59s  172.20.64.240 -> 10.23.99.71  [PSH,ACK] len=11 server
retransmits
  #281 891.02s  172.20.64.240 -> 10.23.99.71  [PSH,ACK] len=11 server
retransmits
    ... (server keeps retransmitting, client never ACKs)

Client side -- sends request, but NEVER receives any server response:

  #267  71.51s  10.23.99.71 -> 172.20.64.240  [ACK]     last activity
                  ~~~ connection idle for 818 seconds ~~~
  #268 889.94s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  sends request
  #269 890.15s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  retransmit 1
  #270 890.37s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  retransmit 2
  #271 890.79s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  retransmit 3
  #272 891.65s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  retransmit 4
  #273 893.38s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  retransmit 5
  #274 894.94s  10.23.99.71 -> 172.20.64.240  [FIN,ACK]         gives up

  Zero packets from 172.20.64.240 after the idle gap. Zero RSTs.

The ENI silently drops all inbound packets (server -> client) because
the connection tracking entry expired after 350 seconds. Outbound
packets (client -> server) still pass through, so the server receives
the request and responds -- but its responses are black-holed by the
ENI. No RST is sent, so both sides are completely unaware.

If tcp_keepalive_time were lower than 350 seconds, the keepalive probes
would have kept the ENI tracking entry alive, and none of this would
have happened.

The trend is clear -- middlebox idle timeouts are getting shorter (AWS
went from 432000s to 350s overnight), while tcp_keepalive_time has
stayed at 7200 seconds for decades. The gap is widening.

Xijun

^ permalink raw reply

* Re: [PATCH net-next 5/6] net: stmmac: move PHY handling out of __stmmac_open()/release()
From: Maxime Chevallier @ 2026-04-17  7:11 UTC (permalink / raw)
  To: Russell King (Oracle), Alexander Stein
  Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, linux-arm-kernel,
	linux-stm32, Maxime Coquelin, netdev, Paolo Abeni
In-Reply-To: <aeDSTIS9-TDSihbX@shell.armlinux.org.uk>

Hi,

On 16/04/2026 14:13, Russell King (Oracle) wrote:
> On Thu, Apr 16, 2026 at 02:02:53PM +0200, Alexander Stein wrote:
>> Hi Russel,
>>
>> Am Donnerstag, 16. April 2026, 12:49:25 CEST schrieb Russell King (Oracle):
>>> On Thu, Apr 16, 2026 at 08:20:13AM +0200, Alexander Stein wrote:
>>>> Am Mittwoch, 15. April 2026, 14:59:32 CEST schrieb Russell King (Oracle):
>>>>> On Wed, Apr 15, 2026 at 08:08:40AM +0200, Alexander Stein wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Am Dienstag, 23. September 2025, 13:26:19 CEST schrieb Russell King (Oracle):
>>>>>>> Move the PHY attachment/detachment from the network driver out of
>>>>>>> __stmmac_open() and __stmmac_release() into stmmac_open() and
>>>>>>> stmmac_release() where these actions will only happen when the
>>>>>>> interface is administratively brought up or down. It does not make
>>>>>>> sense to detach and re-attach the PHY during a change of MTU.
>>>>>>
>>>>>> Sorry for coming up now. But I recently noticed this commit breaks changing
>>>>>> the MTU on i.MX8MP. Once I simply change the MTU I run into some DMA error:
>>>>>> $ ip link set dev end1 mtu 1400
>>>>>> imx-dwmac 30bf0000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-0
>>>>>> imx-dwmac 30bf0000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-1
>>>>>> imx-dwmac 30bf0000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-2
>>>>>> imx-dwmac 30bf0000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-3
>>>>>> imx-dwmac 30bf0000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-4
>>>>>> imx-dwmac 30bf0000.ethernet end1: Link is Down
>>>>>> imx-dwmac 30bf0000.ethernet end1: Failed to reset the dma
>>>>>> imx-dwmac 30bf0000.ethernet end1: stmmac_hw_setup: DMA engine initialization failed
>>>>>
>>>>> This basically means that a clock is missing. Please provide more
>>>>> information:
>>>>>
>>>>> - what kernel version are you using?
>>>>
>>>> Currently I am using v6.18.22.
>>>> $ ethtool -i end1
>>>> driver: st_gmac
>>>> version: 6.18.22
>>>> firmware-version: 
>>>> expansion-rom-version: 
>>>> bus-info: 30bf0000.ethernet
>>>> supports-statistics: yes
>>>> supports-test: no
>>>> supports-eeprom-access: no
>>>> supports-register-dump: yes
>>>> supports-priv-flags: no
>>>>
>>>>> - has EEE been negotiated?
>>>>
>>>> No. It is marked as not supported
>>>>
>>>> $ ethtool --show-eee end1
>>>> EEE settings for end1:
>>>>         EEE status: not supported
>>>>
>>>>> - does the problem persist when EEE is disabled?
>>>>
>>>> As EEE is not supported the problem occurs even with EEE disabled.
>>>>
>>>>> - which PHY is attached to stmmac?
>>>>
>>>> It is a TI DP83867.
>>>>
>>>> imx-dwmac 30bf0000.ethernet eth1: PHY [stmmac-1:03] driver [TI DP83867] (irq=136)
>>>>
>>>>> - which PHY interface mode is being used to connect the PHY to stmmac?
>>>>
>>>> For this interface
>>>>> phy-mode = "rgmii-id";
>>>> is set.
>>>>
>>>> In case it is helpful. My platform is arch/arm64/boot/dts/freescale/imx8mp-tqma8mpql-mba8mpxl.dts
>>>> Thanks for assisting. If there a further questions, don't hesitate to ask.
>>>
>>> Thanks.
>>>
>>> So, as best I can determine at the moment, we end up with the following
>>> sequence:
>>>
>>> stmmac_change_mtu()
>>>  __stmmac_release()
>>>   phylink_stop()
>>>    phy_stop()
>>>     phy->state = PHY_HALTED
>>>     _phy_state_machine() returns PHY_STATE_WORK_SUSPEND
>>>     _phy_state_machine_post_work()
>>>      phy_suspend()
>>>       genphy_suspend()
>>>        phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN)
>>>
>>> With the DP83867, this causes most of the PHY to be powered down, thus
>>> stopping the clocks, and this causes the stmmac reset to time out.
>>>
>>> Prior to this commit, we would have called phylink_disconnect_phy()
>>> immediately after phylink_stop(), but I can see nothing that would
>>> be affected by this change there (since that also calls
>>> phy_suspend(), but as the PHY is already suspended, this becomes a
>>> no-op.)
>>>
>>> However, __stmmac_open() would have called stmmac_init_phy(), which
>>> would reattach the PHY. This would have called phy_init_hw(), 
>>> resetting the PHY, and phy_resume() which would ensure that the
>>> PDOWN bit is clear - thus clocks would be running.
>>>
>>> As a hack, please can you try calling phylink_prepare_resume()
>>> between the __stmmac_release() and __stmmac_open() in
>>> stmmac_change_mtu(). This should resume the PHY, thus restoring the
>>> clocks necessary for stmmac to reset.
>>
>> I tried the following patch. This works as you suspected.
> 
> Brilliant, thanks for proving the theory why it broke.
> 
> I'll have a think about the best way to solve this, because
> phylink_prepare_resume() is supposed to be paired with phylink_resume()
> and that isn't the case here.
> 
> Please bear with me as my availability for looking at the kernel is
> very unpredictable at present (family health issues.)

FWIW I am able to reproduce this with imx8mp + ksz9131

I can give this a try as Russell isn't available.

Maxime

> 


^ permalink raw reply

* Re: Path forward for NFC in the kernel
From: Krzysztof Kozlowski @ 2026-04-17  7:18 UTC (permalink / raw)
  To: Jakub Kicinski, Michael Thalmeier, Raymond Hackley, Michael Walle,
	Bongsu Jeon, Mark Greer, David Heidelberg
  Cc: netdev
In-Reply-To: <20260416101041.4c533306@kernel.org>

On 16/04/2026 19:10, Jakub Kicinski wrote:
> Hi folks!
> 
> We are struggling to keep up with the number of security reports and AI
> generated patches in the kernel. NFC is infamous for being a huge CVE
> magnet. We need someone to step up as a maintainer, create an NFC tree
> and handle all the incoming submissions. Send us (or Linus if you
> prefer) periodic PRs, like WiFi, Bluetooth etc. do. If that does not
> happen I'm afraid we'll have to move the NFC code out of the tree, 
> put it up on GH or some such, and let it accumulate CVEs there..
> 
> I'm planning to send a PR to Linus to shed the unmaintained code early
> next week. We need to have a maintainer established by then.

+Cc David Heidelberg recently trying to use Linux NFC stack,

Just "collecting" patches is not a big deal, I could do this, but
actually reviewing the patches with necessary due diligence is the
effort I could not provide in a reasonable time frame. And picking up
patches without proper review feels risky...

NFC has a long history of issues, first mostly pointed out by syzbot but
now apparently by AI tools. The code base is quite old, with no major
improvements or testings happening but not in a way "oh, it's stable and
working like 'cp' command" but rather "no one knows how many bugs are on
top of each other and if it actually still works".

Syzbot and AI reported bugs encourage random drive-by fixes by people
not testing the code, thus particular bug report might be fixed, but for
example NFC stops working and no one knows that.

Does anyone knows if the NFC stack/drivers actually works fine? Did
anyone test actual devices?

If not, then moving to Github would be even more reasonable.

Another point is that AFAIU, most of real world devices, like
Android-based phones, don't use the Linux NFC stack but their custom
HAL/user-space based libraries and drivers. Some other non-Android
projects use libnfc userspace, which seems to be maintained only as
bugfix (https://github.com/nfc-tools/libnfc/commits/master/).

Best regards,
Krzysztof

^ permalink raw reply

* [PATCH net v2] ixgbe: only access vfinfo and mv_list under RCU lock
From: Corinna Vinschen @ 2026-04-17  7:28 UTC (permalink / raw)
  To: intel-wired-lan, netdev; +Cc: Corinna Vinschen
In-Reply-To: <20260416084227.3787828-1-vinschen@redhat.com>

Commit 1e53834ce541d ("ixgbe: Add locking to prevent panic when setting
sriov_numvfs to zero") added a spinlock to the adapter info.  The reason
at the time was an observed crash when ixgbe_disable_sriov() freed the
adapter->vfinfo array while the interrupt driven function ixgbe_msg_task()
was handling VF messages.

Recent stability testing turned up another crash, which is very easily
reproducible:

  while true
  do
    for numvfs in 5 0
    do
      echo $numvfs > /sys/class/net/eth0/device/sriov_numvfs
    done
  done

This crashed almost always within the first two hundred runs with
a NULL pointer deref while running the ixgbe_service_task() workqueue:

[ 5052.036491] BUG: kernel NULL pointer dereference, address: 0000000000000258
[ 5052.043454] #PF: supervisor read access in kernel mode
[ 5052.048594] #PF: error_code(0x0000) - not-present page
[ 5052.053734] PGD 0 P4D 0
[ 5052.056272] Oops: Oops: 0000 #1 SMP NOPTI
[ 5052.060459] CPU: 2 UID: 0 PID: 132253 Comm: kworker/u96:0 Kdump: loaded Not tainted 6.12.0-180.el10.x86_64 #1 PREEMPT(voluntary)
[ 5052.072100] Hardware name: Dell Inc. PowerEdge R740/0DY2X0, BIOS 2.12.2 07/09/2021
[ 5052.079664] Workqueue: ixgbe ixgbe_service_task [ixgbe]
[ 5052.084907] RIP: 0010:ixgbe_update_stats+0x8b1/0xb40 [ixgbe]
[ 5052.090585] Code: 21 56 50 49 8b b6 18 26 00 00 4c 01 fe 48 09 46 50 42 8d 34 a5 00 83 00 00 e8 cb 7a ff ff 49 8b b6 18 26 00 00 89 c0 4c 01 fe <48> 3b 86 88 00 00 00 73 18 48 b9 00 00 00 00 01 00 00 00 48 01 4e
[ 5052.109331] RSP: 0018:ffffd5f1e8a6bd88 EFLAGS: 00010202
[ 5052.114558] RAX: 0000000000000000 RBX: ffff8f49b22b14a0 RCX: 000000000000023c
[ 5052.121689] RDX: ffffffff00000000 RSI: 00000000000001d0 RDI: ffff8f49b22b14a0
[ 5052.128823] RBP: 000000000000109c R08: 0000000000000000 R09: 0000000000000000
[ 5052.135955] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
[ 5052.143086] R13: 0000000000008410 R14: ffff8f49b22b01a0 R15: 00000000000001d0
[ 5052.150221] FS:  0000000000000000(0000) GS:ffff8f58bfc80000(0000) knlGS:0000000000000000
[ 5052.158307] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5052.164054] CR2: 0000000000000258 CR3: 0000000bf2624006 CR4: 00000000007726f0
[ 5052.171187] PKRU: 55555554
[ 5052.173898] Call Trace:
[ 5052.176351]  <TASK>
[ 5052.178457]  ? show_trace_log_lvl+0x1b0/0x2f0
[ 5052.182816]  ? show_trace_log_lvl+0x1b0/0x2f0
[ 5052.187177]  ? ixgbe_watchdog_subtask+0x1a1/0x230 [ixgbe]
[ 5052.192591]  ? __die_body.cold+0x8/0x12
[ 5052.196433]  ? page_fault_oops+0x148/0x160
[ 5052.200532]  ? exc_page_fault+0x7f/0x150
[ 5052.204458]  ? asm_exc_page_fault+0x26/0x30
[ 5052.208643]  ? ixgbe_update_stats+0x8b1/0xb40 [ixgbe]
[ 5052.213714]  ? ixgbe_update_stats+0x8a5/0xb40 [ixgbe]
[ 5052.218784]  ixgbe_watchdog_subtask+0x1a1/0x230 [ixgbe]
[ 5052.224026]  ixgbe_service_task+0x15a/0x3f0 [ixgbe]
[ 5052.228916]  process_one_work+0x177/0x330
[ 5052.232928]  worker_thread+0x256/0x3a0
[ 5052.236681]  ? __pfx_worker_thread+0x10/0x10
[ 5052.240952]  kthread+0xfa/0x240
[ 5052.244099]  ? __pfx_kthread+0x10/0x10
[ 5052.247852]  ret_from_fork+0x34/0x50
[ 5052.251429]  ? __pfx_kthread+0x10/0x10
[ 5052.255185]  ret_from_fork_asm+0x1a/0x30
[ 5052.259112]  </TASK>

The first simple patch, just adding spinlocking to ixgbe_update_stats()
while reading from adapter->vfinfo, did not fix the problem, it just
moved it elsewhere: I could now reproduce the same kind of crash in
ixgbe_restore_vf_multicasts().

But adding more spinlocking doesn't really cut it.  One reason is that
ixgbe_restore_vf_multicasts() is called from within ixgbe_msg_task()
with active spinlock, as well as from outside without locking.

Additionally, given that ixgbe_disable_sriov() is the only call changing
adapter->vfinfo, and given ixgbe_disable_sriov() is called very
seldom compared to other actions in the driver, just adding more
spinlocks would unnecessarily occupy the driver with spinning when
multiple functions accessing adapter->vfinfo are running in parallel.

So this patch drops the spinlock in favor of RCU and uses it throughout
the driver.

While changing this, it seems prudent to do the same for the
adapter->mv_list array, which is allocated and freed at the same time as
adapter->vfinfo, albeit there was no crash observed.

Fixes: 1e53834ce541d ("ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero")
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
---
v2: always return 0 from ixgbe_ndo_get_vf_stats so as not to break
    'ip link show dev'

Interdiff against v1:
  diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
  index 6ee8c2a140c2..e0a986f1c96a 100644
  --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
  +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
  @@ -9797,7 +9797,7 @@ static int ixgbe_ndo_get_vf_stats(struct net_device *netdev, int vf,
   	}
   	rcu_read_unlock();
   
  -	return vfinfo ? 0 : -EINVAL;
  +	return 0;
   }
   
   #ifdef CONFIG_IXGBE_DCB

 drivers/net/ethernet/intel/ixgbe/ixgbe.h      |   7 +-
 .../net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c   |  36 +-
 .../net/ethernet/intel/ixgbe/ixgbe_ethtool.c  |  44 +-
 .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c    |  17 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 227 +++++---
 .../net/ethernet/intel/ixgbe/ixgbe_sriov.c    | 547 ++++++++++++------
 6 files changed, 592 insertions(+), 286 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 9b8217523fd2..8849b9f42bf6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -210,6 +210,7 @@ struct vf_stats {
 };
 
 struct vf_data_storage {
+	struct rcu_head rcu_head;
 	struct pci_dev *vfdev;
 	unsigned char vf_mac_addresses[ETH_ALEN];
 	u16 vf_mc_hashes[IXGBE_MAX_VF_MC_ENTRIES];
@@ -240,6 +241,7 @@ enum ixgbevf_xcast_modes {
 };
 
 struct vf_macvlans {
+	struct rcu_head rcu_head;
 	struct list_head l;
 	int vf;
 	bool free;
@@ -808,10 +810,10 @@ struct ixgbe_adapter {
 	/* SR-IOV */
 	DECLARE_BITMAP(active_vfs, IXGBE_MAX_VF_FUNCTIONS);
 	unsigned int num_vfs;
-	struct vf_data_storage *vfinfo;
+	struct vf_data_storage __rcu *vfinfo;
 	int vf_rate_link_speed;
 	struct vf_macvlans vf_mvs;
-	struct vf_macvlans *mv_list;
+	struct vf_macvlans __rcu *mv_list;
 
 	u32 timer_event_accumulator;
 	u32 vferr_refcount;
@@ -844,7 +846,6 @@ struct ixgbe_adapter {
 #ifdef CONFIG_IXGBE_IPSEC
 	struct ixgbe_ipsec *ipsec;
 #endif /* CONFIG_IXGBE_IPSEC */
-	spinlock_t vfs_lock;
 };
 
 struct ixgbe_netdevice_priv {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c
index 382d097e4b11..9a84cfc09120 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c
@@ -640,17 +640,21 @@ static int ixgbe_dcbnl_ieee_setapp(struct net_device *dev,
 	/* VF devices should use default UP when available */
 	if (app->selector == IEEE_8021QAZ_APP_SEL_ETHERTYPE &&
 	    app->protocol == 0) {
+		struct vf_data_storage *vfinfo;
 		int vf;
 
 		adapter->default_up = app->priority;
 
-		for (vf = 0; vf < adapter->num_vfs; vf++) {
-			struct vf_data_storage *vfinfo = &adapter->vfinfo[vf];
-
-			if (!vfinfo->pf_qos)
-				ixgbe_set_vmvir(adapter, vfinfo->pf_vlan,
-						app->priority, vf);
-		}
+		rcu_read_lock();
+		vfinfo = rcu_dereference(adapter->vfinfo);
+		if (vfinfo)
+			for (vf = 0; vf < adapter->num_vfs; vf++) {
+				if (!vfinfo[vf].pf_qos)
+					ixgbe_set_vmvir(adapter,
+							vfinfo[vf].pf_vlan,
+							app->priority, vf);
+			}
+		rcu_read_unlock();
 	}
 
 	return 0;
@@ -683,19 +687,23 @@ static int ixgbe_dcbnl_ieee_delapp(struct net_device *dev,
 	/* IF default priority is being removed clear VF default UP */
 	if (app->selector == IEEE_8021QAZ_APP_SEL_ETHERTYPE &&
 	    app->protocol == 0 && adapter->default_up == app->priority) {
+		struct vf_data_storage *vfinfo;
 		int vf;
 		long unsigned int app_mask = dcb_ieee_getapp_mask(dev, app);
 		int qos = app_mask ? find_first_bit(&app_mask, 8) : 0;
 
 		adapter->default_up = qos;
 
-		for (vf = 0; vf < adapter->num_vfs; vf++) {
-			struct vf_data_storage *vfinfo = &adapter->vfinfo[vf];
-
-			if (!vfinfo->pf_qos)
-				ixgbe_set_vmvir(adapter, vfinfo->pf_vlan,
-						qos, vf);
-		}
+		rcu_read_lock();
+		vfinfo = rcu_dereference(adapter->vfinfo);
+		if (vfinfo)
+			for (vf = 0; vf < adapter->num_vfs; vf++) {
+				if (!vfinfo[vf].pf_qos)
+					ixgbe_set_vmvir(adapter,
+							vfinfo[vf].pf_vlan,
+							qos, vf);
+			}
+		rcu_read_unlock();
 	}
 
 	return err;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index ba049b3a9609..b77317476af4 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -2265,21 +2265,28 @@ static void ixgbe_diag_test(struct net_device *netdev,
 		struct ixgbe_hw *hw = &adapter->hw;
 
 		if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) {
+			struct vf_data_storage *vfinfo;
 			int i;
-			for (i = 0; i < adapter->num_vfs; i++) {
-				if (adapter->vfinfo[i].clear_to_send) {
-					netdev_warn(netdev, "offline diagnostic is not supported when VFs are present\n");
-					data[0] = 1;
-					data[1] = 1;
-					data[2] = 1;
-					data[3] = 1;
-					data[4] = 1;
-					eth_test->flags |= ETH_TEST_FL_FAILED;
-					clear_bit(__IXGBE_TESTING,
-						  &adapter->state);
-					return;
+
+			rcu_read_lock();
+			vfinfo = rcu_dereference(adapter->vfinfo);
+			if (vfinfo)
+				for (i = 0; i < adapter->num_vfs; i++) {
+					if (vfinfo[i].clear_to_send) {
+						netdev_warn(netdev, "offline diagnostic is not supported when VFs are present\n");
+						data[0] = 1;
+						data[1] = 1;
+						data[2] = 1;
+						data[3] = 1;
+						data[4] = 1;
+						eth_test->flags |= ETH_TEST_FL_FAILED;
+						clear_bit(__IXGBE_TESTING,
+							  &adapter->state);
+						rcu_read_unlock();
+						return;
+					}
 				}
-			}
+			rcu_read_unlock();
 		}
 
 		/* Offline tests */
@@ -3700,9 +3707,14 @@ static int ixgbe_set_priv_flags(struct net_device *netdev, u32 priv_flags)
 	if (priv_flags & IXGBE_PRIV_FLAGS_AUTO_DISABLE_VF) {
 		if (adapter->hw.mac.type == ixgbe_mac_82599EB) {
 			/* Reset primary abort counter */
-			for (i = 0; i < adapter->num_vfs; i++)
-				adapter->vfinfo[i].primary_abort_count = 0;
-
+			struct vf_data_storage *vfinfo;
+
+			rcu_read_lock();
+			vfinfo = rcu_dereference(adapter->vfinfo);
+			if (vfinfo)
+				for (i = 0; i < adapter->num_vfs; i++)
+					vfinfo[i].primary_abort_count = 0;
+			rcu_read_unlock();
 			flags2 |= IXGBE_FLAG2_AUTO_DISABLE_VF;
 		} else {
 			e_info(probe,
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index bd397b3d7dea..b524a3a61eb6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -874,6 +874,7 @@ void ixgbe_ipsec_vf_clear(struct ixgbe_adapter *adapter, u32 vf)
 int ixgbe_ipsec_vf_add_sa(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
 {
 	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct vf_data_storage *vfinfo;
 	struct xfrm_algo_desc *algo;
 	struct sa_mbx_msg *sam;
 	struct xfrm_state *xs;
@@ -883,7 +884,13 @@ int ixgbe_ipsec_vf_add_sa(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
 	int err;
 
 	sam = (struct sa_mbx_msg *)(&msgbuf[1]);
-	if (!adapter->vfinfo[vf].trusted ||
+
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
+	if (!vfinfo[vf].trusted ||
 	    !(adapter->flags2 & IXGBE_FLAG2_VF_IPSEC_ENABLED)) {
 		e_warn(drv, "VF %d attempted to add an IPsec SA\n", vf);
 		err = -EACCES;
@@ -984,11 +991,17 @@ int ixgbe_ipsec_vf_add_sa(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
 int ixgbe_ipsec_vf_del_sa(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
 {
 	struct ixgbe_ipsec *ipsec = adapter->ipsec;
+	struct vf_data_storage *vfinfo;
 	struct xfrm_state *xs;
 	u32 pfsa = msgbuf[1];
 	u16 sa_idx;
 
-	if (!adapter->vfinfo[vf].trusted) {
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
+	if (!vfinfo[vf].trusted) {
 		e_err(drv, "vf %d attempted to delete an SA\n", vf);
 		return -EPERM;
 	}
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 2646ee6f295f..e0a986f1c96a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1240,20 +1240,26 @@ static void ixgbe_pf_handle_tx_hang(struct ixgbe_ring *tx_ring,
 static void ixgbe_vf_handle_tx_hang(struct ixgbe_adapter *adapter, u16 vf)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
 
 	if (adapter->hw.mac.type != ixgbe_mac_e610)
 		return;
 
-	e_warn(drv,
-	       "Malicious Driver Detection tx hang detected on PF %d VF %d MAC: %pM",
-	       hw->bus.func, vf, adapter->vfinfo[vf].vf_mac_addresses);
-
-	adapter->tx_hang_count[vf]++;
-	if (adapter->tx_hang_count[vf] == IXGBE_MAX_TX_VF_HANGS) {
-		ixgbe_set_vf_link_state(adapter, vf,
-					IFLA_VF_LINK_STATE_DISABLE);
-		adapter->tx_hang_count[vf] = 0;
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo) {
+		e_warn(drv,
+		       "Malicious Driver Detection tx hang detected on PF %d VF %d MAC: %pM",
+		       hw->bus.func, vf, vfinfo[vf].vf_mac_addresses);
+
+		adapter->tx_hang_count[vf]++;
+		if (adapter->tx_hang_count[vf] == IXGBE_MAX_TX_VF_HANGS) {
+			ixgbe_set_vf_link_state(adapter, vf,
+						IFLA_VF_LINK_STATE_DISABLE);
+			adapter->tx_hang_count[vf] = 0;
+		}
 	}
+	rcu_read_unlock();
 }
 
 static u32 ixgbe_poll_tx_icache(struct ixgbe_hw *hw, u16 queue, u16 idx)
@@ -4625,6 +4631,7 @@ static void ixgbe_configure_virtualization(struct ixgbe_adapter *adapter)
 	struct ixgbe_hw *hw = &adapter->hw;
 	u16 pool = adapter->num_rx_pools;
 	u32 reg_offset, vf_shift, vmolr;
+	struct vf_data_storage *vfinfo;
 	u32 gcr_ext, vmdctl;
 	int i;
 
@@ -4680,15 +4687,19 @@ static void ixgbe_configure_virtualization(struct ixgbe_adapter *adapter)
 
 	IXGBE_WRITE_REG(hw, IXGBE_GCR_EXT, gcr_ext);
 
-	for (i = 0; i < adapter->num_vfs; i++) {
-		/* configure spoof checking */
-		ixgbe_ndo_set_vf_spoofchk(adapter->netdev, i,
-					  adapter->vfinfo[i].spoofchk_enabled);
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo)
+		for (i = 0; i < adapter->num_vfs; i++) {
+			/* configure spoof checking */
+			ixgbe_ndo_set_vf_spoofchk(adapter->netdev, i,
+						  vfinfo[i].spoofchk_enabled);
 
-		/* Enable/Disable RSS query feature  */
-		ixgbe_ndo_set_vf_rss_query_en(adapter->netdev, i,
-					  adapter->vfinfo[i].rss_query_enabled);
-	}
+			/* Enable/Disable RSS query feature  */
+			ixgbe_ndo_set_vf_rss_query_en(adapter->netdev, i,
+						  vfinfo[i].rss_query_enabled);
+		}
+	rcu_read_unlock();
 }
 
 static void ixgbe_set_rx_buffer_len(struct ixgbe_adapter *adapter)
@@ -6093,35 +6104,40 @@ static void ixgbe_check_media_subtask(struct ixgbe_adapter *adapter)
 static void ixgbe_clear_vf_stats_counters(struct ixgbe_adapter *adapter)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
 	int i;
 
-	for (i = 0; i < adapter->num_vfs; i++) {
-		adapter->vfinfo[i].last_vfstats.gprc =
-			IXGBE_READ_REG(hw, IXGBE_PVFGPRC(i));
-		adapter->vfinfo[i].saved_rst_vfstats.gprc +=
-			adapter->vfinfo[i].vfstats.gprc;
-		adapter->vfinfo[i].vfstats.gprc = 0;
-		adapter->vfinfo[i].last_vfstats.gptc =
-			IXGBE_READ_REG(hw, IXGBE_PVFGPTC(i));
-		adapter->vfinfo[i].saved_rst_vfstats.gptc +=
-			adapter->vfinfo[i].vfstats.gptc;
-		adapter->vfinfo[i].vfstats.gptc = 0;
-		adapter->vfinfo[i].last_vfstats.gorc =
-			IXGBE_READ_REG(hw, IXGBE_PVFGORC_LSB(i));
-		adapter->vfinfo[i].saved_rst_vfstats.gorc +=
-			adapter->vfinfo[i].vfstats.gorc;
-		adapter->vfinfo[i].vfstats.gorc = 0;
-		adapter->vfinfo[i].last_vfstats.gotc =
-			IXGBE_READ_REG(hw, IXGBE_PVFGOTC_LSB(i));
-		adapter->vfinfo[i].saved_rst_vfstats.gotc +=
-			adapter->vfinfo[i].vfstats.gotc;
-		adapter->vfinfo[i].vfstats.gotc = 0;
-		adapter->vfinfo[i].last_vfstats.mprc =
-			IXGBE_READ_REG(hw, IXGBE_PVFMPRC(i));
-		adapter->vfinfo[i].saved_rst_vfstats.mprc +=
-			adapter->vfinfo[i].vfstats.mprc;
-		adapter->vfinfo[i].vfstats.mprc = 0;
-	}
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo)
+		for (i = 0; i < adapter->num_vfs; i++) {
+			vfinfo[i].last_vfstats.gprc =
+				IXGBE_READ_REG(hw, IXGBE_PVFGPRC(i));
+			vfinfo[i].saved_rst_vfstats.gprc +=
+				vfinfo[i].vfstats.gprc;
+			vfinfo[i].vfstats.gprc = 0;
+			vfinfo[i].last_vfstats.gptc =
+				IXGBE_READ_REG(hw, IXGBE_PVFGPTC(i));
+			vfinfo[i].saved_rst_vfstats.gptc +=
+				vfinfo[i].vfstats.gptc;
+			vfinfo[i].vfstats.gptc = 0;
+			vfinfo[i].last_vfstats.gorc =
+				IXGBE_READ_REG(hw, IXGBE_PVFGORC_LSB(i));
+			vfinfo[i].saved_rst_vfstats.gorc +=
+				vfinfo[i].vfstats.gorc;
+			vfinfo[i].vfstats.gorc = 0;
+			vfinfo[i].last_vfstats.gotc =
+				IXGBE_READ_REG(hw, IXGBE_PVFGOTC_LSB(i));
+			vfinfo[i].saved_rst_vfstats.gotc +=
+				vfinfo[i].vfstats.gotc;
+			vfinfo[i].vfstats.gotc = 0;
+			vfinfo[i].last_vfstats.mprc =
+				IXGBE_READ_REG(hw, IXGBE_PVFMPRC(i));
+			vfinfo[i].saved_rst_vfstats.mprc +=
+				vfinfo[i].vfstats.mprc;
+			vfinfo[i].vfstats.mprc = 0;
+		}
+	rcu_read_unlock();
 }
 
 static void ixgbe_setup_gpie(struct ixgbe_adapter *adapter)
@@ -6729,15 +6745,22 @@ void ixgbe_down(struct ixgbe_adapter *adapter)
 	timer_delete_sync(&adapter->service_timer);
 
 	if (adapter->num_vfs) {
+		struct vf_data_storage *vfinfo;
+
 		/* Clear EITR Select mapping */
 		IXGBE_WRITE_REG(&adapter->hw, IXGBE_EITRSEL, 0);
 
+		rcu_read_lock();
+		vfinfo = rcu_dereference(adapter->vfinfo);
 		/* Mark all the VFs as inactive */
-		for (i = 0 ; i < adapter->num_vfs; i++)
-			adapter->vfinfo[i].clear_to_send = false;
+		if (vfinfo) {
+			for (i = 0 ; i < adapter->num_vfs; i++)
+				vfinfo[i].clear_to_send = false;
 
-		/* update setting rx tx for all active vfs */
-		ixgbe_set_all_vfs(adapter);
+			/* update setting rx tx for all active vfs */
+			ixgbe_set_all_vfs(adapter);
+		}
+		rcu_read_unlock();
 	}
 
 	/* disable transmits in the hardware now that interrupts are off */
@@ -7001,9 +7024,6 @@ static int ixgbe_sw_init(struct ixgbe_adapter *adapter,
 	/* n-tuple support exists, always init our spinlock */
 	spin_lock_init(&adapter->fdir_perfect_lock);
 
-	/* init spinlock to avoid concurrency of VF resources */
-	spin_lock_init(&adapter->vfs_lock);
-
 #ifdef CONFIG_IXGBE_DCB
 	ixgbe_init_dcb(adapter);
 #endif
@@ -7905,25 +7925,31 @@ void ixgbe_update_stats(struct ixgbe_adapter *adapter)
 	 * crazy values.
 	 */
 	if (!test_bit(__IXGBE_RESETTING, &adapter->state)) {
-		for (i = 0; i < adapter->num_vfs; i++) {
-			UPDATE_VF_COUNTER_32bit(IXGBE_PVFGPRC(i),
-						adapter->vfinfo[i].last_vfstats.gprc,
-						adapter->vfinfo[i].vfstats.gprc);
-			UPDATE_VF_COUNTER_32bit(IXGBE_PVFGPTC(i),
-						adapter->vfinfo[i].last_vfstats.gptc,
-						adapter->vfinfo[i].vfstats.gptc);
-			UPDATE_VF_COUNTER_36bit(IXGBE_PVFGORC_LSB(i),
-						IXGBE_PVFGORC_MSB(i),
-						adapter->vfinfo[i].last_vfstats.gorc,
-						adapter->vfinfo[i].vfstats.gorc);
-			UPDATE_VF_COUNTER_36bit(IXGBE_PVFGOTC_LSB(i),
-						IXGBE_PVFGOTC_MSB(i),
-						adapter->vfinfo[i].last_vfstats.gotc,
-						adapter->vfinfo[i].vfstats.gotc);
-			UPDATE_VF_COUNTER_32bit(IXGBE_PVFMPRC(i),
-						adapter->vfinfo[i].last_vfstats.mprc,
-						adapter->vfinfo[i].vfstats.mprc);
-		}
+		struct vf_data_storage *vfinfo;
+
+		rcu_read_lock();
+		vfinfo = rcu_dereference(adapter->vfinfo);
+		if (vfinfo)
+			for (i = 0; i < adapter->num_vfs; i++) {
+				UPDATE_VF_COUNTER_32bit(IXGBE_PVFGPRC(i),
+							vfinfo[i].last_vfstats.gprc,
+							vfinfo[i].vfstats.gprc);
+				UPDATE_VF_COUNTER_32bit(IXGBE_PVFGPTC(i),
+							vfinfo[i].last_vfstats.gptc,
+							vfinfo[i].vfstats.gptc);
+				UPDATE_VF_COUNTER_36bit(IXGBE_PVFGORC_LSB(i),
+							IXGBE_PVFGORC_MSB(i),
+							vfinfo[i].last_vfstats.gorc,
+							vfinfo[i].vfstats.gorc);
+				UPDATE_VF_COUNTER_36bit(IXGBE_PVFGOTC_LSB(i),
+							IXGBE_PVFGOTC_MSB(i),
+							vfinfo[i].last_vfstats.gotc,
+							vfinfo[i].vfstats.gotc);
+				UPDATE_VF_COUNTER_32bit(IXGBE_PVFMPRC(i),
+							vfinfo[i].last_vfstats.mprc,
+							vfinfo[i].vfstats.mprc);
+			}
+		rcu_read_unlock();
 	}
 }
 
@@ -8267,22 +8293,27 @@ static void ixgbe_watchdog_flush_tx(struct ixgbe_adapter *adapter)
 static void ixgbe_bad_vf_abort(struct ixgbe_adapter *adapter, u32 vf)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
 
-	if (adapter->hw.mac.type == ixgbe_mac_82599EB &&
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo &&
+	    adapter->hw.mac.type == ixgbe_mac_82599EB &&
 	    adapter->flags2 & IXGBE_FLAG2_AUTO_DISABLE_VF) {
-		adapter->vfinfo[vf].primary_abort_count++;
-		if (adapter->vfinfo[vf].primary_abort_count ==
+		vfinfo[vf].primary_abort_count++;
+		if (vfinfo[vf].primary_abort_count ==
 		    IXGBE_PRIMARY_ABORT_LIMIT) {
 			ixgbe_set_vf_link_state(adapter, vf,
 						IFLA_VF_LINK_STATE_DISABLE);
-			adapter->vfinfo[vf].primary_abort_count = 0;
+			vfinfo[vf].primary_abort_count = 0;
 
 			e_info(drv,
 			       "Malicious Driver Detection event detected on PF %d VF %d MAC: %pM mdd-disable-vf=on",
 			       hw->bus.func, vf,
-			       adapter->vfinfo[vf].vf_mac_addresses);
+			       vfinfo[vf].vf_mac_addresses);
 		}
 	}
+	rcu_read_unlock();
 }
 
 static void ixgbe_check_for_bad_vf(struct ixgbe_adapter *adapter)
@@ -8309,9 +8340,15 @@ static void ixgbe_check_for_bad_vf(struct ixgbe_adapter *adapter)
 
 	/* check status reg for all VFs owned by this PF */
 	for (vf = 0; vf < adapter->num_vfs; ++vf) {
-		struct pci_dev *vfdev = adapter->vfinfo[vf].vfdev;
+		struct vf_data_storage *vfinfo;
+		struct pci_dev *vfdev = NULL;
 		u16 status_reg;
 
+		rcu_read_lock();
+		vfinfo = rcu_dereference(adapter->vfinfo);
+		if (vfinfo)
+			vfdev = vfinfo[vf].vfdev;
+		rcu_read_unlock();
 		if (!vfdev)
 			continue;
 		pci_read_config_word(vfdev, PCI_STATUS, &status_reg);
@@ -9744,15 +9781,21 @@ static int ixgbe_ndo_get_vf_stats(struct net_device *netdev, int vf,
 				  struct ifla_vf_stats *vf_stats)
 {
 	struct ixgbe_adapter *adapter = ixgbe_from_netdev(netdev);
+	struct vf_data_storage *vfinfo;
 
 	if (vf < 0 || vf >= adapter->num_vfs)
 		return -EINVAL;
 
-	vf_stats->rx_packets = adapter->vfinfo[vf].vfstats.gprc;
-	vf_stats->rx_bytes   = adapter->vfinfo[vf].vfstats.gorc;
-	vf_stats->tx_packets = adapter->vfinfo[vf].vfstats.gptc;
-	vf_stats->tx_bytes   = adapter->vfinfo[vf].vfstats.gotc;
-	vf_stats->multicast  = adapter->vfinfo[vf].vfstats.mprc;
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo) {
+		vf_stats->rx_packets = vfinfo[vf].vfstats.gprc;
+		vf_stats->rx_bytes   = vfinfo[vf].vfstats.gorc;
+		vf_stats->tx_packets = vfinfo[vf].vfstats.gptc;
+		vf_stats->tx_bytes   = vfinfo[vf].vfstats.gotc;
+		vf_stats->multicast  = vfinfo[vf].vfstats.mprc;
+	}
+	rcu_read_unlock();
 
 	return 0;
 }
@@ -10071,20 +10114,26 @@ static int handle_redirect_action(struct ixgbe_adapter *adapter, int ifindex,
 {
 	struct ixgbe_ring_feature *vmdq = &adapter->ring_feature[RING_F_VMDQ];
 	unsigned int num_vfs = adapter->num_vfs, vf;
+	struct vf_data_storage *vfinfo;
 	struct netdev_nested_priv priv;
 	struct upper_walk_data data;
 	struct net_device *upper;
 
 	/* redirect to a SRIOV VF */
-	for (vf = 0; vf < num_vfs; ++vf) {
-		upper = pci_get_drvdata(adapter->vfinfo[vf].vfdev);
-		if (upper->ifindex == ifindex) {
-			*queue = vf * __ALIGN_MASK(1, ~vmdq->mask);
-			*action = vf + 1;
-			*action <<= ETHTOOL_RX_FLOW_SPEC_RING_VF_OFF;
-			return 0;
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo)
+		for (vf = 0; vf < num_vfs; ++vf) {
+			upper = pci_get_drvdata(vfinfo[vf].vfdev);
+			if (upper->ifindex == ifindex) {
+				*queue = vf * __ALIGN_MASK(1, ~vmdq->mask);
+				*action = vf + 1;
+				*action <<= ETHTOOL_RX_FLOW_SPEC_RING_VF_OFF;
+				rcu_read_unlock();
+				return 0;
+			}
 		}
-	}
+	rcu_read_unlock();
 
 	/* redirect to a offloaded macvlan netdev */
 	data.adapter = adapter;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 431d77da15a5..80f22a8e7af4 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -44,7 +44,7 @@ static inline void ixgbe_alloc_vf_macvlans(struct ixgbe_adapter *adapter,
 			mv_list[i].free = true;
 			list_add(&mv_list[i].l, &adapter->vf_mvs.l);
 		}
-		adapter->mv_list = mv_list;
+		rcu_assign_pointer(adapter->mv_list, mv_list);
 	}
 }
 
@@ -52,6 +52,7 @@ static int __ixgbe_enable_sriov(struct ixgbe_adapter *adapter,
 				unsigned int num_vfs)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
 	int i;
 
 	if (adapter->xdp_prog) {
@@ -64,14 +65,11 @@ static int __ixgbe_enable_sriov(struct ixgbe_adapter *adapter,
 			  IXGBE_FLAG_VMDQ_ENABLED;
 
 	/* Allocate memory for per VF control structures */
-	adapter->vfinfo = kzalloc_objs(struct vf_data_storage, num_vfs);
-	if (!adapter->vfinfo)
+	vfinfo = kzalloc_objs(struct vf_data_storage, num_vfs);
+	if (!vfinfo)
 		return -ENOMEM;
 
-	adapter->num_vfs = num_vfs;
-
 	ixgbe_alloc_vf_macvlans(adapter, num_vfs);
-	adapter->ring_feature[RING_F_VMDQ].offset = num_vfs;
 
 	/* Initialize default switching mode VEB */
 	IXGBE_WRITE_REG(hw, IXGBE_PFDTXGSWC, IXGBE_PFDTXGSWC_VT_LBEN);
@@ -95,23 +93,27 @@ static int __ixgbe_enable_sriov(struct ixgbe_adapter *adapter,
 
 	for (i = 0; i < num_vfs; i++) {
 		/* enable spoof checking for all VFs */
-		adapter->vfinfo[i].spoofchk_enabled = true;
-		adapter->vfinfo[i].link_enable = true;
+		vfinfo[i].spoofchk_enabled = true;
+		vfinfo[i].link_enable = true;
 
 		/* We support VF RSS querying only for 82599 and x540
 		 * devices at the moment. These devices share RSS
 		 * indirection table and RSS hash key with PF therefore
 		 * we want to disable the querying by default.
 		 */
-		adapter->vfinfo[i].rss_query_enabled = false;
+		vfinfo[i].rss_query_enabled = false;
 
 		/* Untrust all VFs */
-		adapter->vfinfo[i].trusted = false;
+		vfinfo[i].trusted = false;
 
 		/* set the default xcast mode */
-		adapter->vfinfo[i].xcast_mode = IXGBEVF_XCAST_MODE_NONE;
+		vfinfo[i].xcast_mode = IXGBEVF_XCAST_MODE_NONE;
 	}
 
+	rcu_assign_pointer(adapter->vfinfo, vfinfo);
+	adapter->num_vfs = num_vfs;
+	adapter->ring_feature[RING_F_VMDQ].offset = num_vfs;
+
 	e_info(probe, "SR-IOV enabled with %d VFs\n", num_vfs);
 	return 0;
 }
@@ -123,6 +125,7 @@ static int __ixgbe_enable_sriov(struct ixgbe_adapter *adapter,
 static void ixgbe_get_vfs(struct ixgbe_adapter *adapter)
 {
 	struct pci_dev *pdev = adapter->pdev;
+	struct vf_data_storage *vfinfo;
 	u16 vendor = pdev->vendor;
 	struct pci_dev *vfdev;
 	int vf = 0;
@@ -134,18 +137,23 @@ static void ixgbe_get_vfs(struct ixgbe_adapter *adapter)
 		return;
 	pci_read_config_word(pdev, pos + PCI_SRIOV_VF_DID, &vf_id);
 
-	vfdev = pci_get_device(vendor, vf_id, NULL);
-	for (; vfdev; vfdev = pci_get_device(vendor, vf_id, vfdev)) {
-		if (!vfdev->is_virtfn)
-			continue;
-		if (vfdev->physfn != pdev)
-			continue;
-		if (vf >= adapter->num_vfs)
-			continue;
-		pci_dev_get(vfdev);
-		adapter->vfinfo[vf].vfdev = vfdev;
-		++vf;
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo) {
+		vfdev = pci_get_device(vendor, vf_id, NULL);
+		for (; vfdev; vfdev = pci_get_device(vendor, vf_id, vfdev)) {
+			if (!vfdev->is_virtfn)
+				continue;
+			if (vfdev->physfn != pdev)
+				continue;
+			if (vf >= adapter->num_vfs)
+				continue;
+			pci_dev_get(vfdev);
+			vfinfo[vf].vfdev = vfdev;
+			++vf;
+		}
 	}
+	rcu_read_unlock();
 }
 
 /* Note this function is called when the user wants to enable SR-IOV
@@ -206,31 +214,28 @@ int ixgbe_disable_sriov(struct ixgbe_adapter *adapter)
 {
 	unsigned int num_vfs = adapter->num_vfs, vf;
 	struct ixgbe_hw *hw = &adapter->hw;
-	unsigned long flags;
+	struct vf_data_storage *vfinfo;
+	struct vf_macvlans *mv_list;
 	int rss;
 
-	spin_lock_irqsave(&adapter->vfs_lock, flags);
-	/* set num VFs to 0 to prevent access to vfinfo */
+	/* set num VFs to 0 so readers bail out early */
 	adapter->num_vfs = 0;
-	spin_unlock_irqrestore(&adapter->vfs_lock, flags);
+
+	vfinfo = rcu_replace_pointer(adapter->vfinfo, NULL, 1);
+	mv_list = rcu_replace_pointer(adapter->mv_list, NULL, 1);
 
 	/* put the reference to all of the vf devices */
 	for (vf = 0; vf < num_vfs; ++vf) {
-		struct pci_dev *vfdev = adapter->vfinfo[vf].vfdev;
+		struct pci_dev *vfdev = vfinfo[vf].vfdev;
 
 		if (!vfdev)
 			continue;
-		adapter->vfinfo[vf].vfdev = NULL;
+		vfinfo[vf].vfdev = NULL;
 		pci_dev_put(vfdev);
 	}
 
-	/* free VF control structures */
-	kfree(adapter->vfinfo);
-	adapter->vfinfo = NULL;
-
-	/* free macvlan list */
-	kfree(adapter->mv_list);
-	adapter->mv_list = NULL;
+	kfree_rcu(vfinfo, rcu_head);
+	kfree_rcu(mv_list, rcu_head);
 
 	/* if SR-IOV is already disabled then there is nothing to do */
 	if (!(adapter->flags & IXGBE_FLAG_SRIOV_ENABLED))
@@ -368,8 +373,8 @@ static int ixgbe_set_vf_multicasts(struct ixgbe_adapter *adapter,
 {
 	int entries = FIELD_GET(IXGBE_VT_MSGINFO_MASK, msgbuf[0]);
 	u16 *hash_list = (u16 *)&msgbuf[1];
-	struct vf_data_storage *vfinfo = &adapter->vfinfo[vf];
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
 	int i;
 	u32 vector_bit;
 	u32 vector_reg;
@@ -379,28 +384,34 @@ static int ixgbe_set_vf_multicasts(struct ixgbe_adapter *adapter,
 	/* only so many hash values supported */
 	entries = min(entries, IXGBE_MAX_VF_MC_ENTRIES);
 
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
 	/*
 	 * salt away the number of multi cast addresses assigned
 	 * to this VF for later use to restore when the PF multi cast
 	 * list changes
 	 */
-	vfinfo->num_vf_mc_hashes = entries;
+	vfinfo[vf].num_vf_mc_hashes = entries;
 
 	/*
 	 * VFs are limited to using the MTA hash table for their multicast
 	 * addresses
 	 */
 	for (i = 0; i < entries; i++) {
-		vfinfo->vf_mc_hashes[i] = hash_list[i];
+		vfinfo[vf].vf_mc_hashes[i] = hash_list[i];
 	}
 
-	for (i = 0; i < vfinfo->num_vf_mc_hashes; i++) {
-		vector_reg = (vfinfo->vf_mc_hashes[i] >> 5) & 0x7F;
-		vector_bit = vfinfo->vf_mc_hashes[i] & 0x1F;
+	for (i = 0; i < vfinfo[vf].num_vf_mc_hashes; i++) {
+		vector_reg = (vfinfo[vf].vf_mc_hashes[i] >> 5) & 0x7F;
+		vector_bit = vfinfo[vf].vf_mc_hashes[i] & 0x1F;
 		mta_reg = IXGBE_READ_REG(hw, IXGBE_MTA(vector_reg));
 		mta_reg |= BIT(vector_bit);
 		IXGBE_WRITE_REG(hw, IXGBE_MTA(vector_reg), mta_reg);
 	}
+
 	vmolr |= IXGBE_VMOLR_ROMPE;
 	IXGBE_WRITE_REG(hw, IXGBE_VMOLR(vf), vmolr);
 
@@ -410,32 +421,39 @@ static int ixgbe_set_vf_multicasts(struct ixgbe_adapter *adapter,
 #ifdef CONFIG_PCI_IOV
 void ixgbe_restore_vf_multicasts(struct ixgbe_adapter *adapter)
 {
-	struct ixgbe_hw *hw = &adapter->hw;
 	struct vf_data_storage *vfinfo;
+	struct ixgbe_hw *hw = &adapter->hw;
 	int i, j;
 	u32 vector_bit;
 	u32 vector_reg;
 	u32 mta_reg;
 
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		goto no_vfs;
+
 	for (i = 0; i < adapter->num_vfs; i++) {
 		u32 vmolr = IXGBE_READ_REG(hw, IXGBE_VMOLR(i));
-		vfinfo = &adapter->vfinfo[i];
-		for (j = 0; j < vfinfo->num_vf_mc_hashes; j++) {
+		for (j = 0; j < vfinfo[i].num_vf_mc_hashes; j++) {
 			hw->addr_ctrl.mta_in_use++;
-			vector_reg = (vfinfo->vf_mc_hashes[j] >> 5) & 0x7F;
-			vector_bit = vfinfo->vf_mc_hashes[j] & 0x1F;
+			vector_reg = (vfinfo[i].vf_mc_hashes[j] >> 5) & 0x7F;
+			vector_bit = vfinfo[i].vf_mc_hashes[j] & 0x1F;
 			mta_reg = IXGBE_READ_REG(hw, IXGBE_MTA(vector_reg));
 			mta_reg |= BIT(vector_bit);
 			IXGBE_WRITE_REG(hw, IXGBE_MTA(vector_reg), mta_reg);
 		}
 
-		if (vfinfo->num_vf_mc_hashes)
+		if (vfinfo[i].num_vf_mc_hashes)
 			vmolr |= IXGBE_VMOLR_ROMPE;
 		else
 			vmolr &= ~IXGBE_VMOLR_ROMPE;
 		IXGBE_WRITE_REG(hw, IXGBE_VMOLR(i), vmolr);
 	}
 
+no_vfs:
+	rcu_read_unlock();
+
 	/* Restore any VF macvlans */
 	ixgbe_full_sync_mac_table(adapter);
 }
@@ -493,7 +511,9 @@ static int ixgbe_set_vf_lpe(struct ixgbe_adapter *adapter, u32 max_frame, u32 vf
 	 */
 	if (adapter->hw.mac.type == ixgbe_mac_82599EB) {
 		struct net_device *dev = adapter->netdev;
+		unsigned int vf_api = ixgbe_mbox_api_10;
 		int pf_max_frame = dev->mtu + ETH_HLEN;
+		struct vf_data_storage *vfinfo;
 		u32 reg_offset, vf_shift, vfre;
 		int err = 0;
 
@@ -503,7 +523,12 @@ static int ixgbe_set_vf_lpe(struct ixgbe_adapter *adapter, u32 max_frame, u32 vf
 					     IXGBE_FCOE_JUMBO_FRAME_SIZE);
 
 #endif /* CONFIG_FCOE */
-		switch (adapter->vfinfo[vf].vf_api) {
+		lockdep_assert_in_rcu_read_lock();
+		vfinfo = rcu_dereference(adapter->vfinfo);
+		if (vfinfo)
+			vf_api = vfinfo[vf].vf_api;
+
+		switch (vf_api) {
 		case ixgbe_mbox_api_11:
 		case ixgbe_mbox_api_12:
 		case ixgbe_mbox_api_13:
@@ -643,10 +668,16 @@ static void ixgbe_clear_vf_vlans(struct ixgbe_adapter *adapter, u32 vf)
 static int ixgbe_set_vf_macvlan(struct ixgbe_adapter *adapter,
 				int vf, int index, unsigned char *mac_addr)
 {
-	struct vf_macvlans *entry;
+	struct vf_macvlans *mv_list, *entry;
 	bool found = false;
 	int retval = 0;
 
+	lockdep_assert_in_rcu_read_lock();
+	/* vf_mvs entries point into the mv_list array */
+	mv_list = rcu_dereference(adapter->mv_list);
+	if (!mv_list)
+		return 0;
+
 	if (index <= 1) {
 		list_for_each_entry(entry, &adapter->vf_mvs.l, l) {
 			if (entry->vf == vf) {
@@ -700,7 +731,7 @@ static inline void ixgbe_vf_reset_event(struct ixgbe_adapter *adapter, u32 vf)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
 	struct ixgbe_ring_feature *vmdq = &adapter->ring_feature[RING_F_VMDQ];
-	struct vf_data_storage *vfinfo = &adapter->vfinfo[vf];
+	struct vf_data_storage *vfinfo;
 	u32 q_per_pool = __ALIGN_MASK(1, ~vmdq->mask);
 	u8 num_tcs = adapter->hw_tcs;
 	u32 reg_val;
@@ -709,31 +740,36 @@ static inline void ixgbe_vf_reset_event(struct ixgbe_adapter *adapter, u32 vf)
 	/* remove VLAN filters belonging to this VF */
 	ixgbe_clear_vf_vlans(adapter, vf);
 
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return;
+
 	/* add back PF assigned VLAN or VLAN 0 */
-	ixgbe_set_vf_vlan(adapter, true, vfinfo->pf_vlan, vf);
+	ixgbe_set_vf_vlan(adapter, true, vfinfo[vf].pf_vlan, vf);
 
 	/* reset offloads to defaults */
-	ixgbe_set_vmolr(hw, vf, !vfinfo->pf_vlan);
+	ixgbe_set_vmolr(hw, vf, !vfinfo[vf].pf_vlan);
 
 	/* set outgoing tags for VFs */
-	if (!vfinfo->pf_vlan && !vfinfo->pf_qos && !num_tcs) {
+	if (!vfinfo[vf].pf_vlan && !vfinfo[vf].pf_qos && !num_tcs) {
 		ixgbe_clear_vmvir(adapter, vf);
 	} else {
-		if (vfinfo->pf_qos || !num_tcs)
-			ixgbe_set_vmvir(adapter, vfinfo->pf_vlan,
-					vfinfo->pf_qos, vf);
+		if (vfinfo[vf].pf_qos || !num_tcs)
+			ixgbe_set_vmvir(adapter, vfinfo[vf].pf_vlan,
+					vfinfo[vf].pf_qos, vf);
 		else
-			ixgbe_set_vmvir(adapter, vfinfo->pf_vlan,
+			ixgbe_set_vmvir(adapter, vfinfo[vf].pf_vlan,
 					adapter->default_up, vf);
 
-		if (vfinfo->spoofchk_enabled) {
+		if (vfinfo[vf].spoofchk_enabled) {
 			hw->mac.ops.set_vlan_anti_spoofing(hw, true, vf);
 			hw->mac.ops.set_mac_anti_spoofing(hw, true, vf);
 		}
 	}
 
 	/* reset multicast table array for vf */
-	adapter->vfinfo[vf].num_vf_mc_hashes = 0;
+	vfinfo[vf].num_vf_mc_hashes = 0;
 
 	/* clear any ipsec table info */
 	ixgbe_ipsec_vf_clear(adapter, vf);
@@ -741,11 +777,11 @@ static inline void ixgbe_vf_reset_event(struct ixgbe_adapter *adapter, u32 vf)
 	/* Flush and reset the mta with the new values */
 	ixgbe_set_rx_mode(adapter->netdev);
 
-	ixgbe_del_mac_filter(adapter, adapter->vfinfo[vf].vf_mac_addresses, vf);
+	ixgbe_del_mac_filter(adapter, vfinfo[vf].vf_mac_addresses, vf);
 	ixgbe_set_vf_macvlan(adapter, vf, 0, NULL);
 
 	/* reset VF api back to unknown */
-	adapter->vfinfo[vf].vf_api = ixgbe_mbox_api_10;
+	vfinfo[vf].vf_api = ixgbe_mbox_api_10;
 
 	/* Restart each queue for given VF */
 	for (queue = 0; queue < q_per_pool; queue++) {
@@ -780,16 +816,25 @@ static void ixgbe_vf_clear_mbx(struct ixgbe_adapter *adapter, u32 vf)
 static int ixgbe_set_vf_mac(struct ixgbe_adapter *adapter,
 			    int vf, unsigned char *mac_addr)
 {
+	struct vf_data_storage *vfinfo;
 	int retval;
 
-	ixgbe_del_mac_filter(adapter, adapter->vfinfo[vf].vf_mac_addresses, vf);
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo) {
+		rcu_read_unlock();
+		return -EINVAL;
+	}
+
+	ixgbe_del_mac_filter(adapter, vfinfo[vf].vf_mac_addresses, vf);
 	retval = ixgbe_add_mac_filter(adapter, mac_addr, vf);
 	if (retval >= 0)
-		memcpy(adapter->vfinfo[vf].vf_mac_addresses, mac_addr,
+		memcpy(vfinfo[vf].vf_mac_addresses, mac_addr,
 		       ETH_ALEN);
 	else
-		eth_zero_addr(adapter->vfinfo[vf].vf_mac_addresses);
+		eth_zero_addr(vfinfo[vf].vf_mac_addresses);
 
+	rcu_read_unlock();
 	return retval;
 }
 
@@ -797,12 +842,17 @@ int ixgbe_vf_configuration(struct pci_dev *pdev, unsigned int event_mask)
 {
 	struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
 	unsigned int vfn = (event_mask & 0x3f);
+	struct vf_data_storage *vfinfo;
 
 	bool enable = ((event_mask & 0x10000000U) != 0);
 
-	if (enable)
-		eth_zero_addr(adapter->vfinfo[vfn].vf_mac_addresses);
-
+	if (enable) {
+		rcu_read_lock();
+		vfinfo = rcu_dereference(adapter->vfinfo);
+		if (vfinfo)
+			eth_zero_addr(vfinfo[vfn].vf_mac_addresses);
+		rcu_read_unlock();
+	}
 	return 0;
 }
 
@@ -838,6 +888,7 @@ static void ixgbe_set_vf_rx_tx(struct ixgbe_adapter *adapter, int vf)
 {
 	u32 reg_cur_tx, reg_cur_rx, reg_req_tx, reg_req_rx;
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
 	u32 reg_offset, vf_shift;
 
 	vf_shift = vf % 32;
@@ -846,7 +897,9 @@ static void ixgbe_set_vf_rx_tx(struct ixgbe_adapter *adapter, int vf)
 	reg_cur_tx = IXGBE_READ_REG(hw, IXGBE_VFTE(reg_offset));
 	reg_cur_rx = IXGBE_READ_REG(hw, IXGBE_VFRE(reg_offset));
 
-	if (adapter->vfinfo[vf].link_enable) {
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo && vfinfo[vf].link_enable) {
 		reg_req_tx = reg_cur_tx | 1 << vf_shift;
 		reg_req_rx = reg_cur_rx | 1 << vf_shift;
 	} else {
@@ -882,11 +935,12 @@ static int ixgbe_vf_reset_msg(struct ixgbe_adapter *adapter, u32 vf)
 {
 	struct ixgbe_ring_feature *vmdq = &adapter->ring_feature[RING_F_VMDQ];
 	struct ixgbe_hw *hw = &adapter->hw;
-	unsigned char *vf_mac = adapter->vfinfo[vf].vf_mac_addresses;
+	struct vf_data_storage *vfinfo;
 	u32 reg, reg_offset, vf_shift;
 	u32 msgbuf[4] = {0, 0, 0, 0};
 	u8 *addr = (u8 *)(&msgbuf[1]);
 	u32 q_per_pool = __ALIGN_MASK(1, ~vmdq->mask);
+	unsigned char *vf_mac;
 	int i;
 
 	e_info(probe, "VF Reset msg received from vf %d\n", vf);
@@ -896,6 +950,13 @@ static int ixgbe_vf_reset_msg(struct ixgbe_adapter *adapter, u32 vf)
 
 	ixgbe_vf_clear_mbx(adapter, vf);
 
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
+	vf_mac = vfinfo[vf].vf_mac_addresses;
+
 	/* set vf mac address */
 	if (!is_zero_ether_addr(vf_mac))
 		ixgbe_set_vf_mac(adapter, vf, vf_mac);
@@ -905,7 +966,7 @@ static int ixgbe_vf_reset_msg(struct ixgbe_adapter *adapter, u32 vf)
 
 	/* force drop enable for all VF Rx queues */
 	reg = IXGBE_QDE_ENABLE;
-	if (adapter->vfinfo[vf].pf_vlan)
+	if (vfinfo[vf].pf_vlan)
 		reg |= IXGBE_QDE_HIDE_VLAN;
 
 	ixgbe_write_qde(adapter, vf, reg);
@@ -913,7 +974,7 @@ static int ixgbe_vf_reset_msg(struct ixgbe_adapter *adapter, u32 vf)
 	ixgbe_set_vf_rx_tx(adapter, vf);
 
 	/* enable VF mailbox for further messages */
-	adapter->vfinfo[vf].clear_to_send = true;
+	vfinfo[vf].clear_to_send = true;
 
 	/* Enable counting of spoofed packets in the SSVPC register */
 	reg = IXGBE_READ_REG(hw, IXGBE_VMECM(reg_offset));
@@ -931,7 +992,7 @@ static int ixgbe_vf_reset_msg(struct ixgbe_adapter *adapter, u32 vf)
 
 	/* reply to reset with ack and vf mac address */
 	msgbuf[0] = IXGBE_VF_RESET;
-	if (!is_zero_ether_addr(vf_mac) && adapter->vfinfo[vf].pf_set_mac) {
+	if (!is_zero_ether_addr(vf_mac) && vfinfo[vf].pf_set_mac) {
 		msgbuf[0] |= IXGBE_VT_MSGTYPE_ACK;
 		memcpy(addr, vf_mac, ETH_ALEN);
 	} else {
@@ -952,14 +1013,20 @@ static int ixgbe_set_vf_mac_addr(struct ixgbe_adapter *adapter,
 				 u32 *msgbuf, u32 vf)
 {
 	u8 *new_mac = ((u8 *)(&msgbuf[1]));
+	struct vf_data_storage *vfinfo;
 
 	if (!is_valid_ether_addr(new_mac)) {
 		e_warn(drv, "VF %d attempted to set invalid mac\n", vf);
 		return -1;
 	}
 
-	if (adapter->vfinfo[vf].pf_set_mac && !adapter->vfinfo[vf].trusted &&
-	    !ether_addr_equal(adapter->vfinfo[vf].vf_mac_addresses, new_mac)) {
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
+	if (vfinfo[vf].pf_set_mac && !vfinfo[vf].trusted &&
+	    !ether_addr_equal(vfinfo[vf].vf_mac_addresses, new_mac)) {
 		e_warn(drv,
 		       "VF %d attempted to override administratively set MAC address\n"
 		       "Reload the VF driver to resume operations\n",
@@ -975,9 +1042,15 @@ static int ixgbe_set_vf_vlan_msg(struct ixgbe_adapter *adapter,
 {
 	u32 add = FIELD_GET(IXGBE_VT_MSGINFO_MASK, msgbuf[0]);
 	u32 vid = (msgbuf[1] & IXGBE_VLVF_VLANID_MASK);
+	struct vf_data_storage *vfinfo;
 	u8 tcs = adapter->hw_tcs;
 
-	if (adapter->vfinfo[vf].pf_vlan || tcs) {
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
+	if (vfinfo[vf].pf_vlan || tcs) {
 		e_warn(drv,
 		       "VF %d attempted to override administratively set VLAN configuration\n"
 		       "Reload the VF driver to resume operations\n",
@@ -997,9 +1070,15 @@ static int ixgbe_set_vf_macvlan_msg(struct ixgbe_adapter *adapter,
 {
 	u8 *new_mac = ((u8 *)(&msgbuf[1]));
 	int index = FIELD_GET(IXGBE_VT_MSGINFO_MASK, msgbuf[0]);
+	struct vf_data_storage *vfinfo;
 	int err;
 
-	if (adapter->vfinfo[vf].pf_set_mac && !adapter->vfinfo[vf].trusted &&
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
+	if (vfinfo[vf].pf_set_mac && !vfinfo[vf].trusted &&
 	    index > 0) {
 		e_warn(drv,
 		       "VF %d requested MACVLAN filter but is administratively denied\n",
@@ -1018,7 +1097,7 @@ static int ixgbe_set_vf_macvlan_msg(struct ixgbe_adapter *adapter,
 		 * If the VF is allowed to set MAC filters then turn off
 		 * anti-spoofing to avoid false positives.
 		 */
-		if (adapter->vfinfo[vf].spoofchk_enabled) {
+		if (vfinfo[vf].spoofchk_enabled) {
 			struct ixgbe_hw *hw = &adapter->hw;
 
 			hw->mac.ops.set_mac_anti_spoofing(hw, false, vf);
@@ -1038,6 +1117,7 @@ static int ixgbe_set_vf_macvlan_msg(struct ixgbe_adapter *adapter,
 static int ixgbe_negotiate_vf_api(struct ixgbe_adapter *adapter,
 				  u32 *msgbuf, u32 vf)
 {
+	struct vf_data_storage *vfinfo;
 	int api = msgbuf[1];
 
 	switch (api) {
@@ -1048,7 +1128,10 @@ static int ixgbe_negotiate_vf_api(struct ixgbe_adapter *adapter,
 	case ixgbe_mbox_api_14:
 	case ixgbe_mbox_api_16:
 	case ixgbe_mbox_api_17:
-		adapter->vfinfo[vf].vf_api = api;
+		lockdep_assert_in_rcu_read_lock();
+		vfinfo = rcu_dereference(adapter->vfinfo);
+		if (vfinfo)
+			vfinfo[vf].vf_api = api;
 		return 0;
 	default:
 		break;
@@ -1064,11 +1147,17 @@ static int ixgbe_get_vf_queues(struct ixgbe_adapter *adapter,
 {
 	struct net_device *dev = adapter->netdev;
 	struct ixgbe_ring_feature *vmdq = &adapter->ring_feature[RING_F_VMDQ];
+	struct vf_data_storage *vfinfo;
 	unsigned int default_tc = 0;
 	u8 num_tcs = adapter->hw_tcs;
 
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
 	/* verify the PF is supporting the correct APIs */
-	switch (adapter->vfinfo[vf].vf_api) {
+	switch (vfinfo[vf].vf_api) {
 	case ixgbe_mbox_api_20:
 	case ixgbe_mbox_api_11:
 	case ixgbe_mbox_api_12:
@@ -1092,7 +1181,7 @@ static int ixgbe_get_vf_queues(struct ixgbe_adapter *adapter,
 	/* notify VF of need for VLAN tag stripping, and correct queue */
 	if (num_tcs)
 		msgbuf[IXGBE_VF_TRANS_VLAN] = num_tcs;
-	else if (adapter->vfinfo[vf].pf_vlan || adapter->vfinfo[vf].pf_qos)
+	else if (vfinfo[vf].pf_vlan || vfinfo[vf].pf_qos)
 		msgbuf[IXGBE_VF_TRANS_VLAN] = 1;
 	else
 		msgbuf[IXGBE_VF_TRANS_VLAN] = 0;
@@ -1105,17 +1194,23 @@ static int ixgbe_get_vf_queues(struct ixgbe_adapter *adapter,
 
 static int ixgbe_get_vf_reta(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
 {
-	u32 i, j;
-	u32 *out_buf = &msgbuf[1];
-	const u8 *reta = adapter->rss_indir_tbl;
 	u32 reta_size = ixgbe_rss_indir_tbl_entries(adapter);
+	const u8 *reta = adapter->rss_indir_tbl;
+	struct vf_data_storage *vfinfo;
+	u32 *out_buf = &msgbuf[1];
+	u32 i, j;
+
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
 
 	/* Check if operation is permitted */
-	if (!adapter->vfinfo[vf].rss_query_enabled)
+	if (!vfinfo[vf].rss_query_enabled)
 		return -EPERM;
 
 	/* verify the PF is supporting the correct API */
-	switch (adapter->vfinfo[vf].vf_api) {
+	switch (vfinfo[vf].vf_api) {
 	case ixgbe_mbox_api_17:
 	case ixgbe_mbox_api_16:
 	case ixgbe_mbox_api_14:
@@ -1143,14 +1238,20 @@ static int ixgbe_get_vf_reta(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
 static int ixgbe_get_vf_rss_key(struct ixgbe_adapter *adapter,
 				u32 *msgbuf, u32 vf)
 {
+	struct vf_data_storage *vfinfo;
 	u32 *rss_key = &msgbuf[1];
 
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
 	/* Check if the operation is permitted */
-	if (!adapter->vfinfo[vf].rss_query_enabled)
+	if (!vfinfo[vf].rss_query_enabled)
 		return -EPERM;
 
 	/* verify the PF is supporting the correct API */
-	switch (adapter->vfinfo[vf].vf_api) {
+	switch (vfinfo[vf].vf_api) {
 	case ixgbe_mbox_api_17:
 	case ixgbe_mbox_api_16:
 	case ixgbe_mbox_api_14:
@@ -1170,11 +1271,17 @@ static int ixgbe_update_vf_xcast_mode(struct ixgbe_adapter *adapter,
 				      u32 *msgbuf, u32 vf)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
 	int xcast_mode = msgbuf[1];
 	u32 vmolr, fctrl, disable, enable;
 
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
 	/* verify the PF is supporting the correct APIs */
-	switch (adapter->vfinfo[vf].vf_api) {
+	switch (vfinfo[vf].vf_api) {
 	case ixgbe_mbox_api_12:
 		/* promisc introduced in 1.3 version */
 		if (xcast_mode == IXGBEVF_XCAST_MODE_PROMISC)
@@ -1190,11 +1297,11 @@ static int ixgbe_update_vf_xcast_mode(struct ixgbe_adapter *adapter,
 	}
 
 	if (xcast_mode > IXGBEVF_XCAST_MODE_MULTI &&
-	    !adapter->vfinfo[vf].trusted) {
+	    !vfinfo[vf].trusted) {
 		xcast_mode = IXGBEVF_XCAST_MODE_MULTI;
 	}
 
-	if (adapter->vfinfo[vf].xcast_mode == xcast_mode)
+	if (vfinfo[vf].xcast_mode == xcast_mode)
 		goto out;
 
 	switch (xcast_mode) {
@@ -1236,7 +1343,7 @@ static int ixgbe_update_vf_xcast_mode(struct ixgbe_adapter *adapter,
 	vmolr |= enable;
 	IXGBE_WRITE_REG(hw, IXGBE_VMOLR(vf), vmolr);
 
-	adapter->vfinfo[vf].xcast_mode = xcast_mode;
+	vfinfo[vf].xcast_mode = xcast_mode;
 
 out:
 	msgbuf[1] = xcast_mode;
@@ -1247,10 +1354,16 @@ static int ixgbe_update_vf_xcast_mode(struct ixgbe_adapter *adapter,
 static int ixgbe_get_vf_link_state(struct ixgbe_adapter *adapter,
 				   u32 *msgbuf, u32 vf)
 {
+	struct vf_data_storage *vfinfo;
 	u32 *link_state = &msgbuf[1];
 
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
 	/* verify the PF is supporting the correct API */
-	switch (adapter->vfinfo[vf].vf_api) {
+	switch (vfinfo[vf].vf_api) {
 	case ixgbe_mbox_api_12:
 	case ixgbe_mbox_api_13:
 	case ixgbe_mbox_api_14:
@@ -1261,7 +1374,7 @@ static int ixgbe_get_vf_link_state(struct ixgbe_adapter *adapter,
 		return -EOPNOTSUPP;
 	}
 
-	*link_state = adapter->vfinfo[vf].link_enable;
+	*link_state = vfinfo[vf].link_enable;
 
 	return 0;
 }
@@ -1280,8 +1393,14 @@ static int ixgbe_send_vf_link_status(struct ixgbe_adapter *adapter,
 				     u32 *msgbuf, u32 vf)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
+
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
 
-	switch (adapter->vfinfo[vf].vf_api) {
+	switch (vfinfo[vf].vf_api) {
 	case ixgbe_mbox_api_16:
 	case ixgbe_mbox_api_17:
 		if (hw->mac.type != ixgbe_mac_e610)
@@ -1310,9 +1429,15 @@ static int ixgbe_send_vf_link_status(struct ixgbe_adapter *adapter,
 static int ixgbe_negotiate_vf_features(struct ixgbe_adapter *adapter,
 				       u32 *msgbuf, u32 vf)
 {
+	struct vf_data_storage *vfinfo;
 	u32 features = msgbuf[1];
 
-	switch (adapter->vfinfo[vf].vf_api) {
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
+	switch (vfinfo[vf].vf_api) {
 	case ixgbe_mbox_api_17:
 		break;
 	default:
@@ -1330,6 +1455,7 @@ static int ixgbe_rcv_msg_from_vf(struct ixgbe_adapter *adapter, u32 vf)
 	u32 mbx_size = IXGBE_VFMAILBOX_SIZE;
 	u32 msgbuf[IXGBE_VFMAILBOX_SIZE];
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
 	int retval;
 
 	retval = ixgbe_read_mbx(hw, msgbuf, mbx_size, vf);
@@ -1349,11 +1475,16 @@ static int ixgbe_rcv_msg_from_vf(struct ixgbe_adapter *adapter, u32 vf)
 	if (msgbuf[0] == IXGBE_VF_RESET)
 		return ixgbe_vf_reset_msg(adapter, vf);
 
+	lockdep_assert_in_rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		return 0;
+
 	/*
 	 * until the vf completes a virtual function reset it should not be
 	 * allowed to start any configuration.
 	 */
-	if (!adapter->vfinfo[vf].clear_to_send) {
+	if (!vfinfo[vf].clear_to_send) {
 		msgbuf[0] |= IXGBE_VT_MSGTYPE_NACK;
 		ixgbe_write_mbx(hw, msgbuf, 1, vf);
 		return 0;
@@ -1426,11 +1557,12 @@ static int ixgbe_rcv_msg_from_vf(struct ixgbe_adapter *adapter, u32 vf)
 
 static void ixgbe_rcv_ack_from_vf(struct ixgbe_adapter *adapter, u32 vf)
 {
+	struct vf_data_storage *vfinfo = rcu_dereference(adapter->vfinfo);
 	struct ixgbe_hw *hw = &adapter->hw;
 	u32 msg = IXGBE_VT_MSGTYPE_NACK;
 
 	/* if device isn't clear to send it shouldn't be reading either */
-	if (!adapter->vfinfo[vf].clear_to_send)
+	if (vfinfo && !vfinfo[vf].clear_to_send)
 		ixgbe_write_mbx(hw, &msg, 1, vf);
 }
 
@@ -1462,15 +1594,21 @@ bool ixgbe_check_mdd_event(struct ixgbe_adapter *adapter)
 			 IXGBE_READ_REG(hw, IXGBE_LVMMC_RX));
 
 		if (hw->mac.ops.restore_mdd_vf) {
+			struct vf_data_storage *vfinfo;
 			u32 ping;
 
 			hw->mac.ops.restore_mdd_vf(hw, i);
 
 			/* get the VF to rebuild its queues */
-			adapter->vfinfo[i].clear_to_send = 0;
-			ping = IXGBE_PF_CONTROL_MSG |
-			       IXGBE_VT_MSGTYPE_CTS;
-			ixgbe_write_mbx(hw, &ping, 1, i);
+			rcu_read_lock();
+			vfinfo = rcu_dereference(adapter->vfinfo);
+			if (vfinfo) {
+				vfinfo[i].clear_to_send = false;
+				ping = IXGBE_PF_CONTROL_MSG |
+				       IXGBE_VT_MSGTYPE_CTS;
+				ixgbe_write_mbx(hw, &ping, 1, i);
+			}
+			rcu_read_unlock();
 		}
 
 		ret = true;
@@ -1482,12 +1620,11 @@ bool ixgbe_check_mdd_event(struct ixgbe_adapter *adapter)
 void ixgbe_msg_task(struct ixgbe_adapter *adapter)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
-	unsigned long flags;
 	u32 vf;
 
 	ixgbe_check_mdd_event(adapter);
 
-	spin_lock_irqsave(&adapter->vfs_lock, flags);
+	rcu_read_lock();
 	for (vf = 0; vf < adapter->num_vfs; vf++) {
 		/* process any reset requests */
 		if (!ixgbe_check_for_rst(hw, vf))
@@ -1501,7 +1638,7 @@ void ixgbe_msg_task(struct ixgbe_adapter *adapter)
 		if (!ixgbe_check_for_ack(hw, vf))
 			ixgbe_rcv_ack_from_vf(adapter, vf);
 	}
-	spin_unlock_irqrestore(&adapter->vfs_lock, flags);
+	rcu_read_unlock();
 }
 
 static inline void ixgbe_ping_vf(struct ixgbe_adapter *adapter, int vf)
@@ -1510,23 +1647,26 @@ static inline void ixgbe_ping_vf(struct ixgbe_adapter *adapter, int vf)
 	u32 ping;
 
 	ping = IXGBE_PF_CONTROL_MSG;
-	if (adapter->vfinfo[vf].clear_to_send)
-		ping |= IXGBE_VT_MSGTYPE_CTS;
 	ixgbe_write_mbx(hw, &ping, 1, vf);
 }
 
 void ixgbe_ping_all_vfs(struct ixgbe_adapter *adapter)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
 	u32 ping;
 	int i;
 
-	for (i = 0 ; i < adapter->num_vfs; i++) {
-		ping = IXGBE_PF_CONTROL_MSG;
-		if (adapter->vfinfo[i].clear_to_send)
-			ping |= IXGBE_VT_MSGTYPE_CTS;
-		ixgbe_write_mbx(hw, &ping, 1, i);
-	}
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo)
+		for (i = 0 ; i < adapter->num_vfs; i++) {
+			ping = IXGBE_PF_CONTROL_MSG;
+			if (vfinfo[i].clear_to_send)
+				ping |= IXGBE_VT_MSGTYPE_CTS;
+			ixgbe_write_mbx(hw, &ping, 1, i);
+		}
+	rcu_read_unlock();
 }
 
 /**
@@ -1537,21 +1677,34 @@ void ixgbe_ping_all_vfs(struct ixgbe_adapter *adapter)
  **/
 void ixgbe_set_all_vfs(struct ixgbe_adapter *adapter)
 {
+	struct vf_data_storage *vfinfo;
 	int i;
 
-	for (i = 0 ; i < adapter->num_vfs; i++)
-		ixgbe_set_vf_link_state(adapter, i,
-					adapter->vfinfo[i].link_state);
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo)
+		for (i = 0 ; i < adapter->num_vfs; i++)
+			ixgbe_set_vf_link_state(adapter, i,
+						vfinfo[i].link_state);
+	rcu_read_unlock();
 }
 
 int ixgbe_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 {
 	struct ixgbe_adapter *adapter = ixgbe_from_netdev(netdev);
+	struct vf_data_storage *vfinfo;
 	int retval;
 
 	if (vf >= adapter->num_vfs)
 		return -EINVAL;
 
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo) {
+		rcu_read_unlock();
+		return 0;
+	}
+
 	if (is_valid_ether_addr(mac)) {
 		dev_info(&adapter->pdev->dev, "setting MAC %pM on VF %d\n",
 			 mac, vf);
@@ -1559,7 +1712,7 @@ int ixgbe_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 
 		retval = ixgbe_set_vf_mac(adapter, vf, mac);
 		if (retval >= 0) {
-			adapter->vfinfo[vf].pf_set_mac = true;
+			vfinfo[vf].pf_set_mac = true;
 
 			if (test_bit(__IXGBE_DOWN, &adapter->state)) {
 				dev_warn(&adapter->pdev->dev, "The VF MAC address has been set, but the PF device is not up.\n");
@@ -1569,18 +1722,19 @@ int ixgbe_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 			dev_warn(&adapter->pdev->dev, "The VF MAC address was NOT set due to invalid or duplicate MAC address.\n");
 		}
 	} else if (is_zero_ether_addr(mac)) {
-		unsigned char *vf_mac_addr =
-					   adapter->vfinfo[vf].vf_mac_addresses;
+		unsigned char *vf_mac_addr = vfinfo[vf].vf_mac_addresses;
 
 		/* nothing to do */
-		if (is_zero_ether_addr(vf_mac_addr))
+		if (is_zero_ether_addr(vf_mac_addr)) {
+			rcu_read_unlock();
 			return 0;
+		}
 
 		dev_info(&adapter->pdev->dev, "removing MAC on VF %d\n", vf);
 
 		retval = ixgbe_del_mac_filter(adapter, vf_mac_addr, vf);
 		if (retval >= 0) {
-			adapter->vfinfo[vf].pf_set_mac = false;
+			vfinfo[vf].pf_set_mac = false;
 			memcpy(vf_mac_addr, mac, ETH_ALEN);
 		} else {
 			dev_warn(&adapter->pdev->dev, "Could NOT remove the VF MAC address.\n");
@@ -1589,10 +1743,12 @@ int ixgbe_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 		retval = -EINVAL;
 	}
 
+	rcu_read_unlock();
 	return retval;
 }
 
 static int ixgbe_enable_port_vlan(struct ixgbe_adapter *adapter, int vf,
+				  struct vf_data_storage *vfinfo,
 				  u16 vlan, u8 qos)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
@@ -1613,8 +1769,8 @@ static int ixgbe_enable_port_vlan(struct ixgbe_adapter *adapter, int vf,
 		ixgbe_write_qde(adapter, vf, IXGBE_QDE_ENABLE |
 				IXGBE_QDE_HIDE_VLAN);
 
-	adapter->vfinfo[vf].pf_vlan = vlan;
-	adapter->vfinfo[vf].pf_qos = qos;
+	vfinfo[vf].pf_vlan = vlan;
+	vfinfo[vf].pf_qos = qos;
 	dev_info(&adapter->pdev->dev,
 		 "Setting VLAN %d, QOS 0x%x on VF %d\n", vlan, qos, vf);
 	if (test_bit(__IXGBE_DOWN, &adapter->state)) {
@@ -1628,13 +1784,14 @@ static int ixgbe_enable_port_vlan(struct ixgbe_adapter *adapter, int vf,
 	return err;
 }
 
-static int ixgbe_disable_port_vlan(struct ixgbe_adapter *adapter, int vf)
+static int ixgbe_disable_port_vlan(struct ixgbe_adapter *adapter, int vf,
+				   struct vf_data_storage *vfinfo)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
 	int err;
 
 	err = ixgbe_set_vf_vlan(adapter, false,
-				adapter->vfinfo[vf].pf_vlan, vf);
+				vfinfo[vf].pf_vlan, vf);
 	/* Restore tagless access via VLAN 0 */
 	ixgbe_set_vf_vlan(adapter, true, 0, vf);
 	ixgbe_clear_vmvir(adapter, vf);
@@ -1644,8 +1801,8 @@ static int ixgbe_disable_port_vlan(struct ixgbe_adapter *adapter, int vf)
 	if (hw->mac.type >= ixgbe_mac_X550)
 		ixgbe_write_qde(adapter, vf, IXGBE_QDE_ENABLE);
 
-	adapter->vfinfo[vf].pf_vlan = 0;
-	adapter->vfinfo[vf].pf_qos = 0;
+	vfinfo[vf].pf_vlan = 0;
+	vfinfo[vf].pf_qos = 0;
 
 	return err;
 }
@@ -1653,13 +1810,20 @@ static int ixgbe_disable_port_vlan(struct ixgbe_adapter *adapter, int vf)
 int ixgbe_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan,
 			  u8 qos, __be16 vlan_proto)
 {
-	int err = 0;
 	struct ixgbe_adapter *adapter = ixgbe_from_netdev(netdev);
+	struct vf_data_storage *vfinfo;
+	int err = 0;
 
 	if ((vf >= adapter->num_vfs) || (vlan > 4095) || (qos > 7))
 		return -EINVAL;
 	if (vlan_proto != htons(ETH_P_8021Q))
 		return -EPROTONOSUPPORT;
+
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo)
+		goto out;
+
 	if (vlan || qos) {
 		/* Check if there is already a port VLAN set, if so
 		 * we have to delete the old one first before we
@@ -1668,16 +1832,17 @@ int ixgbe_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan,
 		 * old port VLAN before setting a new one but this
 		 * is not necessarily the case.
 		 */
-		if (adapter->vfinfo[vf].pf_vlan)
-			err = ixgbe_disable_port_vlan(adapter, vf);
+		if (vfinfo[vf].pf_vlan)
+			err = ixgbe_disable_port_vlan(adapter, vf, vfinfo);
 		if (err)
 			goto out;
-		err = ixgbe_enable_port_vlan(adapter, vf, vlan, qos);
+		err = ixgbe_enable_port_vlan(adapter, vf, vfinfo, vlan, qos);
 	} else {
-		err = ixgbe_disable_port_vlan(adapter, vf);
+		err = ixgbe_disable_port_vlan(adapter, vf, vfinfo);
 	}
 
 out:
+	rcu_read_unlock();
 	return err;
 }
 
@@ -1695,13 +1860,13 @@ int ixgbe_link_mbps(struct ixgbe_adapter *adapter)
 	}
 }
 
-static void ixgbe_set_vf_rate_limit(struct ixgbe_adapter *adapter, int vf)
+static void ixgbe_set_vf_rate_limit(struct ixgbe_adapter *adapter, int vf,
+				    u16 tx_rate)
 {
 	struct ixgbe_ring_feature *vmdq = &adapter->ring_feature[RING_F_VMDQ];
 	struct ixgbe_hw *hw = &adapter->hw;
 	u32 bcnrc_val = 0;
 	u16 queue, queues_per_pool;
-	u16 tx_rate = adapter->vfinfo[vf].tx_rate;
 
 	if (tx_rate) {
 		/* start with base link speed value */
@@ -1749,6 +1914,7 @@ static void ixgbe_set_vf_rate_limit(struct ixgbe_adapter *adapter, int vf)
 
 void ixgbe_check_vf_rate_limit(struct ixgbe_adapter *adapter)
 {
+	struct vf_data_storage *vfinfo;
 	int i;
 
 	/* VF Tx rate limit was not set */
@@ -1761,18 +1927,23 @@ void ixgbe_check_vf_rate_limit(struct ixgbe_adapter *adapter)
 			 "Link speed has been changed. VF Transmit rate is disabled\n");
 	}
 
-	for (i = 0; i < adapter->num_vfs; i++) {
-		if (!adapter->vf_rate_link_speed)
-			adapter->vfinfo[i].tx_rate = 0;
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo)
+		for (i = 0; i < adapter->num_vfs; i++) {
+			if (!adapter->vf_rate_link_speed)
+				vfinfo[i].tx_rate = 0;
 
-		ixgbe_set_vf_rate_limit(adapter, i);
-	}
+			ixgbe_set_vf_rate_limit(adapter, i, vfinfo[i].tx_rate);
+		}
+	rcu_read_unlock();
 }
 
 int ixgbe_ndo_set_vf_bw(struct net_device *netdev, int vf, int min_tx_rate,
 			int max_tx_rate)
 {
 	struct ixgbe_adapter *adapter = ixgbe_from_netdev(netdev);
+	struct vf_data_storage *vfinfo;
 	int link_speed;
 
 	/* verify VF is active */
@@ -1795,12 +1966,17 @@ int ixgbe_ndo_set_vf_bw(struct net_device *netdev, int vf, int min_tx_rate,
 	if (max_tx_rate && ((max_tx_rate <= 10) || (max_tx_rate > link_speed)))
 		return -EINVAL;
 
-	/* store values */
-	adapter->vf_rate_link_speed = link_speed;
-	adapter->vfinfo[vf].tx_rate = max_tx_rate;
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo) {
+		/* store values */
+		adapter->vf_rate_link_speed = link_speed;
+		vfinfo[vf].tx_rate = max_tx_rate;
 
-	/* update hardware configuration */
-	ixgbe_set_vf_rate_limit(adapter, vf);
+		/* update hardware configuration */
+		ixgbe_set_vf_rate_limit(adapter, vf, vfinfo[vf].tx_rate);
+	}
+	rcu_read_unlock();
 
 	return 0;
 }
@@ -1809,11 +1985,18 @@ int ixgbe_ndo_set_vf_spoofchk(struct net_device *netdev, int vf, bool setting)
 {
 	struct ixgbe_adapter *adapter = ixgbe_from_netdev(netdev);
 	struct ixgbe_hw *hw = &adapter->hw;
+	struct vf_data_storage *vfinfo;
 
 	if (vf >= adapter->num_vfs)
 		return -EINVAL;
 
-	adapter->vfinfo[vf].spoofchk_enabled = setting;
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo)
+		vfinfo[vf].spoofchk_enabled = setting;
+	rcu_read_unlock();
+	if (!vfinfo)
+		return 0;
 
 	/* configure MAC spoofing */
 	hw->mac.ops.set_mac_anti_spoofing(hw, setting, vf);
@@ -1851,28 +2034,37 @@ int ixgbe_ndo_set_vf_spoofchk(struct net_device *netdev, int vf, bool setting)
  **/
 void ixgbe_set_vf_link_state(struct ixgbe_adapter *adapter, int vf, int state)
 {
-	adapter->vfinfo[vf].link_state = state;
+	struct vf_data_storage *vfinfo;
+
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo) {
+		rcu_read_unlock();
+		return;
+	}
+	vfinfo[vf].link_state = state;
 
 	switch (state) {
 	case IFLA_VF_LINK_STATE_AUTO:
 		if (test_bit(__IXGBE_DOWN, &adapter->state))
-			adapter->vfinfo[vf].link_enable = false;
+			vfinfo[vf].link_enable = false;
 		else
-			adapter->vfinfo[vf].link_enable = true;
+			vfinfo[vf].link_enable = true;
 		break;
 	case IFLA_VF_LINK_STATE_ENABLE:
-		adapter->vfinfo[vf].link_enable = true;
+		vfinfo[vf].link_enable = true;
 		break;
 	case IFLA_VF_LINK_STATE_DISABLE:
-		adapter->vfinfo[vf].link_enable = false;
+		vfinfo[vf].link_enable = false;
 		break;
 	}
 
 	ixgbe_set_vf_rx_tx(adapter, vf);
 
 	/* restart the VF */
-	adapter->vfinfo[vf].clear_to_send = false;
+	vfinfo[vf].clear_to_send = false;
 	ixgbe_ping_vf(adapter, vf);
+	rcu_read_unlock();
 }
 
 /**
@@ -1923,6 +2115,7 @@ int ixgbe_ndo_set_vf_rss_query_en(struct net_device *netdev, int vf,
 				  bool setting)
 {
 	struct ixgbe_adapter *adapter = ixgbe_from_netdev(netdev);
+	struct vf_data_storage *vfinfo;
 
 	/* This operation is currently supported only for 82599 and x540
 	 * devices.
@@ -1934,7 +2127,11 @@ int ixgbe_ndo_set_vf_rss_query_en(struct net_device *netdev, int vf,
 	if (vf >= adapter->num_vfs)
 		return -EINVAL;
 
-	adapter->vfinfo[vf].rss_query_enabled = setting;
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (vfinfo)
+		vfinfo[vf].rss_query_enabled = setting;
+	rcu_read_unlock();
 
 	return 0;
 }
@@ -1942,18 +2139,31 @@ int ixgbe_ndo_set_vf_rss_query_en(struct net_device *netdev, int vf,
 int ixgbe_ndo_set_vf_trust(struct net_device *netdev, int vf, bool setting)
 {
 	struct ixgbe_adapter *adapter = ixgbe_from_netdev(netdev);
+	struct vf_data_storage *vfinfo;
 
 	if (vf >= adapter->num_vfs)
 		return -EINVAL;
 
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo) {
+		rcu_read_unlock();
+		return 0;
+	}
+
 	/* nothing to do */
-	if (adapter->vfinfo[vf].trusted == setting)
+	if (vfinfo[vf].trusted == setting) {
+		rcu_read_unlock();
 		return 0;
+	}
 
-	adapter->vfinfo[vf].trusted = setting;
+	vfinfo[vf].trusted = setting;
 
 	/* reset VF to reconfigure features */
-	adapter->vfinfo[vf].clear_to_send = false;
+	vfinfo[vf].clear_to_send = false;
+
+	rcu_read_unlock();
+
 	ixgbe_ping_vf(adapter, vf);
 
 	e_info(drv, "VF %u is %strusted\n", vf, setting ? "" : "not ");
@@ -1965,17 +2175,30 @@ int ixgbe_ndo_get_vf_config(struct net_device *netdev,
 			    int vf, struct ifla_vf_info *ivi)
 {
 	struct ixgbe_adapter *adapter = ixgbe_from_netdev(netdev);
+	struct vf_data_storage *vfinfo;
+
 	if (vf >= adapter->num_vfs)
 		return -EINVAL;
 	ivi->vf = vf;
-	memcpy(&ivi->mac, adapter->vfinfo[vf].vf_mac_addresses, ETH_ALEN);
-	ivi->max_tx_rate = adapter->vfinfo[vf].tx_rate;
+
+	rcu_read_lock();
+	vfinfo = rcu_dereference(adapter->vfinfo);
+	if (!vfinfo) {
+		rcu_read_unlock();
+		return -EINVAL;
+	}
+
+	memcpy(&ivi->mac, vfinfo[vf].vf_mac_addresses, ETH_ALEN);
+	ivi->max_tx_rate = vfinfo[vf].tx_rate;
 	ivi->min_tx_rate = 0;
-	ivi->vlan = adapter->vfinfo[vf].pf_vlan;
-	ivi->qos = adapter->vfinfo[vf].pf_qos;
-	ivi->spoofchk = adapter->vfinfo[vf].spoofchk_enabled;
-	ivi->rss_query_en = adapter->vfinfo[vf].rss_query_enabled;
-	ivi->trusted = adapter->vfinfo[vf].trusted;
-	ivi->linkstate = adapter->vfinfo[vf].link_state;
+	ivi->vlan = vfinfo[vf].pf_vlan;
+	ivi->qos = vfinfo[vf].pf_qos;
+	ivi->spoofchk = vfinfo[vf].spoofchk_enabled;
+	ivi->rss_query_en = vfinfo[vf].rss_query_enabled;
+	ivi->trusted = vfinfo[vf].trusted;
+	ivi->linkstate = vfinfo[vf].link_state;
+
+	rcu_read_unlock();
+
 	return 0;
 }

-- 
2.53.0


^ permalink raw reply related

* Re: TCP default settings (bugzilla)
From: Willy Tarreau @ 2026-04-17  7:33 UTC (permalink / raw)
  To: plantegg ren; +Cc: stephen, netdev
In-Reply-To: <CAMrUUotdnC+Gv3oud75Ns3BMiOCEo+_fNYX4_L=r=YhtyzZ0Qw@mail.gmail.com>

On Fri, Apr 17, 2026 at 03:01:08PM +0800, plantegg ren wrote:
> Hi,
> 
> One more real-world data point that just happened two weeks ago,
> directly related to tcp_keepalive_time.
> 
> AWS recently rolled out Nitro V6 (8th-gen EC2 instances) which reduced
> the ENI connection tracking timeout from 432000 seconds (5 days) to
> just 350 seconds:
> 
>   https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/security-group-connection-tracking.html
> 
> Our MySQL/HikariCP connection pools started seeing intermittent timeout
> errors every 20-30 minutes after migrating to 8th-gen instances. We
> captured packets on both client and server simultaneously. Here is what
> we found on a single connection (idle for 818 seconds, well past the
> 350-second ENI timeout):
> 
> Server side -- MySQL receives the request and sends responses normally:
> 
>   #270  71.51s  10.23.99.71 -> 172.20.64.240  [ACK]     last activity
>                   ~~~ connection idle for 818 seconds ~~~
>   #271 889.94s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  client
> request arrives
>   #272 889.94s  172.20.64.240 -> 10.23.99.71  [PSH,ACK] len=11 server
> responds OK
>   #275 890.15s  172.20.64.240 -> 10.23.99.71  [PSH,ACK] len=11 server
> retransmits
>   #278 890.59s  172.20.64.240 -> 10.23.99.71  [PSH,ACK] len=11 server
> retransmits
>   #281 891.02s  172.20.64.240 -> 10.23.99.71  [PSH,ACK] len=11 server
> retransmits
>     ... (server keeps retransmitting, client never ACKs)
> 
> Client side -- sends request, but NEVER receives any server response:
> 
>   #267  71.51s  10.23.99.71 -> 172.20.64.240  [ACK]     last activity
>                   ~~~ connection idle for 818 seconds ~~~
>   #268 889.94s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  sends request
>   #269 890.15s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  retransmit 1
>   #270 890.37s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  retransmit 2
>   #271 890.79s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  retransmit 3
>   #272 891.65s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  retransmit 4
>   #273 893.38s  10.23.99.71 -> 172.20.64.240  [PSH,ACK] len=5  retransmit 5
>   #274 894.94s  10.23.99.71 -> 172.20.64.240  [FIN,ACK]         gives up
> 
>   Zero packets from 172.20.64.240 after the idle gap. Zero RSTs.
> 
> The ENI silently drops all inbound packets (server -> client) because
> the connection tracking entry expired after 350 seconds. Outbound
> packets (client -> server) still pass through, so the server receives
> the request and responds -- but its responses are black-holed by the
> ENI. No RST is sent, so both sides are completely unaware.
> 
> If tcp_keepalive_time were lower than 350 seconds, the keepalive probes
> would have kept the ENI tracking entry alive, and none of this would
> have happened.
> 
> The trend is clear -- middlebox idle timeouts are getting shorter (AWS
> went from 432000s to 350s overnight), while tcp_keepalive_time has
> stayed at 7200 seconds for decades. The gap is widening.

It's up to the application to configure the keepalive interval if it
is relying on long connections, it's done using TCP_KEEPINTVL, and if
you're dealing with an application that doesn't expose the setting,
you indeed still have access to the system-wide setting above.

It's been well-known for at least two decades that no middle box could
sanely keep idle connections forever with the amount of traffic they're
seeing. 25 years ago I was already tuning the conntrack timeouts for a
bank firewall that was dealing with only 6k connections per second so
as to stay within reasonable memory sizes while keeping a good quality
of service. There's nothing new here.

Willy

^ permalink raw reply

* Re: [PATCH net] net: dsa: mt7530: fix .get_stats64 sleeping in atomic context
From: Chester A. Unal @ 2026-04-17  7:35 UTC (permalink / raw)
  To: Daniel Golle, Andrew Lunn, Vladimir Oltean, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Matthias Brugger,
	AngeloGioacchino Del Regno, Russell King, Christian Marangi,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek
  Cc: Frank Wunderlich, John Crispin
In-Reply-To: <79dc0ec5b6be698b14cb66339d6f63033ca2934a.1776397542.git.daniel@makrotopia.org>

On 17 April 2026 04:55:57 WEST, Daniel Golle <daniel@makrotopia.org> wrote:
>The .get_stats64 callback runs in atomic context, but on
>MDIO-connected switches every register read acquires the MDIO bus
>mutex, which can sleep:
>[   12.645973] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:609
>[   12.654442] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 759, name: grep
>[   12.663377] preempt_count: 0, expected: 0
>[   12.667410] RCU nest depth: 1, expected: 0
>[   12.671511] INFO: lockdep is turned off.
>[   12.675441] CPU: 0 UID: 0 PID: 759 Comm: grep Tainted: G S      W           7.0.0+ #0 PREEMPT
>[   12.675453] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
>[   12.675456] Hardware name: Bananapi BPI-R64 (DT)
>[   12.675459] Call trace:
>[   12.675462]  show_stack+0x14/0x1c (C)
>[   12.675477]  dump_stack_lvl+0x68/0x8c
>[   12.675487]  dump_stack+0x14/0x1c
>[   12.675495]  __might_resched+0x14c/0x220
>[   12.675504]  __might_sleep+0x44/0x80
>[   12.675511]  __mutex_lock+0x50/0xb10
>[   12.675523]  mutex_lock_nested+0x20/0x30
>[   12.675532]  mt7530_get_stats64+0x40/0x2ac
>[   12.675542]  dsa_user_get_stats64+0x2c/0x40
>[   12.675553]  dev_get_stats+0x44/0x1e0
>[   12.675564]  dev_seq_printf_stats+0x24/0xe0
>[   12.675575]  dev_seq_show+0x14/0x3c
>[   12.675583]  seq_read_iter+0x37c/0x480
>[   12.675595]  seq_read+0xd0/0xec
>[   12.675605]  proc_reg_read+0x94/0xe4
>[   12.675615]  vfs_read+0x98/0x29c
>[   12.675625]  ksys_read+0x54/0xdc
>[   12.675633]  __arm64_sys_read+0x18/0x20
>[   12.675642]  invoke_syscall.constprop.0+0x54/0xec
>[   12.675653]  do_el0_svc+0x3c/0xb4
>[   12.675662]  el0_svc+0x38/0x200
>[   12.675670]  el0t_64_sync_handler+0x98/0xdc
>[   12.675679]  el0t_64_sync+0x158/0x15c
>
>For MDIO-connected switches, poll MIB counters asynchronously using a
>delayed workqueue every second and let .get_stats64 return the cached
>values under a per-port spinlock. A mod_delayed_work() call on each
>read triggers an immediate refresh so counters stay responsive when
>queried more frequently.
>
>MMIO-connected switches (MT7988, EN7581, AN7583) are not affected
>because their regmap does not sleep, so they continue to read MIB
>counters directly in .get_stats64.
>
>Fixes: 88c810f35ed5 ("net: dsa: mt7530: implement .get_stats64")
>Signed-off-by: Daniel Golle <daniel@makrotopia.org>
>---
>This bug highlights a bigger problem and the actual cause:
>Locking in the mt7530 driver deserves a cleanup, and refactoring
>towards cleanly and directly using the regmap API.
>I've prepared this already and am going to submit a series doing
>most of that using Coccinelle semantic patches once net-next opens
>again.

Acked-by: Chester A. Unal <chester.a.unal@arinc9.com>

Chester A.

^ permalink raw reply

* Re: [PATCHv3] selftests: Use ktap helpers for runner.sh
From: Hangbin Liu @ 2026-04-17  7:43 UTC (permalink / raw)
  To: Qingfang Deng
  Cc: Brendan Jackman, Mark Brown, Shuah Khan, linux-kselftest, netdev
In-Reply-To: <861b722f-6d5e-4f47-9d17-00a98fceb8cc@linux.dev>

On Thu, Apr 16, 2026 at 04:07:08PM +0800, Qingfang Deng wrote:
> Hi, Hangbin
> 
> This patch broke selftests run with `make -C tools/testing/selftests` as
> make uses /bin/sh by default:
> 
> /bin/sh: 5: /home/qf/linux-next/tools/testing/selftests/kselftest/runner.sh:
> Bad substitution
> 
> Add `SHELL := /bin/bash` to the start of lib.mk to fix this.

Hi Qingfang,

Thanks for the report. I saw Mark helped fix this issue.
Sorry for the inconvenient.

hangbin

^ permalink raw reply

* Re: [PATCH for-7.1-fixes 1/2] rhashtable: add no_sync_grow option
From: Herbert Xu @ 2026-04-17  7:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Thomas Graf, David Vernet, Andrea Righi, Changwoo Min,
	Emil Tsalapatis, linux-crypto, sched-ext, linux-kernel,
	Florian Westphal, netdev
In-Reply-To: <aeHjjGEhlikSsxCX@slm.duckdns.org>

On Thu, Apr 16, 2026 at 09:38:52PM -1000, Tejun Heo wrote:
>
> Also, taking a step back, if rhashtable allows usage under raw spin locks,
> isn't this broken regardless of how easy or difficult it may be to reproduce
> the problem? Practically speaking, the scx_sched_hash one is unlikely to
> trigger in real world; however, it is still theoretically possible and I'm
> pretty positive that one would be able to create a repro case with the right
> interference workload. It'd be contrived for sure but should be possible.

rhashtable originated in networking where it tries very hard to
stop the hash table from ever degenerating into a linked list.

If your use-case is not as adversarial as that, and you're happy
for the hash table to degenerate into a linked-list in the worst
case, then yes it's aboslutely fine to not grow the table (or
try to grow it and fail with kmalloc_nolock).

It's just that we haven't had any users like this until now and
the feature that you want got removed because of that.

I'm more than happy to bring it back (commit 5f8ddeab10ce).

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH net,v2 00/11] Netfilter/IPVS fixes for net
From: Pablo Neira Ayuso @ 2026-04-17  7:51 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netfilter-devel, davem, netdev, kuba, pabeni, edumazet, horms
In-Reply-To: <aeFRt__YQqJ84ZaN@strlen.de>

On Thu, Apr 16, 2026 at 11:16:39PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
[...]
> If you don't want to take this v2 because of above issues, please
> consider at least applying
> 
>   ↳ [2026-04-16] Pablo Neira Ayuso <pablo@netfilter.org>: [PATCH net 08/11] ipvs: fix MTU check for GSO packets in tunnel mode
>   ↳ [2026-04-16] Pablo Neira Ayuso <pablo@netfilter.org>: [PATCH net 09/11] netfilter: nf_tables: use list_del_rcu for netlink hooks
>   ↳ [2026-04-16] Pablo Neira Ayuso <pablo@netfilter.org>: [PATCH net 10/11] rculist: add list_splice_rcu() for private lists
>   ↳ [2026-04-16] Pablo Neira Ayuso <pablo@netfilter.org>: [PATCH net 05/11] netfilter: conntrack: remove sprintf usage
>   ↳ [2026-04-16] Pablo Neira Ayuso <pablo@netfilter.org>: [PATCH net 06/11] netfilter: xtables: restrict several matches to inet family
> 
> manually.  nf:main always tracks net:main, applying them manually
> doesn't cause issues.

Florian, I am going to prepare a v3.

^ permalink raw reply

* [PATCH] tipc: crypto: require a NUL-terminated AEAD algorithm name
From: Pengpeng Hou @ 2026-04-17  7:53 UTC (permalink / raw)
  To: Jon Maloy, David S. Miller
  Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev,
	tipc-discussion, linux-kernel, Pengpeng Hou, stable

struct tipc_aead_key carries alg_name in a fixed 32-byte field, but both
the generic netlink validation path and the MSG_CRYPTO receive path pass
that field straight to crypto_has_alg(), strcmp(), and
crypto_alloc_aead() without first proving that it contains a terminating
NUL.

Reject locally supplied and received keys whose algorithm name fills the
entire fixed-width field without a terminator.

Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Cc: stable@vger.kernel.org

Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
---
 net/tipc/crypto.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
index 6d3b6b89b1d1..60110ea0fe7c 100644
--- a/net/tipc/crypto.c
+++ b/net/tipc/crypto.c
@@ -307,6 +307,11 @@ static void tipc_crypto_work_tx(struct work_struct *work);
 static void tipc_crypto_work_rx(struct work_struct *work);
 static int tipc_aead_key_generate(struct tipc_aead_key *skey);
 
+static bool tipc_aead_alg_name_valid(const char *alg_name)
+{
+	return strnlen(alg_name, TIPC_AEAD_ALG_NAME) < TIPC_AEAD_ALG_NAME;
+}
+
 #define is_tx(crypto) (!(crypto)->node)
 #define is_rx(crypto) (!is_tx(crypto))
 
@@ -335,6 +340,11 @@ int tipc_aead_key_validate(struct tipc_aead_key *ukey, struct genl_info *info)
 {
 	int keylen;
 
+	if (unlikely(!tipc_aead_alg_name_valid(ukey->alg_name))) {
+		GENL_SET_ERR_MSG(info, "algorithm name is not NUL-terminated");
+		return -EINVAL;
+	}
+
 	/* Check if algorithm exists */
 	if (unlikely(!crypto_has_alg(ukey->alg_name, 0, 0))) {
 		GENL_SET_ERR_MSG(info, "unable to load the algorithm (module existed?)");
@@ -2298,6 +2308,10 @@ static bool tipc_crypto_key_rcv(struct tipc_crypto *rx, struct tipc_msg *hdr)
 		pr_debug("%s: invalid MSG_CRYPTO key size\n", rx->name);
 		goto exit;
 	}
+	if (unlikely(!tipc_aead_alg_name_valid(data))) {
+		pr_debug("%s: invalid MSG_CRYPTO algorithm name\n", rx->name);
+		goto exit;
+	}
 
 	spin_lock(&rx->lock);
 	if (unlikely(rx->skey || (key_gen == rx->key_gen && rx->key.keys))) {
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related

* Re: [PATCH net v2] hv_sock: Report EOF instead of -EIO for FIN
From: Stefano Garzarella @ 2026-04-17  8:11 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: kys, haiyangz, wei.liu, longli, davem, edumazet, kuba, pabeni,
	horms, niuxuewei.nxw, linux-hyperv, virtualization, netdev,
	linux-kernel, stable, Ben Hillis, Mitchell Levy
In-Reply-To: <20260416191433.840637-1-decui@microsoft.com>

On Thu, Apr 16, 2026 at 12:14:33PM -0700, Dexuan Cui wrote:
>Commit f0c5827d07cb unluckily causes a regression for the FIN packet,
>and the final read syscall gets an error rather than 0.
>
>Ideally, we would want to fix hvs_channel_readable_payload() so that it
>could return 0 in the FIN scenario, but it's not good for the hv_sock
>driver to use the VMBus ringbuffer's cached priv_read_index, which is
>internal data in the VMBus driver.
>
>Fix the regression in hv_sock by returning 0 rather than -EIO.
>
>Fixes: f0c5827d07cb ("hv_sock: Return the readable bytes in hvs_stream_has_data()")
>Cc: stable@vger.kernel.org
>Reported-by: Ben Hillis <Ben.Hillis@microsoft.com>
>Reported-by: Mitchell Levy <levymitchell0@gmail.com>
>Signed-off-by: Dexuan Cui <decui@microsoft.com>
>---
>
>Changes since v1:
>    Removed the local variable 'need_refill' to make the code more
>    readable. Stefano, thanks!

Thanks for the fix!

>
>    No other change.
>
> net/vmw_vsock/hyperv_transport.c | 20 ++++++++++++++++----
> 1 file changed, 16 insertions(+), 4 deletions(-)

Acked-by: Stefano Garzarella <sgarzare@redhat.com>


^ permalink raw reply

* Re: [RFC PATCH] mm: net: disable kswapd for high-order network buffer allocation
From: wang lian @ 2026-04-17  8:11 UTC (permalink / raw)
  To: willy
  Cc: 21cnbao, corbet, davem, edumazet, hannes, horms, jackmanb, kuba,
	kuniyu, linux-doc, linux-kernel, linux-mm, linyunsheng, mhocko,
	netdev, pabeni, surenb, v-songbaohua, vbabka, willemb, zhouhuacai,
	ziy, wang lian
In-Reply-To: <aO11jqD6jgNs5h8K@casper.infradead.org>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 4537 bytes --]

Hi Matthew, Barry,

> So, we try to do an order-3 allocation. kswapd runs and ...
> succeeds in creating order-3 pages? Or fails to?
From our reproducer runs, both happen. We observe intermittent order-3
successes, but also frequent high-order failures followed by order-0
fallback.

> If it fails, that's something we need to sort out.
Agreed. In this workload, the bottleneck appears to be contiguity, not
raw reclaimable memory shortage. Order-0 memory remains available while
suitable order-3 blocks are often unavailable.

> If it succeeds, now we have several order-3 pages, great. But where do
> they all go that we need to run kswapd again?
In our runs, order-3 pockets do show up, but they do not last long.
They get consumed quickly by ongoing skb demand, and the pressure returns.

To investigate this, we built a reproducer that keeps creating memory fragments 
while the network stack continuously requests order-3 allocations.[1][2]

Raw sample output (trimmed):
---------------------------------------------------------------------------------------------------
TIME       | BUDDYINFO (Normal Zone)        | MEMINFO                   | KSWAPD CPU & VMSTAT      
---------------------------------------------------------------------------------------------------
11:08:11   | ord0:11622 ord3:0              | Free:96MB Avail:1309MB    | CPU: 10.0%  scan:83107932
[*] PHASE 3: Triggering Order-3 Pressure (UDP Storm).
11:08:15   | ord0:52079 ord3:0              | Free:273MB Avail:1300MB   | CPU: 90.9%  scan:85328881
11:08:16   | ord0:102895 ord3:0             | Free:477MB Avail:1309MB   | CPU: 60.0%  scan:85873777
11:08:17   | ord0:115459 ord3:5             | Free:517MB Avail:1284MB   | CPU: 54.5%  scan:86584389
11:08:18   | ord0:115164 ord3:0             | Free:509MB Avail:1107MB   | CPU: 36.4%  scan:87083561
---------------------------------------------------------------------------------------------------

The current phenomenon we observed is: Free memory is plentiful, Order-0 
pages are abundant, and the network allocation has already successfully 
entered the fallback-to-order-0 path. Everything seems normal on the 
surface, yet kswapd remains trapped in a futile loop.

It appears that kswapd is stuck in the following logic: 
wakeup_kswapd -> pgdat_balance -> __zone_watermark_ok. 

Specifically, in __zone_watermark_ok():

        /* For a high-order request, check at least one suitable page is free */
        for (o = order; o < NR_PAGE_ORDERS; o++) {
                struct free_area *area = &z->free_area[o];
                int mt;

                if (!area->nr_free)
                        continue;

                for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) {
                        if (!free_area_empty(area, mt))
                                return true;
                }
        }

Because our reproducer keeps creating fragmentation while the network 
stack requests order-3, this loop continues to return 'false' for the 
high-order requirement, even though the system is functionally fine 
with order-0. To be clear, we are not intentionally creating "artificial" 
fragments just for the sake of it. Rather, we designed this reproducer to 
effectively stress-test and expose the existing feedback gap in the 
reclaim/compaction logic—helping to pinpoint why kswapd continues 
thumping CPU cycles to satisfy a watermark that the allocator has 
already abandoned in favor of order-0 fallback.

A related discussion in [3] helps reduce vmpressure noise in this area.
Useful, but it does not close the contiguity gap by itself: high-order
wake/reclaim can still repeat when contiguous blocks cannot be formed.

It seems the current situation directs us to take a much closer look at 
how kswapd behaves in these scenarios. After carefully reviewing 
everyone's input, we believe it is time to do some targeted work on 
handling these high-order page issues. 

We already have some rough ideas and plan to conduct further experiments 
in this area. We would appreciate a broader discussion to help address 
this potential oversight that we might have collectively missed.

Links:
[1] https://github.com/hack-kernel-just-for-fun/kswap/blob/main/kswapd_spin_repro.c
[2] https://github.com/hack-kernel-just-for-fun/kswap/blob/main/kswapd.sh
[3] https://lore.kernel.org/all/20260406195014.112521-1-jp.kobryn@linux.dev/#r

This was reproduced and cross-checked independently by our team
(Wang Lian <lianux.mm@gmail.com> and Kunwu Chan <kunwu.chan@gmail.com>).

--
Best Regards,
wang lian

^ permalink raw reply

* Re: Path forward for NFC in the kernel
From: David Heidelberg @ 2026-04-17  8:12 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Jakub Kicinski, Michael Thalmeier,
	Raymond Hackley, Michael Walle, Bongsu Jeon, Mark Greer
  Cc: netdev
In-Reply-To: <9c4a4acf-b4f1-4e84-93bf-cdf080cb9970@kernel.org>

On 17/04/2026 09:18, Krzysztof Kozlowski wrote:
> On 16/04/2026 19:10, Jakub Kicinski wrote:
>> Hi folks!
>>
>> We are struggling to keep up with the number of security reports and AI
>> generated patches in the kernel. NFC is infamous for being a huge CVE
>> magnet. We need someone to step up as a maintainer, create an NFC tree
>> and handle all the incoming submissions. Send us (or Linus if you
>> prefer) periodic PRs, like WiFi, Bluetooth etc. do. If that does not
>> happen I'm afraid we'll have to move the NFC code out of the tree,
>> put it up on GH or some such, and let it accumulate CVEs there..
>>
>> I'm planning to send a PR to Linus to shed the unmaintained code early
>> next week. We need to have a maintainer established by then.
> 
> +Cc David Heidelberg recently trying to use Linux NFC stack,
> 
> Just "collecting" patches is not a big deal, I could do this, but
> actually reviewing the patches with necessary due diligence is the
> effort I could not provide in a reasonable time frame. And picking up
> patches without proper review feels risky...

Hello Krzystof, Jakub,

thanks for putting me into loop.

I can do limited reviews and basic maintenance. My knowledge about NFC is for 
now somehow limited (but I'm willing to invest my limited time into learning more).

As "I & LLM" wrote [1] userspace very basic reader for GNOME and planning to do 
more tight integration into GNOME, so would make sense to keep the kernel stack 
alive.

[1] https://gitlab.gnome.org/dh/gnfc/

> 
> NFC has a long history of issues, first mostly pointed out by syzbot but
> now apparently by AI tools. The code base is quite old, with no major
> improvements or testings happening but not in a way "oh, it's stable and
> working like 'cp' command" but rather "no one knows how many bugs are on
> top of each other and if it actually still works".
> 
> Syzbot and AI reported bugs encourage random drive-by fixes by people
> not testing the code, thus particular bug report might be fixed, but for
> example NFC stops working and no one knows that.

I think I could filter out nonsense, possibly with help of Sashiko [2].

[2] https://sashiko.dev/

> 
> Does anyone knows if the NFC stack/drivers actually works fine? Did
> anyone test actual devices?

Yes, nxp,pn553, nxp,pn557.

Other people did also test on some phones with different tags (I currently have 
only one tag with vCARD loaded on it).

David

> 
> If not, then moving to Github would be even more reasonable.
> 
> Another point is that AFAIU, most of real world devices, like
> Android-based phones, don't use the Linux NFC stack but their custom
> HAL/user-space based libraries and drivers. Some other non-Android
> projects use libnfc userspace, which seems to be maintained only as
> bugfix (https://github.com/nfc-tools/libnfc/commits/master/).
> 
> Best regards,
> Krzysztof

^ permalink raw reply

* Re: [PATCH] net/packet: fix TOCTOU race on mmap'd vnet_hdr in tpacket_snd()
From: Willem de Bruijn @ 2026-04-17  8:15 UTC (permalink / raw)
  To: Zero Mark, Willem de Bruijn
  Cc: security, David S . Miller, Jakub Kicinski, Eric Dumazet, netdev,
	Zero Mark
In-Reply-To: <20260417060714.35488-1-patzilla007@gmail.com>

Zero Mark wrote:
> In tpacket_snd(), when PACKET_VNET_HDR is enabled, vnet_hdr points
> directly into the mmap'd TX ring buffer shared with userspace. The
> kernel validates the header via __packet_snd_vnet_parse() but then
> re-reads all fields later in virtio_net_hdr_to_skb(). A concurrent
> userspace thread can modify the vnet_hdr fields (gso_type, gso_size,
> flags, csum_start, csum_offset) between validation and use, bypassing
> all safety checks.
> 
> This can lead to:
>  - Out-of-bounds checksum writes via crafted csum_start/csum_offset
>  - Malicious GSO segmentation parameters
>  - Kernel memory corruption and potential local privilege escalation
> 
> The non-TPACKET path (packet_snd()) already correctly copies vnet_hdr
> to a stack-local variable. All other vnet_hdr consumers in the kernel
> (tun.c, tap.c, virtio_net.c) also use stack copies. The TPACKET TX
> path is the only caller of virtio_net_hdr_to_skb() that reads directly
> from user-controlled shared memory.
> 
> Fix this by copying vnet_hdr from the mmap'd ring buffer to a
> stack-local variable before validation and use, consistent with the
> approach used in packet_snd() and all other callers.
> 
> Exploitation requires CAP_NET_RAW, which can be obtained without
> special privileges via user namespaces.
> 
> Confirmed with a PoC on Linux 6.8.0 (Ubuntu): kprobe tracing on
> skb_partial_csum_set captured 77 race wins in 500,000 iterations.

No need to add such details on exploitability of bugs.

> Affects all kernels since PACKET_VNET_HDR support was added to the
> TPACKET TX path (~v3.14).
> 
> Fixes: 9ed988e5 ("packet: add vnet_hdr support for tpacket_snd")

This patch does not exist. Also 12-char SHA1.

I think this should be

Fixes: 1d036d25e560 ("packet: tpacket_snd gso and checksum offload")

> Signed-off-by: Zero Mark <patzilla007@gmail.com>

Thanks for the fix!

Only it does not apply cleanly. Please mark fixes [PATCH net] and
ensure that they apply to current netdev-net/main

https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html

> ---
>  net/packet/af_packet.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index abcdef012345..fedcba654321 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2725,7 +2725,8 @@ static int tpacket_parse_header(struct packet_sock *po, void *frame,
>  static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
>  {
>  	struct sk_buff *skb = NULL;
>  	struct net_device *dev;
> -	struct virtio_net_hdr *vnet_hdr = NULL;
> +	struct virtio_net_hdr vnet_hdr;
> +	bool has_vnet_hdr = false;
>  	struct sockcm_cookie sockc;
>  	__be16 proto;
>  	int err, reserve = 0;
> @@ -2828,16 +2829,17 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
>  		if (po->has_vnet_hdr) {
> -			vnet_hdr = data;
> -			data += sizeof(*vnet_hdr);
> -			tp_len -= sizeof(*vnet_hdr);
> +			memcpy(&vnet_hdr, data, sizeof(vnet_hdr));

Move the tp_len < 0 check before memcpy

> +			data += sizeof(vnet_hdr);
> +			tp_len -= sizeof(vnet_hdr);
>  			if (tp_len < 0 ||
> -			    __packet_snd_vnet_parse(vnet_hdr, tp_len)) {
> +			    __packet_snd_vnet_parse(&vnet_hdr, tp_len)) {
>  				tp_len = -EINVAL;
>  				goto tpacket_error;
>  			}
>  			copylen = __virtio16_to_cpu(vio_le(),
> -						    vnet_hdr->hdr_len);
> +						    vnet_hdr.hdr_len);
> +			has_vnet_hdr = true;
>  		}
>  		copylen = max_t(int, copylen, dev->hard_header_len);
>  		skb = sock_alloc_send_skb(&po->sk,
> @@ -2875,11 +2877,11 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
>  		}
> 
> -		if (po->has_vnet_hdr) {
> -			if (virtio_net_hdr_to_skb(skb, vnet_hdr, vio_le())) {
> +		if (has_vnet_hdr) {
> +			if (virtio_net_hdr_to_skb(skb, &vnet_hdr, vio_le())) {
>  				tp_len = -EINVAL;
>  				goto tpacket_error;
>  			}
> -			virtio_net_hdr_set_proto(skb, vnet_hdr);
> +			virtio_net_hdr_set_proto(skb, &vnet_hdr);
>  		}
> 
>  		skb->destructor = tpacket_destruct_skb;
> --
> 2.43.0
> 



^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox