Netdev List
 help / color / mirror / Atom feed
* [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf)
@ 2026-05-05  5:14 Jacob Keller
  2026-05-05  5:14 ` [PATCH net 01/13] i40e: Cleanup PTP registration on probe failure Jacob Keller
                   ` (13 more replies)
  0 siblings, 14 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Matt Vollrath, Sunitha Mekala,
	Kohei Enju, Paul Menzel, Simon Horman, Emil Tantilov,
	Samuel Salin, Patryk Holda, Greg Kroah-Hartman, Tony Nguyen,
	stable, Marcin Szycik, Bart Van Assche, intel-wired-lan,
	Arpana Arland, Rinitha S

Matt Volrath fixes two issues with the i40e driver probe routine, ensuring
that PTP is properly cleaned up if the probe fails.

Maciej fixes the i40e driver logic to keep the q_vectors array in sync with
changes to the channel count via ethtool.

Emil corrects the initialization of the read_dev_clk_lock spinlock in
idpf_ptp_init, ensuring it is initialized prior to when the
ptp_schedule_worker() is called.

Josh fixes the idpf driver to prevent enabling XDP if the queue based
scheduling is not supported by the firmware.

Josh fixes the idpf skb data path for handling queue based scheduling.

Josh fixes an XDP crash in the soft reset error path, restoring the
original configuration if idpf_xdp_setup_prog() fails.

Greg KH fixes a double free and use-after free in the idpf auxiliary device
error paths.

Marcin fixes ice_set_rss_hfunc() to use the correct q_opt_flags field,
correcting the assignment and preventing submission of invalid data to the
firmware.

Bart corrects the locking in ice_dcb_rebuild(), ensuring that the tc_mutex
is held over the entire operation.

Grzegorz fixes the ordering of ice_ptp_link_change() in ice_up_complete()
ensuring that the PTP timestamps will not be enabled before the PTP timer
is actually re-initialized.

Ivan fixes the rclk pin state get for E810 devices, ensuring the index is
properly offset by the base_rclk_idx value. This ensures that the correct
pin index is used to look up recovered clock state. He additionally adds
bounds checking to prevent attempting to access pins outside of the pin
state array.

Ivan also moves the CGU register macros to the top of ice_dpll.h, inside
the header guard to avoid duplicate macro definitions should the ice_dpll.h
header is included multiple times.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
Bart Van Assche (1):
      ice: fix locking in ice_dcb_rebuild()

Emil Tantilov (2):
      idpf: fix read_dev_clk_lock spinlock init in idpf_ptp_init()
      idpf: fix xdp crash in soft reset error path

Greg Kroah-Hartman (1):
      idpf: fix double free and use-after-free in aux device error paths

Grzegorz Nitka (1):
      ice: fix PTP hang for E825C devices

Ivan Vecera (2):
      ice: dpll: fix rclk pin state get for E810
      ice: dpll: fix misplaced header macros

Joshua Hay (2):
      idpf: do not enable XDP if queue based scheduling is not supported
      idpf: fix skb datapath queue based scheduling crashes and timeouts

Maciej Fijalkowski (1):
      i40e: keep q_vectors array in sync with channel count changes

Marcin Szycik (1):
      ice: fix setting RSS VSI hash for E830

Matt Vollrath (2):
      i40e: Cleanup PTP registration on probe failure
      i40e: Cleanup PTP pins on probe failure

 drivers/net/ethernet/intel/i40e/i40e.h          |  1 +
 drivers/net/ethernet/intel/ice/ice_dpll.h       | 32 ++++++-------
 drivers/net/ethernet/intel/idpf/idpf_txrx.h     | 12 +++--
 drivers/net/ethernet/intel/idpf/idpf_virtchnl.h |  4 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c     | 36 ++++++++++++---
 drivers/net/ethernet/intel/i40e/i40e_ptp.c      |  3 +-
 drivers/net/ethernet/intel/ice/ice_dcb_lib.c    |  4 +-
 drivers/net/ethernet/intel/ice/ice_dpll.c       |  5 ++
 drivers/net/ethernet/intel/ice/ice_main.c       |  6 +--
 drivers/net/ethernet/intel/idpf/idpf_idc.c      |  6 +++
 drivers/net/ethernet/intel/idpf/idpf_lib.c      |  4 +-
 drivers/net/ethernet/intel/idpf/idpf_ptp.c      |  4 +-
 drivers/net/ethernet/intel/idpf/idpf_txrx.c     | 61 ++++++++++++++-----------
 drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 19 ++------
 drivers/net/ethernet/intel/idpf/xdp.c           | 15 ++++--
 drivers/net/ethernet/intel/idpf/xsk.c           |  4 +-
 16 files changed, 132 insertions(+), 84 deletions(-)
---
base-commit: bd3a4795d5744f59a1f485379f1303e5e606f377
change-id: 20260504-jk-iwl-net-2026-05-04-f9526823577f

Best regards,
--  
Jacob Keller <jacob.e.keller@intel.com>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH net 01/13] i40e: Cleanup PTP registration on probe failure
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-06 20:24   ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 02/13] i40e: Cleanup PTP pins " Jacob Keller
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Matt Vollrath, Sunitha Mekala

From: Matt Vollrath <tactii@gmail.com>

Fix two conditions which would leak PTP registration on probe failure:

1. i40e_setup_pf_switch can encounter an error in
   i40e_setup_pf_filter_control, call i40e_ptp_init, then return
   non-zero, sending i40e_probe to err_vsis.

2. i40e_setup_misc_vector can return non-zero, sending i40e_probe to
   err_vsis.

Both of these conditions have been present since PTP was introduced in
this driver.

Found with coccinelle.

Fixes: beb0dff1251db ("i40e: enable PTP")
Signed-off-by: Matt Vollrath <tactii@gmail.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 028bd500603a..f06fcef644e5 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -16108,6 +16108,7 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	/* Unwind what we've done if something failed in the setup */
 err_vsis:
 	set_bit(__I40E_DOWN, pf->state);
+	i40e_ptp_stop(pf);
 	i40e_clear_interrupt_scheme(pf);
 	kfree(pf->vsi);
 err_switch_setup:

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 02/13] i40e: Cleanup PTP pins on probe failure
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
  2026-05-05  5:14 ` [PATCH net 01/13] i40e: Cleanup PTP registration on probe failure Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-06 20:28   ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 03/13] i40e: keep q_vectors array in sync with channel count changes Jacob Keller
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Matt Vollrath, Kohei Enju,
	Paul Menzel, Sunitha Mekala

From: Matt Vollrath <tactii@gmail.com>

PTP pin structs are allocated early in probe, but never cleaned up.

Fix this by calling i40e_ptp_free_pins in the error path.

To support this, i40e_ptp_free_pins is added to the header and
pin_config is correctly nullified after being freed.

This has been an issue since i40e_ptp_alloc_pins was introduced.

Fixes: 1050713026a08 ("i40e: add support for PTP external synchronization clock")
Reported-by: Kohei Enju <kohei@enjuk.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Matt Vollrath <tactii@gmail.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Kohei Enju <kohei@enjuk.jp>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h      | 1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c | 1 +
 drivers/net/ethernet/intel/i40e/i40e_ptp.c  | 3 ++-
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index dcb50c2e1aa2..83e780919ac9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -1318,6 +1318,7 @@ void i40e_ptp_restore_hw_time(struct i40e_pf *pf);
 void i40e_ptp_init(struct i40e_pf *pf);
 void i40e_ptp_stop(struct i40e_pf *pf);
 int i40e_ptp_alloc_pins(struct i40e_pf *pf);
+void i40e_ptp_free_pins(struct i40e_pf *pf);
 int i40e_update_adq_vsi_queues(struct i40e_vsi *vsi, int vsi_offset);
 int i40e_is_vsi_uplink_mode_veb(struct i40e_vsi *vsi);
 int i40e_get_partition_bw_setting(struct i40e_pf *pf);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index f06fcef644e5..6d4f9218dc68 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -16112,6 +16112,7 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	i40e_clear_interrupt_scheme(pf);
 	kfree(pf->vsi);
 err_switch_setup:
+	i40e_ptp_free_pins(pf);
 	i40e_reset_interrupt_capability(pf);
 	timer_shutdown_sync(&pf->service_timer);
 err_mac_addr:
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ptp.c b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
index 404a716db8da..7d07c389bb23 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ptp.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
@@ -940,12 +940,13 @@ int i40e_ptp_hwtstamp_get(struct net_device *netdev,
  *
  * Release memory allocated for PTP pins.
  **/
-static void i40e_ptp_free_pins(struct i40e_pf *pf)
+void i40e_ptp_free_pins(struct i40e_pf *pf)
 {
 	if (i40e_is_ptp_pin_dev(&pf->hw)) {
 		kfree(pf->ptp_pins);
 		kfree(pf->ptp_caps.pin_config);
 		pf->ptp_pins = NULL;
+		pf->ptp_caps.pin_config = NULL;
 	}
 }
 

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 03/13] i40e: keep q_vectors array in sync with channel count changes
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
  2026-05-05  5:14 ` [PATCH net 01/13] i40e: Cleanup PTP registration on probe failure Jacob Keller
  2026-05-05  5:14 ` [PATCH net 02/13] i40e: Cleanup PTP pins " Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-06 20:53   ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 04/13] idpf: fix read_dev_clk_lock spinlock init in idpf_ptp_init() Jacob Keller
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Simon Horman, Sunitha Mekala

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

For the main VSI, i40e_set_num_rings_in_vsi() always derives
num_q_vectors from pf->num_lan_msix. At the same time, ethtool -L stores
the user requested channel count in vsi->req_queue_pairs and the queue
setup path uses that value for the effective number of queue pairs.

This leaves queue and vector counts out of sync after shrinking channel
count via ethtool -L. The active queue configuration is reduced, but the
VSI still keeps the full PF-sized q_vector topology.

That mismatch breaks reconfiguration flows which rely on vector/NAPI
state matching the effective channel configuration. In particular,
toggling /sys/class/net/<dev>/threaded after reducing the channel count
can hang, and later channel-count changes can fail because VSI reinit
does not rebuild q_vectors to match the new vector count.

Fix this by making the main VSI num_q_vectors follow the effective
requested channel count, capped by the available MSI-X vectors. Update
i40e_vsi_reinit_setup() to rebuild q_vectors during VSI reinit so the
vector topology is refreshed together with the ring arrays when channel
count changes.

Keep alloc_queue_pairs unchanged and based on pf->num_lan_qps so the VSI
retains its full queue capacity.

Selftest napi_threaded.py was originally used when Jakub reported hang
on /sys/class/net/<dev>/threaded toggle. In order to make it pass on
i40e, use persistent NAPI configuration for q_vector NAPIs so NAPI
identity and threaded settings survive q_vector reallocation across
channel-count changes. This is achieved by using netif_napi_add_config()
when configuring q_vectors.

$ export NETIF=ens259f1np1
$ sudo -E env PATH="$PATH" ./tools/testing/selftests/drivers/net/napi_threaded.py
TAP version 13
1..3
ok 1 napi_threaded.napi_init
ok 2 napi_threaded.change_num_queues
ok 3 napi_threaded.enable_dev_threaded_disable_napi_threaded
Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0

[Jake: use min() and clamp() as suggested by Simon on Intel Wired LAN]

Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/intel-wired-lan/20260316133100.6054a11f@kernel.org/
Fixes: d2a69fefd756 ("i40e: Fix changing previously set num_queue_pairs for PFs")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 34 ++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 6d4f9218dc68..23156015ed86 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11403,10 +11403,14 @@ static void i40e_service_timer(struct timer_list *t)
 static int i40e_set_num_rings_in_vsi(struct i40e_vsi *vsi)
 {
 	struct i40e_pf *pf = vsi->back;
+	u16 qps;
 
 	switch (vsi->type) {
 	case I40E_VSI_MAIN:
 		vsi->alloc_queue_pairs = pf->num_lan_qps;
+		qps = vsi->req_queue_pairs ?
+		      min(vsi->req_queue_pairs, pf->num_lan_qps) :
+		      pf->num_lan_qps;
 		if (!vsi->num_tx_desc)
 			vsi->num_tx_desc = ALIGN(I40E_DEFAULT_NUM_DESCRIPTORS,
 						 I40E_REQ_DESCRIPTOR_MULTIPLE);
@@ -11414,7 +11418,7 @@ static int i40e_set_num_rings_in_vsi(struct i40e_vsi *vsi)
 			vsi->num_rx_desc = ALIGN(I40E_DEFAULT_NUM_DESCRIPTORS,
 						 I40E_REQ_DESCRIPTOR_MULTIPLE);
 		if (test_bit(I40E_FLAG_MSIX_ENA, pf->flags))
-			vsi->num_q_vectors = pf->num_lan_msix;
+			vsi->num_q_vectors = clamp(qps, 1, pf->num_lan_msix);
 		else
 			vsi->num_q_vectors = 1;
 
@@ -11503,6 +11507,7 @@ static int i40e_vsi_alloc_arrays(struct i40e_vsi *vsi, bool alloc_qvectors)
 
 err_vectors:
 	kfree(vsi->tx_rings);
+	vsi->tx_rings = NULL;
 	return ret;
 }
 
@@ -12043,7 +12048,8 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
 	cpumask_copy(&q_vector->affinity_mask, cpu_possible_mask);
 
 	if (vsi->netdev)
-		netif_napi_add(vsi->netdev, &q_vector->napi, i40e_napi_poll);
+		netif_napi_add_config(vsi->netdev, &q_vector->napi,
+				      i40e_napi_poll, v_idx);
 
 	/* tie q_vector and vsi together */
 	vsi->q_vectors[v_idx] = q_vector;
@@ -14264,12 +14270,27 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct i40e_vsi *vsi)
 
 	pf = vsi->back;
 
+	if (test_bit(I40E_FLAG_MSIX_ENA, pf->flags)) {
+		i40e_put_lump(pf->irq_pile, vsi->base_vector, vsi->idx);
+		vsi->base_vector = 0;
+	}
+
 	i40e_put_lump(pf->qp_pile, vsi->base_queue, vsi->idx);
 	i40e_vsi_clear_rings(vsi);
 
-	i40e_vsi_free_arrays(vsi, false);
+	i40e_vsi_free_q_vectors(vsi);
+	i40e_vsi_free_arrays(vsi, true);
 	i40e_set_num_rings_in_vsi(vsi);
-	ret = i40e_vsi_alloc_arrays(vsi, false);
+
+	ret = i40e_vsi_alloc_arrays(vsi, true);
+	if (ret)
+		goto err_vsi;
+
+	/* Rebuild q_vectors during VSI reinit because the effective channel
+	 * count may change num_q_vectors. Keep vector topology aligned with the
+	 * queue configuration after ethtool's .set_channels() callback.
+	 */
+	ret = i40e_vsi_setup_vectors(vsi);
 	if (ret)
 		goto err_vsi;
 
@@ -14281,7 +14302,7 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct i40e_vsi *vsi)
 		dev_info(&pf->pdev->dev,
 			 "failed to get tracking for %d queues for VSI %d err %d\n",
 			 alloc_queue_pairs, vsi->seid, ret);
-		goto err_vsi;
+		goto err_lump;
 	}
 	vsi->base_queue = ret;
 
@@ -14305,7 +14326,6 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct i40e_vsi *vsi)
 	return vsi;
 
 err_rings:
-	i40e_vsi_free_q_vectors(vsi);
 	if (vsi->netdev_registered) {
 		vsi->netdev_registered = false;
 		unregister_netdev(vsi->netdev);
@@ -14315,6 +14335,8 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct i40e_vsi *vsi)
 	if (vsi->type == I40E_VSI_MAIN)
 		i40e_devlink_destroy_port(pf);
 	i40e_aq_delete_element(&pf->hw, vsi->seid, NULL);
+err_lump:
+	i40e_vsi_free_q_vectors(vsi);
 err_vsi:
 	i40e_vsi_clear(vsi);
 	return NULL;

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 04/13] idpf: fix read_dev_clk_lock spinlock init in idpf_ptp_init()
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (2 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 03/13] i40e: keep q_vectors array in sync with channel count changes Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 05/13] idpf: do not enable XDP if queue based scheduling is not supported Jacob Keller
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Emil Tantilov, Simon Horman,
	Samuel Salin

From: Emil Tantilov <emil.s.tantilov@intel.com>

In idpf_ptp_init(), read_dev_clk_lock is initialized after
ptp_schedule_worker() had already been called (and after
idpf_ptp_settime64() could reach the lock). The PTP aux worker
fires immediately upon scheduling and can call into
idpf_ptp_read_src_clk_reg_direct(), which takes
spin_lock(&ptp->read_dev_clk_lock) on an uninitialized lock, triggering
the lockdep "non-static key" warning:

[12973.796587] idpf 0000:83:00.0: Device HW Reset initiated
[12974.094507] INFO: trying to register non-static key.
...
[12974.097208] Call Trace:
[12974.097213]  <TASK>
[12974.097218]  dump_stack_lvl+0x93/0xe0
[12974.097234]  register_lock_class+0x4c4/0x4e0
[12974.097249]  ? __lock_acquire+0x427/0x2290
[12974.097259]  __lock_acquire+0x98/0x2290
[12974.097272]  lock_acquire+0xc6/0x310
[12974.097281]  ? idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf]
[12974.097311]  ? lockdep_hardirqs_on_prepare+0xde/0x190
[12974.097318]  ? finish_task_switch.isra.0+0xd2/0x350
[12974.097330]  ? __pfx_ptp_aux_kworker+0x10/0x10 [ptp]
[12974.097343]  _raw_spin_lock+0x30/0x40
[12974.097353]  ? idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf]
[12974.097373]  idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf]
[12974.097391]  ? kthread_worker_fn+0x88/0x3d0
[12974.097404]  ? kthread_worker_fn+0x4e/0x3d0
[12974.097411]  idpf_ptp_update_cached_phctime+0x26/0x120 [idpf]
[12974.097428]  ? _raw_spin_unlock_irq+0x28/0x50
[12974.097436]  idpf_ptp_do_aux_work+0x15/0x20 [idpf]
[12974.097454]  ptp_aux_kworker+0x20/0x40 [ptp]
[12974.097464]  kthread_worker_fn+0xd5/0x3d0
[12974.097474]  ? __pfx_kthread_worker_fn+0x10/0x10
[12974.097482]  kthread+0xf4/0x130
[12974.097489]  ? __pfx_kthread+0x10/0x10
[12974.097498]  ret_from_fork+0x32c/0x410
[12974.097512]  ? __pfx_kthread+0x10/0x10
[12974.097519]  ret_from_fork_asm+0x1a/0x30
[12974.097540]  </TASK>

Move the call to spin_lock_init() up a bit to make sure read_dev_clk_lock
is not touched before it's been initialized.

Fixes: 5cb8805d2366 ("idpf: negotiate PTP capabilities and get PTP clock")
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/idpf/idpf_ptp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_ptp.c b/drivers/net/ethernet/intel/idpf/idpf_ptp.c
index eec91c4f0a75..4a51d2727547 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_ptp.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_ptp.c
@@ -952,6 +952,8 @@ int idpf_ptp_init(struct idpf_adapter *adapter)
 		goto free_ptp;
 	}
 
+	spin_lock_init(&adapter->ptp->read_dev_clk_lock);
+
 	err = idpf_ptp_create_clock(adapter);
 	if (err)
 		goto free_ptp;
@@ -977,8 +979,6 @@ int idpf_ptp_init(struct idpf_adapter *adapter)
 			goto remove_clock;
 	}
 
-	spin_lock_init(&adapter->ptp->read_dev_clk_lock);
-
 	pci_dbg(adapter->pdev, "PTP init successful\n");
 
 	return 0;

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 05/13] idpf: do not enable XDP if queue based scheduling is not supported
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (3 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 04/13] idpf: fix read_dev_clk_lock spinlock init in idpf_ptp_init() Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-06 20:59   ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 06/13] idpf: fix skb datapath queue based scheduling crashes and timeouts Jacob Keller
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Patryk Holda

From: Joshua Hay <joshua.a.hay@intel.com>

The current XDP implementation uses queue based scheduling for its TxQs.
If the FW does not advertise support for queue based scheduling, do not
enable XDP. Add the missing capability check at the start of the XDP
configuration. This will temporarily break XDP while a flow based
implementation is worked on, as well as while FWs with queue based by
default are rolled out.

Fixes: 705457e7211f ("idpf: implement XDP_SETUP_PROG in ndo_bpf for splitq")
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
Tested-by: Patryk Holda <patryk.holda@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/idpf/xdp.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/intel/idpf/xdp.c b/drivers/net/ethernet/intel/idpf/xdp.c
index cbccd4546768..dcd867517a5f 100644
--- a/drivers/net/ethernet/intel/idpf/xdp.c
+++ b/drivers/net/ethernet/intel/idpf/xdp.c
@@ -510,6 +510,13 @@ int idpf_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 	if (!idpf_is_queue_model_split(vport->dflt_qv_rsrc.txq_model))
 		goto notsupp;
 
+	if (!idpf_is_cap_ena(vport->adapter, IDPF_OTHER_CAPS,
+			     VIRTCHNL2_CAP_SPLITQ_QSCHED)) {
+		NL_SET_ERR_MSG_MOD(xdp->extack,
+				   "Device does not support requested XDP Tx scheduling mode");
+		goto notsupp;
+	}
+
 	switch (xdp->command) {
 	case XDP_SETUP_PROG:
 		ret = idpf_xdp_setup_prog(vport, xdp);

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 06/13] idpf: fix skb datapath queue based scheduling crashes and timeouts
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (4 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 05/13] idpf: do not enable XDP if queue based scheduling is not supported Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 07/13] idpf: fix xdp crash in soft reset error path Jacob Keller
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Samuel Salin

From: Joshua Hay <joshua.a.hay@intel.com>

The splitq Tx resource checks were assuming that the queues were using
flow based scheduling and checking the refillqs for free buffers.
However, the Tx refillqs are not allocated when using queue based
scheduling resulting in a NULL ptr dereference. Adjust the Tx resource
checks to only check available descriptor resources when using queue
based scheduling. Because queue based scheduling does not have any
notion of descriptor only completions, there cannot be any packets in
flight, meaning there is no need to check for pending completions.

The driver also only supported 8 byte completion descriptors in the skb
datapath previously. However, currently the FW only supports 4 byte
completion descriptors when using queue based scheduling. This meant we
were skipping over completions, resulting in Tx timeouts.  Add support
to process both 4 and 8 byte completion descriptors, depending on the
scheduling mode. Cache the next_to_clean completion descriptor in the
completion queue struct, and fetch this descriptor before the start of
each cleaning loop. Access the next descriptor in the loop by
calculating the index based on raw byte count.

Fixes: 0c3f135e840d ("idpf: stop Tx if there are insufficient buffer resources")
Fixes: 1c325aac10a8 ("idpf: configure resources for TX queues")
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/idpf/idpf_txrx.h |  6 +++-
 drivers/net/ethernet/intel/idpf/idpf_txrx.c | 49 ++++++++++++++++++-----------
 2 files changed, 36 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index 4be5b3b6d3ed..b6836e38f449 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -807,11 +807,13 @@ libeth_cacheline_set_assert(struct idpf_buf_queue, 64, 24, 32);
  * @txq_grp: See struct idpf_txq_group
  * @flags: See enum idpf_queue_flags_t
  * @desc_count: Number of descriptors
+ * @desc_sz: Descriptor size in bytes
  * @clean_budget: queue cleaning budget
  * @netdev: &net_device corresponding to this queue
  * @next_to_use: Next descriptor to use. Relevant in both split & single txq
  *		 and bufq.
  * @next_to_clean: Next descriptor to clean
+ * @ntc_desc: Pointer to next_to_clean descriptor for next NAPI poll
  * @num_completions: Only relevant for TX completion queue. It tracks the
  *		     number of completions received to compare against the
  *		     number of completions pending, as accumulated by the
@@ -833,6 +835,7 @@ struct idpf_compl_queue {
 
 	DECLARE_BITMAP(flags, __IDPF_Q_FLAGS_NBITS);
 	u32 desc_count;
+	u32 desc_sz;
 
 	u32 clean_budget;
 	struct net_device *netdev;
@@ -841,6 +844,7 @@ struct idpf_compl_queue {
 	__cacheline_group_begin_aligned(read_write);
 	u32 next_to_use;
 	u32 next_to_clean;
+	struct idpf_splitq_tx_compl_desc *ntc_desc;
 
 	aligned_u64 num_completions;
 	__cacheline_group_end_aligned(read_write);
@@ -853,7 +857,7 @@ struct idpf_compl_queue {
 	struct idpf_q_vector *q_vector;
 	__cacheline_group_end_aligned(cold);
 };
-libeth_cacheline_set_assert(struct idpf_compl_queue, 40, 16, 24);
+libeth_cacheline_set_assert(struct idpf_compl_queue, 48, 24, 24);
 
 /**
  * struct idpf_sw_queue
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index f6b3b15364ff..4fc0bb14c5b1 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -270,11 +270,9 @@ static int idpf_tx_desc_alloc(const struct idpf_vport *vport,
 static int idpf_compl_desc_alloc(const struct idpf_vport *vport,
 				 struct idpf_compl_queue *complq)
 {
-	u32 desc_size;
-
-	desc_size = idpf_queue_has(FLOW_SCH_EN, complq) ?
-		    sizeof(*complq->comp) : sizeof(*complq->comp_4b);
-	complq->size = array_size(complq->desc_count, desc_size);
+	complq->desc_sz = idpf_queue_has(FLOW_SCH_EN, complq) ?
+			  sizeof(*complq->comp) : sizeof(*complq->comp_4b);
+	complq->size = array_size(complq->desc_count, complq->desc_sz);
 
 	complq->desc_ring = dma_alloc_coherent(complq->netdev->dev.parent,
 					       complq->size, &complq->dma,
@@ -284,6 +282,7 @@ static int idpf_compl_desc_alloc(const struct idpf_vport *vport,
 
 	complq->next_to_use = 0;
 	complq->next_to_clean = 0;
+	complq->ntc_desc = complq->comp;
 	idpf_queue_set(GEN_CHK, complq);
 
 	idpf_xsk_setup_queue(vport, complq,
@@ -2193,7 +2192,7 @@ static void idpf_tx_handle_rs_completion(struct idpf_tx_queue *txq,
 static bool idpf_tx_clean_complq(struct idpf_compl_queue *complq, int budget,
 				 int *cleaned)
 {
-	struct idpf_splitq_tx_compl_desc *tx_desc;
+	struct idpf_splitq_tx_compl_desc *tx_desc = complq->ntc_desc;
 	s16 ntc = complq->next_to_clean;
 	struct idpf_netdev_priv *np;
 	unsigned int complq_budget;
@@ -2201,7 +2200,6 @@ static bool idpf_tx_clean_complq(struct idpf_compl_queue *complq, int budget,
 	int i;
 
 	complq_budget = complq->clean_budget;
-	tx_desc = &complq->comp[ntc];
 	ntc -= complq->desc_count;
 
 	do {
@@ -2257,11 +2255,12 @@ static bool idpf_tx_clean_complq(struct idpf_compl_queue *complq, int budget,
 		u64_stats_update_end(&tx_q->stats_sync);
 
 fetch_next_desc:
-		tx_desc++;
+		tx_desc = (struct idpf_splitq_tx_compl_desc *)
+				((u8 *)tx_desc + complq->desc_sz);
 		ntc++;
 		if (unlikely(!ntc)) {
 			ntc -= complq->desc_count;
-			tx_desc = &complq->comp[0];
+			tx_desc = complq->comp;
 			idpf_queue_change(GEN_CHK, complq);
 		}
 
@@ -2271,6 +2270,8 @@ static bool idpf_tx_clean_complq(struct idpf_compl_queue *complq, int budget,
 		complq_budget--;
 	} while (likely(complq_budget));
 
+	complq->ntc_desc = tx_desc;
+
 	/* Store the state of the complq to be used later in deciding if a
 	 * TXQ can be started again
 	 */
@@ -2437,21 +2438,32 @@ static int idpf_txq_has_room(struct idpf_tx_queue *tx_q, u32 descs_needed,
  * @tx_q: the queue to be checked
  * @descs_needed: number of descriptors required for this packet
  * @bufs_needed: number of buffers needed for this packet
+ * @flow: true if queue uses flow based scheduling, false if queue based scheduling
  *
  * Return: 0 if stop is not needed
  */
 static int idpf_tx_maybe_stop_splitq(struct idpf_tx_queue *tx_q,
-				     u32 descs_needed,
-				     u32 bufs_needed)
+				     u32 descs_needed, u32 bufs_needed,
+				     bool flow)
 {
-	/* Since we have multiple resources to check for splitq, our
+	/* Since we have multiple resources to check for flow based splitq, our
 	 * start,stop_thrs becomes a boolean check instead of a count
 	 * threshold.
 	 */
-	if (netif_subqueue_maybe_stop(tx_q->netdev, tx_q->idx,
-				      idpf_txq_has_room(tx_q, descs_needed,
-							bufs_needed),
-				      1, 1))
+	if (flow && netif_subqueue_maybe_stop(tx_q->netdev, tx_q->idx,
+					      idpf_txq_has_room(tx_q,
+								descs_needed,
+								bufs_needed),
+					      1, 1))
+		return 0;
+
+	/* For queue based splitq, there is no need to check the number of
+	 * pending completions since we cannot reuse descriptors until we get
+	 * completions, so we only need to check for descriptor resources.
+	 */
+	if (!flow && netif_subqueue_maybe_stop(tx_q->netdev, tx_q->idx,
+					       IDPF_DESC_UNUSED(tx_q),
+					       descs_needed, descs_needed))
 		return 0;
 
 	u64_stats_update_begin(&tx_q->stats_sync);
@@ -3021,6 +3033,7 @@ static bool idpf_tx_splitq_need_re(struct idpf_tx_queue *tx_q)
 static netdev_tx_t idpf_tx_splitq_frame(struct sk_buff *skb,
 					struct idpf_tx_queue *tx_q)
 {
+	bool flow = idpf_queue_has(FLOW_SCH_EN, tx_q);
 	struct idpf_tx_splitq_params tx_params = {
 		.prev_ntu = tx_q->next_to_use,
 	};
@@ -3040,7 +3053,7 @@ static netdev_tx_t idpf_tx_splitq_frame(struct sk_buff *skb,
 
 	/* Check for splitq specific TX resources */
 	count += (IDPF_TX_DESCS_PER_CACHE_LINE + tso);
-	if (idpf_tx_maybe_stop_splitq(tx_q, count, buf_count)) {
+	if (idpf_tx_maybe_stop_splitq(tx_q, count, buf_count, flow)) {
 		idpf_tx_buf_hw_update(tx_q, tx_q->next_to_use, false);
 
 		return NETDEV_TX_BUSY;
@@ -3072,7 +3085,7 @@ static netdev_tx_t idpf_tx_splitq_frame(struct sk_buff *skb,
 		idpf_tx_set_tstamp_desc(ctx_desc, idx);
 	}
 
-	if (idpf_queue_has(FLOW_SCH_EN, tx_q)) {
+	if (flow) {
 		struct idpf_sw_queue *refillq = tx_q->refillq;
 
 		/* Save refillq state in case of a packet rollback.  Otherwise,

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 07/13] idpf: fix xdp crash in soft reset error path
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (5 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 06/13] idpf: fix skb datapath queue based scheduling crashes and timeouts Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 08/13] idpf: fix double free and use-after-free in aux device error paths Jacob Keller
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Emil Tantilov

From: Emil Tantilov <emil.s.tantilov@intel.com>

NULL pointer dereference is reported in cases where idpf_vport_open()
fails during soft reset:

./xdpsock -i <inf> -q -r -N

[ 3179.186687] idpf 0000:83:00.0: Failed to initialize queue ids for vport 0: -12
[ 3179.276739] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 3179.277636] #PF: supervisor read access in kernel mode
[ 3179.278470] #PF: error_code(0x0000) - not-present page
[ 3179.279285] PGD 0
[ 3179.280083] Oops: Oops: 0000 [#1] SMP NOPTI
...
[ 3179.283997] Workqueue: events xp_release_deferred
[ 3179.284770] RIP: 0010:idpf_find_rxq_vec+0x17/0x30 [idpf]
...
[ 3179.291937] Call Trace:
[ 3179.292392]  <TASK>
[ 3179.292843]  idpf_qp_switch+0x25/0x820 [idpf]
[ 3179.293325]  idpf_xsk_pool_setup+0x7c/0x520 [idpf]
[ 3179.293803]  idpf_xdp+0x59/0x240 [idpf]
[ 3179.294275]  xp_disable_drv_zc+0x62/0xb0
[ 3179.294743]  xp_clear_dev+0x40/0xb0
[ 3179.295198]  xp_release_deferred+0x1f/0xa0
[ 3179.295648]  process_one_work+0x226/0x730
[ 3179.296106]  worker_thread+0x19e/0x340
[ 3179.296557]  ? __pfx_worker_thread+0x10/0x10
[ 3179.297009]  kthread+0xf4/0x130
[ 3179.297459]  ? __pfx_kthread+0x10/0x10
[ 3179.297910]  ret_from_fork+0x32c/0x410
[ 3179.298361]  ? __pfx_kthread+0x10/0x10
[ 3179.298702]  ret_from_fork_asm+0x1a/0x30

Fix the error handling of the soft reset in idpf_xdp_setup_prog() by
restoring the vport->xdp_prog to the old value. This avoids referencing
the orphaned prog that was copied to vport->xdp_prog in the soft reset
and prevents subsequent false positive by idpf_xdp_enabled(). Roll back
the number of queues as well. Also only call put on the program if the
soft reset was successful. Returning an error will trigger the core XDP
stack to handle the put otherwise.

Update the restart check in idpf_xsk_pool_setup() to use IDPF_VPORT_UP bit
instead of netif_running(). The idpf_vport_stop/start() calls will not
update the __LINK_STATE_START bit, making this test a false positive
should the soft reset fail.

Fixes: 3d57b2c00f09 ("idpf: add XSk pool initialization")
Cc: stable@vger.kernel.org
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/idpf/idpf_txrx.h     |  6 +++---
 drivers/net/ethernet/intel/idpf/idpf_virtchnl.h |  4 ++--
 drivers/net/ethernet/intel/idpf/idpf_lib.c      |  4 +---
 drivers/net/ethernet/intel/idpf/idpf_txrx.c     | 12 ++++--------
 drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 19 ++++---------------
 drivers/net/ethernet/intel/idpf/xdp.c           |  8 +++++---
 drivers/net/ethernet/intel/idpf/xsk.c           |  4 +++-
 7 files changed, 22 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index b6836e38f449..22c647d6dd5c 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -1084,9 +1084,9 @@ void idpf_vport_init_num_qs(struct idpf_vport *vport,
 			    struct idpf_q_vec_rsrc *rsrc);
 void idpf_vport_calc_num_q_desc(struct idpf_vport *vport,
 				struct idpf_q_vec_rsrc *rsrc);
-int idpf_vport_calc_total_qs(struct idpf_adapter *adapter, u16 vport_index,
-			     struct virtchnl2_create_vport *vport_msg,
-			     struct idpf_vport_max_q *max_q);
+void idpf_vport_calc_total_qs(struct idpf_adapter *adapter, u16 vport_index,
+			      struct virtchnl2_create_vport *vport_msg,
+			      struct idpf_vport_max_q *max_q);
 void idpf_vport_calc_num_q_groups(struct idpf_q_vec_rsrc *rsrc);
 int idpf_vport_queues_alloc(struct idpf_vport *vport,
 			    struct idpf_q_vec_rsrc *rsrc);
diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.h b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.h
index 6876e3ed9d1b..76d238fc660c 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.h
@@ -169,8 +169,8 @@ int idpf_send_destroy_vport_msg(struct idpf_adapter *adapter, u32 vport_id);
 int idpf_send_enable_vport_msg(struct idpf_adapter *adapter, u32 vport_id);
 int idpf_send_disable_vport_msg(struct idpf_adapter *adapter, u32 vport_id);
 
-int idpf_vport_adjust_qs(struct idpf_vport *vport,
-			 struct idpf_q_vec_rsrc *rsrc);
+void idpf_vport_adjust_qs(struct idpf_vport *vport,
+			  struct idpf_q_vec_rsrc *rsrc);
 int idpf_vport_alloc_max_qs(struct idpf_adapter *adapter,
 			    struct idpf_vport_max_q *max_q);
 void idpf_vport_dealloc_max_qs(struct idpf_adapter *adapter,
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index cf966fe6c759..56198b417c97 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -2042,9 +2042,7 @@ int idpf_initiate_soft_reset(struct idpf_vport *vport,
 	/* Adjust resource parameters prior to reallocating resources */
 	switch (reset_cause) {
 	case IDPF_SR_Q_CHANGE:
-		err = idpf_vport_adjust_qs(new_vport, new_rsrc);
-		if (err)
-			goto free_vport;
+		idpf_vport_adjust_qs(new_vport, new_rsrc);
 		break;
 	case IDPF_SR_Q_DESC_CHANGE:
 		/* Update queue parameters before allocating resources */
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index 4fc0bb14c5b1..4e0d31023123 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -1568,12 +1568,10 @@ void idpf_vport_calc_num_q_desc(struct idpf_vport *vport,
  * @vport_idx: vport idx to retrieve vport pointer
  * @vport_msg: message to fill with data
  * @max_q: vport max queue info
- *
- * Return: 0 on success, error value on failure.
  */
-int idpf_vport_calc_total_qs(struct idpf_adapter *adapter, u16 vport_idx,
-			     struct virtchnl2_create_vport *vport_msg,
-			     struct idpf_vport_max_q *max_q)
+void idpf_vport_calc_total_qs(struct idpf_adapter *adapter, u16 vport_idx,
+			      struct virtchnl2_create_vport *vport_msg,
+			      struct idpf_vport_max_q *max_q)
 {
 	int dflt_splitq_txq_grps = 0, dflt_singleq_txqs = 0;
 	int dflt_splitq_rxq_grps = 0, dflt_singleq_rxqs = 0;
@@ -1624,7 +1622,7 @@ int idpf_vport_calc_total_qs(struct idpf_adapter *adapter, u16 vport_idx,
 	}
 
 	if (!vport_config)
-		return 0;
+		return;
 
 	user = &vport_config->user_config;
 	user->num_req_rx_qs = le16_to_cpu(vport_msg->num_rx_q);
@@ -1640,8 +1638,6 @@ int idpf_vport_calc_total_qs(struct idpf_adapter *adapter, u16 vport_idx,
 	vport_msg->num_tx_q = cpu_to_le16(user->num_req_tx_qs + num_xdpsq);
 	if (idpf_is_queue_model_split(le16_to_cpu(vport_msg->txq_model)))
 		vport_msg->num_tx_complq = vport_msg->num_tx_q;
-
-	return 0;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
index be66f9b2e101..91af4f298475 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
@@ -1578,12 +1578,7 @@ int idpf_send_create_vport_msg(struct idpf_adapter *adapter,
 	else
 		vport_msg->rxq_model = cpu_to_le16(VIRTCHNL2_QUEUE_MODEL_SINGLE);
 
-	err = idpf_vport_calc_total_qs(adapter, idx, vport_msg, max_q);
-	if (err) {
-		dev_err(&adapter->pdev->dev, "Enough queues are not available");
-
-		return err;
-	}
+	idpf_vport_calc_total_qs(adapter, idx, vport_msg, max_q);
 
 	if (!adapter->vport_params_recvd[idx]) {
 		adapter->vport_params_recvd[idx] = kzalloc(IDPF_CTLQ_MAX_BUF_LEN,
@@ -4065,24 +4060,18 @@ int idpf_vport_queue_ids_init(struct idpf_vport *vport,
  * @vport: virtual port data struct
  * @rsrc: pointer to queue and vector resources
  *
- * Renegotiate queues.  Returns 0 on success, negative on failure.
+ * Renegotiate queues.
  */
-int idpf_vport_adjust_qs(struct idpf_vport *vport, struct idpf_q_vec_rsrc *rsrc)
+void idpf_vport_adjust_qs(struct idpf_vport *vport, struct idpf_q_vec_rsrc *rsrc)
 {
 	struct virtchnl2_create_vport vport_msg;
-	int err;
 
 	vport_msg.txq_model = cpu_to_le16(rsrc->txq_model);
 	vport_msg.rxq_model = cpu_to_le16(rsrc->rxq_model);
-	err = idpf_vport_calc_total_qs(vport->adapter, vport->idx, &vport_msg,
-				       NULL);
-	if (err)
-		return err;
+	idpf_vport_calc_total_qs(vport->adapter, vport->idx, &vport_msg, NULL);
 
 	idpf_vport_init_num_qs(vport, &vport_msg, rsrc);
 	idpf_vport_calc_num_q_groups(rsrc);
-
-	return 0;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/idpf/xdp.c b/drivers/net/ethernet/intel/idpf/xdp.c
index dcd867517a5f..f6e6b72169fd 100644
--- a/drivers/net/ethernet/intel/idpf/xdp.c
+++ b/drivers/net/ethernet/intel/idpf/xdp.c
@@ -488,11 +488,13 @@ static int idpf_xdp_setup_prog(struct idpf_vport *vport,
 				   "Could not reopen the vport after XDP setup");
 
 		cfg->user_config.xdp_prog = old;
-		old = prog;
-	}
+		vport->xdp_prog = old;
 
-	if (old)
+		/* Restore previous queue config */
+		idpf_vport_adjust_qs(vport, &vport->dflt_qv_rsrc);
+	} else if (old) {
 		bpf_prog_put(old);
+	}
 
 	libeth_xdp_set_redirect(vport->netdev, vport->xdp_prog);
 
diff --git a/drivers/net/ethernet/intel/idpf/xsk.c b/drivers/net/ethernet/intel/idpf/xsk.c
index d95d3efdfd36..3d8c430efd2b 100644
--- a/drivers/net/ethernet/intel/idpf/xsk.c
+++ b/drivers/net/ethernet/intel/idpf/xsk.c
@@ -553,6 +553,7 @@ int idpf_xskrq_poll(struct idpf_rx_queue *rxq, u32 budget)
 
 int idpf_xsk_pool_setup(struct idpf_vport *vport, struct netdev_bpf *bpf)
 {
+	const struct idpf_netdev_priv *np = netdev_priv(vport->netdev);
 	struct xsk_buff_pool *pool = bpf->xsk.pool;
 	u32 qid = bpf->xsk.queue_id;
 	bool restart;
@@ -568,7 +569,8 @@ int idpf_xsk_pool_setup(struct idpf_vport *vport, struct netdev_bpf *bpf)
 		return -EINVAL;
 	}
 
-	restart = idpf_xdp_enabled(vport) && netif_running(vport->netdev);
+	restart = idpf_xdp_enabled(vport) &&
+		  test_bit(IDPF_VPORT_UP, np->state);
 	if (!restart)
 		goto pool;
 

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 08/13] idpf: fix double free and use-after-free in aux device error paths
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (6 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 07/13] idpf: fix xdp crash in soft reset error path Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-06 21:04   ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 09/13] ice: fix setting RSS VSI hash for E830 Jacob Keller
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Greg Kroah-Hartman, Tony Nguyen,
	stable, Paul Menzel

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

When auxiliary_device_add() fails in idpf_plug_vport_aux_dev() or
idpf_plug_core_aux_dev(), the err_aux_dev_add label calls
auxiliary_device_uninit() and falls through to err_aux_dev_init.  The
uninit call will trigger put_device(), which invokes the release
callback (idpf_vport_adev_release / idpf_core_adev_release) that frees
iadev.  The fall-through then reads adev->id from the freed iadev for
ida_free() and double-frees iadev with kfree().

Free the IDA slot and clear the back-pointer before uninit, while adev
is still valid, then return immediately.

Commit 65637c3a1811 ("idpf: fix UAF in RDMA core aux dev deinitialization")
fixed the same use-after-free in the matching unplug path in this file but
missed both probe error paths.

Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: stable@kernel.org
Fixes: be91128c579c ("idpf: implement RDMA vport auxiliary dev create, init, and destroy")
Fixes: f4312e6bfa2a ("idpf: implement core RDMA auxiliary dev create, init, and destroy")
Assisted-by: gregkh_clanker_t1000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/idpf/idpf_idc.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_idc.c b/drivers/net/ethernet/intel/idpf/idpf_idc.c
index 7e4f4ac92653..b7d6b08fc89e 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_idc.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_idc.c
@@ -90,7 +90,10 @@ static int idpf_plug_vport_aux_dev(struct iidc_rdma_core_dev_info *cdev_info,
 	return 0;
 
 err_aux_dev_add:
+	ida_free(&idpf_idc_ida, adev->id);
+	vdev_info->adev = NULL;
 	auxiliary_device_uninit(adev);
+	return ret;
 err_aux_dev_init:
 	ida_free(&idpf_idc_ida, adev->id);
 err_ida_alloc:
@@ -228,7 +231,10 @@ static int idpf_plug_core_aux_dev(struct iidc_rdma_core_dev_info *cdev_info)
 	return 0;
 
 err_aux_dev_add:
+	ida_free(&idpf_idc_ida, adev->id);
+	cdev_info->adev = NULL;
 	auxiliary_device_uninit(adev);
+	return ret;
 err_aux_dev_init:
 	ida_free(&idpf_idc_ida, adev->id);
 err_ida_alloc:

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 09/13] ice: fix setting RSS VSI hash for E830
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (7 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 08/13] idpf: fix double free and use-after-free in aux device error paths Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-06 21:06   ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 10/13] ice: fix locking in ice_dcb_rebuild() Jacob Keller
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Marcin Szycik

From: Marcin Szycik <marcin.szycik@linux.intel.com>

ice_set_rss_hfunc() performs a VSI update, in which it sets hashing
function, leaving other VSI options unchanged. However, ::q_opt_flags is
mistakenly set to the value of another field, instead of its original
value, probably due to a typo. What happens next is hardware-dependent:

On E810, only the first bit is meaningful (see
ICE_AQ_VSI_Q_OPT_PE_FLTR_EN) and can potentially end up in a different
state than before VSI update.

On E830, some of the remaining bits are not reserved. Setting them
to some unrelated values can cause the firmware to reject the update
because of invalid settings, or worse - succeed.

Reproducer:
  sudo ethtool -X $PF1 equal 8

Output in dmesg:
  Failed to configure RSS hash for VSI 6, error -5

Fixes: 352e9bf23813 ("ice: enable symmetric-xor RSS for Toeplitz hash function")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 1d1947a7fe11..c52c465280f7 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -8046,7 +8046,7 @@ int ice_set_rss_hfunc(struct ice_vsi *vsi, u8 hfunc)
 	ctx->info.q_opt_rss |=
 		FIELD_PREP(ICE_AQ_VSI_Q_OPT_RSS_HASH_M, hfunc);
 	ctx->info.q_opt_tc = vsi->info.q_opt_tc;
-	ctx->info.q_opt_flags = vsi->info.q_opt_rss;
+	ctx->info.q_opt_flags = vsi->info.q_opt_flags;
 
 	err = ice_update_vsi(hw, vsi->idx, ctx, NULL);
 	if (err) {

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 10/13] ice: fix locking in ice_dcb_rebuild()
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (8 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 09/13] ice: fix setting RSS VSI hash for E830 Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-06 21:13   ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 11/13] ice: fix PTP hang for E825C devices Jacob Keller
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Bart Van Assche, intel-wired-lan,
	Arpana Arland

From: Bart Van Assche <bvanassche@acm.org>

Move the mutex_lock() call up to prevent that DCB settings change after
the first ice_query_port_ets() call. The second ice_query_port_ets()
call in ice_dcb_rebuild() is already protected by pf->tc_mutex.

This also fixes a bug in an error path, as before taking the first
"goto dcb_error" in the function jumped over mutex_lock() to
mutex_unlock().

This bug has been detected by the clang thread-safety analyzer.

Cc: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Fixes: 242b5e068b25 ("ice: Fix DCB rebuild after reset")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
index 16aa25535152..0bc6dd375687 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
@@ -537,14 +537,14 @@ void ice_dcb_rebuild(struct ice_pf *pf)
 	struct ice_dcbx_cfg *err_cfg;
 	int ret;
 
+	mutex_lock(&pf->tc_mutex);
+
 	ret = ice_query_port_ets(pf->hw.port_info, &buf, sizeof(buf), NULL);
 	if (ret) {
 		dev_err(dev, "Query Port ETS failed\n");
 		goto dcb_error;
 	}
 
-	mutex_lock(&pf->tc_mutex);
-
 	if (!pf->hw.port_info->qos_cfg.is_sw_lldp)
 		ice_cfg_etsrec_defaults(pf->hw.port_info);
 

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 11/13] ice: fix PTP hang for E825C devices
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (9 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 10/13] ice: fix locking in ice_dcb_rebuild() Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-06 21:16   ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 12/13] ice: dpll: fix rclk pin state get for E810 Jacob Keller
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller, Simon Horman, Rinitha S

From: Grzegorz Nitka <grzegorz.nitka@intel.com>

Change the order of PTP reconfiguration when port goes down or up
(ice_down and ice_up calls) to be more graceful and consistent from
timestamp interrupts processing perspective.

For both calls (ice_up and ice_down), accompanying ice_ptp_link_change
is called which starts/stops PTP timer. This patch changes the order:
- while link goes down: disable net device Tx first (netif_carrier_off,
  netif_tx_disable), then call ice_ptp_link_change
- while link goes up: ice_ptp_link_change called first, then re-enable
  net device Tx (netif_tx_start_all_queues)

Otherwise, there is a narrow window in which PTP timestamp request has
been triggered and timestamp processing occurs when PTP timer is not
enabled yet (up case) or already disabled (down case). This may lead to
undefined behavior and receiving invalid timestamps. This case was
observed on E825C devices only.

Fixes: 6b1ff5d39228 ("ice: always call ice_ptp_link_change and make it void")
Cc: stable@vger.kernel.org
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index c52c465280f7..8cc9d0521988 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -6732,10 +6732,10 @@ static int ice_up_complete(struct ice_vsi *vsi)
 	    (vsi->port_info->phy.link_info.link_info & ICE_AQ_LINK_UP) &&
 	    ((vsi->netdev && (vsi->type == ICE_VSI_PF ||
 			      vsi->type == ICE_VSI_SF)))) {
+		ice_ptp_link_change(pf, true);
 		ice_print_link_msg(vsi, true);
 		netif_tx_start_all_queues(vsi->netdev);
 		netif_carrier_on(vsi->netdev);
-		ice_ptp_link_change(pf, true);
 	}
 
 	/* Perform an initial read of the statistics registers now to
@@ -7263,9 +7263,9 @@ int ice_down(struct ice_vsi *vsi)
 
 	if (vsi->netdev) {
 		vlan_err = ice_vsi_del_vlan_zero(vsi);
-		ice_ptp_link_change(vsi->back, false);
 		netif_carrier_off(vsi->netdev);
 		netif_tx_disable(vsi->netdev);
+		ice_ptp_link_change(vsi->back, false);
 	}
 
 	ice_vsi_dis_irq(vsi);

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 12/13] ice: dpll: fix rclk pin state get for E810
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (10 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 11/13] ice: fix PTP hang for E825C devices Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-05  5:14 ` [PATCH net 13/13] ice: dpll: fix misplaced header macros Jacob Keller
  2026-05-06 21:21 ` [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
  13 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller

From: Ivan Vecera <ivecera@redhat.com>

The refactoring of ice_dpll_rclk_state_on_pin_get() to use
ice_dpll_pin_get_parent_idx() omitted the base_rclk_idx adjustment that was
correctly added in the ice_dpll_rclk_state_on_pin_set() path. This breaks
E810 devices where base_rclk_idx is non-zero, causing the wrong hardware
index to be used for pin state lookup and incorrect recovered clock state
to be reported via the DPLL subsystem. E825C is unaffected as its
base_rclk_idx is 0.

While at it, add bounds check against ICE_DPLL_RCLK_NUM_MAX on hw_idx after
the base_rclk_idx subtraction in both ice_dpll_rclk_state_on_pin_{get,set}()
to prevent out-of-bounds access on the pin state array.

Fixes: ad1df4f2d591 ("ice: dpll: Support E825-C SyncE and dynamic pin discovery")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_dpll.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.c b/drivers/net/ethernet/intel/ice/ice_dpll.c
index 27b460926bac..892bc7c2e28b 100644
--- a/drivers/net/ethernet/intel/ice/ice_dpll.c
+++ b/drivers/net/ethernet/intel/ice/ice_dpll.c
@@ -2523,6 +2523,8 @@ ice_dpll_rclk_state_on_pin_set(const struct dpll_pin *pin, void *pin_priv,
 	if (hw_idx < 0)
 		goto unlock;
 	hw_idx -= pf->dplls.base_rclk_idx;
+	if (hw_idx >= ICE_DPLL_RCLK_NUM_MAX)
+		goto unlock;
 
 	if ((enable && p->state[hw_idx] == DPLL_PIN_STATE_CONNECTED) ||
 	    (!enable && p->state[hw_idx] == DPLL_PIN_STATE_DISCONNECTED)) {
@@ -2586,6 +2588,9 @@ ice_dpll_rclk_state_on_pin_get(const struct dpll_pin *pin, void *pin_priv,
 	hw_idx = ice_dpll_pin_get_parent_idx(p, parent_pin);
 	if (hw_idx < 0)
 		goto unlock;
+	hw_idx -= pf->dplls.base_rclk_idx;
+	if (hw_idx >= ICE_DPLL_RCLK_NUM_MAX)
+		goto unlock;
 
 	ret = ice_dpll_pin_state_update(pf, p, ICE_DPLL_PIN_TYPE_RCLK_INPUT,
 					extack);

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net 13/13] ice: dpll: fix misplaced header macros
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (11 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 12/13] ice: dpll: fix rclk pin state get for E810 Jacob Keller
@ 2026-05-05  5:14 ` Jacob Keller
  2026-05-06 21:21 ` [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
  13 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-05  5:14 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Jacob Keller

From: Ivan Vecera <ivecera@redhat.com>

The CGU register definitions (ICE_CGU_R10, ICE_CGU_R11 and related field
masks) were placed after the #endif of the _ICE_DPLL_H_ include guard,
leaving them unprotected. Move them inside the guard.

Fixes: ad1df4f2d591 ("ice: dpll: Support E825-C SyncE and dynamic pin discovery")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_dpll.h | 32 +++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.h b/drivers/net/ethernet/intel/ice/ice_dpll.h
index ae42cdea0ee1..8678575359b9 100644
--- a/drivers/net/ethernet/intel/ice/ice_dpll.h
+++ b/drivers/net/ethernet/intel/ice/ice_dpll.h
@@ -8,6 +8,22 @@
 
 #define ICE_DPLL_RCLK_NUM_MAX	4
 
+#define ICE_CGU_R10			0x28
+#define ICE_CGU_R10_SYNCE_CLKO_SEL	GENMASK(8, 5)
+#define ICE_CGU_R10_SYNCE_CLKODIV_M1	GENMASK(13, 9)
+#define ICE_CGU_R10_SYNCE_CLKODIV_LOAD	BIT(14)
+#define ICE_CGU_R10_SYNCE_DCK_RST	BIT(15)
+#define ICE_CGU_R10_SYNCE_ETHCLKO_SEL	GENMASK(18, 16)
+#define ICE_CGU_R10_SYNCE_ETHDIV_M1	GENMASK(23, 19)
+#define ICE_CGU_R10_SYNCE_ETHDIV_LOAD	BIT(24)
+#define ICE_CGU_R10_SYNCE_DCK2_RST	BIT(25)
+#define ICE_CGU_R10_SYNCE_S_REF_CLK	GENMASK(31, 27)
+
+#define ICE_CGU_R11			0x2C
+#define ICE_CGU_R11_SYNCE_S_BYP_CLK	GENMASK(6, 1)
+
+#define ICE_CGU_BYPASS_MUX_OFFSET_E825C	3
+
 /**
  * enum ice_dpll_pin_sw - enumerate ice software pin indices:
  * @ICE_DPLL_PIN_SW_1_IDX: index of first SW pin
@@ -157,19 +173,3 @@ static inline void ice_dpll_deinit(struct ice_pf *pf) { }
 #endif
 
 #endif
-
-#define ICE_CGU_R10				0x28
-#define ICE_CGU_R10_SYNCE_CLKO_SEL		GENMASK(8, 5)
-#define ICE_CGU_R10_SYNCE_CLKODIV_M1		GENMASK(13, 9)
-#define ICE_CGU_R10_SYNCE_CLKODIV_LOAD		BIT(14)
-#define ICE_CGU_R10_SYNCE_DCK_RST		BIT(15)
-#define ICE_CGU_R10_SYNCE_ETHCLKO_SEL		GENMASK(18, 16)
-#define ICE_CGU_R10_SYNCE_ETHDIV_M1		GENMASK(23, 19)
-#define ICE_CGU_R10_SYNCE_ETHDIV_LOAD		BIT(24)
-#define ICE_CGU_R10_SYNCE_DCK2_RST		BIT(25)
-#define ICE_CGU_R10_SYNCE_S_REF_CLK		GENMASK(31, 27)
-
-#define ICE_CGU_R11				0x2C
-#define ICE_CGU_R11_SYNCE_S_BYP_CLK		GENMASK(6, 1)
-
-#define ICE_CGU_BYPASS_MUX_OFFSET_E825C		3

-- 
2.54.0.rc2.531.gaf818d63126a


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH net 01/13] i40e: Cleanup PTP registration on probe failure
  2026-05-05  5:14 ` [PATCH net 01/13] i40e: Cleanup PTP registration on probe failure Jacob Keller
@ 2026-05-06 20:24   ` Jacob Keller
  0 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-06 20:24 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Matt Vollrath, Sunitha Mekala

On 5/4/2026 10:14 PM, Jacob Keller wrote:
> From: Matt Vollrath <tactii@gmail.com>
> 
> Fix two conditions which would leak PTP registration on probe failure:
> 
> 1. i40e_setup_pf_switch can encounter an error in
>    i40e_setup_pf_filter_control, call i40e_ptp_init, then return
>    non-zero, sending i40e_probe to err_vsis.
> 
> 2. i40e_setup_misc_vector can return non-zero, sending i40e_probe to
>    err_vsis.
> 
> Both of these conditions have been present since PTP was introduced in
> this driver.
> 
> Found with coccinelle.
> 
> Fixes: beb0dff1251db ("i40e: enable PTP")
> Signed-off-by: Matt Vollrath <tactii@gmail.com>
> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 028bd500603a..f06fcef644e5 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -16108,6 +16108,7 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	/* Unwind what we've done if something failed in the setup */
>  err_vsis:
>  	set_bit(__I40E_DOWN, pf->state);
> +	i40e_ptp_stop(pf);
>  	i40e_clear_interrupt_scheme(pf);
>  	kfree(pf->vsi);
>  err_switch_setup:
> 

Sashiko points out two possible issues, both of which I think are better
served by follow-up fixes:

> This isn't a bug introduced by this patch, but is it safe to use
> i40e_ptp_stop() in this error path without canceling its workqueue task?
> The PTP initialization logic initializes a background work item
> pf->ptp_extts0_work, which is scheduled asynchronously by the hardware
> interrupt handler. The i40e_ptp_stop() function disables hardware
> interrupts and unregisters the clock without calling
> cancel_work_sync(&pf->ptp_extts0_work). 
> Could this leave the work item pending if an event fires right before
> interrupts are disabled, leading to a use-after-free when the module
> teardown frees the pf structure?


I think we need to double check that extts0_work is cancelled properly
when stopping PTP. Or confirm that it won't cause a problem as-is.

Sashiko also says:

> This isn't a bug introduced by this patch, but does this error path leak VSI
> structures and leave netdevs registered?
> If i40e_setup_pf_switch() fails late, the main VSI has already been
> allocated and its netdev registered via register_netdev(). The err_vsis
> cleanup block only executes kfree(pf->vsi), which frees the array of
> pointers but fails to invoke i40e_vsi_release(vsi) or unregister the netdev.
> Since the driver then proceeds to i40e_free_pf(pf) and frees the underlying
> driver structure, could subsequent userspace operations on the
> still-registered netdev execute driver callbacks that dereference the freed
> pf structure?

The only way that the function can fail late appears to be if
i40e_validate_filter_settings() fails.. but that has a comment around
i40e_setup_pf_filter_control() which claims that we shouldn't the probe
just because that failed.. so I think there is maybe an issue but I'm
not 100% sure how to resolve it properly.. It does seem like there may
be a pre-existing issue in the i40e probe error cleanup.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 02/13] i40e: Cleanup PTP pins on probe failure
  2026-05-05  5:14 ` [PATCH net 02/13] i40e: Cleanup PTP pins " Jacob Keller
@ 2026-05-06 20:28   ` Jacob Keller
  0 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-06 20:28 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Matt Vollrath, Kohei Enju, Paul Menzel,
	Sunitha Mekala

On 5/4/2026 10:14 PM, Jacob Keller wrote:
> From: Matt Vollrath <tactii@gmail.com>
> 
> PTP pin structs are allocated early in probe, but never cleaned up.
> 
> Fix this by calling i40e_ptp_free_pins in the error path.
> 
> To support this, i40e_ptp_free_pins is added to the header and
> pin_config is correctly nullified after being freed.
> 
> This has been an issue since i40e_ptp_alloc_pins was introduced.
> 
> Fixes: 1050713026a08 ("i40e: add support for PTP external synchronization clock")
> Reported-by: Kohei Enju <kohei@enjuk.jp>
> Cc: stable@vger.kernel.org
> Signed-off-by: Matt Vollrath <tactii@gmail.com>
> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Reviewed-by: Kohei Enju <kohei@enjuk.jp>
> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e.h      | 1 +
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 1 +
>  drivers/net/ethernet/intel/i40e/i40e_ptp.c  | 3 ++-
>  3 files changed, 4 insertions(+), 1 deletion(-)
> 
Sashiko points out some issues which are related but pre-existing this fix:

> This isn't a bug in this patch, but looking at i40e_probe(), the return
> value of i40e_ptp_alloc_pins() is ignored earlier in the probe sequence.
> If the memory allocation fails, pf->ptp_pins remains NULL, but the probe
> routine continues successfully.
> Later, if a user attempts to configure the PTP pins via the PTP clock ioctl
> (such as PTP_EXTTS_REQUEST), i40e_ptp_enable_pin() unconditionally dereferences
> the pointer:
> drivers/net/ethernet/intel/i40e/i40e_ptp.c:i40e_ptp_enable_pin() {
>     ...
> 	pins.sdp3_2 = pf->ptp_pins->sdp3_2;
>     ...
> }
> Could this lead to a NULL pointer dereference panic triggered by userspace?


Probably we just need to add an error handling of the pin allocation
since failure to allocate memory is likely to hit a lot of other places
as well. This is also a pre-existing issue.

> This isn't a bug in this patch, but while looking at PTP cleanup, it appears
> the pf->ptp_extts0_work work item is never canceled during device removal.
> In i40e_remove(), i40e_ptp_stop() masks the PTP event interrupts, but it
> doesn't call cancel_work_sync(&pf->ptp_extts0_work) to flush any already-queued
> work.
> i40e_remove() later cancels service_task but completely omits ptp_extts0_work.
> If an interrupt triggers and schedules the work immediately before the teardown
> path masks the interrupt, will the work item execute after the pf structure is
> completely freed, resulting in a use-after-free?

This is also a pre-existing issue in the PTP teardown that was reported
on patch 1/13 as well: we need to cancel the extts0_work item.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 03/13] i40e: keep q_vectors array in sync with channel count changes
  2026-05-05  5:14 ` [PATCH net 03/13] i40e: keep q_vectors array in sync with channel count changes Jacob Keller
@ 2026-05-06 20:53   ` Jacob Keller
  0 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-06 20:53 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Joshua Hay, Madhu Chittim, Willem de Bruijn, Dave Ertman,
	Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Simon Horman, Sunitha Mekala

On 5/4/2026 10:14 PM, Jacob Keller wrote:
> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> 
> For the main VSI, i40e_set_num_rings_in_vsi() always derives
> num_q_vectors from pf->num_lan_msix. At the same time, ethtool -L stores
> the user requested channel count in vsi->req_queue_pairs and the queue
> setup path uses that value for the effective number of queue pairs.
> 
> This leaves queue and vector counts out of sync after shrinking channel
> count via ethtool -L. The active queue configuration is reduced, but the
> VSI still keeps the full PF-sized q_vector topology.
> 
> That mismatch breaks reconfiguration flows which rely on vector/NAPI
> state matching the effective channel configuration. In particular,
> toggling /sys/class/net/<dev>/threaded after reducing the channel count
> can hang, and later channel-count changes can fail because VSI reinit
> does not rebuild q_vectors to match the new vector count.
> 
> Fix this by making the main VSI num_q_vectors follow the effective
> requested channel count, capped by the available MSI-X vectors. Update
> i40e_vsi_reinit_setup() to rebuild q_vectors during VSI reinit so the
> vector topology is refreshed together with the ring arrays when channel
> count changes.
> 
> Keep alloc_queue_pairs unchanged and based on pf->num_lan_qps so the VSI
> retains its full queue capacity.
> 
> Selftest napi_threaded.py was originally used when Jakub reported hang
> on /sys/class/net/<dev>/threaded toggle. In order to make it pass on
> i40e, use persistent NAPI configuration for q_vector NAPIs so NAPI
> identity and threaded settings survive q_vector reallocation across
> channel-count changes. This is achieved by using netif_napi_add_config()
> when configuring q_vectors.
> 
> $ export NETIF=ens259f1np1
> $ sudo -E env PATH="$PATH" ./tools/testing/selftests/drivers/net/napi_threaded.py
> TAP version 13
> 1..3
> ok 1 napi_threaded.napi_init
> ok 2 napi_threaded.change_num_queues
> ok 3 napi_threaded.enable_dev_threaded_disable_napi_threaded
> Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
> 
> [Jake: use min() and clamp() as suggested by Simon on Intel Wired LAN]
> 
> Reported-by: Jakub Kicinski <kuba@kernel.org>
> Closes: https://lore.kernel.org/intel-wired-lan/20260316133100.6054a11f@kernel.org/
> Fixes: d2a69fefd756 ("i40e: Fix changing previously set num_queue_pairs for PFs")
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Reviewed-by: Simon Horman <horms@kernel.org>
> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---

Sashiko points out some possible issues that might need to be addressed:

>  drivers/net/ethernet/intel/i40e/i40e_main.c | 34 ++++++++++++++++++++++++-----
>  1 file changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 6d4f9218dc68..23156015ed86 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -11403,10 +11403,14 @@ static void i40e_service_timer(struct timer_list *t)
>  static int i40e_set_num_rings_in_vsi(struct i40e_vsi *vsi)
>  {
>  	struct i40e_pf *pf = vsi->back;
> +	u16 qps;
>  
>  	switch (vsi->type) {
>  	case I40E_VSI_MAIN:
>  		vsi->alloc_queue_pairs = pf->num_lan_qps;
> +		qps = vsi->req_queue_pairs ?
> +		      min(vsi->req_queue_pairs, pf->num_lan_qps) :
> +		      pf->num_lan_qps;
>  		if (!vsi->num_tx_desc)
>  			vsi->num_tx_desc = ALIGN(I40E_DEFAULT_NUM_DESCRIPTORS,
>  						 I40E_REQ_DESCRIPTOR_MULTIPLE);
> @@ -11414,7 +11418,7 @@ static int i40e_set_num_rings_in_vsi(struct i40e_vsi *vsi)
>  			vsi->num_rx_desc = ALIGN(I40E_DEFAULT_NUM_DESCRIPTORS,
>  						 I40E_REQ_DESCRIPTOR_MULTIPLE);
>  		if (test_bit(I40E_FLAG_MSIX_ENA, pf->flags))
> -			vsi->num_q_vectors = pf->num_lan_msix;
> +			vsi->num_q_vectors = clamp(qps, 1, pf->num_lan_msix);
>  		else
>  			vsi->num_q_vectors = 1;
>  
> @@ -11503,6 +11507,7 @@ static int i40e_vsi_alloc_arrays(struct i40e_vsi *vsi, bool alloc_qvectors)
>  
>  err_vectors:
>  	kfree(vsi->tx_rings);
> +	vsi->tx_rings = NULL;
>  	return ret;
>  }
>  
> @@ -12043,7 +12048,8 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
>  	cpumask_copy(&q_vector->affinity_mask, cpu_possible_mask);
>  
>  	if (vsi->netdev)
> -		netif_napi_add(vsi->netdev, &q_vector->napi, i40e_napi_poll);
> +		netif_napi_add_config(vsi->netdev, &q_vector->napi,
> +				      i40e_napi_poll, v_idx);
>  
>  	/* tie q_vector and vsi together */
>  	vsi->q_vectors[v_idx] = q_vector;
> @@ -14264,12 +14270,27 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct i40e_vsi *vsi)
>  
>  	pf = vsi->back;
>  
> +	if (test_bit(I40E_FLAG_MSIX_ENA, pf->flags)) {
> +		i40e_put_lump(pf->irq_pile, vsi->base_vector, vsi->idx);
> +		vsi->base_vector = 0;
> +	}
> +


It says:
> 
> Does vsi->base_vector need to be cleared unconditionally?
> If the hardware falls back to legacy/MSI mode, the I40E_FLAG_MSIX_ENA flag is
> false, which leaves vsi->base_vector with a stale non-zero value.
> Subsequent calls to i40e_vsi_setup_vectors() check if (vsi->base_vector) and
> unconditionally return -EEXIST, which causes the VSI reinitialization to
> fail and the VSI to be destroyed.

I'm not sure this is an issue. base_vector should only be non-zero when
MSIX is enabled, so this will clear it to 0 in that case. In MSI or
Legacy mode it will already be zero so it doesn't need to be cleared.
When switching modes we have to go through this function to
reinitialize, so we should clear it.

It might nomimally make more sense to always clear the value to ensure
its zero regardless if it ever got a stale value, but I don't think its
actually required.


>  	i40e_put_lump(pf->qp_pile, vsi->base_queue, vsi->idx);
>  	i40e_vsi_clear_rings(vsi);
>  
> -	i40e_vsi_free_arrays(vsi, false);
> +	i40e_vsi_free_q_vectors(vsi);
> +	i40e_vsi_free_arrays(vsi, true);

Sashiko says:

> Can this result in a use-after-free on the ring structures?
> The i40e_vsi_clear_rings() function passes the ring structures to kfree_rcu().
> Immediately after, i40e_vsi_free_q_vectors() iterates over the q_vector->tx
> list, which still points to these RCU-freed rings, and attempts to modify
> them.

I think this is correct, and we'll need to free this in order of
i40e_free_q_vectors -> i40e_vsi_clear_rings -> i40e_vsi_free_arrays.

It does seem like the q_vectors maintain their pointers to the rings and
the i40e_for_each_ring would access the values. The i40e_vsi_clear_rings
function does assign the pointers to NULL, but those don't appear to be
the same pointers checked by in q_vector.

I will note that there is at least one other teardown flow that also
appears incorrect and calls i40e_vsi_clear_rings first before calling
i40e_vsi_free_q_vectors: the teardown flow of i40e_vsi_setup().

This needs to be addressed, so this patch should be dropped from the series.

>  	i40e_set_num_rings_in_vsi(vsi);
> -	ret = i40e_vsi_alloc_arrays(vsi, false);
> +
> +	ret = i40e_vsi_alloc_arrays(vsi, true);
> +	if (ret)
> +		goto err_vsi;
> +
> +	/* Rebuild q_vectors during VSI reinit because the effective channel
> +	 * count may change num_q_vectors. Keep vector topology aligned with the
> +	 * queue configuration after ethtool's .set_channels() callback.
> +	 */
> +	ret = i40e_vsi_setup_vectors(vsi);
>  	if (ret)
>  		goto err_vsi;
>  


Finally sashiko says this:

> Is it possible for the dynamic reallocation of vsi->num_q_vectors to cause
> irq_pile fragmentation?
> Because i40e_get_lump() requires a contiguous block of vectors, reducing
> the channel count and later increasing it might fail if another VSI
> (such as a Macvlan interface) allocated vectors from the freed hole in
> the meantime.
> If this contiguous allocation fails, i40e_vsi_reinit_setup() aborts and
> destroys the VSI, which could render the interface unusable until a driver
> reload.
> Also, if i40e_vsi_setup_vectors() fails here and jumps to err_vsi, will the
> driver skip unregister_netdev()?
> The err_vsi label is placed after the err_rings block, which skips cleaning up
> the netdev. The code then falls through to i40e_vsi_clear() which frees the
> VSI memory.
> Could this leave the active net_device registered in the kernel with a
> dangling pointer to the freed VSI memory?


I'm not sure exactly what its trying to say here since it seems like a
few jumbled issues. However, I suspect even if the fragmentation issue
is real it shouldn't be solved by this fix.

The issue with the err_vsi also I am not certain is real. The function
returns NULL when it fails to reinit, and that will stop probe or
rebuild. It is likely that this entire chunk of code is flawed and
should be redone, but I think that is also out-of-scope for this fix.

Still, we need a new verison that fixes the use-after-free from
q_vectors function.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 05/13] idpf: do not enable XDP if queue based scheduling is not supported
  2026-05-05  5:14 ` [PATCH net 05/13] idpf: do not enable XDP if queue based scheduling is not supported Jacob Keller
@ 2026-05-06 20:59   ` Jacob Keller
  0 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-06 20:59 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Patryk Holda

On 5/4/2026 10:14 PM, Jacob Keller wrote:
> From: Joshua Hay <joshua.a.hay@intel.com>
> 
> The current XDP implementation uses queue based scheduling for its TxQs.
> If the FW does not advertise support for queue based scheduling, do not
> enable XDP. Add the missing capability check at the start of the XDP
> configuration. This will temporarily break XDP while a flow based
> implementation is worked on, as well as while FWs with queue based by
> default are rolled out.
> 
> Fixes: 705457e7211f ("idpf: implement XDP_SETUP_PROG in ndo_bpf for splitq")
> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
> Tested-by: Patryk Holda <patryk.holda@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
>  drivers/net/ethernet/intel/idpf/xdp.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/idpf/xdp.c b/drivers/net/ethernet/intel/idpf/xdp.c
> index cbccd4546768..dcd867517a5f 100644
> --- a/drivers/net/ethernet/intel/idpf/xdp.c
> +++ b/drivers/net/ethernet/intel/idpf/xdp.c
> @@ -510,6 +510,13 @@ int idpf_xdp(struct net_device *dev, struct netdev_bpf *xdp)
>  	if (!idpf_is_queue_model_split(vport->dflt_qv_rsrc.txq_model))
>  		goto notsupp;
>  
> +	if (!idpf_is_cap_ena(vport->adapter, IDPF_OTHER_CAPS,
> +			     VIRTCHNL2_CAP_SPLITQ_QSCHED)) {
> +		NL_SET_ERR_MSG_MOD(xdp->extack,
> +				   "Device does not support requested XDP Tx scheduling mode");
> +		goto notsupp;
> +	}
> +

Sashiko points out that this is only valid for XDP_SETUP_PROG:

> Could accessing xdp->extack here cause an uninitialized memory dereference?
> idpf_xdp() handles multiple commands like XDP_SETUP_PROG and
> XDP_SETUP_XSK_POOL. In struct netdev_bpf, extack is part of a union and
> is only valid when xdp->command is XDP_SETUP_PROG.
> Since this capability check happens before the switch (xdp->command) block,
> if the command is XDP_SETUP_XSK_POOL, xdp->extack overlaps with the xsk
> sub-struct and might contain uninitialized stack data.
> If that memory is non-NULL, NL_SET_ERR_MSG_MOD could write the error message
> pointer to an arbitrary memory address.
> Should this capability check be moved inside the XDP_SETUP_PROG case,
> or should the command type be verified before accessing xdp->extack?

I checked and the netdev_bpf structure indeed only has extack for the
XDP_SETUP_PROG case.

Because extack is only valid for that command, you can't use extack here.

I don't know if this check either belongs inside idpf_xdp_setup_prog()
or if you should just call NL_SET_ERR_MSG_MOD if it is an XDP_SETUP_PROG.

Josh, could you figure out which solution is better and prepare an
updated version.

>  	switch (xdp->command) {
>  	case XDP_SETUP_PROG:
>  		ret = idpf_xdp_setup_prog(vport, xdp);
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 08/13] idpf: fix double free and use-after-free in aux device error paths
  2026-05-05  5:14 ` [PATCH net 08/13] idpf: fix double free and use-after-free in aux device error paths Jacob Keller
@ 2026-05-06 21:04   ` Jacob Keller
  0 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-06 21:04 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Greg Kroah-Hartman, Tony Nguyen, stable,
	Paul Menzel

On 5/4/2026 10:14 PM, Jacob Keller wrote:
> From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> When auxiliary_device_add() fails in idpf_plug_vport_aux_dev() or
> idpf_plug_core_aux_dev(), the err_aux_dev_add label calls
> auxiliary_device_uninit() and falls through to err_aux_dev_init.  The
> uninit call will trigger put_device(), which invokes the release
> callback (idpf_vport_adev_release / idpf_core_adev_release) that frees
> iadev.  The fall-through then reads adev->id from the freed iadev for
> ida_free() and double-frees iadev with kfree().
> 
> Free the IDA slot and clear the back-pointer before uninit, while adev
> is still valid, then return immediately.
> 
> Commit 65637c3a1811 ("idpf: fix UAF in RDMA core aux dev deinitialization")
> fixed the same use-after-free in the matching unplug path in this file but
> missed both probe error paths.
> 
> Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> Cc: Andrew Lunn <andrew+netdev@lunn.ch>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: stable@kernel.org
> Fixes: be91128c579c ("idpf: implement RDMA vport auxiliary dev create, init, and destroy")
> Fixes: f4312e6bfa2a ("idpf: implement core RDMA auxiliary dev create, init, and destroy")
> Assisted-by: gregkh_clanker_t1000
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
>  drivers/net/ethernet/intel/idpf/idpf_idc.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/idpf/idpf_idc.c b/drivers/net/ethernet/intel/idpf/idpf_idc.c
> index 7e4f4ac92653..b7d6b08fc89e 100644
> --- a/drivers/net/ethernet/intel/idpf/idpf_idc.c
> +++ b/drivers/net/ethernet/intel/idpf/idpf_idc.c
> @@ -90,7 +90,10 @@ static int idpf_plug_vport_aux_dev(struct iidc_rdma_core_dev_info *cdev_info,
>  	return 0;
>  

Sashiko points out some possible flaws in the idpf_plug_vport_aux_dev
function that are unrelated to this change:

> This isn't an issue introduced by this patch, but does earlier code prematurely
> publish vdev_info->adev before initialization is complete?
> vdev_info->adev is set near the beginning of idpf_plug_vport_aux_dev():
> 	adev = &iadev->adev;
> 	vdev_info->adev = &iadev->adev;
> 	iadev->vdev_info = vdev_info;
> 	...
> 	ret = auxiliary_device_init(adev);
> If a concurrent MTU event triggers idpf_idc_vdev_mtu_event(), it will read
> the pointer locklessly:
> void idpf_idc_vdev_mtu_event(...)
> {
> 	...
> 	device_lock(&vdev_info->adev->dev);
> 	adev = vdev_info->adev;
> 	if (!adev || !adev->dev.driver)
> 		goto unlock;
> If device_initialize() hasn't executed yet, the device's mutex is uninitialized
> and attempting to lock it could result in a panic.
> Furthermore, if auxiliary_device_add() fails and the error path sets the
> pointer to NULL and frees the device, could a concurrent reader that fetched
> the pointer just before it was cleared suffer a use-after-free when attempting
> to lock the freed device?
> Should the assignment to vdev_info->adev be deferred until after the device has
> been successfully initialized and added?
> Also, in idpf_idc_vdev_mtu_event(), is it possible for vdev_info->adev to be
> NULL when device_lock() is called?
> 	device_lock(&vdev_info->adev->dev);
> 	adev = vdev_info->adev;
> 	if (!adev || !adev->dev.driver)
> 		goto unlock;
> If vdev_info->adev is NULL, evaluating &vdev_info->adev->dev evaluates to
> NULL, which is then passed to mutex_lock() inside device_lock(). The subsequent
> check for (!adev) indicates this NULL state is anticipated. Should the
> NULL check happen before attempting to acquire the lock?

I do not believe these should block this fix, and we'll need to have
someone from the idpf team review this code and determine if this is
valid and find a fix for it.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 09/13] ice: fix setting RSS VSI hash for E830
  2026-05-05  5:14 ` [PATCH net 09/13] ice: fix setting RSS VSI hash for E830 Jacob Keller
@ 2026-05-06 21:06   ` Jacob Keller
  2026-05-07 11:47     ` Marcin Szycik
  0 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2026-05-06 21:06 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Joshua Hay, Madhu Chittim, Willem de Bruijn, Dave Ertman,
	Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Marcin Szycik

On 5/4/2026 10:14 PM, Jacob Keller wrote:
> From: Marcin Szycik <marcin.szycik@linux.intel.com>
> 
> ice_set_rss_hfunc() performs a VSI update, in which it sets hashing
> function, leaving other VSI options unchanged. However, ::q_opt_flags is
> mistakenly set to the value of another field, instead of its original
> value, probably due to a typo. What happens next is hardware-dependent:
> 
> On E810, only the first bit is meaningful (see
> ICE_AQ_VSI_Q_OPT_PE_FLTR_EN) and can potentially end up in a different
> state than before VSI update.
> 
> On E830, some of the remaining bits are not reserved. Setting them
> to some unrelated values can cause the firmware to reject the update
> because of invalid settings, or worse - succeed.
> 
> Reproducer:
>   sudo ethtool -X $PF1 equal 8
> 
> Output in dmesg:
>   Failed to configure RSS hash for VSI 6, error -5
> 
> Fixes: 352e9bf23813 ("ice: enable symmetric-xor RSS for Toeplitz hash function")
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 1d1947a7fe11..c52c465280f7 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -8046,7 +8046,7 @@ int ice_set_rss_hfunc(struct ice_vsi *vsi, u8 hfunc)
>  	ctx->info.q_opt_rss |=
>  		FIELD_PREP(ICE_AQ_VSI_Q_OPT_RSS_HASH_M, hfunc);
>  	ctx->info.q_opt_tc = vsi->info.q_opt_tc;
> -	ctx->info.q_opt_flags = vsi->info.q_opt_rss;
> +	ctx->info.q_opt_flags = vsi->info.q_opt_flags;
>  
>  	err = ice_update_vsi(hw, vsi->idx, ctx, NULL);
>  	if (err) {
> 

Sashiko complains about ice_set_rss_hfunc() but it is unrelated to this fix:

> While looking at this function, I noticed a pre-existing issue regarding the
> hardware cache. Does calling ice_update_vsi() with a local context leave the
> global hw->vsi_ctx[vsi->idx] out of sync?
> If ice_update_vsi() succeeds, vsi->info.q_opt_rss is updated, but
> hw->vsi_ctx[vsi->idx]->info.q_opt_rss is not.
> When an unrelated feature such as RDMA filtering is subsequently toggled via
> ice_cfg_rdma_fltr(), could it retrieve this stale cached context via
> ice_get_vsi_ctx() and copy the stale q_opt_rss value back into its command
> buffer?
> Could this cause the firmware to silently revert the RSS hash function to its
> previous configuration during the next ice_update_vsi() call?
> There also appears to be a pre-existing issue with state desynchronization
> when handling failures. After ice_update_vsi() succeeds, vsi->rss_hfunc
> is updated, but ice_set_rss_hfunc() then calls ice_set_rss_cfg_symm() at the
> end of the function:
> ice_set_rss_hfunc() {
>         ...
>         err = ice_update_vsi(hw, vsi->idx, ctx, NULL);
>         if (err) {
>                 ...
>         } else {
>                 vsi->info.q_opt_rss = ctx->info.q_opt_rss;
>                 vsi->rss_hfunc = hfunc;
>                 ...
>         }
>         ...
>         return ice_set_rss_cfg_symm(hw, vsi, symm);
> }
> If ice_set_rss_cfg_symm() fails, the error is returned but vsi->rss_hfunc
> remains updated. If the user retries the command, the early check:
> if (hfunc == vsi->rss_hfunc)
>         return 0;
> evaluates to true and returns success immediately.
> Could this skip the ice_set_rss_cfg_symm() retry entirely, leaving the
> hardware's flow director rules in a non-symmetric state permanently while
> falsely reporting success?

Someone from the ice team should look into this and determine whether or
not its valid.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 10/13] ice: fix locking in ice_dcb_rebuild()
  2026-05-05  5:14 ` [PATCH net 10/13] ice: fix locking in ice_dcb_rebuild() Jacob Keller
@ 2026-05-06 21:13   ` Jacob Keller
  0 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-06 21:13 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Bart Van Assche, intel-wired-lan, Arpana Arland

On 5/4/2026 10:14 PM, Jacob Keller wrote:
> From: Bart Van Assche <bvanassche@acm.org>
> 
> Move the mutex_lock() call up to prevent that DCB settings change after
> the first ice_query_port_ets() call. The second ice_query_port_ets()
> call in ice_dcb_rebuild() is already protected by pf->tc_mutex.
> 
> This also fixes a bug in an error path, as before taking the first
> "goto dcb_error" in the function jumped over mutex_lock() to
> mutex_unlock().
> 
> This bug has been detected by the clang thread-safety analyzer.
> 
> Cc: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org
> Fixes: 242b5e068b25 ("ice: Fix DCB rebuild after reset")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> Tested-by: Arpana Arland <arpanax.arland@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> index 16aa25535152..0bc6dd375687 100644
> --- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> @@ -537,14 +537,14 @@ void ice_dcb_rebuild(struct ice_pf *pf)
>  	struct ice_dcbx_cfg *err_cfg;
>  	int ret;
>  
> +	mutex_lock(&pf->tc_mutex);
> +
>  	ret = ice_query_port_ets(pf->hw.port_info, &buf, sizeof(buf), NULL);
>  	if (ret) {
>  		dev_err(dev, "Query Port ETS failed\n");
>  		goto dcb_error;
>  	}
>  
> -	mutex_lock(&pf->tc_mutex);
> -
>  	if (!pf->hw.port_info->qos_cfg.is_sw_lldp)
>  		ice_cfg_etsrec_defaults(pf->hw.port_info);
>  
> 

Sashiko thinks there is a problem here:

> Does moving mutex_lock(&pf->tc_mutex) to the top of ice_dcb_rebuild()
> introduce an AB-BA deadlock with rtnl_lock?
> In standard DCB netlink operations, the netlink layer calls into the driver
> with rtnl_lock already held, and the driver subsequently acquires
> pf->tc_mutex. This establishes a required lock ordering of rtnl_lock
> followed by pf->tc_mutex.
> With this change, if the first ice_query_port_ets() fails, we take the goto
> dcb_error path while already holding pf->tc_mutex.
> Looking at the error cleanup path lower down in ice_dcb_rebuild():
> dcb_error:
>     dev_err(dev, "Disabling DCB until new settings occur\n");
>     [ ... ]
>     ice_pf_dcb_cfg(pf, err_cfg, false);
> Because the locked parameter is passed as false, ice_pf_dcb_cfg() will
> unconditionally acquire rtnl_lock().
> Does this create a lock inversion (pf->tc_mutex followed by rtnl_lock)
> against concurrent DCB netlink operations?
This seems like a fully pre-existing error. We already jump to dcb_error
else where in the function.

I don't know if this locking order really is an ABBA violation (I did
not review any of the other flows that take tc_mutex to confirm), but I
don't think it should hold this fix.

Someone from the ice team will need to investigate and see what the best
solution is. I suspect we'll have to take RTNL lock then the tc_mutex
and pass true to the ice_pf_dcb_cfg function. Or, better yet, see if
this converts to the netdev per-instance lock and we could drop the
tc_mutex entirely, relying on netdev_lock?

Thanks,
Jake

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 11/13] ice: fix PTP hang for E825C devices
  2026-05-05  5:14 ` [PATCH net 11/13] ice: fix PTP hang for E825C devices Jacob Keller
@ 2026-05-06 21:16   ` Jacob Keller
  0 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-06 21:16 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Michal Kubiak, Joshua Hay, Madhu Chittim, Willem de Bruijn,
	Dave Ertman, Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Simon Horman, Rinitha S

On 5/4/2026 10:14 PM, Jacob Keller wrote:
> From: Grzegorz Nitka <grzegorz.nitka@intel.com>
> 
> Change the order of PTP reconfiguration when port goes down or up
> (ice_down and ice_up calls) to be more graceful and consistent from
> timestamp interrupts processing perspective.
> 
> For both calls (ice_up and ice_down), accompanying ice_ptp_link_change
> is called which starts/stops PTP timer. This patch changes the order:
> - while link goes down: disable net device Tx first (netif_carrier_off,
>   netif_tx_disable), then call ice_ptp_link_change
> - while link goes up: ice_ptp_link_change called first, then re-enable
>   net device Tx (netif_tx_start_all_queues)
> 
> Otherwise, there is a narrow window in which PTP timestamp request has
> been triggered and timestamp processing occurs when PTP timer is not
> enabled yet (up case) or already disabled (down case). This may lead to
> undefined behavior and receiving invalid timestamps. This case was
> observed on E825C devices only.
> 

Sashiko has some concerns:

> Does this sequence still leave a window for early packets to trigger PTP
> processing before the timer is ready?
> Looking earlier in ice_up_complete(), the hardware Rx rings are started
> before we reach this point:
> 	err = ice_vsi_start_all_rx_rings(vsi);
> 	if (err)
> 		return err;
> Since ice_vsi_start_all_rx_rings() is called before ice_ptp_link_change(),
> could packets be received and timestamped before the PTP timer is actually
> enabled?

This doesn't matter. The issue is for Tx timestamps and the Rx path is
unrelated here.

Sashiko also says:

> Is the race condition fully closed by moving ice_ptp_link_change() here?
> While netif_tx_disable() stops the OS networking stack from enqueuing new
> packets, it does not flush or wait for the hardware DMA rings to drain. The
> hardware Tx rings are actually stopped later in the teardown process:
> 	ice_vsi_dis_irq(vsi);
> 	tx_err = ice_vsi_stop_lan_tx_rings(vsi, ICE_NO_RESET, 0);
> If an in-flight packet from the hardware Tx ring triggers a PTP timestamp
> request after ice_ptp_link_change() has set ptp_port->link_up = false,
> wouldn't the software worker ignore the timestamp?
> Leaving a valid timestamp unread in the hardware FIFO is noted as the cause
> of the PHY hang. Should ice_ptp_link_change() be called after the hardware
> rings are fully stopped to completely avoid this regression?
> Additionally, is there a missing VSI type check here in ice_down()?
> Unlike ice_up_complete(), which checks if the VSI is a PF or SF, this block
> executes for any VSI that has a netdev.
> Because the PTP timer is shared at the physical port level, could bringing
> down a secondary interface (like a sub-function) invoke ice_down() and
> unconditionally disable the PTP timer for vsi->back?
> This seems like it would break PTP functionality for the main PF interface
> and any other active sub-functions on that port.

This I am not sure. Gregorz, do you have any thoughts? I thought we had
spoken about this and were maybe investigating separating the PTP link
state from the admin up/down state entirely.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf)
  2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
                   ` (12 preceding siblings ...)
  2026-05-05  5:14 ` [PATCH net 13/13] ice: dpll: fix misplaced header macros Jacob Keller
@ 2026-05-06 21:21 ` Jacob Keller
  13 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-06 21:21 UTC (permalink / raw)
  To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Joshua Hay, Madhu Chittim, Willem de Bruijn, Dave Ertman,
	Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable, Matt Vollrath, Sunitha Mekala, Kohei Enju,
	Paul Menzel, Simon Horman, Emil Tantilov, Samuel Salin,
	Patryk Holda, Greg Kroah-Hartman, Tony Nguyen, stable,
	Marcin Szycik, Bart Van Assche, intel-wired-lan, Arpana Arland,
	Rinitha S

On 5/4/2026 10:14 PM, Jacob Keller wrote:
> Matt Volrath fixes two issues with the i40e driver probe routine, ensuring
> that PTP is properly cleaned up if the probe fails.
> 
> Maciej fixes the i40e driver logic to keep the q_vectors array in sync with
> changes to the channel count via ethtool.
> 
> Emil corrects the initialization of the read_dev_clk_lock spinlock in
> idpf_ptp_init, ensuring it is initialized prior to when the
> ptp_schedule_worker() is called.
> 
> Josh fixes the idpf driver to prevent enabling XDP if the queue based
> scheduling is not supported by the firmware.
> 
> Josh fixes the idpf skb data path for handling queue based scheduling.
> 
> Josh fixes an XDP crash in the soft reset error path, restoring the
> original configuration if idpf_xdp_setup_prog() fails.
> 
> Greg KH fixes a double free and use-after free in the idpf auxiliary device
> error paths.
> 
> Marcin fixes ice_set_rss_hfunc() to use the correct q_opt_flags field,
> correcting the assignment and preventing submission of invalid data to the
> firmware.
> 
> Bart corrects the locking in ice_dcb_rebuild(), ensuring that the tc_mutex
> is held over the entire operation.
> 
> Grzegorz fixes the ordering of ice_ptp_link_change() in ice_up_complete()
> ensuring that the PTP timestamps will not be enabled before the PTP timer
> is actually re-initialized.
> 
> Ivan fixes the rclk pin state get for E810 devices, ensuring the index is
> properly offset by the base_rclk_idx value. This ensures that the correct
> pin index is used to look up recovered clock state. He additionally adds
> bounds checking to prevent attempting to access pins outside of the pin
> state array.
> 
> Ivan also moves the CGU register macros to the top of ice_dpll.h, inside
> the header guard to avoid duplicate macro definitions should the ice_dpll.h
> header is included multiple times.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---

Sashiko pointed out a few issues with some of the patches. I replied to
all the patches with possible issues, and I think some really do need
more work.

In particular, patch 3 needs to resolve a definite use-after-free issue,
patch 5 needs to address an issue with the extack pointer use, patch 6
and 7 need some investigation from the author to confirm, and patch 11
needs some confirmation from Grzegorz on whether there is still any gap.

Sashiko did have some concerns on patch 1, 2, 8, and 10. I replied to
the patches, and I think those are issues which need separate follow up
work and shouldn't block these fixes.

I'm going to submit a v2 which drops the patches that need rework.

Thanks,
Jake
> Bart Van Assche (1):
>       ice: fix locking in ice_dcb_rebuild()
> 
> Emil Tantilov (2):
>       idpf: fix read_dev_clk_lock spinlock init in idpf_ptp_init()
>       idpf: fix xdp crash in soft reset error path
> 
> Greg Kroah-Hartman (1):
>       idpf: fix double free and use-after-free in aux device error paths
> 
> Grzegorz Nitka (1):
>       ice: fix PTP hang for E825C devices
> 
> Ivan Vecera (2):
>       ice: dpll: fix rclk pin state get for E810
>       ice: dpll: fix misplaced header macros
> 
> Joshua Hay (2):
>       idpf: do not enable XDP if queue based scheduling is not supported
>       idpf: fix skb datapath queue based scheduling crashes and timeouts
> 
> Maciej Fijalkowski (1):
>       i40e: keep q_vectors array in sync with channel count changes
> 
> Marcin Szycik (1):
>       ice: fix setting RSS VSI hash for E830
> 
> Matt Vollrath (2):
>       i40e: Cleanup PTP registration on probe failure
>       i40e: Cleanup PTP pins on probe failure
> 
>  drivers/net/ethernet/intel/i40e/i40e.h          |  1 +
>  drivers/net/ethernet/intel/ice/ice_dpll.h       | 32 ++++++-------
>  drivers/net/ethernet/intel/idpf/idpf_txrx.h     | 12 +++--
>  drivers/net/ethernet/intel/idpf/idpf_virtchnl.h |  4 +-
>  drivers/net/ethernet/intel/i40e/i40e_main.c     | 36 ++++++++++++---
>  drivers/net/ethernet/intel/i40e/i40e_ptp.c      |  3 +-
>  drivers/net/ethernet/intel/ice/ice_dcb_lib.c    |  4 +-
>  drivers/net/ethernet/intel/ice/ice_dpll.c       |  5 ++
>  drivers/net/ethernet/intel/ice/ice_main.c       |  6 +--
>  drivers/net/ethernet/intel/idpf/idpf_idc.c      |  6 +++
>  drivers/net/ethernet/intel/idpf/idpf_lib.c      |  4 +-
>  drivers/net/ethernet/intel/idpf/idpf_ptp.c      |  4 +-
>  drivers/net/ethernet/intel/idpf/idpf_txrx.c     | 61 ++++++++++++++-----------
>  drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 19 ++------
>  drivers/net/ethernet/intel/idpf/xdp.c           | 15 ++++--
>  drivers/net/ethernet/intel/idpf/xsk.c           |  4 +-
>  16 files changed, 132 insertions(+), 84 deletions(-)
> ---
> base-commit: bd3a4795d5744f59a1f485379f1303e5e606f377
> change-id: 20260504-jk-iwl-net-2026-05-04-f9526823577f
> 
> Best regards,
> --  
> Jacob Keller <jacob.e.keller@intel.com>
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 09/13] ice: fix setting RSS VSI hash for E830
  2026-05-06 21:06   ` Jacob Keller
@ 2026-05-07 11:47     ` Marcin Szycik
  2026-05-07 16:59       ` Marcin Szycik
  0 siblings, 1 reply; 26+ messages in thread
From: Marcin Szycik @ 2026-05-07 11:47 UTC (permalink / raw)
  To: Jacob Keller, Przemek Kitszel, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Joshua Hay, Madhu Chittim, Willem de Bruijn, Dave Ertman,
	Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable



On 06.05.2026 23:06, Jacob Keller wrote:
> On 5/4/2026 10:14 PM, Jacob Keller wrote:
>> From: Marcin Szycik <marcin.szycik@linux.intel.com>
>>
>> ice_set_rss_hfunc() performs a VSI update, in which it sets hashing
>> function, leaving other VSI options unchanged. However, ::q_opt_flags is
>> mistakenly set to the value of another field, instead of its original
>> value, probably due to a typo. What happens next is hardware-dependent:
>>
>> On E810, only the first bit is meaningful (see
>> ICE_AQ_VSI_Q_OPT_PE_FLTR_EN) and can potentially end up in a different
>> state than before VSI update.
>>
>> On E830, some of the remaining bits are not reserved. Setting them
>> to some unrelated values can cause the firmware to reject the update
>> because of invalid settings, or worse - succeed.
>>
>> Reproducer:
>>   sudo ethtool -X $PF1 equal 8
>>
>> Output in dmesg:
>>   Failed to configure RSS hash for VSI 6, error -5
>>
>> Fixes: 352e9bf23813 ("ice: enable symmetric-xor RSS for Toeplitz hash function")
>> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
>> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
>> ---
>>  drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>> index 1d1947a7fe11..c52c465280f7 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_main.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
>> @@ -8046,7 +8046,7 @@ int ice_set_rss_hfunc(struct ice_vsi *vsi, u8 hfunc)
>>  	ctx->info.q_opt_rss |=
>>  		FIELD_PREP(ICE_AQ_VSI_Q_OPT_RSS_HASH_M, hfunc);
>>  	ctx->info.q_opt_tc = vsi->info.q_opt_tc;
>> -	ctx->info.q_opt_flags = vsi->info.q_opt_rss;
>> +	ctx->info.q_opt_flags = vsi->info.q_opt_flags;
>>  
>>  	err = ice_update_vsi(hw, vsi->idx, ctx, NULL);
>>  	if (err) {
>>
> 
> Sashiko complains about ice_set_rss_hfunc() but it is unrelated to this fix:
> 
>> While looking at this function, I noticed a pre-existing issue regarding the
>> hardware cache. Does calling ice_update_vsi() with a local context leave the
>> global hw->vsi_ctx[vsi->idx] out of sync?
>> If ice_update_vsi() succeeds, vsi->info.q_opt_rss is updated, but
>> hw->vsi_ctx[vsi->idx]->info.q_opt_rss is not.
>> When an unrelated feature such as RDMA filtering is subsequently toggled via
>> ice_cfg_rdma_fltr(), could it retrieve this stale cached context via
>> ice_get_vsi_ctx() and copy the stale q_opt_rss value back into its command
>> buffer?

Yes.

>> Could this cause the firmware to silently revert the RSS hash function to its
>> previous configuration during the next ice_update_vsi() call?

No, because the context object passed to ice_update_vsi() only sets
ctx->info.valid_sections for the sections it wants to update, so unrelated
values are not updated in HW.

Looking at other ice_update_vsi() calls, most of the times the context object is
being allocated, not taken from cache. It's not immediately clear to me what
purpose does hw->vsi_ctx[] serve - it only appears to be used in
ice_cfg_rdma_fltr() (correct me if I'm wrong), where options from the cached
context are being read to fill the unchanged fields in the updated section.
This seems to be the equivalent of keeping track of context values in vsi->info,
which is what almost all ice_update_vsi() callers do.
If I had to guess, I'd say hw->vsi_ctx[] could probably be removed and vsi->info
used instead, but maybe I'm missing something.

TLDR I think this is just old, inconsistent code that could be improved, but it
needs some investigation. Until we don't have a clear signal that there's a bug,
I wouldn't touch it.

>> There also appears to be a pre-existing issue with state desynchronization
>> when handling failures. After ice_update_vsi() succeeds, vsi->rss_hfunc
>> is updated, but ice_set_rss_hfunc() then calls ice_set_rss_cfg_symm() at the
>> end of the function:
>> ice_set_rss_hfunc() {
>>         ...
>>         err = ice_update_vsi(hw, vsi->idx, ctx, NULL);
>>         if (err) {
>>                 ...
>>         } else {
>>                 vsi->info.q_opt_rss = ctx->info.q_opt_rss;
>>                 vsi->rss_hfunc = hfunc;
>>                 ...
>>         }
>>         ...
>>         return ice_set_rss_cfg_symm(hw, vsi, symm);
>> }
>> If ice_set_rss_cfg_symm() fails, the error is returned but vsi->rss_hfunc
>> remains updated. If the user retries the command, the early check:
>> if (hfunc == vsi->rss_hfunc)
>>         return 0;
>> evaluates to true and returns success immediately.
>> Could this skip the ice_set_rss_cfg_symm() retry entirely, leaving the
>> hardware's flow director rules in a non-symmetric state permanently while
>> falsely reporting success?

This looks valid.

Thanks,
Marcin

> Someone from the ice team should look into this and determine whether or
> not its valid.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 09/13] ice: fix setting RSS VSI hash for E830
  2026-05-07 11:47     ` Marcin Szycik
@ 2026-05-07 16:59       ` Marcin Szycik
  2026-05-07 21:13         ` Jacob Keller
  0 siblings, 1 reply; 26+ messages in thread
From: Marcin Szycik @ 2026-05-07 16:59 UTC (permalink / raw)
  To: Jacob Keller, Przemek Kitszel, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Joshua Hay, Madhu Chittim, Willem de Bruijn, Dave Ertman,
	Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable



On 07.05.2026 13:47, Marcin Szycik wrote:
> 
> 
> On 06.05.2026 23:06, Jacob Keller wrote:
>> On 5/4/2026 10:14 PM, Jacob Keller wrote:
>>> From: Marcin Szycik <marcin.szycik@linux.intel.com>
>>>
>>> ice_set_rss_hfunc() performs a VSI update, in which it sets hashing
>>> function, leaving other VSI options unchanged. However, ::q_opt_flags is
>>> mistakenly set to the value of another field, instead of its original
>>> value, probably due to a typo. What happens next is hardware-dependent:
>>>
>>> On E810, only the first bit is meaningful (see
>>> ICE_AQ_VSI_Q_OPT_PE_FLTR_EN) and can potentially end up in a different
>>> state than before VSI update.
>>>
>>> On E830, some of the remaining bits are not reserved. Setting them
>>> to some unrelated values can cause the firmware to reject the update
>>> because of invalid settings, or worse - succeed.
>>>
>>> Reproducer:
>>>   sudo ethtool -X $PF1 equal 8
>>>
>>> Output in dmesg:
>>>   Failed to configure RSS hash for VSI 6, error -5
>>>
>>> Fixes: 352e9bf23813 ("ice: enable symmetric-xor RSS for Toeplitz hash function")
>>> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
>>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>>> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
>>> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
>>> ---
>>>  drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>>> index 1d1947a7fe11..c52c465280f7 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_main.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
>>> @@ -8046,7 +8046,7 @@ int ice_set_rss_hfunc(struct ice_vsi *vsi, u8 hfunc)
>>>  	ctx->info.q_opt_rss |=
>>>  		FIELD_PREP(ICE_AQ_VSI_Q_OPT_RSS_HASH_M, hfunc);
>>>  	ctx->info.q_opt_tc = vsi->info.q_opt_tc;
>>> -	ctx->info.q_opt_flags = vsi->info.q_opt_rss;
>>> +	ctx->info.q_opt_flags = vsi->info.q_opt_flags;
>>>  
>>>  	err = ice_update_vsi(hw, vsi->idx, ctx, NULL);
>>>  	if (err) {
>>>
>>
>> Sashiko complains about ice_set_rss_hfunc() but it is unrelated to this fix:
>>
>>> While looking at this function, I noticed a pre-existing issue regarding the
>>> hardware cache. Does calling ice_update_vsi() with a local context leave the
>>> global hw->vsi_ctx[vsi->idx] out of sync?
>>> If ice_update_vsi() succeeds, vsi->info.q_opt_rss is updated, but
>>> hw->vsi_ctx[vsi->idx]->info.q_opt_rss is not.
>>> When an unrelated feature such as RDMA filtering is subsequently toggled via
>>> ice_cfg_rdma_fltr(), could it retrieve this stale cached context via
>>> ice_get_vsi_ctx() and copy the stale q_opt_rss value back into its command
>>> buffer?
> 
> Yes.
> 
>>> Could this cause the firmware to silently revert the RSS hash function to its
>>> previous configuration during the next ice_update_vsi() call?
> 
> No, because the context object passed to ice_update_vsi() only sets
> ctx->info.valid_sections for the sections it wants to update, so unrelated
> values are not updated in HW.
> 
> Looking at other ice_update_vsi() calls, most of the times the context object is
> being allocated, not taken from cache. It's not immediately clear to me what
> purpose does hw->vsi_ctx[] serve - it only appears to be used in
> ice_cfg_rdma_fltr() (correct me if I'm wrong), where options from the cached
> context are being read to fill the unchanged fields in the updated section.
> This seems to be the equivalent of keeping track of context values in vsi->info,
> which is what almost all ice_update_vsi() callers do.
> If I had to guess, I'd say hw->vsi_ctx[] could probably be removed and vsi->info
> used instead, but maybe I'm missing something.
> 
> TLDR I think this is just old, inconsistent code that could be improved, but it
> needs some investigation. Until we don't have a clear signal that there's a bug,
> I wouldn't touch it.
> 
>>> There also appears to be a pre-existing issue with state desynchronization
>>> when handling failures. After ice_update_vsi() succeeds, vsi->rss_hfunc
>>> is updated, but ice_set_rss_hfunc() then calls ice_set_rss_cfg_symm() at the
>>> end of the function:
>>> ice_set_rss_hfunc() {
>>>         ...
>>>         err = ice_update_vsi(hw, vsi->idx, ctx, NULL);
>>>         if (err) {
>>>                 ...
>>>         } else {
>>>                 vsi->info.q_opt_rss = ctx->info.q_opt_rss;
>>>                 vsi->rss_hfunc = hfunc;
>>>                 ...
>>>         }
>>>         ...
>>>         return ice_set_rss_cfg_symm(hw, vsi, symm);
>>> }
>>> If ice_set_rss_cfg_symm() fails, the error is returned but vsi->rss_hfunc
>>> remains updated. If the user retries the command, the early check:
>>> if (hfunc == vsi->rss_hfunc)
>>>         return 0;
>>> evaluates to true and returns success immediately.
>>> Could this skip the ice_set_rss_cfg_symm() retry entirely, leaving the
>>> hardware's flow director rules in a non-symmetric state permanently while
>>> falsely reporting success?
> 
> This looks valid.

On second thought, if we decide to rollback changes to VSI on ice_set_rss_cfg_symm()
fail, we must call ice_update_vsi(), which then can also fail, still leaving us with
hfunc programmed and symmetry not set. I'm not sure if it's worth adding rollback
that can fail and still leave us with the original problem. User would just see 2 errors
instead of 1.
> Thanks,
> Marcin
> 
>> Someone from the ice team should look into this and determine whether or
>> not its valid.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net 09/13] ice: fix setting RSS VSI hash for E830
  2026-05-07 16:59       ` Marcin Szycik
@ 2026-05-07 21:13         ` Jacob Keller
  0 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2026-05-07 21:13 UTC (permalink / raw)
  To: Marcin Szycik, Przemek Kitszel, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Piotr Kwapulinski,
	Aleksandr Loktionov, Arkadiusz Kubalewski, Maciej Fijalkowski,
	Joshua Hay, Madhu Chittim, Willem de Bruijn, Dave Ertman,
	Ivan Vecera, Grzegorz Nitka
  Cc: netdev, stable

On 5/7/2026 9:59 AM, Marcin Szycik wrote:
> 
> 
> On 07.05.2026 13:47, Marcin Szycik wrote:
>>
>>
>> On 06.05.2026 23:06, Jacob Keller wrote:
>>> On 5/4/2026 10:14 PM, Jacob Keller wrote:
>>>> From: Marcin Szycik <marcin.szycik@linux.intel.com>
>>>>
>>>> ice_set_rss_hfunc() performs a VSI update, in which it sets hashing
>>>> function, leaving other VSI options unchanged. However, ::q_opt_flags is
>>>> mistakenly set to the value of another field, instead of its original
>>>> value, probably due to a typo. What happens next is hardware-dependent:
>>>>
>>>> On E810, only the first bit is meaningful (see
>>>> ICE_AQ_VSI_Q_OPT_PE_FLTR_EN) and can potentially end up in a different
>>>> state than before VSI update.
>>>>
>>>> On E830, some of the remaining bits are not reserved. Setting them
>>>> to some unrelated values can cause the firmware to reject the update
>>>> because of invalid settings, or worse - succeed.
>>>>
>>>> Reproducer:
>>>>   sudo ethtool -X $PF1 equal 8
>>>>
>>>> Output in dmesg:
>>>>   Failed to configure RSS hash for VSI 6, error -5
>>>>
>>>> Fixes: 352e9bf23813 ("ice: enable symmetric-xor RSS for Toeplitz hash function")
>>>> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
>>>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>>>> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
>>>> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
>>>> ---
>>>>  drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
>>>> index 1d1947a7fe11..c52c465280f7 100644
>>>> --- a/drivers/net/ethernet/intel/ice/ice_main.c
>>>> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
>>>> @@ -8046,7 +8046,7 @@ int ice_set_rss_hfunc(struct ice_vsi *vsi, u8 hfunc)
>>>>  	ctx->info.q_opt_rss |=
>>>>  		FIELD_PREP(ICE_AQ_VSI_Q_OPT_RSS_HASH_M, hfunc);
>>>>  	ctx->info.q_opt_tc = vsi->info.q_opt_tc;
>>>> -	ctx->info.q_opt_flags = vsi->info.q_opt_rss;
>>>> +	ctx->info.q_opt_flags = vsi->info.q_opt_flags;
>>>>  
>>>>  	err = ice_update_vsi(hw, vsi->idx, ctx, NULL);
>>>>  	if (err) {
>>>>
>>>
>>> Sashiko complains about ice_set_rss_hfunc() but it is unrelated to this fix:
>>>
>>>> While looking at this function, I noticed a pre-existing issue regarding the
>>>> hardware cache. Does calling ice_update_vsi() with a local context leave the
>>>> global hw->vsi_ctx[vsi->idx] out of sync?
>>>> If ice_update_vsi() succeeds, vsi->info.q_opt_rss is updated, but
>>>> hw->vsi_ctx[vsi->idx]->info.q_opt_rss is not.
>>>> When an unrelated feature such as RDMA filtering is subsequently toggled via
>>>> ice_cfg_rdma_fltr(), could it retrieve this stale cached context via
>>>> ice_get_vsi_ctx() and copy the stale q_opt_rss value back into its command
>>>> buffer?
>>
>> Yes.
>>
>>>> Could this cause the firmware to silently revert the RSS hash function to its
>>>> previous configuration during the next ice_update_vsi() call?
>>
>> No, because the context object passed to ice_update_vsi() only sets
>> ctx->info.valid_sections for the sections it wants to update, so unrelated
>> values are not updated in HW.
>>

Ok.

>> Looking at other ice_update_vsi() calls, most of the times the context object is
>> being allocated, not taken from cache. It's not immediately clear to me what
>> purpose does hw->vsi_ctx[] serve - it only appears to be used in
>> ice_cfg_rdma_fltr() (correct me if I'm wrong), where options from the cached
>> context are being read to fill the unchanged fields in the updated section.
>> This seems to be the equivalent of keeping track of context values in vsi->info,
>> which is what almost all ice_update_vsi() callers do.
>> If I had to guess, I'd say hw->vsi_ctx[] could probably be removed and vsi->info
>> used instead, but maybe I'm missing something.
>>
>> TLDR I think this is just old, inconsistent code that could be improved, but it
>> needs some investigation. Until we don't have a clear signal that there's a bug,
>> I wouldn't touch it.
>>

Makes sense. Its probably better to leave it as-is for now. Either way,
I think these shouldn't block the obvious fix.

>>>> There also appears to be a pre-existing issue with state desynchronization
>>>> when handling failures. After ice_update_vsi() succeeds, vsi->rss_hfunc
>>>> is updated, but ice_set_rss_hfunc() then calls ice_set_rss_cfg_symm() at the
>>>> end of the function:
>>>> ice_set_rss_hfunc() {
>>>>         ...
>>>>         err = ice_update_vsi(hw, vsi->idx, ctx, NULL);
>>>>         if (err) {
>>>>                 ...
>>>>         } else {
>>>>                 vsi->info.q_opt_rss = ctx->info.q_opt_rss;
>>>>                 vsi->rss_hfunc = hfunc;
>>>>                 ...
>>>>         }
>>>>         ...
>>>>         return ice_set_rss_cfg_symm(hw, vsi, symm);
>>>> }
>>>> If ice_set_rss_cfg_symm() fails, the error is returned but vsi->rss_hfunc
>>>> remains updated. If the user retries the command, the early check:
>>>> if (hfunc == vsi->rss_hfunc)
>>>>         return 0;
>>>> evaluates to true and returns success immediately.
>>>> Could this skip the ice_set_rss_cfg_symm() retry entirely, leaving the
>>>> hardware's flow director rules in a non-symmetric state permanently while
>>>> falsely reporting success?
>>
>> This looks valid.
> 
> On second thought, if we decide to rollback changes to VSI on ice_set_rss_cfg_symm()
> fail, we must call ice_update_vsi(), which then can also fail, still leaving us with
> hfunc programmed and symmetry not set. I'm not sure if it's worth adding rollback
> that can fail and still leave us with the original problem. User would just see 2 errors
> instead of 1.
>> Thanks,
>> Marcin
>>


Either way, its a pre-existing issue that could warrant an investigation
and a patch, not a reason to modify or delay this bug fix.

Thanks,
Jake


>>> Someone from the ice team should look into this and determine whether or
>>> not its valid.
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2026-05-07 21:14 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-05  5:14 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller
2026-05-05  5:14 ` [PATCH net 01/13] i40e: Cleanup PTP registration on probe failure Jacob Keller
2026-05-06 20:24   ` Jacob Keller
2026-05-05  5:14 ` [PATCH net 02/13] i40e: Cleanup PTP pins " Jacob Keller
2026-05-06 20:28   ` Jacob Keller
2026-05-05  5:14 ` [PATCH net 03/13] i40e: keep q_vectors array in sync with channel count changes Jacob Keller
2026-05-06 20:53   ` Jacob Keller
2026-05-05  5:14 ` [PATCH net 04/13] idpf: fix read_dev_clk_lock spinlock init in idpf_ptp_init() Jacob Keller
2026-05-05  5:14 ` [PATCH net 05/13] idpf: do not enable XDP if queue based scheduling is not supported Jacob Keller
2026-05-06 20:59   ` Jacob Keller
2026-05-05  5:14 ` [PATCH net 06/13] idpf: fix skb datapath queue based scheduling crashes and timeouts Jacob Keller
2026-05-05  5:14 ` [PATCH net 07/13] idpf: fix xdp crash in soft reset error path Jacob Keller
2026-05-05  5:14 ` [PATCH net 08/13] idpf: fix double free and use-after-free in aux device error paths Jacob Keller
2026-05-06 21:04   ` Jacob Keller
2026-05-05  5:14 ` [PATCH net 09/13] ice: fix setting RSS VSI hash for E830 Jacob Keller
2026-05-06 21:06   ` Jacob Keller
2026-05-07 11:47     ` Marcin Szycik
2026-05-07 16:59       ` Marcin Szycik
2026-05-07 21:13         ` Jacob Keller
2026-05-05  5:14 ` [PATCH net 10/13] ice: fix locking in ice_dcb_rebuild() Jacob Keller
2026-05-06 21:13   ` Jacob Keller
2026-05-05  5:14 ` [PATCH net 11/13] ice: fix PTP hang for E825C devices Jacob Keller
2026-05-06 21:16   ` Jacob Keller
2026-05-05  5:14 ` [PATCH net 12/13] ice: dpll: fix rclk pin state get for E810 Jacob Keller
2026-05-05  5:14 ` [PATCH net 13/13] ice: dpll: fix misplaced header macros Jacob Keller
2026-05-06 21:21 ` [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf) Jacob Keller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox