* [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e)
@ 2026-04-15 5:47 Jacob Keller
2026-04-15 5:47 ` [PATCH net 01/13] ice: fix 'adjust' timer programming for E830 devices Jacob Keller
` (12 more replies)
0 siblings, 13 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:47 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Grzegorz Nitka, Aleksandr Loktionov,
Simon Horman, Rinitha S, Zoltan Fodor, Sunitha Mekala,
Guangshuo Li, stable, Michal Schmidt, Paul Greenwalt,
Przemek Kitszel, Keita Morisaki, Kohei Enju, Petr Oros,
Paul Menzel, Rafal Romanowski, Emil Tantilov, Patryk Holda,
Matt Vollrath, Avigail Dahan
Grzegorz updates the logic for adjusting the PTP hardware clock on E830,
fixing a bug that prevented adjustments below S32_MAX/MIN nanoseconds.
Grzegorz and Zoli update the PCS latency settings for E825 devices at 10GbE
and 25GbE, improving the accuracy of timestamps based on data from
production hardware.
Michal Schmidt fixes a double-free that could happen if a particular error
path is taken in ice_xmit_frame_ring().
Guangshuo fixes a double-free that could happen during error paths in the
ice_sf_eth_activate() function.
Paul Greenwalt fixes the PHY link configuration when the link-down-on-close
driver parameter is enabled and new media is inserted.
Paul Greenwalt fixes the ICE_AQ_LINK_SPEED_M macro for 200G, enabling 200G
link speed advertisement.
Keita Morisaki fixes a race condition in the ice Tx timestamp ring cleanup,
preventing a possible NULL pointer dereference.
Kohei Enju fixes a potential NULL pointer dereference in ice_set_ring_param().
Kohei Enju fixes i40e to stop advertising IFF_SUPP_NOFCS, when the driver
does not actually support the feature.
Aleksandr fixes i40e napi_enable/disable for q_vectors that no longer have
rings.
Petr fixes the VLAN L2TAG2 mask when the iAVF VF and a PF negotiate use of
the legacy Rx descriptor format.
Emil fixes a NULL pointer dereference that can happen in the soft reset if
a particular error path is taken.
Matt fixes the unrolling logic for PTP when the e1000e probe fails after
the PTP clock has been registered.
**A note to stable backports**
The patches [7/13] ("ice: fix race condition in TX timestamp ring
cleanup") and [8/13] ("ice: fix potential NULL pointer deref in error
path of ice_set_ringparam()") must be backported together. Otherwise the
fix in patch 8 will not work properly.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
Aleksandr Loktionov (1):
i40e: fix napi_enable/disable skipping ringless q_vectors
Emil Tantilov (1):
idpf: fix xdp crash in soft reset error path
Grzegorz Nitka (2):
ice: fix 'adjust' timer programming for E830 devices
ice: update PCS latency settings for E825 10G/25Gb modes
Guangshuo Li (1):
ice: fix double free in ice_sf_eth_activate() error path
Keita Morisaki (1):
ice: fix race condition in TX timestamp ring cleanup
Kohei Enju (2):
ice: fix potential NULL pointer deref in error path of ice_set_ringparam()
i40e: don't advertise IFF_SUPP_NOFCS
Matt Vollrath (1):
e1000e: Unroll PTP in probe error handling
Michal Schmidt (1):
ice: fix double-free of tx_buf skb
Paul Greenwalt (2):
ice: fix PHY config on media change with link-down-on-close
ice: fix ICE_AQ_LINK_SPEED_M for 200G
Petr Oros (1):
iavf: fix wrong VLAN mask for legacy Rx descriptors L2TAG2
drivers/net/ethernet/intel/iavf/iavf_type.h | 2 +-
drivers/net/ethernet/intel/ice/ice.h | 4 +-
drivers/net/ethernet/intel/ice/ice_adminq_cmd.h | 2 +-
drivers/net/ethernet/intel/ice/ice_ptp_consts.h | 12 +--
drivers/net/ethernet/intel/ice/ice_txrx.h | 16 ++--
drivers/net/ethernet/intel/e1000e/netdev.c | 1 +
drivers/net/ethernet/intel/i40e/i40e_main.c | 29 +++---
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 10 ++
drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 2 +-
drivers/net/ethernet/intel/ice/ice_ethtool.c | 1 +
drivers/net/ethernet/intel/ice/ice_lib.c | 4 +-
drivers/net/ethernet/intel/ice/ice_main.c | 121 ++++++------------------
drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 6 +-
drivers/net/ethernet/intel/ice/ice_sf_eth.c | 2 +
drivers/net/ethernet/intel/ice/ice_txrx.c | 29 ++++--
drivers/net/ethernet/intel/idpf/xdp.c | 1 +
drivers/net/ethernet/intel/idpf/xsk.c | 4 +-
17 files changed, 107 insertions(+), 139 deletions(-)
---
base-commit: b9d8b856689d2b968495d79fe653d87fcb8ad98c
change-id: 20260414-iwl-net-submission-2026-04-14-6203e1860df3
Best regards,
--
Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH net 01/13] ice: fix 'adjust' timer programming for E830 devices
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
@ 2026-04-15 5:47 ` Jacob Keller
2026-04-15 5:47 ` [PATCH net 02/13] ice: update PCS latency settings for E825 10G/25Gb modes Jacob Keller
` (11 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:47 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Grzegorz Nitka, Aleksandr Loktionov,
Simon Horman, Rinitha S
From: Grzegorz Nitka <grzegorz.nitka@intel.com>
Fix incorrect 'adjust the timer' programming sequence for E830 devices
series. Only shadow registers GLTSYN_SHADJ were programmed in the
current implementation. According to the specification [1], write to
command GLTSYN_CMD register is also required with CMD field set to
"Adjust the Time" value, for the timer adjustment to take the effect.
The flow was broken for the adjustment less than S32_MAX/MIN range
(around +/- 2 seconds). For bigger adjustment, non-atomic programming
flow is used, involving set timer programming. Non-atomic flow is
implemented correctly.
Testing hints:
Run command:
phc_ctl /dev/ptpX get adj 2 get
Expected result:
Returned timestamps differ at least by 2 seconds
[1] Intel® Ethernet Controller E830 Datasheet rev 1.3, chapter 9.7.5.4
https://cdrdv2.intel.com/v1/dl/getContent/787353?explicitVersion=true
Fixes: f00307522786 ("ice: Implement PTP support for E830 devices")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
index 61c0a0d93ea8..5a5c511ccbb6 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
@@ -5381,8 +5381,8 @@ int ice_ptp_write_incval_locked(struct ice_hw *hw, u64 incval)
*/
int ice_ptp_adj_clock(struct ice_hw *hw, s32 adj)
{
+ int err = 0;
u8 tmr_idx;
- int err;
tmr_idx = hw->func_caps.ts_func_info.tmr_index_owned;
@@ -5399,8 +5399,8 @@ int ice_ptp_adj_clock(struct ice_hw *hw, s32 adj)
err = ice_ptp_prep_phy_adj_e810(hw, adj);
break;
case ICE_MAC_E830:
- /* E830 sync PHYs automatically after setting GLTSYN_SHADJ */
- return 0;
+ /* E830 sync PHYs automatically after setting cmd register */
+ break;
case ICE_MAC_GENERIC:
err = ice_ptp_prep_phy_adj_e82x(hw, adj);
break;
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 02/13] ice: update PCS latency settings for E825 10G/25Gb modes
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
2026-04-15 5:47 ` [PATCH net 01/13] ice: fix 'adjust' timer programming for E830 devices Jacob Keller
@ 2026-04-15 5:47 ` Jacob Keller
2026-04-15 5:47 ` [PATCH net 03/13] ice: fix double free in ice_sf_eth_activate() error path Jacob Keller
` (10 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:47 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Grzegorz Nitka, Zoltan Fodor,
Aleksandr Loktionov, Sunitha Mekala
From: Grzegorz Nitka <grzegorz.nitka@intel.com>
Update MAC Rx/Tx offset registers settings (PHY_MAC_[RX|TX]_OFFSET
registers) with the data obtained with the latest research. It applies
to PCS latency settings for the following speeds/modes:
* 10Gb NO-FEC
- TX latency changed from 71.25 ns to 73 ns
- RX latency changed from -25.6 ns to -28 ns
* 25Gb NO-FEC
- TX latency changed from 28.17 ns to 33 ns
- RX latency changed from -12.45 ns to -12 ns
* 25Gb RS-FEC
- TX latency changed from 64.5 ns to 69 ns
- RX latency changed from -3.6 ns to -3 ns
The original data came from simulation and pre-production hardware.
The new data measures the actual delays and as such is more accurate.
Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products")
Co-developed-by: Zoltan Fodor <zoltan.fodor@intel.com>
Signed-off-by: Zoltan Fodor <zoltan.fodor@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/ice/ice_ptp_consts.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_consts.h b/drivers/net/ethernet/intel/ice/ice_ptp_consts.h
index 19dddd9b53dd..4d298c27bfb2 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_consts.h
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_consts.h
@@ -78,14 +78,14 @@ struct ice_eth56g_mac_reg_cfg eth56g_mac_cfg[NUM_ICE_ETH56G_LNK_SPD] = {
.blktime = 0x666, /* 3.2 */
.tx_offset = {
.serdes = 0x234c, /* 17.6484848 */
- .no_fec = 0x8e80, /* 71.25 */
+ .no_fec = 0x93d9, /* 73 */
.fc = 0xb4a4, /* 90.32 */
.sfd = 0x4a4, /* 2.32 */
.onestep = 0x4ccd /* 38.4 */
},
.rx_offset = {
.serdes = 0xffffeb27, /* -10.42424 */
- .no_fec = 0xffffcccd, /* -25.6 */
+ .no_fec = 0xffffc7b6, /* -28 */
.fc = 0xfffc557b, /* -469.26 */
.sfd = 0x4a4, /* 2.32 */
.bs_ds = 0x32 /* 0.0969697 */
@@ -118,17 +118,17 @@ struct ice_eth56g_mac_reg_cfg eth56g_mac_cfg[NUM_ICE_ETH56G_LNK_SPD] = {
.mktime = 0x147b, /* 10.24, only if RS-FEC enabled */
.tx_offset = {
.serdes = 0xe1e, /* 7.0593939 */
- .no_fec = 0x3857, /* 28.17 */
+ .no_fec = 0x4266, /* 33 */
.fc = 0x48c3, /* 36.38 */
- .rs = 0x8100, /* 64.5 */
+ .rs = 0x8a00, /* 69 */
.sfd = 0x1dc, /* 0.93 */
.onestep = 0x1eb8 /* 15.36 */
},
.rx_offset = {
.serdes = 0xfffff7a9, /* -4.1697 */
- .no_fec = 0xffffe71a, /* -12.45 */
+ .no_fec = 0xffffe700, /* -12 */
.fc = 0xfffe894d, /* -187.35 */
- .rs = 0xfffff8cd, /* -3.6 */
+ .rs = 0xfffff8cc, /* -3 */
.sfd = 0x1dc, /* 0.93 */
.bs_ds = 0x14 /* 0.0387879, RS-FEC 0 */
}
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 03/13] ice: fix double free in ice_sf_eth_activate() error path
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
2026-04-15 5:47 ` [PATCH net 01/13] ice: fix 'adjust' timer programming for E830 devices Jacob Keller
2026-04-15 5:47 ` [PATCH net 02/13] ice: update PCS latency settings for E825 10G/25Gb modes Jacob Keller
@ 2026-04-15 5:47 ` Jacob Keller
2026-04-15 5:47 ` [PATCH net 04/13] ice: fix double-free of tx_buf skb Jacob Keller
` (9 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:47 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Guangshuo Li, stable, Aleksandr Loktionov,
Simon Horman
From: Guangshuo Li <lgs201920130244@gmail.com>
When auxiliary_device_add() fails, ice_sf_eth_activate() jumps to
aux_dev_uninit and calls auxiliary_device_uninit(&sf_dev->adev).
The device release callback ice_sf_dev_release() frees sf_dev, but
the current error path falls through to sf_dev_free and calls
kfree(sf_dev) again, causing a double free.
Keep kfree(sf_dev) for the auxiliary_device_init() failure path, but
avoid falling through to sf_dev_free after auxiliary_device_uninit().
Fixes: 13acc5c4cdbe ("ice: subfunction activation and base devlink ops")
Cc: stable@vger.kernel.org
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/ice/ice_sf_eth.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/ice/ice_sf_eth.c b/drivers/net/ethernet/intel/ice/ice_sf_eth.c
index 2cf04bc6edce..a730aa368c92 100644
--- a/drivers/net/ethernet/intel/ice/ice_sf_eth.c
+++ b/drivers/net/ethernet/intel/ice/ice_sf_eth.c
@@ -305,6 +305,8 @@ ice_sf_eth_activate(struct ice_dynamic_port *dyn_port,
aux_dev_uninit:
auxiliary_device_uninit(&sf_dev->adev);
+ return err;
+
sf_dev_free:
kfree(sf_dev);
xa_erase:
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 04/13] ice: fix double-free of tx_buf skb
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
` (2 preceding siblings ...)
2026-04-15 5:47 ` [PATCH net 03/13] ice: fix double free in ice_sf_eth_activate() error path Jacob Keller
@ 2026-04-15 5:47 ` Jacob Keller
2026-04-15 5:48 ` [PATCH net 05/13] ice: fix PHY config on media change with link-down-on-close Jacob Keller
` (8 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:47 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Michal Schmidt
From: Michal Schmidt <mschmidt@redhat.com>
If ice_tso() or ice_tx_csum() fail, the error path in
ice_xmit_frame_ring() frees the skb, but the 'first' tx_buf still points
to it and is marked as valid (ICE_TX_BUF_SKB).
'next_to_use' remains unchanged, so the potential problem will
likely fix itself when the next packet is transmitted and the tx_buf
gets overwritten. But if there is no next packet and the interface is
brought down instead, ice_clean_tx_ring() -> ice_unmap_and_free_tx_buf()
will find the tx_buf and free the skb for the second time.
The fix is to reset the tx_buf type to ICE_TX_BUF_EMPTY in the error
path, so that ice_unmap_and_free_tx_buf().
Move the initialization of 'first' up, to ensure it's already valid in
case we hit the linearization error path.
The bug was spotted by AI while I had it looking for something else.
It also proposed an initial version of the patch.
I reproduced the bug and tested the fix by adding code to inject
failures, on a build with KASAN.
I looked for similar bugs in related Intel drivers and did not find any.
Fixes: d76a60ba7afb ("ice: Add support for VLANs and offloads")
Assisted-by: Claude:claude-4.6-opus-high Cursor
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/ice/ice_txrx.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index a2cd4cf37734..7be9c062949b 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -2158,6 +2158,9 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_tx_ring *tx_ring)
ice_trace(xmit_frame_ring, tx_ring, skb);
+ /* record the location of the first descriptor for this packet */
+ first = &tx_ring->tx_buf[tx_ring->next_to_use];
+
count = ice_xmit_desc_count(skb);
if (ice_chk_linearize(skb, count)) {
if (__skb_linearize(skb))
@@ -2183,8 +2186,6 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_tx_ring *tx_ring)
offload.tx_ring = tx_ring;
- /* record the location of the first descriptor for this packet */
- first = &tx_ring->tx_buf[tx_ring->next_to_use];
first->skb = skb;
first->type = ICE_TX_BUF_SKB;
first->bytecount = max_t(unsigned int, skb->len, ETH_ZLEN);
@@ -2249,6 +2250,7 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_tx_ring *tx_ring)
out_drop:
ice_trace(xmit_frame_ring_drop, tx_ring, skb);
dev_kfree_skb_any(skb);
+ first->type = ICE_TX_BUF_EMPTY;
return NETDEV_TX_OK;
}
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 05/13] ice: fix PHY config on media change with link-down-on-close
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
` (3 preceding siblings ...)
2026-04-15 5:47 ` [PATCH net 04/13] ice: fix double-free of tx_buf skb Jacob Keller
@ 2026-04-15 5:48 ` Jacob Keller
2026-04-15 5:48 ` [PATCH net 06/13] ice: fix ICE_AQ_LINK_SPEED_M for 200G Jacob Keller
` (7 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:48 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Paul Greenwalt, Przemek Kitszel,
Aleksandr Loktionov, Sunitha Mekala
From: Paul Greenwalt <paul.greenwalt@intel.com>
Commit 1a3571b5938c ("ice: restore PHY settings on media insertion")
introduced separate flows for setting PHY configuration on media
present: ice_configure_phy() when link-down-on-close is disabled, and
ice_force_phys_link_state() when enabled. The latter incorrectly uses
the previous configuration even after module change, causing link
issues such as wrong speed or no link.
Unify PHY configuration into a single ice_phy_cfg() function with a
link_en parameter, ensuring PHY capabilities are always fetched fresh
from hardware.
Fixes: 1a3571b5938c ("ice: restore PHY settings on media insertion")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/ice/ice_main.c | 121 +++++++-----------------------
1 file changed, 27 insertions(+), 94 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 3c36e3641b9e..ce3a0afe302d 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -1922,82 +1922,6 @@ static void ice_handle_mdd_event(struct ice_pf *pf)
ice_print_vfs_mdd_events(pf);
}
-/**
- * ice_force_phys_link_state - Force the physical link state
- * @vsi: VSI to force the physical link state to up/down
- * @link_up: true/false indicates to set the physical link to up/down
- *
- * Force the physical link state by getting the current PHY capabilities from
- * hardware and setting the PHY config based on the determined capabilities. If
- * link changes a link event will be triggered because both the Enable Automatic
- * Link Update and LESM Enable bits are set when setting the PHY capabilities.
- *
- * Returns 0 on success, negative on failure
- */
-static int ice_force_phys_link_state(struct ice_vsi *vsi, bool link_up)
-{
- struct ice_aqc_get_phy_caps_data *pcaps;
- struct ice_aqc_set_phy_cfg_data *cfg;
- struct ice_port_info *pi;
- struct device *dev;
- int retcode;
-
- if (!vsi || !vsi->port_info || !vsi->back)
- return -EINVAL;
- if (vsi->type != ICE_VSI_PF)
- return 0;
-
- dev = ice_pf_to_dev(vsi->back);
-
- pi = vsi->port_info;
-
- pcaps = kzalloc_obj(*pcaps);
- if (!pcaps)
- return -ENOMEM;
-
- retcode = ice_aq_get_phy_caps(pi, false, ICE_AQC_REPORT_ACTIVE_CFG, pcaps,
- NULL);
- if (retcode) {
- dev_err(dev, "Failed to get phy capabilities, VSI %d error %d\n",
- vsi->vsi_num, retcode);
- retcode = -EIO;
- goto out;
- }
-
- /* No change in link */
- if (link_up == !!(pcaps->caps & ICE_AQC_PHY_EN_LINK) &&
- link_up == !!(pi->phy.link_info.link_info & ICE_AQ_LINK_UP))
- goto out;
-
- /* Use the current user PHY configuration. The current user PHY
- * configuration is initialized during probe from PHY capabilities
- * software mode, and updated on set PHY configuration.
- */
- cfg = kmemdup(&pi->phy.curr_user_phy_cfg, sizeof(*cfg), GFP_KERNEL);
- if (!cfg) {
- retcode = -ENOMEM;
- goto out;
- }
-
- cfg->caps |= ICE_AQ_PHY_ENA_AUTO_LINK_UPDT;
- if (link_up)
- cfg->caps |= ICE_AQ_PHY_ENA_LINK;
- else
- cfg->caps &= ~ICE_AQ_PHY_ENA_LINK;
-
- retcode = ice_aq_set_phy_cfg(&vsi->back->hw, pi, cfg, NULL);
- if (retcode) {
- dev_err(dev, "Failed to set phy config, VSI %d error %d\n",
- vsi->vsi_num, retcode);
- retcode = -EIO;
- }
-
- kfree(cfg);
-out:
- kfree(pcaps);
- return retcode;
-}
-
/**
* ice_init_nvm_phy_type - Initialize the NVM PHY type
* @pi: port info structure
@@ -2066,7 +1990,7 @@ static void ice_init_link_dflt_override(struct ice_port_info *pi)
* first time media is available. The ICE_LINK_DEFAULT_OVERRIDE_PENDING state
* is used to indicate that the user PHY cfg default override is initialized
* and the PHY has not been configured with the default override settings. The
- * state is set here, and cleared in ice_configure_phy the first time the PHY is
+ * state is set here, and cleared in ice_phy_cfg the first time the PHY is
* configured.
*
* This function should be called only if the FW doesn't support default
@@ -2172,14 +2096,18 @@ static int ice_init_phy_user_cfg(struct ice_port_info *pi)
}
/**
- * ice_configure_phy - configure PHY
+ * ice_phy_cfg - configure PHY
* @vsi: VSI of PHY
+ * @link_en: true/false indicates to set link to enable/disable
*
* Set the PHY configuration. If the current PHY configuration is the same as
- * the curr_user_phy_cfg, then do nothing to avoid link flap. Otherwise
- * configure the based get PHY capabilities for topology with media.
+ * the curr_user_phy_cfg and link_en hasn't changed, then do nothing to avoid
+ * link flap. Otherwise configure the PHY based get PHY capabilities for
+ * topology with media and link_en.
+ *
+ * Return: 0 on success, negative on failure
*/
-static int ice_configure_phy(struct ice_vsi *vsi)
+static int ice_phy_cfg(struct ice_vsi *vsi, bool link_en)
{
struct device *dev = ice_pf_to_dev(vsi->back);
struct ice_port_info *pi = vsi->port_info;
@@ -2199,9 +2127,6 @@ static int ice_configure_phy(struct ice_vsi *vsi)
phy->link_info.topo_media_conflict == ICE_AQ_LINK_TOPO_UNSUPP_MEDIA)
return -EPERM;
- if (test_bit(ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA, pf->flags))
- return ice_force_phys_link_state(vsi, true);
-
pcaps = kzalloc_obj(*pcaps);
if (!pcaps)
return -ENOMEM;
@@ -2215,10 +2140,8 @@ static int ice_configure_phy(struct ice_vsi *vsi)
goto done;
}
- /* If PHY enable link is configured and configuration has not changed,
- * there's nothing to do
- */
- if (pcaps->caps & ICE_AQC_PHY_EN_LINK &&
+ /* Configuration has not changed. There's nothing to do. */
+ if (link_en == !!(pcaps->caps & ICE_AQC_PHY_EN_LINK) &&
ice_phy_caps_equals_cfg(pcaps, &phy->curr_user_phy_cfg))
goto done;
@@ -2282,8 +2205,12 @@ static int ice_configure_phy(struct ice_vsi *vsi)
*/
ice_cfg_phy_fc(pi, cfg, phy->curr_user_fc_req);
- /* Enable link and link update */
- cfg->caps |= ICE_AQ_PHY_ENA_AUTO_LINK_UPDT | ICE_AQ_PHY_ENA_LINK;
+ /* Enable/Disable link and link update */
+ cfg->caps |= ICE_AQ_PHY_ENA_AUTO_LINK_UPDT;
+ if (link_en)
+ cfg->caps |= ICE_AQ_PHY_ENA_LINK;
+ else
+ cfg->caps &= ~ICE_AQ_PHY_ENA_LINK;
err = ice_aq_set_phy_cfg(&pf->hw, pi, cfg, NULL);
if (err)
@@ -2336,7 +2263,7 @@ static void ice_check_media_subtask(struct ice_pf *pf)
test_bit(ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA, vsi->back->flags))
return;
- err = ice_configure_phy(vsi);
+ err = ice_phy_cfg(vsi, true);
if (!err)
clear_bit(ICE_FLAG_NO_MEDIA, pf->flags);
@@ -4892,9 +4819,15 @@ static int ice_init_link(struct ice_pf *pf)
if (!test_bit(ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA, pf->flags)) {
struct ice_vsi *vsi = ice_get_main_vsi(pf);
+ struct ice_link_default_override_tlv *ldo;
+ bool link_en;
+
+ ldo = &pf->link_dflt_override;
+ link_en = !(ldo->options &
+ ICE_LINK_OVERRIDE_AUTO_LINK_DIS);
if (vsi)
- ice_configure_phy(vsi);
+ ice_phy_cfg(vsi, link_en);
}
} else {
set_bit(ICE_FLAG_NO_MEDIA, pf->flags);
@@ -9707,7 +9640,7 @@ int ice_open_internal(struct net_device *netdev)
}
}
- err = ice_configure_phy(vsi);
+ err = ice_phy_cfg(vsi, true);
if (err) {
netdev_err(netdev, "Failed to set physical link up, error %d\n",
err);
@@ -9748,7 +9681,7 @@ int ice_stop(struct net_device *netdev)
}
if (test_bit(ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA, vsi->back->flags)) {
- int link_err = ice_force_phys_link_state(vsi, false);
+ int link_err = ice_phy_cfg(vsi, false);
if (link_err) {
if (link_err == -ENOMEDIUM)
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 06/13] ice: fix ICE_AQ_LINK_SPEED_M for 200G
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
` (4 preceding siblings ...)
2026-04-15 5:48 ` [PATCH net 05/13] ice: fix PHY config on media change with link-down-on-close Jacob Keller
@ 2026-04-15 5:48 ` Jacob Keller
2026-04-15 5:48 ` [PATCH net 07/13] ice: fix race condition in TX timestamp ring cleanup Jacob Keller
` (6 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:48 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Paul Greenwalt, Aleksandr Loktionov,
Simon Horman, Sunitha Mekala
From: Paul Greenwalt <paul.greenwalt@intel.com>
When setting PHY configuration during driver initialization, 200G link
speed is not being advertised even when the PHY is capable. This is
because the get PHY capabilities link speed response is being masked by
ICE_AQ_LINK_SPEED_M, which does not include the 200G link speed bit.
ICE_AQ_LINK_SPEED_200GB is defined as BIT(11), but the mask 0x7FF only
covers bits 0-10. Fix ICE_AQ_LINK_SPEED_M to use GENMASK(11, 0) so
that it covers all defined link speed bits including 200G.
Fixes: 24407a01e57c ("ice: Add 200G speed/phy type use")
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/ice/ice_adminq_cmd.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
index 859e9c66f3e7..3cbb1b0582e3 100644
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -1252,7 +1252,7 @@ struct ice_aqc_get_link_status_data {
#define ICE_AQ_LINK_PWR_QSFP_CLASS_3 2
#define ICE_AQ_LINK_PWR_QSFP_CLASS_4 3
__le16 link_speed;
-#define ICE_AQ_LINK_SPEED_M 0x7FF
+#define ICE_AQ_LINK_SPEED_M GENMASK(11, 0)
#define ICE_AQ_LINK_SPEED_10MB BIT(0)
#define ICE_AQ_LINK_SPEED_100MB BIT(1)
#define ICE_AQ_LINK_SPEED_1000MB BIT(2)
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 07/13] ice: fix race condition in TX timestamp ring cleanup
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
` (5 preceding siblings ...)
2026-04-15 5:48 ` [PATCH net 06/13] ice: fix ICE_AQ_LINK_SPEED_M for 200G Jacob Keller
@ 2026-04-15 5:48 ` Jacob Keller
2026-04-15 5:48 ` [PATCH net 08/13] ice: fix potential NULL pointer deref in error path of ice_set_ringparam() Jacob Keller
` (5 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:48 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Keita Morisaki, Aleksandr Loktionov,
Rinitha S
From: Keita Morisaki <kmta1236@gmail.com>
Fix a race condition between ice_free_tx_tstamp_ring() and ice_tx_map()
that can cause a NULL pointer dereference.
ice_free_tx_tstamp_ring currently clears the ICE_TX_FLAGS_TXTIME flag
after NULLing the tstamp_ring. This could allow a concurrent ice_tx_map
call on another CPU to dereference the tstamp_ring, which could lead to
a NULL pointer dereference.
CPU A:ice_free_tx_tstamp_ring() | CPU B:ice_tx_map()
--------------------------------|---------------------------------
tx_ring->tstamp_ring = NULL |
| ice_is_txtime_cfg() -> true
| tstamp_ring = tx_ring->tstamp_ring
| tstamp_ring->count // NULL deref!
flags &= ~ICE_TX_FLAGS_TXTIME |
Fix by:
1. Reordering ice_free_tx_tstamp_ring() to clear the flag before
NULLing the pointer, with smp_wmb() to ensure proper ordering.
2. Adding smp_rmb() in ice_tx_map() after the flag check to order the
flag read before the pointer read, using READ_ONCE() for the
pointer, and adding a NULL check as a safety net.
3. Converting tx_ring->flags from u8 to DECLARE_BITMAP() and using
atomic bitops (set_bit(), clear_bit(), test_bit()) for all flag
operations throughout the driver:
- ICE_TX_RING_FLAGS_XDP
- ICE_TX_RING_FLAGS_VLAN_L2TAG1
- ICE_TX_RING_FLAGS_VLAN_L2TAG2
- ICE_TX_RING_FLAGS_TXTIME
Fixes: ccde82e909467 ("ice: add E830 Earliest TxTime First Offload support")
Signed-off-by: Keita Morisaki <kmta1236@gmail.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/ice/ice.h | 4 ++--
drivers/net/ethernet/intel/ice/ice_txrx.h | 16 ++++++++++------
drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 2 +-
drivers/net/ethernet/intel/ice/ice_lib.c | 4 ++--
drivers/net/ethernet/intel/ice/ice_txrx.c | 23 ++++++++++++++++-------
5 files changed, 31 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index eb3a48330cc1..725b130dd3a2 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -753,7 +753,7 @@ static inline bool ice_is_xdp_ena_vsi(struct ice_vsi *vsi)
static inline void ice_set_ring_xdp(struct ice_tx_ring *ring)
{
- ring->flags |= ICE_TX_FLAGS_RING_XDP;
+ set_bit(ICE_TX_RING_FLAGS_XDP, ring->flags);
}
/**
@@ -778,7 +778,7 @@ static inline bool ice_is_txtime_ena(const struct ice_tx_ring *ring)
*/
static inline bool ice_is_txtime_cfg(const struct ice_tx_ring *ring)
{
- return !!(ring->flags & ICE_TX_FLAGS_TXTIME);
+ return test_bit(ICE_TX_RING_FLAGS_TXTIME, ring->flags);
}
/**
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
index b6547e1b7c42..5e517f219379 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
@@ -212,6 +212,14 @@ enum ice_rx_dtype {
ICE_RX_DTYPE_SPLIT_ALWAYS = 2,
};
+enum ice_tx_ring_flags {
+ ICE_TX_RING_FLAGS_XDP,
+ ICE_TX_RING_FLAGS_VLAN_L2TAG1,
+ ICE_TX_RING_FLAGS_VLAN_L2TAG2,
+ ICE_TX_RING_FLAGS_TXTIME,
+ ICE_TX_RING_FLAGS_NBITS,
+};
+
struct ice_pkt_ctx {
u64 cached_phctime;
__be16 vlan_proto;
@@ -352,11 +360,7 @@ struct ice_tx_ring {
u16 count; /* Number of descriptors */
u16 q_index; /* Queue number of ring */
- u8 flags;
-#define ICE_TX_FLAGS_RING_XDP BIT(0)
-#define ICE_TX_FLAGS_RING_VLAN_L2TAG1 BIT(1)
-#define ICE_TX_FLAGS_RING_VLAN_L2TAG2 BIT(2)
-#define ICE_TX_FLAGS_TXTIME BIT(3)
+ DECLARE_BITMAP(flags, ICE_TX_RING_FLAGS_NBITS);
struct xsk_buff_pool *xsk_pool;
@@ -398,7 +402,7 @@ static inline bool ice_ring_ch_enabled(struct ice_tx_ring *ring)
static inline bool ice_ring_is_xdp(struct ice_tx_ring *ring)
{
- return !!(ring->flags & ICE_TX_FLAGS_RING_XDP);
+ return test_bit(ICE_TX_RING_FLAGS_XDP, ring->flags);
}
enum ice_container_type {
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
index bd77f1c001ee..16aa25535152 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
@@ -943,7 +943,7 @@ ice_tx_prepare_vlan_flags_dcb(struct ice_tx_ring *tx_ring,
/* if this is not already set it means a VLAN 0 + priority needs
* to be offloaded
*/
- if (tx_ring->flags & ICE_TX_FLAGS_RING_VLAN_L2TAG2)
+ if (test_bit(ICE_TX_RING_FLAGS_VLAN_L2TAG2, tx_ring->flags))
first->tx_flags |= ICE_TX_FLAGS_HW_OUTER_SINGLE_VLAN;
else
first->tx_flags |= ICE_TX_FLAGS_HW_VLAN;
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 689c6025ea82..837b71b7b2b7 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -1412,9 +1412,9 @@ static int ice_vsi_alloc_rings(struct ice_vsi *vsi)
ring->count = vsi->num_tx_desc;
ring->txq_teid = ICE_INVAL_TEID;
if (dvm_ena)
- ring->flags |= ICE_TX_FLAGS_RING_VLAN_L2TAG2;
+ set_bit(ICE_TX_RING_FLAGS_VLAN_L2TAG2, ring->flags);
else
- ring->flags |= ICE_TX_FLAGS_RING_VLAN_L2TAG1;
+ set_bit(ICE_TX_RING_FLAGS_VLAN_L2TAG1, ring->flags);
WRITE_ONCE(vsi->tx_rings[i], ring);
}
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 7be9c062949b..4ca1a0602307 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -190,9 +190,10 @@ void ice_free_tstamp_ring(struct ice_tx_ring *tx_ring)
void ice_free_tx_tstamp_ring(struct ice_tx_ring *tx_ring)
{
ice_free_tstamp_ring(tx_ring);
+ clear_bit(ICE_TX_RING_FLAGS_TXTIME, tx_ring->flags);
+ smp_wmb(); /* order flag clear before pointer NULL */
kfree_rcu(tx_ring->tstamp_ring, rcu);
- tx_ring->tstamp_ring = NULL;
- tx_ring->flags &= ~ICE_TX_FLAGS_TXTIME;
+ WRITE_ONCE(tx_ring->tstamp_ring, NULL);
}
/**
@@ -405,7 +406,7 @@ static int ice_alloc_tstamp_ring(struct ice_tx_ring *tx_ring)
tx_ring->tstamp_ring = tstamp_ring;
tstamp_ring->desc = NULL;
tstamp_ring->count = ice_calc_ts_ring_count(tx_ring);
- tx_ring->flags |= ICE_TX_FLAGS_TXTIME;
+ set_bit(ICE_TX_RING_FLAGS_TXTIME, tx_ring->flags);
return 0;
}
@@ -1521,13 +1522,20 @@ ice_tx_map(struct ice_tx_ring *tx_ring, struct ice_tx_buf *first,
return;
if (ice_is_txtime_cfg(tx_ring)) {
- struct ice_tstamp_ring *tstamp_ring = tx_ring->tstamp_ring;
- u32 tstamp_count = tstamp_ring->count;
- u32 j = tstamp_ring->next_to_use;
+ struct ice_tstamp_ring *tstamp_ring;
+ u32 tstamp_count, j;
struct ice_ts_desc *ts_desc;
struct timespec64 ts;
u32 tstamp;
+ smp_rmb(); /* order flag read before pointer read */
+ tstamp_ring = READ_ONCE(tx_ring->tstamp_ring);
+ if (unlikely(!tstamp_ring))
+ goto ring_kick;
+
+ tstamp_count = tstamp_ring->count;
+ j = tstamp_ring->next_to_use;
+
ts = ktime_to_timespec64(first->skb->tstamp);
tstamp = ts.tv_nsec >> ICE_TXTIME_CTX_RESOLUTION_128NS;
@@ -1555,6 +1563,7 @@ ice_tx_map(struct ice_tx_ring *tx_ring, struct ice_tx_buf *first,
tstamp_ring->next_to_use = j;
writel_relaxed(j, tstamp_ring->tail);
} else {
+ring_kick:
writel_relaxed(i, tx_ring->tail);
}
return;
@@ -1814,7 +1823,7 @@ ice_tx_prepare_vlan_flags(struct ice_tx_ring *tx_ring, struct ice_tx_buf *first)
*/
if (skb_vlan_tag_present(skb)) {
first->vid = skb_vlan_tag_get(skb);
- if (tx_ring->flags & ICE_TX_FLAGS_RING_VLAN_L2TAG2)
+ if (test_bit(ICE_TX_RING_FLAGS_VLAN_L2TAG2, tx_ring->flags))
first->tx_flags |= ICE_TX_FLAGS_HW_OUTER_SINGLE_VLAN;
else
first->tx_flags |= ICE_TX_FLAGS_HW_VLAN;
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 08/13] ice: fix potential NULL pointer deref in error path of ice_set_ringparam()
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
` (6 preceding siblings ...)
2026-04-15 5:48 ` [PATCH net 07/13] ice: fix race condition in TX timestamp ring cleanup Jacob Keller
@ 2026-04-15 5:48 ` Jacob Keller
2026-04-15 5:48 ` [PATCH net 09/13] i40e: don't advertise IFF_SUPP_NOFCS Jacob Keller
` (4 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:48 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Kohei Enju, Paul Greenwalt, Rinitha S
From: Kohei Enju <kohei@enjuk.jp>
ice_set_ringparam nullifies tstamp_ring of temporary tx_rings, without
clearing ICE_TX_RING_FLAGS_TXTIME bit.
When ICE_TX_RING_FLAGS_TXTIME is set and the subsequent
ice_setup_tx_ring() call fails, a NULL pointer dereference could happen
in the unwinding sequence:
ice_clean_tx_ring()
-> ice_is_txtime_cfg() == true (ICE_TX_RING_FLAGS_TXTIME is set)
-> ice_free_tx_tstamp_ring()
-> ice_free_tstamp_ring()
-> tstamp_ring->desc (NULL deref)
Clear ICE_TX_RING_FLAGS_TXTIME bit to avoid the potential issue.
Note that this potential issue is found by manual code review.
Compile test only since unfortunately I don't have E830 devices.
Fixes: ccde82e90946 ("ice: add E830 Earliest TxTime First Offload support")
Signed-off-by: Kohei Enju <kohei@enjuk.jp>
Reviewed-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/ice/ice_ethtool.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index e6a20af6f63d..f28416a707d7 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -3290,6 +3290,7 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
tx_rings[i].desc = NULL;
tx_rings[i].tx_buf = NULL;
tx_rings[i].tstamp_ring = NULL;
+ clear_bit(ICE_TX_RING_FLAGS_TXTIME, tx_rings[i].flags);
tx_rings[i].tx_tstamps = &pf->ptp.port.tx;
err = ice_setup_tx_ring(&tx_rings[i]);
if (err) {
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 09/13] i40e: don't advertise IFF_SUPP_NOFCS
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
` (7 preceding siblings ...)
2026-04-15 5:48 ` [PATCH net 08/13] ice: fix potential NULL pointer deref in error path of ice_set_ringparam() Jacob Keller
@ 2026-04-15 5:48 ` Jacob Keller
2026-04-15 5:48 ` [PATCH net 10/13] i40e: fix napi_enable/disable skipping ringless q_vectors Jacob Keller
` (3 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:48 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Kohei Enju, Aleksandr Loktionov,
Sunitha Mekala
From: Kohei Enju <kohei@enjuk.jp>
i40e advertises IFF_SUPP_NOFCS, allowing users to use the SO_NOFCS
socket option. However, this option is silently ignored, as the driver
does not check skb->no_fcs, and always enables FCS insertion offload.
Fix this by removing the advertisement of IFF_SUPP_NOFCS.
This behavior can be reproduced with a simple AF_PACKET socket:
import socket
s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW)
s.setsockopt(socket.SOL_SOCKET, 43, 1) # SO_NOFCS
s.bind(("eth0", 0))
s.send(b'\xff' * 64)
Previously, send() succeeds but the driver ignores SO_NOFCS.
With this change, send() fails with -EPROTONOSUPPORT, as expected.
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Kohei Enju <kohei@enjuk.jp>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 926d001b2150..028bd500603a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -13783,7 +13783,6 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
netdev->neigh_priv_len = sizeof(u32) * 4;
netdev->priv_flags |= IFF_UNICAST_FLT;
- netdev->priv_flags |= IFF_SUPP_NOFCS;
/* Setup netdev TC information */
i40e_vsi_config_netdev_tc(vsi, vsi->tc_config.enabled_tc);
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 10/13] i40e: fix napi_enable/disable skipping ringless q_vectors
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
` (8 preceding siblings ...)
2026-04-15 5:48 ` [PATCH net 09/13] i40e: don't advertise IFF_SUPP_NOFCS Jacob Keller
@ 2026-04-15 5:48 ` Jacob Keller
2026-04-16 4:20 ` Przemek Kitszel
2026-04-15 5:48 ` [PATCH net 11/13] iavf: fix wrong VLAN mask for legacy Rx descriptors L2TAG2 Jacob Keller
` (2 subsequent siblings)
12 siblings, 1 reply; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:48 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Aleksandr Loktionov, stable, Sunitha Mekala
From: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
After ethtool -L reduces the queue count, i40e_napi_disable_all() sets
NAPI_STATE_SCHED on all q_vectors, then i40e_vsi_map_rings_to_vectors()
clears ring pointers on the excess ones. i40e_napi_enable_all() skips
those with:
if (q_vector->rx.ring || q_vector->tx.ring)
napi_enable(&q_vector->napi);
leaving them on dev->napi_list with NAPI_STATE_SCHED permanently set.
Writing to /sys/class/net/<iface>/threaded calls napi_stop_kthread()
on every entry in dev->napi_list. The function loops on msleep(20)
waiting for NAPI_STATE_SCHED to clear -- which never happens for the
stale q_vectors. The task hangs in D state forever; a concurrent write
deadlocks on dev->lock held by the first.
Commit 13a8cd191a2b ("i40e: Do not enable NAPI on q_vectors that have no
rings") added the guard to prevent a divide-by-zero in i40e_napi_poll()
when epoll busy-poll iterated all device NAPIs (4.x era). Since
7adc3d57fe2b ("net: Introduce preferred busy-polling"), from v5.11,
napi_busy_loop() polls by napi_id keyed to the socket, so ringless
q_vectors are never selected. i40e_msix_clean_rings() also independently
avoids scheduling NAPI for them. The guard is safe to remove.
Add an early return in i40e_napi_poll() for num_ringpairs == 0 so the
function is self-defending against a NULL tx.ring dereference at the
WB_ON_ITR check, should the NAPI ever fire through an unexpected path.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/intel-wired-lan/20260316133100.6054a11f@kernel.org/
Fixes: 13a8cd191a2b ("i40e: Do not enable NAPI on q_vectors that have no rings")
Cc: stable@vger.kernel.org
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 28 ++++++++++++++++------------
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 10 ++++++++++
2 files changed, 26 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 028bd500603a..b4ca8485f4b5 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -5182,6 +5182,14 @@ static void i40e_clear_interrupt_scheme(struct i40e_pf *pf)
/**
* i40e_napi_enable_all - Enable NAPI for all q_vectors in the VSI
* @vsi: the VSI being configured
+ *
+ * Enable NAPI on every q_vector that is registered with the netdev,
+ * regardless of whether it currently has rings assigned. After a queue-
+ * count reduction (e.g. ethtool -L combined 1) the excess q_vectors lose
+ * their ring pointers inside i40e_vsi_map_rings_to_vectors but remain on
+ * dev->napi_list. Leaving them in the napi_disable()-ed state
+ * (NAPI_STATE_SCHED set) causes napi_set_threaded() to spin forever on
+ * msleep(20) waiting for that bit to clear.
**/
static void i40e_napi_enable_all(struct i40e_vsi *vsi)
{
@@ -5190,17 +5198,17 @@ static void i40e_napi_enable_all(struct i40e_vsi *vsi)
if (!vsi->netdev)
return;
- for (q_idx = 0; q_idx < vsi->num_q_vectors; q_idx++) {
- struct i40e_q_vector *q_vector = vsi->q_vectors[q_idx];
-
- if (q_vector->rx.ring || q_vector->tx.ring)
- napi_enable(&q_vector->napi);
- }
+ for (q_idx = 0; q_idx < vsi->num_q_vectors; q_idx++)
+ napi_enable(&vsi->q_vectors[q_idx]->napi);
}
/**
* i40e_napi_disable_all - Disable NAPI for all q_vectors in the VSI
* @vsi: the VSI being configured
+ *
+ * Mirror of i40e_napi_enable_all: operate on every registered q_vector so
+ * enable/disable calls are always balanced, even when some q_vectors carry
+ * no rings (as happens after a queue-count reduction).
**/
static void i40e_napi_disable_all(struct i40e_vsi *vsi)
{
@@ -5209,12 +5217,8 @@ static void i40e_napi_disable_all(struct i40e_vsi *vsi)
if (!vsi->netdev)
return;
- for (q_idx = 0; q_idx < vsi->num_q_vectors; q_idx++) {
- struct i40e_q_vector *q_vector = vsi->q_vectors[q_idx];
-
- if (q_vector->rx.ring || q_vector->tx.ring)
- napi_disable(&q_vector->napi);
- }
+ for (q_idx = 0; q_idx < vsi->num_q_vectors; q_idx++)
+ napi_disable(&vsi->q_vectors[q_idx]->napi);
}
/**
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 894f2d06d39d..3123459208d3 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2760,6 +2760,16 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
return 0;
}
+ /* A q_vector can have its ring pointers cleared after a queue-count
+ * reduction (ethtool -L combined N) while napi_enable() was already
+ * called on it. Complete immediately so the poll loop exits cleanly
+ * and we never dereference the NULL ring pointer below.
+ */
+ if (unlikely(!q_vector->num_ringpairs)) {
+ napi_complete_done(napi, 0);
+ return 0;
+ }
+
/* Since the actual Tx work is minimal, we can give the Tx a larger
* budget and be more aggressive about cleaning up the Tx descriptors.
*/
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 11/13] iavf: fix wrong VLAN mask for legacy Rx descriptors L2TAG2
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
` (9 preceding siblings ...)
2026-04-15 5:48 ` [PATCH net 10/13] i40e: fix napi_enable/disable skipping ringless q_vectors Jacob Keller
@ 2026-04-15 5:48 ` Jacob Keller
2026-04-15 5:48 ` [PATCH net 12/13] idpf: fix xdp crash in soft reset error path Jacob Keller
2026-04-15 5:48 ` [PATCH net 13/13] e1000e: Unroll PTP in probe error handling Jacob Keller
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:48 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Petr Oros, Aleksandr Loktionov, Paul Menzel,
Rafal Romanowski
From: Petr Oros <poros@redhat.com>
The IAVF_RXD_LEGACY_L2TAG2_M mask was incorrectly defined as
GENMASK_ULL(63, 32), extracting 32 bits from qw2 instead of the
16-bit VLAN tag. In the legacy Rx descriptor layout, the 2nd L2TAG2
(VLAN tag) occupies bits 63:48 of qw2, not 63:32.
The oversized mask causes FIELD_GET to return a 32-bit value where the
actual VLAN tag sits in bits 31:16. When this value is passed to
iavf_receive_skb() as a u16 parameter, it gets truncated to the lower
16 bits (which contain the 1st L2TAG2, typically zero). As a result,
__vlan_hwaccel_put_tag() is never called and software VLAN interfaces
on VFs receive no traffic.
This affects VFs behind ice PF (VIRTCHNL VLAN v2) when the PF
advertises VLAN stripping into L2TAG2_2 and legacy descriptors are
used.
The flex descriptor path already uses the correct mask
(IAVF_RXD_FLEX_L2TAG2_2_M = GENMASK_ULL(63, 48)).
Reproducer:
1. Create 2 VFs on ice PF (echo 2 > sriov_numvfs)
2. Disable spoofchk on both VFs
3. Move each VF into a separate network namespace
4. On each VF: create VLAN interface (e.g. vlan 198), assign IP,
bring up
5. Set rx-vlan-offload OFF on both VFs
6. Ping between VLAN interfaces -> expect PASS
(VLAN tag stays in packet data, kernel matches in-band)
7. Set rx-vlan-offload ON on both VFs
8. Ping between VLAN interfaces -> expect FAIL if bug present
(HW strips VLAN tag into descriptor L2TAG2 field, wrong mask
extracts bits 47:32 instead of 63:48, truncated to u16 -> zero,
__vlan_hwaccel_put_tag() never called, packet delivered to parent
interface, not VLAN interface)
The reproducer requires legacy Rx descriptors. On modern ice + iavf
with full PTP support, flex descriptors are always negotiated and the
buggy legacy path is never reached. Flex descriptors require all of:
- CONFIG_PTP_1588_CLOCK enabled
- VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC granted by PF
- PTP capabilities negotiated (VIRTCHNL_VF_CAP_PTP)
- VIRTCHNL_1588_PTP_CAP_RX_TSTAMP supported
- VIRTCHNL_RXDID_2_FLEX_SQ_NIC present in DDP profile
If any condition is not met, iavf_select_rx_desc_format() falls back
to legacy descriptors (RXDID=1) and the wrong L2TAG2 mask is hit.
Fixes: 2dc8e7c36d80 ("iavf: refactor iavf_clean_rx_irq to support legacy and flex descriptors")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/iavf/iavf_type.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_type.h b/drivers/net/ethernet/intel/iavf/iavf_type.h
index 1d8cf29cb65a..5bb1de1cfd33 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_type.h
+++ b/drivers/net/ethernet/intel/iavf/iavf_type.h
@@ -277,7 +277,7 @@ struct iavf_rx_desc {
/* L2 Tag 2 Presence */
#define IAVF_RXD_LEGACY_L2TAG2P_M BIT(0)
/* Stripped S-TAG VLAN from the receive packet */
-#define IAVF_RXD_LEGACY_L2TAG2_M GENMASK_ULL(63, 32)
+#define IAVF_RXD_LEGACY_L2TAG2_M GENMASK_ULL(63, 48)
/* Stripped S-TAG VLAN from the receive packet */
#define IAVF_RXD_FLEX_L2TAG2_2_M GENMASK_ULL(63, 48)
/* The packet is a UDP tunneled packet */
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 12/13] idpf: fix xdp crash in soft reset error path
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
` (10 preceding siblings ...)
2026-04-15 5:48 ` [PATCH net 11/13] iavf: fix wrong VLAN mask for legacy Rx descriptors L2TAG2 Jacob Keller
@ 2026-04-15 5:48 ` Jacob Keller
2026-04-15 5:48 ` [PATCH net 13/13] e1000e: Unroll PTP in probe error handling Jacob Keller
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:48 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Emil Tantilov, stable, Aleksandr Loktionov,
Patryk Holda
From: Emil Tantilov <emil.s.tantilov@intel.com>
NULL pointer dereference is reported in cases where idpf_vport_open()
fails during soft reset:
./xdpsock -i <inf> -q -r -N
[ 3179.186687] idpf 0000:83:00.0: Failed to initialize queue ids for vport 0: -12
[ 3179.276739] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 3179.277636] #PF: supervisor read access in kernel mode
[ 3179.278470] #PF: error_code(0x0000) - not-present page
[ 3179.279285] PGD 0
[ 3179.280083] Oops: Oops: 0000 [#1] SMP NOPTI
...
[ 3179.283997] Workqueue: events xp_release_deferred
[ 3179.284770] RIP: 0010:idpf_find_rxq_vec+0x17/0x30 [idpf]
...
[ 3179.291937] Call Trace:
[ 3179.292392] <TASK>
[ 3179.292843] idpf_qp_switch+0x25/0x820 [idpf]
[ 3179.293325] idpf_xsk_pool_setup+0x7c/0x520 [idpf]
[ 3179.293803] idpf_xdp+0x59/0x240 [idpf]
[ 3179.294275] xp_disable_drv_zc+0x62/0xb0
[ 3179.294743] xp_clear_dev+0x40/0xb0
[ 3179.295198] xp_release_deferred+0x1f/0xa0
[ 3179.295648] process_one_work+0x226/0x730
[ 3179.296106] worker_thread+0x19e/0x340
[ 3179.296557] ? __pfx_worker_thread+0x10/0x10
[ 3179.297009] kthread+0xf4/0x130
[ 3179.297459] ? __pfx_kthread+0x10/0x10
[ 3179.297910] ret_from_fork+0x32c/0x410
[ 3179.298361] ? __pfx_kthread+0x10/0x10
[ 3179.298702] ret_from_fork_asm+0x1a/0x30
Fix the error handling of the soft reset in idpf_xdp_setup_prog() by
restoring the vport->xdp_prog to the old value. This avoids referencing
the orphaned prog that was copied to vport->xdp_prog in the soft reset
and prevents subsequent false positive by idpf_xdp_enabled().
Update the restart check in idpf_xsk_pool_setup() to use IDPF_VPORT_UP bit
instead of netif_running(). The idpf_vport_stop/start() calls will not
update the __LINK_STATE_START bit, making this test a false positive
should the soft reset fail.
Fixes: 3d57b2c00f09 ("idpf: add XSk pool initialization")
Cc: stable@vger.kernel.org
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Patryk Holda <patryk.holda@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/idpf/xdp.c | 1 +
drivers/net/ethernet/intel/idpf/xsk.c | 4 +++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/idpf/xdp.c b/drivers/net/ethernet/intel/idpf/xdp.c
index cbccd4546768..18a6e7062863 100644
--- a/drivers/net/ethernet/intel/idpf/xdp.c
+++ b/drivers/net/ethernet/intel/idpf/xdp.c
@@ -488,6 +488,7 @@ static int idpf_xdp_setup_prog(struct idpf_vport *vport,
"Could not reopen the vport after XDP setup");
cfg->user_config.xdp_prog = old;
+ vport->xdp_prog = old;
old = prog;
}
diff --git a/drivers/net/ethernet/intel/idpf/xsk.c b/drivers/net/ethernet/intel/idpf/xsk.c
index d95d3efdfd36..3d8c430efd2b 100644
--- a/drivers/net/ethernet/intel/idpf/xsk.c
+++ b/drivers/net/ethernet/intel/idpf/xsk.c
@@ -553,6 +553,7 @@ int idpf_xskrq_poll(struct idpf_rx_queue *rxq, u32 budget)
int idpf_xsk_pool_setup(struct idpf_vport *vport, struct netdev_bpf *bpf)
{
+ const struct idpf_netdev_priv *np = netdev_priv(vport->netdev);
struct xsk_buff_pool *pool = bpf->xsk.pool;
u32 qid = bpf->xsk.queue_id;
bool restart;
@@ -568,7 +569,8 @@ int idpf_xsk_pool_setup(struct idpf_vport *vport, struct netdev_bpf *bpf)
return -EINVAL;
}
- restart = idpf_xdp_enabled(vport) && netif_running(vport->netdev);
+ restart = idpf_xdp_enabled(vport) &&
+ test_bit(IDPF_VPORT_UP, np->state);
if (!restart)
goto pool;
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net 13/13] e1000e: Unroll PTP in probe error handling
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
` (11 preceding siblings ...)
2026-04-15 5:48 ` [PATCH net 12/13] idpf: fix xdp crash in soft reset error path Jacob Keller
@ 2026-04-15 5:48 ` Jacob Keller
12 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-15 5:48 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, Jacob Keller, Matt Vollrath, Avigail Dahan
From: Matt Vollrath <tactii@gmail.com>
If probe fails after registering the PTP clock and its delayed work,
these resources must be released.
This was not an issue until a 2016 fix moved the e1000e_ptp_init() call
before the jump to err_register.
Fixes: aa524b66c5ef ("e1000e: don't modify SYSTIM registers during SIOCSHWTSTAMP ioctl")
Signed-off-by: Matt Vollrath <tactii@gmail.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
drivers/net/ethernet/intel/e1000e/netdev.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 9befdacd6730..7ce0cc8ab8f4 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -7706,6 +7706,7 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
err_register:
if (!(adapter->flags & FLAG_HAS_AMT))
e1000e_release_hw_control(adapter);
+ e1000e_ptp_remove(adapter);
err_eeprom:
if (hw->phy.ops.check_reset_block && !hw->phy.ops.check_reset_block(hw))
e1000_phy_hw_reset(&adapter->hw);
--
2.53.0.1066.g1eceb487f285
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH net 10/13] i40e: fix napi_enable/disable skipping ringless q_vectors
2026-04-15 5:48 ` [PATCH net 10/13] i40e: fix napi_enable/disable skipping ringless q_vectors Jacob Keller
@ 2026-04-16 4:20 ` Przemek Kitszel
2026-04-16 20:46 ` Jacob Keller
0 siblings, 1 reply; 17+ messages in thread
From: Przemek Kitszel @ 2026-04-16 4:20 UTC (permalink / raw)
To: Jacob Keller, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni
Cc: netdev, Aleksandr Loktionov, stable, Sunitha Mekala,
Maciej Fijalkowski
On 4/15/26 07:48, Jacob Keller wrote:
> From: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
>
> After ethtool -L reduces the queue count, i40e_napi_disable_all() sets
> NAPI_STATE_SCHED on all q_vectors, then i40e_vsi_map_rings_to_vectors()
> clears ring pointers on the excess ones. i40e_napi_enable_all() skips
> those with:
>
> if (q_vector->rx.ring || q_vector->tx.ring)
> napi_enable(&q_vector->napi);
>
> leaving them on dev->napi_list with NAPI_STATE_SCHED permanently set.
>
> Writing to /sys/class/net/<iface>/threaded calls napi_stop_kthread()
> on every entry in dev->napi_list. The function loops on msleep(20)
> waiting for NAPI_STATE_SCHED to clear -- which never happens for the
> stale q_vectors. The task hangs in D state forever; a concurrent write
> deadlocks on dev->lock held by the first.
>
> Commit 13a8cd191a2b ("i40e: Do not enable NAPI on q_vectors that have no
> rings") added the guard to prevent a divide-by-zero in i40e_napi_poll()
> when epoll busy-poll iterated all device NAPIs (4.x era). Since
> 7adc3d57fe2b ("net: Introduce preferred busy-polling"), from v5.11,
> napi_busy_loop() polls by napi_id keyed to the socket, so ringless
> q_vectors are never selected. i40e_msix_clean_rings() also independently
> avoids scheduling NAPI for them. The guard is safe to remove.
>
> Add an early return in i40e_napi_poll() for num_ringpairs == 0 so the
> function is self-defending against a NULL tx.ring dereference at the
> WB_ON_ITR check, should the NAPI ever fire through an unexpected path.
>
> Reported-by: Jakub Kicinski <kuba@kernel.org>
> Closes: https://lore.kernel.org/intel-wired-lan/20260316133100.6054a11f@kernel.org/
Maciej developed a better fix for the problem, and he explicitly asked
to not include this patch. Please drop it from this series.
Maciej's fix:
https://lore.kernel.org/intel-wired-lan/20260414121405.631092-1-maciej.fijalkowski@intel.com/T/#u
ask for reject:
https://lore.kernel.org/intel-wired-lan/PH0PR11MB75223C8A00C3183C5082A096A0252@PH0PR11MB7522.namprd11.prod.outlook.com/T/#mbac55f7219d7855a2e5d1527904b2da43ad080cb
> Fixes: 13a8cd191a2b ("i40e: Do not enable NAPI on q_vectors that have no rings")
> Cc: stable@vger.kernel.org
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> drivers/net/ethernet/intel/i40e/i40e_main.c | 28 ++++++++++++++++------------
> drivers/net/ethernet/intel/i40e/i40e_txrx.c | 10 ++++++++++
> 2 files changed, 26 insertions(+), 12 deletions(-)
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net 10/13] i40e: fix napi_enable/disable skipping ringless q_vectors
2026-04-16 4:20 ` Przemek Kitszel
@ 2026-04-16 20:46 ` Jacob Keller
2026-04-16 20:50 ` Jacob Keller
0 siblings, 1 reply; 17+ messages in thread
From: Jacob Keller @ 2026-04-16 20:46 UTC (permalink / raw)
To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni
Cc: netdev, Aleksandr Loktionov, stable, Sunitha Mekala,
Maciej Fijalkowski
On 4/15/2026 9:20 PM, Przemek Kitszel wrote:
> On 4/15/26 07:48, Jacob Keller wrote:
>> From: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
>>
>> After ethtool -L reduces the queue count, i40e_napi_disable_all() sets
>> NAPI_STATE_SCHED on all q_vectors, then i40e_vsi_map_rings_to_vectors()
>> clears ring pointers on the excess ones. i40e_napi_enable_all() skips
>> those with:
>>
>> if (q_vector->rx.ring || q_vector->tx.ring)
>> napi_enable(&q_vector->napi);
>>
>> leaving them on dev->napi_list with NAPI_STATE_SCHED permanently set.
>>
>> Writing to /sys/class/net/<iface>/threaded calls napi_stop_kthread()
>> on every entry in dev->napi_list. The function loops on msleep(20)
>> waiting for NAPI_STATE_SCHED to clear -- which never happens for the
>> stale q_vectors. The task hangs in D state forever; a concurrent write
>> deadlocks on dev->lock held by the first.
>>
>> Commit 13a8cd191a2b ("i40e: Do not enable NAPI on q_vectors that have no
>> rings") added the guard to prevent a divide-by-zero in i40e_napi_poll()
>> when epoll busy-poll iterated all device NAPIs (4.x era). Since
>> 7adc3d57fe2b ("net: Introduce preferred busy-polling"), from v5.11,
>> napi_busy_loop() polls by napi_id keyed to the socket, so ringless
>> q_vectors are never selected. i40e_msix_clean_rings() also independently
>> avoids scheduling NAPI for them. The guard is safe to remove.
>>
>> Add an early return in i40e_napi_poll() for num_ringpairs == 0 so the
>> function is self-defending against a NULL tx.ring dereference at the
>> WB_ON_ITR check, should the NAPI ever fire through an unexpected path.
>>
>> Reported-by: Jakub Kicinski <kuba@kernel.org>
>> Closes: https://lore.kernel.org/intel-wired-
>> lan/20260316133100.6054a11f@kernel.org/
>
> Maciej developed a better fix for the problem, and he explicitly asked
> to not include this patch. Please drop it from this series.
>
> Maciej's fix:
> https://lore.kernel.org/intel-wired-lan/20260414121405.631092-1-
> maciej.fijalkowski@intel.com/T/#u
>
> ask for reject:
> https://lore.kernel.org/intel-wired-lan/
> PH0PR11MB75223C8A00C3183C5082A096A0252@PH0PR11MB7522.namprd11.prod.outlook.com/T/#mbac55f7219d7855a2e5d1527904b2da43ad080cb
>
Ugh, sorry for failing to notice this when batching this series up :(
Thanks,
Jake
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net 10/13] i40e: fix napi_enable/disable skipping ringless q_vectors
2026-04-16 20:46 ` Jacob Keller
@ 2026-04-16 20:50 ` Jacob Keller
0 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2026-04-16 20:50 UTC (permalink / raw)
To: Przemek Kitszel, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni
Cc: netdev, Aleksandr Loktionov, stable, Sunitha Mekala,
Maciej Fijalkowski
On 4/16/2026 1:46 PM, Jacob Keller wrote:
> On 4/15/2026 9:20 PM, Przemek Kitszel wrote:
>> On 4/15/26 07:48, Jacob Keller wrote:
>>> From: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
>>>
>>> After ethtool -L reduces the queue count, i40e_napi_disable_all() sets
>>> NAPI_STATE_SCHED on all q_vectors, then i40e_vsi_map_rings_to_vectors()
>>> clears ring pointers on the excess ones. i40e_napi_enable_all() skips
>>> those with:
>>>
>>> if (q_vector->rx.ring || q_vector->tx.ring)
>>> napi_enable(&q_vector->napi);
>>>
>>> leaving them on dev->napi_list with NAPI_STATE_SCHED permanently set.
>>>
>>> Writing to /sys/class/net/<iface>/threaded calls napi_stop_kthread()
>>> on every entry in dev->napi_list. The function loops on msleep(20)
>>> waiting for NAPI_STATE_SCHED to clear -- which never happens for the
>>> stale q_vectors. The task hangs in D state forever; a concurrent write
>>> deadlocks on dev->lock held by the first.
>>>
>>> Commit 13a8cd191a2b ("i40e: Do not enable NAPI on q_vectors that have no
>>> rings") added the guard to prevent a divide-by-zero in i40e_napi_poll()
>>> when epoll busy-poll iterated all device NAPIs (4.x era). Since
>>> 7adc3d57fe2b ("net: Introduce preferred busy-polling"), from v5.11,
>>> napi_busy_loop() polls by napi_id keyed to the socket, so ringless
>>> q_vectors are never selected. i40e_msix_clean_rings() also independently
>>> avoids scheduling NAPI for them. The guard is safe to remove.
>>>
>>> Add an early return in i40e_napi_poll() for num_ringpairs == 0 so the
>>> function is self-defending against a NULL tx.ring dereference at the
>>> WB_ON_ITR check, should the NAPI ever fire through an unexpected path.
>>>
>>> Reported-by: Jakub Kicinski <kuba@kernel.org>
>>> Closes: https://lore.kernel.org/intel-wired-
>>> lan/20260316133100.6054a11f@kernel.org/
>>
>> Maciej developed a better fix for the problem, and he explicitly asked
>> to not include this patch. Please drop it from this series.
>>
>> Maciej's fix:
>> https://lore.kernel.org/intel-wired-lan/20260414121405.631092-1-
>> maciej.fijalkowski@intel.com/T/#u
>>
>> ask for reject:
>> https://lore.kernel.org/intel-wired-lan/
>> PH0PR11MB75223C8A00C3183C5082A096A0252@PH0PR11MB7522.namprd11.prod.outlook.com/T/#mbac55f7219d7855a2e5d1527904b2da43ad080cb
>>
>
> Ugh, sorry for failing to notice this when batching this series up :(
>
> Thanks,
> Jake
>
Jakub,
Can you discard this patch out of the series when applying? Or should I
go ahead and send a v2?
Thanks,
Jake
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-04-16 20:50 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-15 5:47 [PATCH net 00/13] Intel Wired LAN Driver Updates 2026-04-14 (ice, i40e, iavf, idpf, e1000e) Jacob Keller
2026-04-15 5:47 ` [PATCH net 01/13] ice: fix 'adjust' timer programming for E830 devices Jacob Keller
2026-04-15 5:47 ` [PATCH net 02/13] ice: update PCS latency settings for E825 10G/25Gb modes Jacob Keller
2026-04-15 5:47 ` [PATCH net 03/13] ice: fix double free in ice_sf_eth_activate() error path Jacob Keller
2026-04-15 5:47 ` [PATCH net 04/13] ice: fix double-free of tx_buf skb Jacob Keller
2026-04-15 5:48 ` [PATCH net 05/13] ice: fix PHY config on media change with link-down-on-close Jacob Keller
2026-04-15 5:48 ` [PATCH net 06/13] ice: fix ICE_AQ_LINK_SPEED_M for 200G Jacob Keller
2026-04-15 5:48 ` [PATCH net 07/13] ice: fix race condition in TX timestamp ring cleanup Jacob Keller
2026-04-15 5:48 ` [PATCH net 08/13] ice: fix potential NULL pointer deref in error path of ice_set_ringparam() Jacob Keller
2026-04-15 5:48 ` [PATCH net 09/13] i40e: don't advertise IFF_SUPP_NOFCS Jacob Keller
2026-04-15 5:48 ` [PATCH net 10/13] i40e: fix napi_enable/disable skipping ringless q_vectors Jacob Keller
2026-04-16 4:20 ` Przemek Kitszel
2026-04-16 20:46 ` Jacob Keller
2026-04-16 20:50 ` Jacob Keller
2026-04-15 5:48 ` [PATCH net 11/13] iavf: fix wrong VLAN mask for legacy Rx descriptors L2TAG2 Jacob Keller
2026-04-15 5:48 ` [PATCH net 12/13] idpf: fix xdp crash in soft reset error path Jacob Keller
2026-04-15 5:48 ` [PATCH net 13/13] e1000e: Unroll PTP in probe error handling Jacob Keller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox