* [PATCH iwl-net] ice: fix NULL pointer dereference in ice_reset_all_vfs()
@ 2026-04-01 11:09 Petr Oros
2026-04-13 15:53 ` [Intel-wired-lan] " Romanowski, Rafal
0 siblings, 1 reply; 2+ messages in thread
From: Petr Oros @ 2026-04-01 11:09 UTC (permalink / raw)
To: netdev
Cc: Petr Oros, Tony Nguyen, Przemek Kitszel, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Brett Creeley, intel-wired-lan, linux-kernel
ice_reset_all_vfs() ignores the return value of ice_vf_rebuild_vsi().
When the VSI rebuild fails (e.g. during NVM firmware update via
nvmupdate64e), ice_vsi_rebuild() tears down the VSI on its error path,
leaving txq_map and rxq_map as NULL. The subsequent unconditional call
to ice_vf_post_vsi_rebuild() leads to a NULL pointer dereference in
ice_ena_vf_q_mappings() when it accesses vsi->txq_map[0].
The single-VF reset path in ice_reset_vf() already handles this
correctly by checking the return value of ice_vf_reconfig_vsi() and
skipping ice_vf_post_vsi_rebuild() on failure.
Apply the same pattern to ice_reset_all_vfs(): check the return value
of ice_vf_rebuild_vsi() and skip ice_vf_post_vsi_rebuild() and
ice_eswitch_attach_vf() on failure. The VF is left safely disabled
(ICE_VF_STATE_INIT not set, VFGEN_RSTAT not set to VFACTIVE) and can
be recovered via a VFLR triggered by a PCI reset of the VF
(sysfs reset or driver rebind).
Note that this patch does not prevent the VF VSI rebuild from failing
during NVM update — the underlying cause is firmware being in a
transitional state while the EMP reset is processed, which can cause
Admin Queue commands (ice_add_vsi, ice_cfg_vsi_lan) to fail. This
patch only prevents the subsequent NULL pointer dereference that
crashes the kernel when the rebuild does fail.
crash> bt
PID: 50795 TASK: ff34c9ee708dc680 CPU: 1 COMMAND: "kworker/u512:5"
#0 [ff72159bcfe5bb50] machine_kexec at ffffffffaa8850ee
#1 [ff72159bcfe5bba8] __crash_kexec at ffffffffaaa15fba
#2 [ff72159bcfe5bc68] crash_kexec at ffffffffaaa16540
#3 [ff72159bcfe5bc70] oops_end at ffffffffaa837eda
#4 [ff72159bcfe5bc90] page_fault_oops at ffffffffaa893997
#5 [ff72159bcfe5bce8] exc_page_fault at ffffffffab528595
#6 [ff72159bcfe5bd10] asm_exc_page_fault at ffffffffab600bb2
[exception RIP: ice_ena_vf_q_mappings+0x79]
RIP: ffffffffc0a85b29 RSP: ff72159bcfe5bdc8 RFLAGS: 00010206
RAX: 00000000000f0000 RBX: ff34c9efc9c00000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000010 RDI: ff34c9efc9c00000
RBP: ff34c9efc27d4828 R8: 0000000000000093 R9: 0000000000000040
R10: ff34c9efc27d4828 R11: 0000000000000040 R12: 0000000000100000
R13: 0000000000000010 R14: R15:
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ff72159bcfe5bdf8] ice_sriov_post_vsi_rebuild at ffffffffc0a85e2e [ice]
#8 [ff72159bcfe5be08] ice_reset_all_vfs at ffffffffc0a920b4 [ice]
#9 [ff72159bcfe5be48] ice_service_task at ffffffffc0a31519 [ice]
#10 [ff72159bcfe5be88] process_one_work at ffffffffaa93dca4
#11 [ff72159bcfe5bec8] worker_thread at ffffffffaa93e9de
#12 [ff72159bcfe5bf18] kthread at ffffffffaa946663
#13 [ff72159bcfe5bf50] ret_from_fork at ffffffffaa8086b9
The panic occurs attempting to dereference the NULL pointer in RDX at
ice_sriov.c:294, which loads vsi->txq_map (offset 0x4b8 in ice_vsi).
The faulting VSI is an allocated slab object but not fully initialized
after a failed ice_vsi_rebuild():
crash> struct ice_vsi 0xff34c9efc27d4828
netdev = 0x0,
rx_rings = 0x0,
tx_rings = 0x0,
q_vectors = 0x0,
txq_map = 0x0,
rxq_map = 0x0,
alloc_txq = 0x10,
num_txq = 0x10,
alloc_rxq = 0x10,
num_rxq = 0x10,
The nvmupdate64e process was performing NVM firmware update:
crash> bt 0xff34c9edd1a30000
PID: 49858 TASK: ff34c9edd1a30000 CPU: 1 COMMAND: "nvmupdate64e"
#0 [ff72159bcd617618] __schedule at ffffffffab5333f8
#4 [ff72159bcd617750] ice_sq_send_cmd at ffffffffc0a35347 [ice]
#5 [ff72159bcd6177a8] ice_sq_send_cmd_retry at ffffffffc0a35b47 [ice]
#6 [ff72159bcd617810] ice_aq_send_cmd at ffffffffc0a38018 [ice]
#7 [ff72159bcd617848] ice_aq_read_nvm at ffffffffc0a40254 [ice]
#8 [ff72159bcd6178b8] ice_read_flat_nvm at ffffffffc0a4034c [ice]
#9 [ff72159bcd617918] ice_devlink_nvm_snapshot at ffffffffc0a6ffa5 [ice]
dmesg:
ice 0000:13:00.0: firmware recommends not updating fw.mgmt, as it
may result in a downgrade. continuing anyways
ice 0000:13:00.1: ice_init_nvm failed -5
ice 0000:13:00.1: Rebuild failed, unload and reload driver
Fixes: 12bb018c538c ("ice: Refactor VF reset")
Signed-off-by: Petr Oros <poros@redhat.com>
---
drivers/net/ethernet/intel/ice/ice_vf_lib.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
index c8bc952f05cdb5..51259a4fdda4b9 100644
--- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
@@ -804,7 +804,12 @@ void ice_reset_all_vfs(struct ice_pf *pf)
ice_vf_ctrl_invalidate_vsi(vf);
ice_vf_pre_vsi_rebuild(vf);
- ice_vf_rebuild_vsi(vf);
+ if (ice_vf_rebuild_vsi(vf)) {
+ dev_err(dev, "VF %u VSI rebuild failed, leaving VF disabled\n",
+ vf->vf_id);
+ mutex_unlock(&vf->cfg_lock);
+ continue;
+ }
ice_vf_post_vsi_rebuild(vf);
ice_eswitch_attach_vf(pf, vf);
--
2.52.0
^ permalink raw reply related [flat|nested] 2+ messages in thread* RE: [Intel-wired-lan] [PATCH iwl-net] ice: fix NULL pointer dereference in ice_reset_all_vfs()
2026-04-01 11:09 [PATCH iwl-net] ice: fix NULL pointer dereference in ice_reset_all_vfs() Petr Oros
@ 2026-04-13 15:53 ` Romanowski, Rafal
0 siblings, 0 replies; 2+ messages in thread
From: Romanowski, Rafal @ 2026-04-13 15:53 UTC (permalink / raw)
To: Oros, Petr, netdev@vger.kernel.org
Cc: Kitszel, Przemyslaw, Brett Creeley, Eric Dumazet,
linux-kernel@vger.kernel.org, Andrew Lunn, Nguyen, Anthony L,
intel-wired-lan@lists.osuosl.org, Jakub Kicinski, Paolo Abeni,
David S. Miller
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Petr
> Oros
> Sent: Wednesday, April 1, 2026 1:10 PM
> To: netdev@vger.kernel.org
> Cc: Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; Brett Creeley
> <brett.creeley@intel.com>; Eric Dumazet <edumazet@google.com>; linux-
> kernel@vger.kernel.org; Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen,
> Anthony L <anthony.l.nguyen@intel.com>; intel-wired-lan@lists.osuosl.org;
> Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David S.
> Miller <davem@davemloft.net>
> Subject: [Intel-wired-lan] [PATCH iwl-net] ice: fix NULL pointer dereference in
> ice_reset_all_vfs()
>
> ice_reset_all_vfs() ignores the return value of ice_vf_rebuild_vsi().
> When the VSI rebuild fails (e.g. during NVM firmware update via nvmupdate64e),
> ice_vsi_rebuild() tears down the VSI on its error path, leaving txq_map and
> rxq_map as NULL. The subsequent unconditional call to ice_vf_post_vsi_rebuild()
> leads to a NULL pointer dereference in
> ice_ena_vf_q_mappings() when it accesses vsi->txq_map[0].
>
> The single-VF reset path in ice_reset_vf() already handles this correctly by
> checking the return value of ice_vf_reconfig_vsi() and skipping
> ice_vf_post_vsi_rebuild() on failure.
>
> Apply the same pattern to ice_reset_all_vfs(): check the return value of
> ice_vf_rebuild_vsi() and skip ice_vf_post_vsi_rebuild() and
> ice_eswitch_attach_vf() on failure. The VF is left safely disabled
> (ICE_VF_STATE_INIT not set, VFGEN_RSTAT not set to VFACTIVE) and can be
> recovered via a VFLR triggered by a PCI reset of the VF (sysfs reset or driver
> rebind).
>
> Note that this patch does not prevent the VF VSI rebuild from failing during NVM
> update — the underlying cause is firmware being in a transitional state while the
> EMP reset is processed, which can cause Admin Queue commands (ice_add_vsi,
> ice_cfg_vsi_lan) to fail. This patch only prevents the subsequent NULL pointer
> dereference that crashes the kernel when the rebuild does fail.
>
> crash> bt
> PID: 50795 TASK: ff34c9ee708dc680 CPU: 1 COMMAND:
> "kworker/u512:5"
> #0 [ff72159bcfe5bb50] machine_kexec at ffffffffaa8850ee
> #1 [ff72159bcfe5bba8] __crash_kexec at ffffffffaaa15fba
> #2 [ff72159bcfe5bc68] crash_kexec at ffffffffaaa16540
> #3 [ff72159bcfe5bc70] oops_end at ffffffffaa837eda
> #4 [ff72159bcfe5bc90] page_fault_oops at ffffffffaa893997
> #5 [ff72159bcfe5bce8] exc_page_fault at ffffffffab528595
> #6 [ff72159bcfe5bd10] asm_exc_page_fault at ffffffffab600bb2
> [exception RIP: ice_ena_vf_q_mappings+0x79]
> RIP: ffffffffc0a85b29 RSP: ff72159bcfe5bdc8 RFLAGS: 00010206
> RAX: 00000000000f0000 RBX: ff34c9efc9c00000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000010 RDI: ff34c9efc9c00000
> RBP: ff34c9efc27d4828 R8: 0000000000000093 R9: 0000000000000040
> R10: ff34c9efc27d4828 R11: 0000000000000040 R12: 0000000000100000
> R13: 0000000000000010 R14: R15:
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #7 [ff72159bcfe5bdf8] ice_sriov_post_vsi_rebuild at ffffffffc0a85e2e [ice]
> #8 [ff72159bcfe5be08] ice_reset_all_vfs at ffffffffc0a920b4 [ice]
> #9 [ff72159bcfe5be48] ice_service_task at ffffffffc0a31519 [ice]
> #10 [ff72159bcfe5be88] process_one_work at ffffffffaa93dca4
> #11 [ff72159bcfe5bec8] worker_thread at ffffffffaa93e9de
> #12 [ff72159bcfe5bf18] kthread at ffffffffaa946663
> #13 [ff72159bcfe5bf50] ret_from_fork at ffffffffaa8086b9
>
> The panic occurs attempting to dereference the NULL pointer in RDX at
> ice_sriov.c:294, which loads vsi->txq_map (offset 0x4b8 in ice_vsi).
>
> The faulting VSI is an allocated slab object but not fully initialized after a failed
> ice_vsi_rebuild():
>
> crash> struct ice_vsi 0xff34c9efc27d4828
> netdev = 0x0,
> rx_rings = 0x0,
> tx_rings = 0x0,
> q_vectors = 0x0,
> txq_map = 0x0,
> rxq_map = 0x0,
> alloc_txq = 0x10,
> num_txq = 0x10,
> alloc_rxq = 0x10,
> num_rxq = 0x10,
>
> The nvmupdate64e process was performing NVM firmware update:
>
> crash> bt 0xff34c9edd1a30000
> PID: 49858 TASK: ff34c9edd1a30000 CPU: 1 COMMAND: "nvmupdate64e"
> #0 [ff72159bcd617618] __schedule at ffffffffab5333f8
> #4 [ff72159bcd617750] ice_sq_send_cmd at ffffffffc0a35347 [ice]
> #5 [ff72159bcd6177a8] ice_sq_send_cmd_retry at ffffffffc0a35b47 [ice]
> #6 [ff72159bcd617810] ice_aq_send_cmd at ffffffffc0a38018 [ice]
> #7 [ff72159bcd617848] ice_aq_read_nvm at ffffffffc0a40254 [ice]
> #8 [ff72159bcd6178b8] ice_read_flat_nvm at ffffffffc0a4034c [ice]
> #9 [ff72159bcd617918] ice_devlink_nvm_snapshot at ffffffffc0a6ffa5 [ice]
>
> dmesg:
> ice 0000:13:00.0: firmware recommends not updating fw.mgmt, as it
> may result in a downgrade. continuing anyways
> ice 0000:13:00.1: ice_init_nvm failed -5
> ice 0000:13:00.1: Rebuild failed, unload and reload driver
>
> Fixes: 12bb018c538c ("ice: Refactor VF reset")
> Signed-off-by: Petr Oros <poros@redhat.com>
> ---
> drivers/net/ethernet/intel/ice/ice_vf_lib.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> index c8bc952f05cdb5..51259a4fdda4b9 100644
> --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> @@ -804,7 +804,12 @@ void ice_reset_all_vfs(struct ice_pf *pf)
> ice_vf_ctrl_invalidate_vsi(vf);
>
> ice_vf_pre_vsi_rebuild(vf);
> - ice_vf_rebuild_vsi(vf);
> + if (ice_vf_rebuild_vsi(vf)) {
> + dev_err(dev, "VF %u VSI rebuild failed, leaving VF
> disabled\n",
> + vf->vf_id);
> + mutex_unlock(&vf->cfg_lock);
> + continue;
> + }
> ice_vf_post_vsi_rebuild(vf);
>
> ice_eswitch_attach_vf(pf, vf);
> --
> 2.52.0
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-04-13 15:53 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-01 11:09 [PATCH iwl-net] ice: fix NULL pointer dereference in ice_reset_all_vfs() Petr Oros
2026-04-13 15:53 ` [Intel-wired-lan] " Romanowski, Rafal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox