[PATCH iwl-net] ice: fix NULL pointer dereference in ice_reset_all

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH iwl-net] ice: fix NULL pointer dereference in ice_reset_all_vfs()
@ 2026-04-01 11:09 Petr Oros
  2026-04-13 15:53 ` [Intel-wired-lan] " Romanowski, Rafal
  0 siblings, 1 reply; 2+ messages in thread
From: Petr Oros @ 2026-04-01 11:09 UTC (permalink / raw)
  To: netdev
  Cc: Petr Oros, Tony Nguyen, Przemek Kitszel, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Brett Creeley, intel-wired-lan, linux-kernel

ice_reset_all_vfs() ignores the return value of ice_vf_rebuild_vsi().
When the VSI rebuild fails (e.g. during NVM firmware update via
nvmupdate64e), ice_vsi_rebuild() tears down the VSI on its error path,
leaving txq_map and rxq_map as NULL. The subsequent unconditional call
to ice_vf_post_vsi_rebuild() leads to a NULL pointer dereference in
ice_ena_vf_q_mappings() when it accesses vsi->txq_map[0].

The single-VF reset path in ice_reset_vf() already handles this
correctly by checking the return value of ice_vf_reconfig_vsi() and
skipping ice_vf_post_vsi_rebuild() on failure.

Apply the same pattern to ice_reset_all_vfs(): check the return value
of ice_vf_rebuild_vsi() and skip ice_vf_post_vsi_rebuild() and
ice_eswitch_attach_vf() on failure. The VF is left safely disabled
(ICE_VF_STATE_INIT not set, VFGEN_RSTAT not set to VFACTIVE) and can
be recovered via a VFLR triggered by a PCI reset of the VF
(sysfs reset or driver rebind).

Note that this patch does not prevent the VF VSI rebuild from failing
during NVM update — the underlying cause is firmware being in a
transitional state while the EMP reset is processed, which can cause
Admin Queue commands (ice_add_vsi, ice_cfg_vsi_lan) to fail. This
patch only prevents the subsequent NULL pointer dereference that
crashes the kernel when the rebuild does fail.

 crash> bt
     PID: 50795    TASK: ff34c9ee708dc680  CPU: 1    COMMAND: "kworker/u512:5"
      #0 [ff72159bcfe5bb50] machine_kexec at ffffffffaa8850ee
      #1 [ff72159bcfe5bba8] __crash_kexec at ffffffffaaa15fba
      #2 [ff72159bcfe5bc68] crash_kexec at ffffffffaaa16540
      #3 [ff72159bcfe5bc70] oops_end at ffffffffaa837eda
      #4 [ff72159bcfe5bc90] page_fault_oops at ffffffffaa893997
      #5 [ff72159bcfe5bce8] exc_page_fault at ffffffffab528595
      #6 [ff72159bcfe5bd10] asm_exc_page_fault at ffffffffab600bb2
         [exception RIP: ice_ena_vf_q_mappings+0x79]
         RIP: ffffffffc0a85b29  RSP: ff72159bcfe5bdc8  RFLAGS: 00010206
         RAX: 00000000000f0000  RBX: ff34c9efc9c00000  RCX: 0000000000000000
         RDX: 0000000000000000  RSI: 0000000000000010  RDI: ff34c9efc9c00000
         RBP: ff34c9efc27d4828   R8: 0000000000000093   R9: 0000000000000040
         R10: ff34c9efc27d4828  R11: 0000000000000040  R12: 0000000000100000
         R13: 0000000000000010  R14:   R15:
         ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
      #7 [ff72159bcfe5bdf8] ice_sriov_post_vsi_rebuild at ffffffffc0a85e2e [ice]
      #8 [ff72159bcfe5be08] ice_reset_all_vfs at ffffffffc0a920b4 [ice]
      #9 [ff72159bcfe5be48] ice_service_task at ffffffffc0a31519 [ice]
     #10 [ff72159bcfe5be88] process_one_work at ffffffffaa93dca4
     #11 [ff72159bcfe5bec8] worker_thread at ffffffffaa93e9de
     #12 [ff72159bcfe5bf18] kthread at ffffffffaa946663
     #13 [ff72159bcfe5bf50] ret_from_fork at ffffffffaa8086b9

 The panic occurs attempting to dereference the NULL pointer in RDX at
 ice_sriov.c:294, which loads vsi->txq_map (offset 0x4b8 in ice_vsi).

 The faulting VSI is an allocated slab object but not fully initialized
 after a failed ice_vsi_rebuild():

  crash> struct ice_vsi 0xff34c9efc27d4828
    netdev = 0x0,
    rx_rings = 0x0,
    tx_rings = 0x0,
    q_vectors = 0x0,
    txq_map = 0x0,
    rxq_map = 0x0,
    alloc_txq = 0x10,
    num_txq = 0x10,
    alloc_rxq = 0x10,
    num_rxq = 0x10,

 The nvmupdate64e process was performing NVM firmware update:

  crash> bt 0xff34c9edd1a30000
  PID: 49858    TASK: ff34c9edd1a30000  CPU: 1    COMMAND: "nvmupdate64e"
   #0 [ff72159bcd617618] __schedule at ffffffffab5333f8
   #4 [ff72159bcd617750] ice_sq_send_cmd at ffffffffc0a35347 [ice]
   #5 [ff72159bcd6177a8] ice_sq_send_cmd_retry at ffffffffc0a35b47 [ice]
   #6 [ff72159bcd617810] ice_aq_send_cmd at ffffffffc0a38018 [ice]
   #7 [ff72159bcd617848] ice_aq_read_nvm at ffffffffc0a40254 [ice]
   #8 [ff72159bcd6178b8] ice_read_flat_nvm at ffffffffc0a4034c [ice]
   #9 [ff72159bcd617918] ice_devlink_nvm_snapshot at ffffffffc0a6ffa5 [ice]

 dmesg:
  ice 0000:13:00.0: firmware recommends not updating fw.mgmt, as it
    may result in a downgrade. continuing anyways
  ice 0000:13:00.1: ice_init_nvm failed -5
  ice 0000:13:00.1: Rebuild failed, unload and reload driver

Fixes: 12bb018c538c ("ice: Refactor VF reset")
Signed-off-by: Petr Oros <poros@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_vf_lib.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
index c8bc952f05cdb5..51259a4fdda4b9 100644
--- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
@@ -804,7 +804,12 @@ void ice_reset_all_vfs(struct ice_pf *pf)
 			ice_vf_ctrl_invalidate_vsi(vf);
 
 		ice_vf_pre_vsi_rebuild(vf);
-		ice_vf_rebuild_vsi(vf);
+		if (ice_vf_rebuild_vsi(vf)) {
+			dev_err(dev, "VF %u VSI rebuild failed, leaving VF disabled\n",
+				vf->vf_id);
+			mutex_unlock(&vf->cfg_lock);
+			continue;
+		}
 		ice_vf_post_vsi_rebuild(vf);
 
 		ice_eswitch_attach_vf(pf, vf);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* RE: [Intel-wired-lan] [PATCH iwl-net] ice: fix NULL pointer dereference in ice_reset_all_vfs()
  2026-04-01 11:09 [PATCH iwl-net] ice: fix NULL pointer dereference in ice_reset_all_vfs() Petr Oros
@ 2026-04-13 15:53 ` Romanowski, Rafal
  0 siblings, 0 replies; 2+ messages in thread
From: Romanowski, Rafal @ 2026-04-13 15:53 UTC (permalink / raw)
  To: Oros, Petr, netdev@vger.kernel.org
  Cc: Kitszel, Przemyslaw, Brett Creeley, Eric Dumazet,
	linux-kernel@vger.kernel.org, Andrew Lunn, Nguyen, Anthony L,
	intel-wired-lan@lists.osuosl.org, Jakub Kicinski, Paolo Abeni,
	David S. Miller

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Petr
> Oros
> Sent: Wednesday, April 1, 2026 1:10 PM
> To: netdev@vger.kernel.org
> Cc: Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; Brett Creeley
> <brett.creeley@intel.com>; Eric Dumazet <edumazet@google.com>; linux-
> kernel@vger.kernel.org; Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen,
> Anthony L <anthony.l.nguyen@intel.com>; intel-wired-lan@lists.osuosl.org;
> Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David S.
> Miller <davem@davemloft.net>
> Subject: [Intel-wired-lan] [PATCH iwl-net] ice: fix NULL pointer dereference in
> ice_reset_all_vfs()
> 
> ice_reset_all_vfs() ignores the return value of ice_vf_rebuild_vsi().
> When the VSI rebuild fails (e.g. during NVM firmware update via nvmupdate64e),
> ice_vsi_rebuild() tears down the VSI on its error path, leaving txq_map and
> rxq_map as NULL. The subsequent unconditional call to ice_vf_post_vsi_rebuild()
> leads to a NULL pointer dereference in
> ice_ena_vf_q_mappings() when it accesses vsi->txq_map[0].
> 
> The single-VF reset path in ice_reset_vf() already handles this correctly by
> checking the return value of ice_vf_reconfig_vsi() and skipping
> ice_vf_post_vsi_rebuild() on failure.
> 
> Apply the same pattern to ice_reset_all_vfs(): check the return value of
> ice_vf_rebuild_vsi() and skip ice_vf_post_vsi_rebuild() and
> ice_eswitch_attach_vf() on failure. The VF is left safely disabled
> (ICE_VF_STATE_INIT not set, VFGEN_RSTAT not set to VFACTIVE) and can be
> recovered via a VFLR triggered by a PCI reset of the VF (sysfs reset or driver
> rebind).
> 
> Note that this patch does not prevent the VF VSI rebuild from failing during NVM
> update — the underlying cause is firmware being in a transitional state while the
> EMP reset is processed, which can cause Admin Queue commands (ice_add_vsi,
> ice_cfg_vsi_lan) to fail. This patch only prevents the subsequent NULL pointer
> dereference that crashes the kernel when the rebuild does fail.
> 
>  crash> bt
>      PID: 50795    TASK: ff34c9ee708dc680  CPU: 1    COMMAND:
> "kworker/u512:5"
>       #0 [ff72159bcfe5bb50] machine_kexec at ffffffffaa8850ee
>       #1 [ff72159bcfe5bba8] __crash_kexec at ffffffffaaa15fba
>       #2 [ff72159bcfe5bc68] crash_kexec at ffffffffaaa16540
>       #3 [ff72159bcfe5bc70] oops_end at ffffffffaa837eda
>       #4 [ff72159bcfe5bc90] page_fault_oops at ffffffffaa893997
>       #5 [ff72159bcfe5bce8] exc_page_fault at ffffffffab528595
>       #6 [ff72159bcfe5bd10] asm_exc_page_fault at ffffffffab600bb2
>          [exception RIP: ice_ena_vf_q_mappings+0x79]
>          RIP: ffffffffc0a85b29  RSP: ff72159bcfe5bdc8  RFLAGS: 00010206
>          RAX: 00000000000f0000  RBX: ff34c9efc9c00000  RCX: 0000000000000000
>          RDX: 0000000000000000  RSI: 0000000000000010  RDI: ff34c9efc9c00000
>          RBP: ff34c9efc27d4828   R8: 0000000000000093   R9: 0000000000000040
>          R10: ff34c9efc27d4828  R11: 0000000000000040  R12: 0000000000100000
>          R13: 0000000000000010  R14:   R15:
>          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>       #7 [ff72159bcfe5bdf8] ice_sriov_post_vsi_rebuild at ffffffffc0a85e2e [ice]
>       #8 [ff72159bcfe5be08] ice_reset_all_vfs at ffffffffc0a920b4 [ice]
>       #9 [ff72159bcfe5be48] ice_service_task at ffffffffc0a31519 [ice]
>      #10 [ff72159bcfe5be88] process_one_work at ffffffffaa93dca4
>      #11 [ff72159bcfe5bec8] worker_thread at ffffffffaa93e9de
>      #12 [ff72159bcfe5bf18] kthread at ffffffffaa946663
>      #13 [ff72159bcfe5bf50] ret_from_fork at ffffffffaa8086b9
> 
>  The panic occurs attempting to dereference the NULL pointer in RDX at
> ice_sriov.c:294, which loads vsi->txq_map (offset 0x4b8 in ice_vsi).
> 
>  The faulting VSI is an allocated slab object but not fully initialized  after a failed
> ice_vsi_rebuild():
> 
>   crash> struct ice_vsi 0xff34c9efc27d4828
>     netdev = 0x0,
>     rx_rings = 0x0,
>     tx_rings = 0x0,
>     q_vectors = 0x0,
>     txq_map = 0x0,
>     rxq_map = 0x0,
>     alloc_txq = 0x10,
>     num_txq = 0x10,
>     alloc_rxq = 0x10,
>     num_rxq = 0x10,
> 
>  The nvmupdate64e process was performing NVM firmware update:
> 
>   crash> bt 0xff34c9edd1a30000
>   PID: 49858    TASK: ff34c9edd1a30000  CPU: 1    COMMAND: "nvmupdate64e"
>    #0 [ff72159bcd617618] __schedule at ffffffffab5333f8
>    #4 [ff72159bcd617750] ice_sq_send_cmd at ffffffffc0a35347 [ice]
>    #5 [ff72159bcd6177a8] ice_sq_send_cmd_retry at ffffffffc0a35b47 [ice]
>    #6 [ff72159bcd617810] ice_aq_send_cmd at ffffffffc0a38018 [ice]
>    #7 [ff72159bcd617848] ice_aq_read_nvm at ffffffffc0a40254 [ice]
>    #8 [ff72159bcd6178b8] ice_read_flat_nvm at ffffffffc0a4034c [ice]
>    #9 [ff72159bcd617918] ice_devlink_nvm_snapshot at ffffffffc0a6ffa5 [ice]
> 
>  dmesg:
>   ice 0000:13:00.0: firmware recommends not updating fw.mgmt, as it
>     may result in a downgrade. continuing anyways
>   ice 0000:13:00.1: ice_init_nvm failed -5
>   ice 0000:13:00.1: Rebuild failed, unload and reload driver
> 
> Fixes: 12bb018c538c ("ice: Refactor VF reset")
> Signed-off-by: Petr Oros <poros@redhat.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_vf_lib.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> index c8bc952f05cdb5..51259a4fdda4b9 100644
> --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> @@ -804,7 +804,12 @@ void ice_reset_all_vfs(struct ice_pf *pf)
>  			ice_vf_ctrl_invalidate_vsi(vf);
> 
>  		ice_vf_pre_vsi_rebuild(vf);
> -		ice_vf_rebuild_vsi(vf);
> +		if (ice_vf_rebuild_vsi(vf)) {
> +			dev_err(dev, "VF %u VSI rebuild failed, leaving VF
> disabled\n",
> +				vf->vf_id);
> +			mutex_unlock(&vf->cfg_lock);
> +			continue;
> +		}
>  		ice_vf_post_vsi_rebuild(vf);
> 
>  		ice_eswitch_attach_vf(pf, vf);
> --
> 2.52.0

Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-04-13 15:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-01 11:09 [PATCH iwl-net] ice: fix NULL pointer dereference in ice_reset_all_vfs() Petr Oros
2026-04-13 15:53 ` [Intel-wired-lan] " Romanowski, Rafal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox