Netdev List
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Larysa Zaremba <larysa.zaremba@intel.com>
Cc: Jakub Kicinski <kuba@kernel.org>, <anthony.l.nguyen@intel.com>,
	<davem@davemloft.net>, <pabeni@redhat.com>, <edumazet@google.com>,
	<andrew+netdev@lunn.ch>, <netdev@vger.kernel.org>,
	<vinschen@redhat.com>, <alexander.nowlin@intel.com>
Subject: Re: [PATCH net 1/8] ice: fix UAF/NULL deref when VSI rebuild and XDP attach race
Date: Mon, 1 Jun 2026 14:21:34 +0200	[thread overview]
Message-ID: <ah15Ti7O2m3Hj2T+@boxer> (raw)
In-Reply-To: <ahlIwmIHiPsB50Zn@soc-5CG4396X81.clients.intel.com>

On Fri, May 29, 2026 at 10:05:22AM +0200, Larysa Zaremba wrote:
> On Thu, May 28, 2026 at 11:14:15AM +0200, Maciej Fijalkowski wrote:
> > On Fri, May 22, 2026 at 05:16:16PM -0700, Jakub Kicinski wrote:
> > > This is an AI-generated review of your patch. The human sending this
> > > email has considered the AI review valid, or at least plausible.
> > > ---
> > > ice: fix UAF/NULL deref when VSI rebuild and XDP attach race
> > > 
> > > This commit addresses a race condition between a VSI rebuild and an XDP
> > > attach operation. It prevents rx_ring->xdp_prog from being populated before
> > > rx_ring->xdp_ring is valid by rejecting the XDP attach when a VSI rebuild
> > > is pending. XDP detachment is still permitted during this window to clear
> > > the existing program.
> > > 
> > > Fixes: 2504b8405768 ("ice: protect XDP configuration with a mutex")
> > > 
> > > > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> > > > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > > > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > > [ ... ]
> > > > @@ -2912,12 +2912,21 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog,
> > > >  	}
> > > >  
> > > >  	/* hot swap progs and avoid toggling link */
> > > > -	if (ice_is_xdp_ena_vsi(vsi) == !!prog ||
> > > > -	    test_bit(ICE_VSI_REBUILD_PENDING, vsi->state)) {
> > > > +	if (ice_is_xdp_ena_vsi(vsi) == !!prog) {
> > > >  		ice_vsi_assign_bpf_prog(vsi, prog);
> > > >  		return 0;
> > > >  	}
> > > >  
> > > > +	if (test_bit(ICE_VSI_REBUILD_PENDING, vsi->state)) {
> > > 
> > > [Severity: High]
> > > This is a pre-existing issue, but is the check for ICE_VSI_REBUILD_PENDING
> > > prone to a TOCTOU race because it relies on ad-hoc flag synchronization
> > > instead of proper locking with the device reset path?
> > > 
> > > The asynchronous reset path (ice_prepare_for_reset()) sets
> > > ICE_VSI_REBUILD_PENDING without holding vsi->xdp_state_lock or rtnl_lock().
> > > 
> > > If ice_xdp_setup_prog() (which holds both locks) checks the bit before it
> > > is set, it proceeds to configure XDP and sets ICE_VSI_DOWN. If the reset
> > > path then sets the bit and calls ice_pf_dis_all_vsi(), it will skip this
> > > VSI because ICE_VSI_DOWN is already set.
> > >
> 
> Here is ice_dis_vsi()
> 
> void ice_dis_vsi(struct ice_vsi *vsi, bool locked)
> {
> 	bool already_down = test_bit(ICE_VSI_DOWN, vsi->state);
> 
> 	set_bit(ICE_VSI_NEEDS_RESTART, vsi->state);
> 
> 	if (vsi->netdev && (vsi->type == ICE_VSI_PF ||
> 			    vsi->type == ICE_VSI_SF)) {
> 		if (netif_running(vsi->netdev)) {
> 			if (!locked)
> 				rtnl_lock();
> 			already_down = test_bit(ICE_VSI_DOWN, vsi->state);
> 			if (!already_down)
> 				ice_vsi_close(vsi);
> 
> 			if (!locked)
> 				rtnl_unlock();
> 		} else if (!already_down) {
> 			ice_vsi_close(vsi);
> 		}
> 	} else if (vsi->type == ICE_VSI_CTRL && !already_down) {
> 		ice_vsi_close(vsi);
> 	}
> }
> 
> For __LINK_STATE_START netdevs, DOWN state is rechecked under rtnl_lock() 
> before closing VSI, so we do not have to worry about XDP and other configuration 
> functions tempering with it.
> 
> For !netif_running() netdevs, the only situation, when there could be such 
> problem is if the whole .ndo_open() finishes and another rebuilding 
> (rtnl-locked) callback starts and sets ICE_VSI_DOWN in between
> 
> `if (netif_running(vsi->netdev))` and `ice_vsi_close(vsi)`
> 
> Which is already unlikely.
> But I think this is not possible at all since ndo_open() will not succeed while 
> we are in a reset path, so no rtnl-locked callback will see !ICE_VSI_DOWN state.
> 
> It would be more robust to check for netif_running(vsi->netdev) under rtnl_lock 
> too, but the code should work as-is.

I think issue is not about repeated ice_down() scope but rather the
allocations that could overwrite existing objects as we skipped these
calls which ice_vsi_close() has in it:

	ice_vsi_clear_napi_queues(vsi);
	ice_vsi_free_irq(vsi);
	ice_vsi_free_tx_rings(vsi);
	ice_vsi_free_rx_rings(vsi);

?

> 
> > > When ice_xdp_setup_prog() finishes, it calls ice_up() which clears
> > > ICE_VSI_DOWN and starts enabling hardware queues while the device is
> > > actively undergoing a hardware reset (ice_do_reset()).
> > > 
> > > Can writing to hardware registers during an active reset cause hardware
> > > hangs or PCI master aborts?
> > > 
> > > > +		if (prog) {
> > > > +			NL_SET_ERR_MSG_MOD(extack, "VSI rebuild is pending");
> > > > +			return -EAGAIN;
> > > > +		}
> > > > +
> > > > +		ice_vsi_assign_bpf_prog(vsi, NULL);
> > > 
> > > [Severity: High]
> > > This is a pre-existing issue, but does detaching an XDP program while a VSI
> > > rebuild has failed lead to a NULL pointer dereference here?
> > > 
> > > If a VSI rebuild fails, ice_vsi_decfg() frees vsi->rx_rings and sets the
> > > pointer to NULL via ice_vsi_free_arrays(), but leaves vsi->num_rxq
> > > non-zero. The ICE_VSI_REBUILD_PENDING bit remains set because it is only
> > > cleared on the success path in ice_vsi_rebuild().
> > > 
> > > If a user subsequently detaches XDP, ice_xdp_setup_prog() intercepts it
> > > because ICE_VSI_REBUILD_PENDING is set, and it calls
> > > ice_vsi_assign_bpf_prog(vsi, NULL).
> > > 
> > > This function loops up to vsi->num_rxq:
> > > 
> > > ice_vsi_assign_bpf_prog() {
> > >     ...
> > >     ice_for_each_rxq(vsi, i)
> > >         WRITE_ONCE(vsi->rx_rings[i]->xdp_prog, vsi->xdp_prog);
> > >     ...
> > > }
> > > 
> > > Since vsi->rx_rings is NULL, does this cause a guaranteed NULL pointer
> > > dereference panic?
> > 
> > CC: Larysa
> > 
> > Seems we should check if vsi->rx_rings != NULL plus
> > ICE_VSI_REBUILD_PENDING should be cleared on error path in
> > ice_vsi_rebuild().
> 
> I think checking (vsi->rx_rings != NULL) should be sufficient. Not sure about 
> clearing ICE_VSI_REBUILD_PENDING, if a rebuild has failed and the device is 
> disfunctional, rebuild is still to be expected at some point, and XPD changes 
> should not have any effect before then.
> 
> > 
> > Do you have any comment on TOCTOU issue mentioned above?
> > 
> > > 
> > > > +		return 0;
> > > > +	}
> > > > +
> > > >  	if_running = netif_running(vsi->netdev) &&
> > > >  		     !test_and_set_bit(ICE_VSI_DOWN, vsi->state);
> > > -- 
> > > pw-bot: cr
> > > 

  reply	other threads:[~2026-06-01 12:21 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-20 18:34 [PATCH net 0/8][pull request] Intel Wired LAN Driver Updates 2026-05-20 (ice, iavf, i40e, ixgbe) Tony Nguyen
2026-05-20 18:34 ` [PATCH net 1/8] ice: fix UAF/NULL deref when VSI rebuild and XDP attach race Tony Nguyen
2026-05-21 15:37   ` Jakub Kicinski
2026-05-23  0:16   ` Jakub Kicinski
2026-05-28  9:14     ` Maciej Fijalkowski
2026-05-29  8:05       ` Larysa Zaremba
2026-06-01 12:21         ` Maciej Fijalkowski [this message]
2026-06-01 13:52           ` Larysa Zaremba
2026-05-20 18:34 ` [PATCH net 2/8] ice: fix stats array overflow when VF requests more queues Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 3/8] iavf: return EBUSY if reset in progress or not ready during MAC change Tony Nguyen
2026-05-20 18:34 ` [PATCH net 4/8] i40e: skip unnecessary VF reset when setting trust Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-23  0:16   ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 5/8] iavf: send MAC change request synchronously Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-23  0:16   ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 6/8] ice: skip unnecessary VF reset when setting trust Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-23  0:16   ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 7/8] i40e: set supported_extts_flags for rising edge Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 8/8] ixgbe: only access vfinfo and mv_list under RCU lock Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-23  0:16   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ah15Ti7O2m3Hj2T+@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=alexander.nowlin@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=larysa.zaremba@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=vinschen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox