All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Larysa Zaremba <larysa.zaremba@intel.com>
Cc: Jakub Kicinski <kuba@kernel.org>, <anthony.l.nguyen@intel.com>,
	<davem@davemloft.net>, <pabeni@redhat.com>, <edumazet@google.com>,
	<andrew+netdev@lunn.ch>, <netdev@vger.kernel.org>,
	<vinschen@redhat.com>, <alexander.nowlin@intel.com>
Subject: Re: [PATCH net 1/8] ice: fix UAF/NULL deref when VSI rebuild and XDP attach race
Date: Mon, 1 Jun 2026 14:21:34 +0200	[thread overview]
Message-ID: <ah15Ti7O2m3Hj2T+@boxer> (raw)
In-Reply-To: <ahlIwmIHiPsB50Zn@soc-5CG4396X81.clients.intel.com>

On Fri, May 29, 2026 at 10:05:22AM +0200, Larysa Zaremba wrote:
> On Thu, May 28, 2026 at 11:14:15AM +0200, Maciej Fijalkowski wrote:
> > On Fri, May 22, 2026 at 05:16:16PM -0700, Jakub Kicinski wrote:
> > > This is an AI-generated review of your patch. The human sending this
> > > email has considered the AI review valid, or at least plausible.
> > > ---
> > > ice: fix UAF/NULL deref when VSI rebuild and XDP attach race
> > > 
> > > This commit addresses a race condition between a VSI rebuild and an XDP
> > > attach operation. It prevents rx_ring->xdp_prog from being populated before
> > > rx_ring->xdp_ring is valid by rejecting the XDP attach when a VSI rebuild
> > > is pending. XDP detachment is still permitted during this window to clear
> > > the existing program.
> > > 
> > > Fixes: 2504b8405768 ("ice: protect XDP configuration with a mutex")
> > > 
> > > > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> > > > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > > > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > > [ ... ]
> > > > @@ -2912,12 +2912,21 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog,
> > > >  	}
> > > >  
> > > >  	/* hot swap progs and avoid toggling link */
> > > > -	if (ice_is_xdp_ena_vsi(vsi) == !!prog ||
> > > > -	    test_bit(ICE_VSI_REBUILD_PENDING, vsi->state)) {
> > > > +	if (ice_is_xdp_ena_vsi(vsi) == !!prog) {
> > > >  		ice_vsi_assign_bpf_prog(vsi, prog);
> > > >  		return 0;
> > > >  	}
> > > >  
> > > > +	if (test_bit(ICE_VSI_REBUILD_PENDING, vsi->state)) {
> > > 
> > > [Severity: High]
> > > This is a pre-existing issue, but is the check for ICE_VSI_REBUILD_PENDING
> > > prone to a TOCTOU race because it relies on ad-hoc flag synchronization
> > > instead of proper locking with the device reset path?
> > > 
> > > The asynchronous reset path (ice_prepare_for_reset()) sets
> > > ICE_VSI_REBUILD_PENDING without holding vsi->xdp_state_lock or rtnl_lock().
> > > 
> > > If ice_xdp_setup_prog() (which holds both locks) checks the bit before it
> > > is set, it proceeds to configure XDP and sets ICE_VSI_DOWN. If the reset
> > > path then sets the bit and calls ice_pf_dis_all_vsi(), it will skip this
> > > VSI because ICE_VSI_DOWN is already set.
> > >
> 
> Here is ice_dis_vsi()
> 
> void ice_dis_vsi(struct ice_vsi *vsi, bool locked)
> {
> 	bool already_down = test_bit(ICE_VSI_DOWN, vsi->state);
> 
> 	set_bit(ICE_VSI_NEEDS_RESTART, vsi->state);
> 
> 	if (vsi->netdev && (vsi->type == ICE_VSI_PF ||
> 			    vsi->type == ICE_VSI_SF)) {
> 		if (netif_running(vsi->netdev)) {
> 			if (!locked)
> 				rtnl_lock();
> 			already_down = test_bit(ICE_VSI_DOWN, vsi->state);
> 			if (!already_down)
> 				ice_vsi_close(vsi);
> 
> 			if (!locked)
> 				rtnl_unlock();
> 		} else if (!already_down) {
> 			ice_vsi_close(vsi);
> 		}
> 	} else if (vsi->type == ICE_VSI_CTRL && !already_down) {
> 		ice_vsi_close(vsi);
> 	}
> }
> 
> For __LINK_STATE_START netdevs, DOWN state is rechecked under rtnl_lock() 
> before closing VSI, so we do not have to worry about XDP and other configuration 
> functions tempering with it.
> 
> For !netif_running() netdevs, the only situation, when there could be such 
> problem is if the whole .ndo_open() finishes and another rebuilding 
> (rtnl-locked) callback starts and sets ICE_VSI_DOWN in between
> 
> `if (netif_running(vsi->netdev))` and `ice_vsi_close(vsi)`
> 
> Which is already unlikely.
> But I think this is not possible at all since ndo_open() will not succeed while 
> we are in a reset path, so no rtnl-locked callback will see !ICE_VSI_DOWN state.
> 
> It would be more robust to check for netif_running(vsi->netdev) under rtnl_lock 
> too, but the code should work as-is.

I think issue is not about repeated ice_down() scope but rather the
allocations that could overwrite existing objects as we skipped these
calls which ice_vsi_close() has in it:

	ice_vsi_clear_napi_queues(vsi);
	ice_vsi_free_irq(vsi);
	ice_vsi_free_tx_rings(vsi);
	ice_vsi_free_rx_rings(vsi);

?

> 
> > > When ice_xdp_setup_prog() finishes, it calls ice_up() which clears
> > > ICE_VSI_DOWN and starts enabling hardware queues while the device is
> > > actively undergoing a hardware reset (ice_do_reset()).
> > > 
> > > Can writing to hardware registers during an active reset cause hardware
> > > hangs or PCI master aborts?
> > > 
> > > > +		if (prog) {
> > > > +			NL_SET_ERR_MSG_MOD(extack, "VSI rebuild is pending");
> > > > +			return -EAGAIN;
> > > > +		}
> > > > +
> > > > +		ice_vsi_assign_bpf_prog(vsi, NULL);
> > > 
> > > [Severity: High]
> > > This is a pre-existing issue, but does detaching an XDP program while a VSI
> > > rebuild has failed lead to a NULL pointer dereference here?
> > > 
> > > If a VSI rebuild fails, ice_vsi_decfg() frees vsi->rx_rings and sets the
> > > pointer to NULL via ice_vsi_free_arrays(), but leaves vsi->num_rxq
> > > non-zero. The ICE_VSI_REBUILD_PENDING bit remains set because it is only
> > > cleared on the success path in ice_vsi_rebuild().
> > > 
> > > If a user subsequently detaches XDP, ice_xdp_setup_prog() intercepts it
> > > because ICE_VSI_REBUILD_PENDING is set, and it calls
> > > ice_vsi_assign_bpf_prog(vsi, NULL).
> > > 
> > > This function loops up to vsi->num_rxq:
> > > 
> > > ice_vsi_assign_bpf_prog() {
> > >     ...
> > >     ice_for_each_rxq(vsi, i)
> > >         WRITE_ONCE(vsi->rx_rings[i]->xdp_prog, vsi->xdp_prog);
> > >     ...
> > > }
> > > 
> > > Since vsi->rx_rings is NULL, does this cause a guaranteed NULL pointer
> > > dereference panic?
> > 
> > CC: Larysa
> > 
> > Seems we should check if vsi->rx_rings != NULL plus
> > ICE_VSI_REBUILD_PENDING should be cleared on error path in
> > ice_vsi_rebuild().
> 
> I think checking (vsi->rx_rings != NULL) should be sufficient. Not sure about 
> clearing ICE_VSI_REBUILD_PENDING, if a rebuild has failed and the device is 
> disfunctional, rebuild is still to be expected at some point, and XPD changes 
> should not have any effect before then.
> 
> > 
> > Do you have any comment on TOCTOU issue mentioned above?
> > 
> > > 
> > > > +		return 0;
> > > > +	}
> > > > +
> > > >  	if_running = netif_running(vsi->netdev) &&
> > > >  		     !test_and_set_bit(ICE_VSI_DOWN, vsi->state);
> > > -- 
> > > pw-bot: cr
> > > 

  reply	other threads:[~2026-06-01 12:21 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-20 18:34 [PATCH net 0/8][pull request] Intel Wired LAN Driver Updates 2026-05-20 (ice, iavf, i40e, ixgbe) Tony Nguyen
2026-05-20 18:34 ` [PATCH net 1/8] ice: fix UAF/NULL deref when VSI rebuild and XDP attach race Tony Nguyen
2026-05-21 15:37   ` Jakub Kicinski
2026-05-23  0:16   ` Jakub Kicinski
2026-05-28  9:14     ` Maciej Fijalkowski
2026-05-29  8:05       ` Larysa Zaremba
2026-06-01 12:21         ` Maciej Fijalkowski [this message]
2026-06-01 13:52           ` Larysa Zaremba
2026-05-20 18:34 ` [PATCH net 2/8] ice: fix stats array overflow when VF requests more queues Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 3/8] iavf: return EBUSY if reset in progress or not ready during MAC change Tony Nguyen
2026-05-20 18:34 ` [PATCH net 4/8] i40e: skip unnecessary VF reset when setting trust Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-23  0:16   ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 5/8] iavf: send MAC change request synchronously Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-23  0:16   ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 6/8] ice: skip unnecessary VF reset when setting trust Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-23  0:16   ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 7/8] i40e: set supported_extts_flags for rising edge Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-20 18:34 ` [PATCH net 8/8] ixgbe: only access vfinfo and mv_list under RCU lock Tony Nguyen
2026-05-23  0:16   ` Jakub Kicinski
2026-05-23  0:16   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ah15Ti7O2m3Hj2T+@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=alexander.nowlin@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=larysa.zaremba@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=vinschen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.