Re: [Intel-wired-lan] [PATCH iwl-next v5 4/5] ice: implement Rx queue management ops

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Simon Horman <horms@kernel.org>
To: aleksander.lobakin@intel.com
Cc: 'Simon Horman' <horms@kernel.org>,
	intel-wired-lan@lists.osuosl.org, anthony.l.nguyen@intel.com,
	przemyslaw.kitszel@intel.com, andrew+netdev@lunn.ch,
	davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, kohei@enjuk.jp, jacob.e.keller@intel.com,
	aleksandr.loktionov@intel.com,
	nxne.cnse.osdt.itp.upstreaming@intel.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [Intel-wired-lan] [PATCH iwl-next v5 4/5] ice: implement Rx queue management ops
Date: Fri,  8 May 2026 15:25:06 +0100	[thread overview]
Message-ID: <20260508142505.1210977-2-horms@kernel.org> (raw)
In-Reply-To: <20260505152923.1040589-5-aleksander.lobakin@intel.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
    ice: implement Rx queue management ops

    Now ice is ready to get queue_mgmt_ops support. It already has API
    to disable/reconfig/enable one particular queue (for XSk). Reuse as
    much of its code as possible to implement Rx queue management
    callbacks and vice versa -- ice_queue_mem_{alloc,free}() can be
    reused during ifup/ifdown to elide code duplication.
    With this, ice passes the io_uring zcrx selftests, meaning the Rx
    part of netmem/MP support is done.

> diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
> index 1add82d894bbe..4e0b8895c303a 100644
> --- a/drivers/net/ethernet/intel/ice/ice_base.c
> +++ b/drivers/net/ethernet/intel/ice/ice_base.c
> @@ -653,6 +653,43 @@ static int ice_rxq_pp_create(struct ice_rx_ring *rq)
>  	return err;
>  }
>  
> +static int ice_queue_mem_alloc(struct net_device *dev,
> +			       struct netdev_queue_config *qcfg,
> +			       void *per_queue_mem, int idx)
> +{
> +	const struct ice_netdev_priv *priv = netdev_priv(dev);
> +	const struct ice_rx_ring *real = priv->vsi->rx_rings[idx];
> +	struct ice_rx_ring *new = per_queue_mem;
> +	int ret;
> +
> +	new->count = real->count;
> +	new->netdev = real->netdev;
> +	new->q_index = real->q_index;
> +	new->q_vector = real->q_vector;
> +	new->vsi = real->vsi;

Is it safe to assume netdev_priv(dev) returns the correct VSI here? For a
loopback VSI (ICE_VSI_LB), it shares the PF's netdev, so priv->vsi resolves
to the PF VSI.

Could this copy the PF VSI pointers to the loopback ring and cause hardware
context corruption later?

[ ... ]
> -/**
> - * ice_qp_dis - Disables a queue pair
> - * @vsi: VSI of interest
> - * @q_idx: ring index in array
> - *
> - * Returns 0 on success, negative on failure.
> - */
> -int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
> +static int __ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
>  {

Could this cause TX data corruption? This function appears to call
synchronize_net() before netif_tx_stop_queue(), meaning the RCU barrier
completes before the queue actually stops.

Concurrent transmissions might then race with ring modifications.

[ ... ]
> +static int ice_queue_start(struct net_device *dev,
> +			   struct netdev_queue_config *qcfg,
> +			   void *per_queue_mem, int idx)
> +{
> +	const struct ice_netdev_priv *priv = netdev_priv(dev);
> +	struct ice_rx_ring *real = priv->vsi->rx_rings[idx];
> +	struct ice_rx_ring *new = per_queue_mem;
> +	struct napi_struct *napi;
> +	int ret;
[ ... ]
> +	ret = ice_alloc_rx_bufs(real, ICE_DESC_UNUSED(real));
> +
> +	return __ice_qp_ena(priv->vsi, idx, ret);
> +}

Does this cause a use-after-free via hardware DMA?

ice_alloc_rx_bufs() returns a boolean true (1) on success. This is passed to
__ice_qp_ena() as the fail argument, which returns it back. Since
ice_queue_start() returns a non-zero value, the core networking stack assumes
failure and frees the queue memory, while the hardware queue remains active and
DMAs into freed memory.

> +
> +static int ice_queue_stop(struct net_device *dev, void *per_queue_mem,
> +			  int idx)
> +{
> +	const struct ice_netdev_priv *priv = netdev_priv(dev);
> +	struct ice_rx_ring *real = priv->vsi->rx_rings[idx];
> +	int ret;
> +
> +	ret = __ice_qp_dis(priv->vsi, idx);

Does calling __ice_qp_dis() from ice_queue_stop() disrupt the global interface
carrier state and TX queues?

__ice_qp_dis() calls netif_carrier_off() and cleans TX rings, but
ice_queue_stop() is intended to be a granular, per-RX-queue operation.

> +	if (ret)
> +		return ret;
> +
> +	page_pool_disable_direct_recycling(real->pp);

Can this dereference a NULL pointer if the queue is configured with an AF_XDP
zero-copy socket?

In that case, a standard page pool is not created and real->pp might be NULL,
causing a panic in page_pool_disable_direct_recycling().

> +	if (real->hdr_pp)
> +		page_pool_disable_direct_recycling(real->hdr_pp);
> +
> +	ice_zero_rx_ring(real);
> +	memcpy(per_queue_mem, real, sizeof(*real));

Could this lead to a memory leak of RX ring buffers?

ice_zero_rx_ring() clears the next_to_clean and next_to_use indices to 0
before the memcpy(). When the stack calls ice_queue_mem_free(per_queue_mem),
it skips cleaning because next_to_clean == next_to_use, leaking all active
skbs and page pool buffers.

Also, does this memcpy() leave dangling pointers in real that can lead to a
double-free?

The dynamically allocated pointers like rx_fqes and pp are not set to NULL
in real. When the interface is brought down later, ice_clean_rx_ring(real)
may attempt to destroy the already-freed page pools again.

> +
> +	return 0;
> +}

WARNING: multiple messages have this Message-ID (diff)

From: Simon Horman <horms@kernel.org>
To: aleksander.lobakin@intel.com
Cc: 'Simon Horman' <horms@kernel.org>,
	intel-wired-lan@lists.osuosl.org, anthony.l.nguyen@intel.com,
	przemyslaw.kitszel@intel.com, andrew+netdev@lunn.ch,
	davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, kohei@enjuk.jp, jacob.e.keller@intel.com,
	aleksandr.loktionov@intel.com,
	nxne.cnse.osdt.itp.upstreaming@intel.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH iwl-next v5 4/5] ice: implement Rx queue management ops
Date: Fri,  8 May 2026 15:25:06 +0100	[thread overview]
Message-ID: <20260508142505.1210977-2-horms@kernel.org> (raw)
In-Reply-To: <20260505152923.1040589-5-aleksander.lobakin@intel.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
    ice: implement Rx queue management ops

    Now ice is ready to get queue_mgmt_ops support. It already has API
    to disable/reconfig/enable one particular queue (for XSk). Reuse as
    much of its code as possible to implement Rx queue management
    callbacks and vice versa -- ice_queue_mem_{alloc,free}() can be
    reused during ifup/ifdown to elide code duplication.
    With this, ice passes the io_uring zcrx selftests, meaning the Rx
    part of netmem/MP support is done.

> diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
> index 1add82d894bbe..4e0b8895c303a 100644
> --- a/drivers/net/ethernet/intel/ice/ice_base.c
> +++ b/drivers/net/ethernet/intel/ice/ice_base.c
> @@ -653,6 +653,43 @@ static int ice_rxq_pp_create(struct ice_rx_ring *rq)
>  	return err;
>  }
>  
> +static int ice_queue_mem_alloc(struct net_device *dev,
> +			       struct netdev_queue_config *qcfg,
> +			       void *per_queue_mem, int idx)
> +{
> +	const struct ice_netdev_priv *priv = netdev_priv(dev);
> +	const struct ice_rx_ring *real = priv->vsi->rx_rings[idx];
> +	struct ice_rx_ring *new = per_queue_mem;
> +	int ret;
> +
> +	new->count = real->count;
> +	new->netdev = real->netdev;
> +	new->q_index = real->q_index;
> +	new->q_vector = real->q_vector;
> +	new->vsi = real->vsi;

Is it safe to assume netdev_priv(dev) returns the correct VSI here? For a
loopback VSI (ICE_VSI_LB), it shares the PF's netdev, so priv->vsi resolves
to the PF VSI.

Could this copy the PF VSI pointers to the loopback ring and cause hardware
context corruption later?

[ ... ]
> -/**
> - * ice_qp_dis - Disables a queue pair
> - * @vsi: VSI of interest
> - * @q_idx: ring index in array
> - *
> - * Returns 0 on success, negative on failure.
> - */
> -int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
> +static int __ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
>  {

Could this cause TX data corruption? This function appears to call
synchronize_net() before netif_tx_stop_queue(), meaning the RCU barrier
completes before the queue actually stops.

Concurrent transmissions might then race with ring modifications.

[ ... ]
> +static int ice_queue_start(struct net_device *dev,
> +			   struct netdev_queue_config *qcfg,
> +			   void *per_queue_mem, int idx)
> +{
> +	const struct ice_netdev_priv *priv = netdev_priv(dev);
> +	struct ice_rx_ring *real = priv->vsi->rx_rings[idx];
> +	struct ice_rx_ring *new = per_queue_mem;
> +	struct napi_struct *napi;
> +	int ret;
[ ... ]
> +	ret = ice_alloc_rx_bufs(real, ICE_DESC_UNUSED(real));
> +
> +	return __ice_qp_ena(priv->vsi, idx, ret);
> +}

Does this cause a use-after-free via hardware DMA?

ice_alloc_rx_bufs() returns a boolean true (1) on success. This is passed to
__ice_qp_ena() as the fail argument, which returns it back. Since
ice_queue_start() returns a non-zero value, the core networking stack assumes
failure and frees the queue memory, while the hardware queue remains active and
DMAs into freed memory.

> +
> +static int ice_queue_stop(struct net_device *dev, void *per_queue_mem,
> +			  int idx)
> +{
> +	const struct ice_netdev_priv *priv = netdev_priv(dev);
> +	struct ice_rx_ring *real = priv->vsi->rx_rings[idx];
> +	int ret;
> +
> +	ret = __ice_qp_dis(priv->vsi, idx);

Does calling __ice_qp_dis() from ice_queue_stop() disrupt the global interface
carrier state and TX queues?

__ice_qp_dis() calls netif_carrier_off() and cleans TX rings, but
ice_queue_stop() is intended to be a granular, per-RX-queue operation.

> +	if (ret)
> +		return ret;
> +
> +	page_pool_disable_direct_recycling(real->pp);

Can this dereference a NULL pointer if the queue is configured with an AF_XDP
zero-copy socket?

In that case, a standard page pool is not created and real->pp might be NULL,
causing a panic in page_pool_disable_direct_recycling().

> +	if (real->hdr_pp)
> +		page_pool_disable_direct_recycling(real->hdr_pp);
> +
> +	ice_zero_rx_ring(real);
> +	memcpy(per_queue_mem, real, sizeof(*real));

Could this lead to a memory leak of RX ring buffers?

ice_zero_rx_ring() clears the next_to_clean and next_to_use indices to 0
before the memcpy(). When the stack calls ice_queue_mem_free(per_queue_mem),
it skips cleaning because next_to_clean == next_to_use, leaking all active
skbs and page pool buffers.

Also, does this memcpy() leave dangling pointers in real that can lead to a
double-free?

The dynamically allocated pointers like rx_fqes and pp are not set to NULL
in real. When the interface is brought down later, ice_clean_rx_ring(real)
may attempt to destroy the already-freed page pools again.

> +
> +	return 0;
> +}

next prev parent reply	other threads:[~2026-05-08 15:07 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-05 15:29 [Intel-wired-lan] [PATCH iwl-next v5 0/5] ice: add support for devmem/io_uring Rx and Tx Alexander Lobakin
2026-05-05 15:29 ` Alexander Lobakin
2026-05-05 15:29 ` [Intel-wired-lan] [PATCH iwl-next v5 1/5] libeth: pass Rx queue index to PP when creating a fill queue Alexander Lobakin
2026-05-05 15:29   ` Alexander Lobakin
2026-05-05 15:29 ` [Intel-wired-lan] [PATCH iwl-next v5 2/5] libeth: handle creating pools with unreadable buffers Alexander Lobakin
2026-05-05 15:29   ` Alexander Lobakin
2026-05-05 15:29 ` [Intel-wired-lan] [PATCH iwl-next v5 3/5] ice: migrate to netdev ops lock Alexander Lobakin
2026-05-05 15:29   ` Alexander Lobakin
2026-05-08 13:28   ` [Intel-wired-lan] " Simon Horman
2026-05-08 13:28     ` Simon Horman
2026-05-08 13:57     ` [Intel-wired-lan] " Alexander Lobakin
2026-05-08 13:57       ` Alexander Lobakin
2026-05-10  8:57       ` [Intel-wired-lan] " Simon Horman
2026-05-10  8:57         ` Simon Horman
2026-05-05 15:29 ` [Intel-wired-lan] [PATCH iwl-next v5 4/5] ice: implement Rx queue management ops Alexander Lobakin
2026-05-05 15:29   ` Alexander Lobakin
2026-05-08 14:08   ` [Intel-wired-lan] " Simon Horman
2026-05-08 14:26     ` Simon Horman
2026-05-08 15:07     ` Alexander Lobakin
2026-05-08 14:25   ` Simon Horman [this message]
2026-05-08 14:25     ` Simon Horman
2026-05-05 15:29 ` [Intel-wired-lan] [PATCH iwl-next v5 5/5] ice: add support for transmitting unreadable frags Alexander Lobakin
2026-05-05 15:29   ` Alexander Lobakin
2026-05-08 15:42   ` [Intel-wired-lan] " Simon Horman
2026-05-08 15:42     ` Simon Horman
2026-05-08 12:06 ` [Intel-wired-lan] [PATCH iwl-next v5 0/5] ice: add support for devmem/io_uring Rx and Tx Alexander Lobakin
2026-05-08 12:06   ` Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260508142505.1210977-2-horms@kernel.org \
    --to=horms@kernel.org \
    --cc=aleksander.lobakin@intel.com \
    --cc=aleksandr.loktionov@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jacob.e.keller@intel.com \
    --cc=kohei@enjuk.jp \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nxne.cnse.osdt.itp.upstreaming@intel.com \
    --cc=pabeni@redhat.com \
    --cc=przemyslaw.kitszel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.