Re: [Intel-wired-lan] [PATCH iwl-next v5 4/5] ice: implement Rx queue management ops

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Simon Horman <horms@kernel.org>
Cc: <intel-wired-lan@lists.osuosl.org>, <anthony.l.nguyen@intel.com>,
	<przemyslaw.kitszel@intel.com>, <andrew+netdev@lunn.ch>,
	<davem@davemloft.net>, <edumazet@google.com>, <kuba@kernel.org>,
	<pabeni@redhat.com>, <kohei@enjuk.jp>, <jacob.e.keller@intel.com>,
	<aleksandr.loktionov@intel.com>,
	<nxne.cnse.osdt.itp.upstreaming@intel.com>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [Intel-wired-lan] [PATCH iwl-next v5 4/5] ice: implement Rx queue management ops
Date: Fri, 8 May 2026 17:07:03 +0200	[thread overview]
Message-ID: <4c0cacc0-b5f4-47fa-8865-72bd157c1cb1@intel.com> (raw)
In-Reply-To: <20260508140758.1200568-3-horms@kernel.org>

From: Simon Horman <horms@kernel.org>
Date: Fri,  8 May 2026 15:08:00 +0100

> From: 'Simon Horman' <horms@kernel.org>
> 
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
> ice: implement Rx queue management ops
> 
> This commit wires the ice driver into the netdev queue management ops
> framework, reusing the existing per-queue disable/reconfig/enable paths
> originally added for XSk. With it, ice reportedly passes the io_uring
> zcrx selftests.
> 
>> diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
>> index 1add82d894bb..4e0b8895c303 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_base.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_base.c
> 
> [ ... ]
> 
>> @@ -1577,3 +1593,81 @@ int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
>>
>>  	return fail;
>>  }
>> +
>> +/**
>> + * ice_qp_ena - Enables a queue pair
>> + * @vsi: VSI of interest
>> + * @q_idx: ring index in array
>> + *
>> + * Returns 0 on success, negative on failure.
>> + */
>> +int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
>> +{
>> +	return __ice_qp_ena(vsi, q_idx, ice_vsi_cfg_single_rxq(vsi, q_idx));
>> +}
>> +
>> +static int ice_queue_start(struct net_device *dev,
>> +			   struct netdev_queue_config *qcfg,
>> +			   void *per_queue_mem, int idx)
>> +{
>> +	const struct ice_netdev_priv *priv = netdev_priv(dev);
>> +	struct ice_rx_ring *real = priv->vsi->rx_rings[idx];
>> +	struct ice_rx_ring *new = per_queue_mem;
>> +	struct napi_struct *napi;
>> +	int ret;
>> +
>> +	real->pp = new->pp;
>> +	real->rx_fqes = new->rx_fqes;
>> +	real->hdr_fqes = new->hdr_fqes;
>> +	real->hdr_pp = new->hdr_pp;
>> +
>> +	real->hdr_truesize = new->hdr_truesize;
>> +	real->truesize = new->truesize;
>> +	real->rx_hdr_len = new->rx_hdr_len;
>> +	real->rx_buf_len = new->rx_buf_len;
>> +
>> +	memcpy(&real->xdp_rxq, &new->xdp_rxq, sizeof(new->xdp_rxq));
>> +
>> +	ret = ice_setup_rx_ctx(real);
>> +	if (ret)
>> +		return ret;
> 
> Simon says: I believe that the issue flagged immediately below is
>             a false positive; the cleanup logic unwinds things correctly.

Yup, our Sashiko also had concerns about this piece, but I rechecked and
seems like it's a FP.

> 
> 	    However, the two other issues flagged further down do seem to be
> 	    of concern.
> 
> ice_queue_start() eagerly copies new->pp, new->rx_fqes, new->hdr_pp,
> new->hdr_fqes and the entire new->xdp_rxq into real before ice_setup_rx_ctx()
> has succeeded.

[...]

> 
>> +
>> +	napi = &real->q_vector->napi;
>> +
>> +	page_pool_enable_direct_recycling(real->pp, napi);
>> +	if (real->hdr_pp)
>> +		page_pool_enable_direct_recycling(real->hdr_pp, napi);
>> +
>> +	ret = ice_alloc_rx_bufs(real, ICE_DESC_UNUSED(real));
>> +
>> +	return __ice_qp_ena(priv->vsi, idx, ret);
> 
> ice_alloc_rx_bufs() is declared bool and returns !!cleaned_count, so it
> yields 0 on full success and 1 on partial/failed allocation:
> 
> bool ice_alloc_rx_bufs(struct ice_rx_ring *rx_ring, unsigned int cleaned_count)
> {
>     ...
>     return !!cleaned_count;
> }
> 
> That bool is then passed straight into __ice_qp_ena()'s fail parameter,
> which only overwrites fail via 'if (!fail) fail = err;', so a value of 1
> propagates unchanged to ice_queue_start()'s return value.
> 
> netdev_rx_queue_reconfig() treats any non-zero return as failure and
> propagates it back to userspace (netlink, ethtool, io_uring zcrx install),
> where the callers expect a negative errno.
> 
> Before this patch, ice_qp_ena() obtained its fail value from
> ice_vsi_cfg_single_rxq(), which returns a signed errno.
> 
> Should this be converted to a proper negative errno before being handed
> to __ice_qp_ena(), for example:
> 
> 	ret = ice_alloc_rx_bufs(real, ICE_DESC_UNUSED(real)) ? -ENOMEM : 0;
> 
>> +}
>> +
>> +static int ice_queue_stop(struct net_device *dev, void *per_queue_mem,
>> +			  int idx)
>> +{
>> +	const struct ice_netdev_priv *priv = netdev_priv(dev);
>> +	struct ice_rx_ring *real = priv->vsi->rx_rings[idx];
>> +	int ret;
>> +
>> +	ret = __ice_qp_dis(priv->vsi, idx);
>> +	if (ret)
>> +		return ret;
>> +
>> +	page_pool_disable_direct_recycling(real->pp);
>> +	if (real->hdr_pp)
>> +		page_pool_disable_direct_recycling(real->hdr_pp);
>> +
>> +	ice_zero_rx_ring(real);
>> +	memcpy(per_queue_mem, real, sizeof(*real));
>> +
>> +	return 0;
>> +}
> 
> Can this leak inflight page pool buffers?
> 
> ice_zero_rx_ring(real) is called before the memcpy to per_queue_mem, and
> ice_zero_rx_ring() resets both indices:
> 
> void ice_zero_rx_ring(struct ice_rx_ring *rx_ring)
> {
>     ...
>     rx_ring->next_to_clean = 0;
>     rx_ring->next_to_use = 0;
> }
> 
> So per_queue_mem captures a ring where next_to_clean == next_to_use == 0.
> 
> The core then invokes ndo_queue_mem_free(dev, old_mem), and the recycle
> loop in ice_queue_mem_free() is guarded by exactly those two indices:
> 
> void ice_queue_mem_free(struct net_device *dev, void *per_queue_mem)
> {
>     ...
>     for (u32 i = rx_ring->next_to_clean; i != rx_ring->next_to_use; ) {
>         libeth_rx_recycle_slow(rx_ring->rx_fqes[i].netmem);
> 
>         if (rx_ring->hdr_pp)
>             libeth_rx_recycle_slow(rx_ring->hdr_fqes[i].netmem);
> 
>         if (unlikely(++i == rx_ring->count))
>             i = 0;
>     }
>     ...
>     ice_rxq_pp_destroy(rx_ring);
> }
> 
> With 0 != 0 false on entry, the loop never runs, so any buffers that were
> in rx_fqes[old_ntc..old_ntu) (and hdr_fqes[] when header split is on) are
> never returned via libeth_rx_recycle_slow().  ice_rxq_pp_destroy() then
> kvfree()s the fqes arrays and calls page_pool_destroy() with outstanding
> inflight references that are no longer tracked anywhere.
> 
> The new ice_clean_rx_ring() wrapper keeps the original ordering:
> 
> void ice_clean_rx_ring(struct ice_rx_ring *rx_ring)
> {
>     ice_queue_mem_free(rx_ring->netdev, rx_ring);
>     ice_zero_rx_ring(rx_ring);
> }
> 
> Should ice_queue_stop() follow the same ordering and perform the memcpy
> before ice_zero_rx_ring(real), so the recycle loop in ice_queue_mem_free()
> sees the real ntc/ntu values?
> 
>> +
>> +const struct netdev_queue_mgmt_ops ice_queue_mgmt_ops = {
>> +	.ndo_queue_mem_alloc	= ice_queue_mem_alloc,
>> +	.ndo_queue_mem_free	= ice_queue_mem_free,
>> +	.ndo_queue_mem_size	= sizeof(struct ice_rx_ring),
>> +	.ndo_queue_start	= ice_queue_start,
>> +	.ndo_queue_stop		= ice_queue_stop,
>> +};

Those two are new to me, I'll double check.

Thanks,
Olek

next prev parent reply	other threads:[~2026-05-08 15:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-05 15:29 [Intel-wired-lan] [PATCH iwl-next v5 0/5] ice: add support for devmem/io_uring Rx and Tx Alexander Lobakin
2026-05-05 15:29 ` Alexander Lobakin
2026-05-05 15:29 ` [Intel-wired-lan] [PATCH iwl-next v5 1/5] libeth: pass Rx queue index to PP when creating a fill queue Alexander Lobakin
2026-05-05 15:29   ` Alexander Lobakin
2026-05-05 15:29 ` [Intel-wired-lan] [PATCH iwl-next v5 2/5] libeth: handle creating pools with unreadable buffers Alexander Lobakin
2026-05-05 15:29   ` Alexander Lobakin
2026-05-05 15:29 ` [Intel-wired-lan] [PATCH iwl-next v5 3/5] ice: migrate to netdev ops lock Alexander Lobakin
2026-05-05 15:29   ` Alexander Lobakin
2026-05-08 13:28   ` [Intel-wired-lan] " Simon Horman
2026-05-08 13:28     ` Simon Horman
2026-05-08 13:57     ` [Intel-wired-lan] " Alexander Lobakin
2026-05-08 13:57       ` Alexander Lobakin
2026-05-10  8:57       ` [Intel-wired-lan] " Simon Horman
2026-05-10  8:57         ` Simon Horman
2026-05-05 15:29 ` [Intel-wired-lan] [PATCH iwl-next v5 4/5] ice: implement Rx queue management ops Alexander Lobakin
2026-05-05 15:29   ` Alexander Lobakin
2026-05-08 14:08   ` [Intel-wired-lan] " Simon Horman
2026-05-08 14:26     ` Simon Horman
2026-05-08 15:07     ` Alexander Lobakin [this message]
2026-05-08 14:25   ` Simon Horman
2026-05-08 14:25     ` Simon Horman
2026-05-05 15:29 ` [Intel-wired-lan] [PATCH iwl-next v5 5/5] ice: add support for transmitting unreadable frags Alexander Lobakin
2026-05-05 15:29   ` Alexander Lobakin
2026-05-08 15:42   ` [Intel-wired-lan] " Simon Horman
2026-05-08 15:42     ` Simon Horman
2026-05-08 12:06 ` [Intel-wired-lan] [PATCH iwl-next v5 0/5] ice: add support for devmem/io_uring Rx and Tx Alexander Lobakin
2026-05-08 12:06   ` Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c0cacc0-b5f4-47fa-8865-72bd157c1cb1@intel.com \
    --to=aleksander.lobakin@intel.com \
    --cc=aleksandr.loktionov@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jacob.e.keller@intel.com \
    --cc=kohei@enjuk.jp \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nxne.cnse.osdt.itp.upstreaming@intel.com \
    --cc=pabeni@redhat.com \
    --cc=przemyslaw.kitszel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.