Re: [PATCH net-next v07 1/6] hinic3: Add ethtool queue ops

Netdev List
 help / color / mirror / Atom feed

From: Jakub Kicinski <kuba@kernel.org>
To: gongfan1@huawei.com
Cc: Jakub Kicinski <kuba@kernel.org>,
	zhuyikai1@h-partners.com, netdev@vger.kernel.org,
	davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
	horms@kernel.org, andrew+netdev@lunn.ch, ioana.ciornei@nxp.com,
	mohsin.bashr@gmail.com, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, luosifu@huawei.com,
	guoxin09@huawei.com, zhoushuai28@huawei.com, wulike1@huawei.com,
	shijing34@huawei.com, zhengjiezhen@h-partners.com,
	maxime.chevallier@bootlin.com
Subject: Re: [PATCH net-next v07 1/6] hinic3: Add ethtool queue ops
Date: Tue,  2 Jun 2026 14:38:24 -0700	[thread overview]
Message-ID: <20260602213825.2083877-2-kuba@kernel.org> (raw)
In-Reply-To: <1323cad7ba784efa29a3fd1475b744974e0a0cf5.1779940072.git.zhuyikai1@h-partners.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
hinic3: Add ethtool queue ops

This patch implements .get_ringparam and .set_ringparam for hinic3,
introduces a hinic3_change_channel_settings() helper plus a new
HINIC3_CHANGE_RES_INVALID flag, and refactors hinic3_open_channel() into
a hinic3_prepare_channel()/hinic3_open_channel() pair.

A few observations and questions on the patch as posted.

> hinic3: Add ethtool queue ops

Could the subject line be more accurate? The patch also introduces
hinic3_change_channel_settings(), the HINIC3_CHANGE_RES_INVALID flag with
checks injected into hinic3_link_status_change() and
hinic3_set_interrupt_moder(), splits hinic3_open_channel() into
hinic3_prepare_channel()+hinic3_open_channel(), renames hinic3_uninit_qps()
to hinic3_get_cur_qps(), and adds new hinic3_free_nicio_res() and
hinic3_destroy_num_qps() calls in hinic3_close() that look like a
pre-existing leak fix. Could these be split into separate commits, with
the leak fix carrying a Fixes: tag so it can be backported?

> diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c b/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c
> index 90fc16288de9..e594bcbc8153 100644
> --- a/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c
> +++ b/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c
> @@ -10,6 +10,7 @@
>  #include <linux/etherdevice.h>
>  #include <linux/netdevice.h>
>  #include <linux/ethtool.h>
> +#include <net/devlink.h>

[Low]
Is net/devlink.h actually needed here? The new code uses
NL_SET_ERR_MSG_MOD / NL_SET_ERR_MSG_FMT_MOD (from linux/netlink.h) and
netif_running() / netdev_info(), none of which require devlink.

> @@ -409,6 +410,103 @@ hinic3_get_link_ksettings(struct net_device *netdev,
>  	return 0;
>  }
>  
> +static void hinic3_get_ringparam(struct net_device *netdev,
> +				 struct ethtool_ringparam *ring,
> +				 struct kernel_ethtool_ringparam *kernel_ring,
> +				 struct netlink_ext_ack *extack)
> +{
> +	struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
> +
> +	ring->rx_max_pending = HINIC3_MAX_RX_QUEUE_DEPTH;
> +	ring->tx_max_pending = HINIC3_MAX_TX_QUEUE_DEPTH;
> +	ring->rx_pending = nic_dev->rxqs[0].q_depth;
> +	ring->tx_pending = nic_dev->txqs[0].q_depth;
> +}

[Low]
Should the getter and setter consult the same canonical depth field? The
getter reads nic_dev->rxqs[0].q_depth / nic_dev->txqs[0].q_depth, while
hinic3_set_ringparam() compares against and updates
nic_dev->q_params.sq_depth / rq_depth. They are kept consistent today by
hinic3_alloc_rxqs() and hinic3_update_qp_depth(), but the asymmetry is
fragile to future refactors. Would it be clearer to read q_params here too?

[ ... ]

> diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c b/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
> index e7d6c2033b45..cc43773c1984 100644
> --- a/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
> +++ b/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
> @@ -137,7 +137,8 @@ static int hinic3_set_interrupt_moder(struct net_device *netdev, u16 q_id,
>  	struct hinic3_interrupt_info info = {};
>  	int err;
>  
> -	if (q_id >= nic_dev->q_params.num_qps)
> +	if (test_bit(HINIC3_CHANGE_RES_INVALID, &nic_dev->flags) ||
> +	    q_id >= nic_dev->q_params.num_qps)
>  		return 0;

[High]
Is HINIC3_CHANGE_RES_INVALID being used as an ad-hoc lock to coordinate
readers (this function, hinic3_link_status_change(), hinic3_close()) with
the teardown path in hinic3_change_channel_settings()?

The networking subsystem guidance flags exactly this pattern — bit flags
with set_bit()/test_bit()/clear_bit() guarding code sections rather than
expressing state — and recommends a real lock or RCU.

Concretely for hinic3_link_status_change(), can this race happen?

  CPU0 (auxiliary event)              CPU1 (set_ringparam under RTNL)
  test_bit(...) -> 0
  fall through                        test_and_set_bit(...) was 0, now 1
                                      hinic3_vport_down()
                                        netif_carrier_off()
                                        msleep(100)
  netif_carrier_on()
  nic_dev->link_status_up = true

End state would be carrier reported up while the device is mid-reconfigure
with the old channel torn down. The validation check runs outside any
exclusion region with the writer.

> diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_main.c b/drivers/net/ethernet/huawei/hinic3/hinic3_main.c
> index 0a888fe4c975..8cf605a0a5d2 100644
> --- a/drivers/net/ethernet/huawei/hinic3/hinic3_main.c
> +++ b/drivers/net/ethernet/huawei/hinic3/hinic3_main.c
> @@ -315,6 +315,9 @@ static void hinic3_link_status_change(struct net_device *netdev,
>  {
>  	struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
>  
> +	if (test_bit(HINIC3_CHANGE_RES_INVALID, &nic_dev->flags))
> +		return;
> +

(see comment above on the bit-as-lock pattern.)

> diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_netdev_ops.c b/drivers/net/ethernet/huawei/hinic3/hinic3_netdev_ops.c
> index da73811641a9..b7f9ba8ce43f 100644
> --- a/drivers/net/ethernet/huawei/hinic3/hinic3_netdev_ops.c
> +++ b/drivers/net/ethernet/huawei/hinic3/hinic3_netdev_ops.c

[ ... ]

> @@ -428,6 +439,72 @@ static void hinic3_vport_down(struct net_device *netdev)
>  	}
>  }
>  
> +int
> +hinic3_change_channel_settings(struct net_device *netdev,
> +			       struct hinic3_dyna_txrxq_params *trxq_params)
> +{
> +	struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
> +	struct hinic3_dyna_txrxq_params cur_trxq_params = {};
> +	struct hinic3_dyna_qp_params new_qp_params = {};
> +	struct hinic3_dyna_qp_params cur_qp_params = {};
> +	int err;
> +
> +	cur_trxq_params = nic_dev->q_params;
> +
> +	hinic3_config_num_qps(netdev, trxq_params);
> +
> +	err = hinic3_alloc_channel_resources(netdev, &new_qp_params,
> +					     trxq_params);
> +	if (err) {
> +		netdev_err(netdev, "Failed to alloc channel resources\n");
> +		return err;
> +	}
> +
> +	if (!test_and_set_bit(HINIC3_CHANGE_RES_INVALID, &nic_dev->flags)) {
> +		hinic3_vport_down(netdev);
> +		hinic3_close_channel(netdev);
> +		hinic3_get_cur_qps(nic_dev, &cur_qp_params);
> +	}
> +
> +	hinic3_init_qps(nic_dev, &new_qp_params);
> +
> +	err = hinic3_prepare_channel(netdev, trxq_params);
> +	if (err)
> +		goto err_uninit_qps;
> +
> +	if (nic_dev->num_qp_irq > trxq_params->num_qps)
> +		hinic3_qp_irq_change(netdev, trxq_params->num_qps);
> +
> +	nic_dev->q_params = *trxq_params;
> +
> +	err = hinic3_open_channel(netdev);
> +	if (err)
> +		goto err_qp_irq_reset;
> +
> +	err = hinic3_vport_up(netdev);
> +	if (err)
> +		goto err_close_channel;
> +
> +	hinic3_free_channel_resources(netdev, &cur_qp_params, &cur_trxq_params);
> +
> +	clear_bit(HINIC3_CHANGE_RES_INVALID, &nic_dev->flags);
> +
> +	return 0;
> +
> +err_close_channel:
> +	hinic3_close_channel(netdev);
> +err_qp_irq_reset:
> +	nic_dev->q_params = cur_trxq_params;
> +
> +	if (trxq_params->num_qps > cur_trxq_params.num_qps)
> +		hinic3_qp_irq_change(netdev, cur_trxq_params.num_qps);
> +err_uninit_qps:
> +	hinic3_get_cur_qps(nic_dev, &new_qp_params);
> +	hinic3_free_channel_resources(netdev, &new_qp_params, trxq_params);
> +
> +	return err;
> +}

[High]
Do the error paths leak the OLD QP resources?

After the test_and_set_bit() block runs, hinic3_close_channel() and
hinic3_get_cur_qps(&cur_qp_params) capture the OLD nic_io->sq/rq into
cur_qp_params. hinic3_init_qps(&new_qp_params) then overwrites nic_io with
the NEW pointers.

On the success path, cur_qp_params is freed via:

	hinic3_free_channel_resources(netdev, &cur_qp_params, &cur_trxq_params);

But err_close_channel, err_qp_irq_reset and err_uninit_qps only free
new_qp_params/trxq_params. cur_qp_params is never referenced again on these
paths, so the kzalloc'd sqs/rqs arrays plus the per-QP DMA-coherent WQ pages
created by hinic3_create_qp() appear to leak per failed ringparam change.

[High]
Is HINIC3_CHANGE_RES_INVALID ever cleared on failure?

Only the success path runs:

	clear_bit(HINIC3_CHANGE_RES_INVALID, &nic_dev->flags);

err_close_channel, err_qp_irq_reset and err_uninit_qps return without
clearing. Once stuck high:

  - hinic3_link_status_change() returns immediately on every event, so
    carrier transitions stop propagating;
  - hinic3_set_interrupt_moder() returns 0 without applying changes;
  - hinic3_close() takes the test_and_clear_bit() goto-out branch and
    skips hinic3_vport_down()/hinic3_close_channel()/free_channel_resources
    entirely while still calling hinic3_free_nicio_res() and
    hinic3_destroy_num_qps().

After a failed ethtool -G, would the only recovery be a driver reload or
device reset?

[High]
Does hinic3_free_qp_ctxts() get called when hinic3_open_channel() fails
here?

hinic3_prepare_channel() programs the HW root context via
hinic3_set_root_ctxt() and per-SQ hinic3_set_ci_table(); the matching
teardown is hinic3_free_qp_ctxts() -> hinic3_clean_root_ctxt(). On
hinic3_open_channel() failure this jumps to err_qp_irq_reset ->
err_uninit_qps, which only does hinic3_get_cur_qps() +
hinic3_free_channel_resources(). The latter dma_free's the WQ pages whose
physical addresses the firmware still has.

The pre-patch hinic3_open_channel() had an explicit err_free_qp_ctxts label
covering this case; was the symmetry intentionally dropped?

> @@ -458,6 +535,10 @@ static int hinic3_open(struct net_device *netdev)
>  
>  	hinic3_init_qps(nic_dev, &qp_params);
>  
> +	err = hinic3_prepare_channel(netdev, &nic_dev->q_params);
> +	if (err)
> +		goto err_uninit_qps;
> +
>  	err = hinic3_open_channel(netdev);
>  	if (err)
>  		goto err_uninit_qps;
> @@ -473,7 +554,7 @@ static int hinic3_open(struct net_device *netdev)
>  err_close_channel:
>  	hinic3_close_channel(netdev);
>  err_uninit_qps:
> -	hinic3_uninit_qps(nic_dev, &qp_params);
> +	hinic3_get_cur_qps(nic_dev, &qp_params);
>  	hinic3_free_channel_resources(netdev, &qp_params, &nic_dev->q_params);

[High]
Same observation as above for the open() path: when hinic3_open_channel()
fails after hinic3_prepare_channel() succeeded, control jumps to
err_uninit_qps which does not call hinic3_free_qp_ctxts(), so the HW root
context and per-SQ CI table programming is left in place while the WQ
pages backing the addresses are freed. Should hinic3_free_qp_ctxts() be
called in this path?

> @@ -493,10 +574,18 @@ static int hinic3_close(struct net_device *netdev)
>  		return 0;
>  	}
>  
> +	if (test_and_clear_bit(HINIC3_CHANGE_RES_INVALID, &nic_dev->flags))
> +		goto out;
> +
>  	hinic3_vport_down(netdev);
>  	hinic3_close_channel(netdev);
> -	hinic3_uninit_qps(nic_dev, &qp_params);
> -	hinic3_free_channel_resources(netdev, &qp_params, &nic_dev->q_params);
> +	hinic3_get_cur_qps(nic_dev, &qp_params);
> +	hinic3_free_channel_resources(netdev, &qp_params,
> +				      &nic_dev->q_params);
> +
> +out:
> +	hinic3_free_nicio_res(nic_dev);
> +	hinic3_destroy_num_qps(netdev);
>  
>  	return 0;
>  }

(see the high-severity comments above on HINIC3_CHANGE_RES_INVALID never
being cleared on failure — the goto-out branch here is the path that
silently skips channel teardown when that happens.)

[ ... ]

> diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_rx.c b/drivers/net/ethernet/huawei/hinic3/hinic3_rx.c
> index 309ab5901379..b5b601469517 100644
> --- a/drivers/net/ethernet/huawei/hinic3/hinic3_rx.c
> +++ b/drivers/net/ethernet/huawei/hinic3/hinic3_rx.c
> @@ -541,7 +541,7 @@ int hinic3_configure_rxqs(struct net_device *netdev, u16 num_rq,
>  		rq_associate_cqes(rxq);
>  
>  		pkts = hinic3_rx_fill_buffers(rxq);
> -		if (!pkts) {
> +		if (pkts < rxq->q_depth - 1) {
>  			netdev_err(netdev, "Failed to fill Rx buffer\n");
>  			return -ENOMEM;
>  		}

[Medium]
Could this hunk be called out in the commit message? The success criterion
for hinic3_rx_fill_buffers() changes from "any buffers filled" to "almost
all buffers filled" (q_depth - 1). hinic3_rx_fill_buffers() loops up to
q_depth - 1 entries and breaks on the first allocation failure, so any
short fill — for example a single page allocation failure under transient
memory pressure — now returns -ENOMEM and aborts hinic3_open() or
hinic3_change_channel_settings(). This seems unrelated to "Add ethtool
queue ops"; was the behavior change intentional?
-- 
pw-bot: cr

next prev parent reply	other threads:[~2026-06-02 21:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-28  9:29 [PATCH net-next v07 0/6] net: hinic3: PF initialization Fan Gong
2026-05-28  9:29 ` [PATCH net-next v07 1/6] hinic3: Add ethtool queue ops Fan Gong
2026-05-29  5:30   ` Mohsin Bashir
2026-06-02 21:38   ` Jakub Kicinski [this message]
2026-05-28  9:29 ` [PATCH net-next v07 2/6] hinic3: Add ethtool statistic ops Fan Gong
2026-05-28  9:29 ` [PATCH net-next v07 3/6] hinic3: Add ethtool coalesce ops Fan Gong
2026-05-28  9:29 ` [PATCH net-next v07 4/6] hinic3: Add ethtool rss ops Fan Gong
2026-05-28  9:29 ` [PATCH net-next v07 5/6] hinic3: Configure netdev->watchdog_timeo to set nic tx timeout Fan Gong
2026-05-28  9:29 ` [PATCH net-next v07 6/6] hinic3: Remove unneeded coalesce parameters Fan Gong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260602213825.2083877-2-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gongfan1@huawei.com \
    --cc=guoxin09@huawei.com \
    --cc=horms@kernel.org \
    --cc=ioana.ciornei@nxp.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luosifu@huawei.com \
    --cc=maxime.chevallier@bootlin.com \
    --cc=mohsin.bashr@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=shijing34@huawei.com \
    --cc=wulike1@huawei.com \
    --cc=zhengjiezhen@h-partners.com \
    --cc=zhoushuai28@huawei.com \
    --cc=zhuyikai1@h-partners.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox