From: Jakub Kicinski <kuba@kernel.org>
To: jiawenwu@trustnetic.com
Cc: Jakub Kicinski <kuba@kernel.org>,
netdev@vger.kernel.org, mengyuanlou@net-swift.com,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
pabeni@redhat.com, richardcochran@gmail.com,
linux@armlinux.org.uk, horms@kernel.org,
michal.swiatkowski@linux.intel.com, jacob.e.keller@intel.com,
kees@kernel.org, joe@dama.to, larysa.zaremba@intel.com,
abdun.nihaal@gmail.com, leitao@debian.org
Subject: Re: [PATCH net-next v6 09/11] net: wangxun: add Tx timeout process
Date: Mon, 30 Mar 2026 17:55:45 -0700 [thread overview]
Message-ID: <20260331005546.3459790-1-kuba@kernel.org> (raw)
In-Reply-To: <20260326021406.30444-10-jiawenwu@trustnetic.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: wangxun: add Tx timeout process
Implement .ndo_tx_timeout to handle Tx side timeout event. When Tx
timeout event occur, it will triger driver into reset process.
The WX_HANG_CHECK_ARMED bit is set to indicate a potential hang. It will
be cleared if a pause frame is received to remove false hang detection
due to 802.3 frames, which is implemented in later commit.
> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> new file mode 100644
> index 0000000000000..bb77bdce69d22
> --- /dev/null
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> @@ -0,0 +1,126 @@
[ ... ]
> +static void wx_reset_subtask(struct wx *wx)
> +{
> + if (!test_bit(WX_FLAG_NEED_PF_RESET, wx->flags))
> + return;
> +
> + if (!netif_running(wx->netdev) ||
> + test_bit(WX_STATE_RESETTING, wx->state))
> + return;
> +
> + rtnl_lock();
> +
> + wx_warn(wx, "Reset adapter.\n");
> +
> + if (test_bit(WX_FLAG_NEED_PF_RESET, wx->flags)) {
> + if (wx->do_reset)
> + wx->do_reset(wx->netdev, true);
Can this race with dev_close()?
If the interface is brought down concurrently, dev_close() acquires rtnl_lock(),
clears the running state, and frees resources. When wx_reset_subtask() later
acquires the lock, it does not re-check netif_running().
Could this result in calling do_reset() on a closed interface, leading to
hardware state inconsistencies or use-after-free crashes?
> + clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags);
Could this unconditionally clear WX_FLAG_NEED_PF_RESET even if another
context sets it during the do_reset() execution? Should this use
test_and_clear_bit() before the reset operation to avoid losing concurrent
reset requests?
> + }
> +
> + rtnl_unlock();
> +}
[ ... ]
> +void wx_tx_timeout(struct net_device *netdev, unsigned int txqueue)
> +{
> + struct wx *wx = netdev_priv(netdev);
> + u32 head, tail;
> + int i;
> +
> + for (i = 0; i < wx->num_tx_queues; i++) {
> + struct wx_ring *tx_ring = wx->tx_ring[i];
> +
> + if (test_bit(WX_TX_DETECT_HANG, tx_ring->state) &&
> + wx_check_tx_hang(tx_ring))
Does calling wx_check_tx_hang() from the dev_watchdog softirq context cause
data races with the NAPI polling context (wx_clean_tx_irq)?
wx_check_tx_hang() modifies ring state bits (WX_HANG_CHECK_ARMED) and updates
the 64-bit statistic tx_done_old. Can this lead to torn reads or writes on
32-bit platforms and silent corruption of the hang detection state?
> + wx_warn(wx, "Real tx hang detected on queue %d\n", i);
> +
[ ... ]
> diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> index baffcadb22690..b8db772c37c5c 100644
> --- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> +++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
[ ... ]
> @@ -370,6 +391,7 @@ static void ngbe_disable_device(struct wx *wx)
> wx_napi_disable_all(wx);
> netif_tx_stop_all_queues(netdev);
> netif_tx_disable(netdev);
> + timer_delete_sync(&wx->service_timer);
Does this code leave the service task running?
timer_delete_sync() is called, but cancel_work_sync(&wx->service_task) is
missing. If the work is already queued, could wx_check_tx_hang_subtask() run
after the transmission and reception rings are freed, resulting in a
use-after-free when accessing wx->tx_ring[i]->state?
> if (wx->gpio_ctrl)
> ngbe_sfp_modules_txrx_powerctl(wx, false);
> wx_irq_disable(wx);
[ ... ]
> @@ -816,6 +845,9 @@ static void ngbe_remove(struct pci_dev *pdev)
> struct wx *wx = pci_get_drvdata(pdev);
> struct net_device *netdev;
>
> + timer_delete_sync(&wx->service_timer);
> + cancel_work_sync(&wx->service_task);
> +
> netdev = wx->netdev;
> wx_disable_sriov(wx);
> unregister_netdev(netdev);
Does this teardown order result in a use-after-free?
cancel_work_sync() is called before unregister_netdev(). Since the network
device is still active, could a concurrent timeout or interrupt re-queue the
service task before unregister_netdev() finishes? This might allow the
re-queued work item to execute and access the wx structure after it has been
freed by devres.
next prev parent reply other threads:[~2026-03-31 0:55 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 2:13 [PATCH net-next v6 00/11] Wangxun improvement and new support Jiawen Wu
2026-03-26 2:13 ` [PATCH net-next v6 01/11] net: ngbe: remove netdev->ethtool->wol_enabled setting Jiawen Wu
2026-03-26 2:13 ` [PATCH net-next v6 02/11] net: ngbe: move the WOL functions to libwx Jiawen Wu
2026-03-26 2:13 ` [PATCH net-next v6 03/11] net: ngbe: remove redundant macros Jiawen Wu
2026-03-26 2:13 ` [PATCH net-next v6 04/11] net: ngbe: implement libwx reset ops Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-31 7:19 ` Jiawen Wu
2026-03-31 13:03 ` Andrew Lunn
2026-03-31 23:05 ` Jacob Keller
2026-03-26 2:14 ` [PATCH net-next v6 05/11] net: wangxun: move reusable PCI driver ops functions into libwx Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-26 2:14 ` [PATCH net-next v6 06/11] net: txgbe: add power management support Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-26 2:14 ` [PATCH net-next v6 07/11] net: wangxun: move ethtool_ops.set_channels into libwx Jiawen Wu
2026-03-26 2:14 ` [PATCH net-next v6 08/11] net: wangxun: delete service_timer before cancel service_work Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-26 2:14 ` [PATCH net-next v6 09/11] net: wangxun: add Tx timeout process Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski [this message]
2026-03-26 2:14 ` [PATCH net-next v6 10/11] net: wangxun: improve flow control setting Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-26 2:14 ` [PATCH net-next v6 11/11] net: wangxun: implement pci_error_handlers ops Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-31 0:54 ` [PATCH net-next v6 00/11] Wangxun improvement and new support Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260331005546.3459790-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=abdun.nihaal@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jacob.e.keller@intel.com \
--cc=jiawenwu@trustnetic.com \
--cc=joe@dama.to \
--cc=kees@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=leitao@debian.org \
--cc=linux@armlinux.org.uk \
--cc=mengyuanlou@net-swift.com \
--cc=michal.swiatkowski@linux.intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=richardcochran@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox