From: Jakub Kicinski <kuba@kernel.org>
To: jiawenwu@trustnetic.com
Cc: Jakub Kicinski <kuba@kernel.org>,
netdev@vger.kernel.org, mengyuanlou@net-swift.com,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
pabeni@redhat.com, richardcochran@gmail.com,
linux@armlinux.org.uk, horms@kernel.org,
michal.swiatkowski@linux.intel.com, jacob.e.keller@intel.com,
kees@kernel.org, joe@dama.to, larysa.zaremba@intel.com,
abdun.nihaal@gmail.com, leitao@debian.org
Subject: Re: [PATCH net-next v6 09/11] net: wangxun: add Tx timeout process
Date: Mon, 30 Mar 2026 17:55:45 -0700 [thread overview]
Message-ID: <20260331005546.3459790-1-kuba@kernel.org> (raw)
In-Reply-To: <20260326021406.30444-10-jiawenwu@trustnetic.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: wangxun: add Tx timeout process
Implement .ndo_tx_timeout to handle Tx side timeout event. When Tx
timeout event occur, it will triger driver into reset process.
The WX_HANG_CHECK_ARMED bit is set to indicate a potential hang. It will
be cleared if a pause frame is received to remove false hang detection
due to 802.3 frames, which is implemented in later commit.
> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> new file mode 100644
> index 0000000000000..bb77bdce69d22
> --- /dev/null
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> @@ -0,0 +1,126 @@
[ ... ]
> +static void wx_reset_subtask(struct wx *wx)
> +{
> + if (!test_bit(WX_FLAG_NEED_PF_RESET, wx->flags))
> + return;
> +
> + if (!netif_running(wx->netdev) ||
> + test_bit(WX_STATE_RESETTING, wx->state))
> + return;
> +
> + rtnl_lock();
> +
> + wx_warn(wx, "Reset adapter.\n");
> +
> + if (test_bit(WX_FLAG_NEED_PF_RESET, wx->flags)) {
> + if (wx->do_reset)
> + wx->do_reset(wx->netdev, true);
Can this race with dev_close()?
If the interface is brought down concurrently, dev_close() acquires rtnl_lock(),
clears the running state, and frees resources. When wx_reset_subtask() later
acquires the lock, it does not re-check netif_running().
Could this result in calling do_reset() on a closed interface, leading to
hardware state inconsistencies or use-after-free crashes?
> + clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags);
Could this unconditionally clear WX_FLAG_NEED_PF_RESET even if another
context sets it during the do_reset() execution? Should this use
test_and_clear_bit() before the reset operation to avoid losing concurrent
reset requests?
> + }
> +
> + rtnl_unlock();
> +}
[ ... ]
> +void wx_tx_timeout(struct net_device *netdev, unsigned int txqueue)
> +{
> + struct wx *wx = netdev_priv(netdev);
> + u32 head, tail;
> + int i;
> +
> + for (i = 0; i < wx->num_tx_queues; i++) {
> + struct wx_ring *tx_ring = wx->tx_ring[i];
> +
> + if (test_bit(WX_TX_DETECT_HANG, tx_ring->state) &&
> + wx_check_tx_hang(tx_ring))
Does calling wx_check_tx_hang() from the dev_watchdog softirq context cause
data races with the NAPI polling context (wx_clean_tx_irq)?
wx_check_tx_hang() modifies ring state bits (WX_HANG_CHECK_ARMED) and updates
the 64-bit statistic tx_done_old. Can this lead to torn reads or writes on
32-bit platforms and silent corruption of the hang detection state?
> + wx_warn(wx, "Real tx hang detected on queue %d\n", i);
> +
[ ... ]
> diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> index baffcadb22690..b8db772c37c5c 100644
> --- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> +++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
[ ... ]
> @@ -370,6 +391,7 @@ static void ngbe_disable_device(struct wx *wx)
> wx_napi_disable_all(wx);
> netif_tx_stop_all_queues(netdev);
> netif_tx_disable(netdev);
> + timer_delete_sync(&wx->service_timer);
Does this code leave the service task running?
timer_delete_sync() is called, but cancel_work_sync(&wx->service_task) is
missing. If the work is already queued, could wx_check_tx_hang_subtask() run
after the transmission and reception rings are freed, resulting in a
use-after-free when accessing wx->tx_ring[i]->state?
> if (wx->gpio_ctrl)
> ngbe_sfp_modules_txrx_powerctl(wx, false);
> wx_irq_disable(wx);
[ ... ]
> @@ -816,6 +845,9 @@ static void ngbe_remove(struct pci_dev *pdev)
> struct wx *wx = pci_get_drvdata(pdev);
> struct net_device *netdev;
>
> + timer_delete_sync(&wx->service_timer);
> + cancel_work_sync(&wx->service_task);
> +
> netdev = wx->netdev;
> wx_disable_sriov(wx);
> unregister_netdev(netdev);
Does this teardown order result in a use-after-free?
cancel_work_sync() is called before unregister_netdev(). Since the network
device is still active, could a concurrent timeout or interrupt re-queue the
service task before unregister_netdev() finishes? This might allow the
re-queued work item to execute and access the wx structure after it has been
freed by devres.
next prev parent reply other threads:[~2026-03-31 0:55 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 2:13 [PATCH net-next v6 00/11] Wangxun improvement and new support Jiawen Wu
2026-03-26 2:13 ` [PATCH net-next v6 01/11] net: ngbe: remove netdev->ethtool->wol_enabled setting Jiawen Wu
2026-03-26 2:13 ` [PATCH net-next v6 02/11] net: ngbe: move the WOL functions to libwx Jiawen Wu
2026-03-26 2:13 ` [PATCH net-next v6 03/11] net: ngbe: remove redundant macros Jiawen Wu
2026-03-26 2:13 ` [PATCH net-next v6 04/11] net: ngbe: implement libwx reset ops Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-31 7:19 ` Jiawen Wu
2026-03-31 13:03 ` Andrew Lunn
2026-03-31 23:05 ` Jacob Keller
2026-03-26 2:14 ` [PATCH net-next v6 05/11] net: wangxun: move reusable PCI driver ops functions into libwx Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-26 2:14 ` [PATCH net-next v6 06/11] net: txgbe: add power management support Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-26 2:14 ` [PATCH net-next v6 07/11] net: wangxun: move ethtool_ops.set_channels into libwx Jiawen Wu
2026-03-26 2:14 ` [PATCH net-next v6 08/11] net: wangxun: delete service_timer before cancel service_work Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-26 2:14 ` [PATCH net-next v6 09/11] net: wangxun: add Tx timeout process Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski [this message]
2026-03-26 2:14 ` [PATCH net-next v6 10/11] net: wangxun: improve flow control setting Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-03-26 2:14 ` [PATCH net-next v6 11/11] net: wangxun: implement pci_error_handlers ops Jiawen Wu
2026-03-31 0:55 ` Jakub Kicinski
2026-04-21 14:33 ` Lukas Wunner
2026-03-31 0:54 ` [PATCH net-next v6 00/11] Wangxun improvement and new support Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260331005546.3459790-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=abdun.nihaal@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jacob.e.keller@intel.com \
--cc=jiawenwu@trustnetic.com \
--cc=joe@dama.to \
--cc=kees@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=leitao@debian.org \
--cc=linux@armlinux.org.uk \
--cc=mengyuanlou@net-swift.com \
--cc=michal.swiatkowski@linux.intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=richardcochran@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.