From: "Jiawen Wu" <jiawenwu@trustnetic.com>
To: "'Paolo Abeni'" <pabeni@redhat.com>, <netdev@vger.kernel.org>
Cc: "'Mengyuan Lou'" <mengyuanlou@net-swift.com>,
"'Andrew Lunn'" <andrew+netdev@lunn.ch>,
"'David S. Miller'" <davem@davemloft.net>,
"'Eric Dumazet'" <edumazet@google.com>,
"'Jakub Kicinski'" <kuba@kernel.org>,
"'Richard Cochran'" <richardcochran@gmail.com>,
"'Russell King'" <linux@armlinux.org.uk>,
"'Simon Horman'" <horms@kernel.org>,
"'Kees Cook'" <kees@kernel.org>,
"'Larysa Zaremba'" <larysa.zaremba@intel.com>,
"'Breno Leitao'" <leitao@debian.org>,
"'Joe Damato'" <joe@dama.to>,
"'Jacob Keller'" <jacob.e.keller@intel.com>,
"'Fabio Baltieri'" <fabio.baltieri@gmail.com>
Subject: RE: [PATCH net-next v1 2/5] net: wangxun: add Tx timeout process
Date: Thu, 30 Apr 2026 16:33:55 +0800 [thread overview]
Message-ID: <040901dcd87c$155a0e40$400e2ac0$@trustnetic.com> (raw)
In-Reply-To: <fb036ffe-b2da-4e9a-a3f4-0f0e467b9e25@redhat.com>
On Thu, April 30, 2026 4:24 PM, Paolo Abeni wrote:
> On 4/28/26 4:11 AM, Jiawen Wu wrote:
> > +static void wx_reset_subtask(struct wx *wx)
> > +{
> > + if (!test_bit(WX_FLAG_NEED_PF_RESET, wx->flags))
> > + return;
> > +
> > + rtnl_lock();
> > +
> > + if (!netif_running(wx->netdev) ||
> > + test_bit(WX_STATE_RESETTING, wx->state))
> > + return;
>
> Sashiko says:
>
> Does this early return path leak the rtnl_lock?
> If the interface is brought down concurrently while a reset is scheduled,
> it appears this would return without calling rtnl_unlock(). Since all
> network
> configuration operations require the RTNL lock, could this lead to a
> system-wide deadlock in the networking subsystem?
Thanks for your review.
Unfortunately, I just sent V2 patch set address on Sashiko's comments...
I'll make V3 patches according to your follow comments.
>
> > +
> > + wx_warn(wx, "Reset adapter.\n");
> > +
> > + if (test_and_clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags)) {
> > + if (wx->do_reset)
> > + wx->do_reset(wx->netdev);
> > + }
> > +
> > + rtnl_unlock();
> > +}
> > +
> > +/*
> > + * wx_check_tx_hang_subtask - check for hung queues and dropped interrupts
> > + * @wx - pointer to the device wx structure
> > + *
> > + * This function serves two purposes. First it strobes the interrupt lines
> > + * in order to make certain interrupts are occurring. Secondly it sets the
> > + * bits needed to check for TX hangs. As a result we should immediately
> > + * determine if a hang has occurred.
> > + */
> > +static void wx_check_tx_hang_subtask(struct wx *wx)
> > +{
> > + int i;
> > +
> > + /* If we're down or resetting, just bail */
> > + if (!netif_running(wx->netdev) ||
> > + test_bit(WX_STATE_RESETTING, wx->state))
> > + return;
> > +
> > + /* Force detection of hung controller */
> > + if (netif_carrier_ok(wx->netdev)) {
> > + for (i = 0; i < wx->num_tx_queues; i++)
> > + set_bit(WX_TX_DETECT_HANG, wx->tx_ring[i]->state);
> > + }
> > +}
> > +
> > +void wx_handle_errors_subtask(struct wx *wx)
> > +{
> > + wx_reset_subtask(wx);
> > + wx_check_tx_hang_subtask(wx);
> > +}
> > +EXPORT_SYMBOL(wx_handle_errors_subtask);
> > +
> > +static void wx_tx_timeout_reset(struct wx *wx)
> > +{
> > + if (!netif_running(wx->netdev))
> > + return;
> > +
> > + set_bit(WX_FLAG_NEED_PF_RESET, wx->flags);
> > + wx_warn(wx, "initiating reset due to tx timeout\n");
> > + wx_service_event_schedule(wx);
> > +}
> > +
> > +void wx_tx_timeout(struct net_device *netdev, unsigned int txqueue)
> > +{
> > + struct wx *wx = netdev_priv(netdev);
> > + u32 head, tail;
> > + int i;
> > +
> > + for (i = 0; i < wx->num_tx_queues; i++) {
> > + struct wx_ring *tx_ring = wx->tx_ring[i];
> > +
> > + if (test_bit(WX_TX_DETECT_HANG, tx_ring->state) &&
> > + wx_check_tx_hang(tx_ring))
> > + wx_warn(wx, "Real tx hang detected on queue %d\n", i);
> > +
> > + head = rd32(wx, WX_PX_TR_RP(tx_ring->reg_idx));
> > + tail = rd32(wx, WX_PX_TR_WP(tx_ring->reg_idx));
> > + wx_warn(wx,
> > + "tx ring %d next_to_use is %d, next_to_clean is %d\n",
> > + i, tx_ring->next_to_use,
> > + tx_ring->next_to_clean);
> > + wx_warn(wx, "tx ring %d hw rp is 0x%x, wp is 0x%x\n",
> > + i, head, tail);
> > + }
> > +
> > + wx_tx_timeout_reset(wx);
> > +}
> > +EXPORT_SYMBOL(wx_tx_timeout);
> > +
> > +void wx_handle_tx_hang(struct wx_ring *tx_ring, unsigned int next)
> > +{
> > + struct wx *wx = netdev_priv(tx_ring->netdev);
> > +
> > + wx_warn(wx, "Detected Tx Unit Hang\n"
> > + " Tx Queue <%d>\n"
> > + " TDH, TDT <%x>, <%x>\n"
> > + " next_to_use <%x>\n"
> > + " next_to_clean <%x>\n"
> > + "tx_buffer_info[next_to_clean]\n"
> > + " time_stamp <%lx>\n"
> > + " jiffies <%lx>\n",
>
> It's better to use a single string for the whole message, even if it
> would exceed the 80 chars limit
>
> > + tx_ring->queue_index,
> > + rd32(wx, WX_PX_TR_RP(tx_ring->reg_idx)),
> > + rd32(wx, WX_PX_TR_WP(tx_ring->reg_idx)),
> > + tx_ring->next_to_use, next,
> > + tx_ring->tx_buffer_info[next].time_stamp, jiffies);
> > +
> > + netif_stop_subqueue(tx_ring->netdev, tx_ring->queue_index);
> > +
> > + wx_warn(wx, "tx hang detected on queue %d, resetting adapter\n",
> > + tx_ring->queue_index);
>
> Possibly two warn messages for the same cause is a bit too verbose (same
> in wx_tx_timeout()).
>
> > +bool wx_check_tx_hang(struct wx_ring *ring)
> > +{
> > + u32 tx_done_old = ring->tx_stats.tx_done_old;
> > + u32 tx_pending = wx_get_tx_pending(ring);
> > + u32 tx_done = ring->stats.packets;
> > +
> > + clear_bit(WX_TX_DETECT_HANG, ring->state);
>
> It looks like every caller checks WX_TX_DETECT_HANG, it would be
> probably better to use test_and_clear_bit() here, and drop the test from
> the caller.
next prev parent reply other threads:[~2026-04-30 8:34 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 2:11 [PATCH net-next v1 0/5] net: wangxun: timeout and error Jiawen Wu
2026-04-28 2:11 ` [PATCH net-next v1 1/5] net: ngbe: implement libwx reset ops Jiawen Wu
2026-04-28 2:11 ` [PATCH net-next v1 2/5] net: wangxun: add Tx timeout process Jiawen Wu
2026-04-30 8:24 ` Paolo Abeni
2026-04-30 8:33 ` Jiawen Wu [this message]
2026-04-28 2:11 ` [PATCH net-next v1 3/5] net: wangxun: add reinit parameter to wx->do_reset callback Jiawen Wu
2026-04-28 2:11 ` [PATCH net-next v1 4/5] net: wangxun: extract the close_suspend sequence Jiawen Wu
2026-04-30 8:29 ` Paolo Abeni
2026-04-28 2:11 ` [PATCH net-next v1 5/5] net: wangxun: implement pci_error_handlers ops Jiawen Wu
2026-04-30 8:34 ` Paolo Abeni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='040901dcd87c$155a0e40$400e2ac0$@trustnetic.com' \
--to=jiawenwu@trustnetic.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fabio.baltieri@gmail.com \
--cc=horms@kernel.org \
--cc=jacob.e.keller@intel.com \
--cc=joe@dama.to \
--cc=kees@kernel.org \
--cc=kuba@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=leitao@debian.org \
--cc=linux@armlinux.org.uk \
--cc=mengyuanlou@net-swift.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=richardcochran@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.