From: Mika Westerberg <mika.westerberg@linux.intel.com>
To: Benjamin Berman <benjamin.s.berman@gmail.com>
Cc: Andreas Noever <andreas.noever@gmail.com>,
Mika Westerberg <westeri@kernel.org>,
Yehezkel Bernat <YehezkelShB@gmail.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
linux-usb@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] thunderbolt: drop start_poll guard in tb_ring_poll_complete()
Date: Tue, 28 Apr 2026 09:33:25 +0200 [thread overview]
Message-ID: <20260428073325.GO557136@black.igk.intel.com> (raw)
In-Reply-To: <20260428015521.3454006-2-benjamin.s.berman@gmail.com>
Hi,
On Mon, Apr 27, 2026 at 06:55:20PM -0700, Benjamin Berman wrote:
> Under concurrent load on a single NHI with several rings simultaneously
> in NAPI poll (e.g. a Maple Ridge TB4 transit forwarding tbnet traffic
> between two peers), one ring's interrupt enable bit in
> REG_RING_INTERRUPT_BASE can stay cleared. MSI-X stops for that ring,
> NAPI is never rescheduled, but carrier is reported up and no driver
> event fires. The ring stays masked until thunderbolt_net is reloaded.
>
> tb_ring_poll_complete() gated the unmask on @start_poll:
>
> if (ring->start_poll)
> __ring_interrupt_mask(ring, false);
>
> while the ISR path masks unconditionally via __ring_interrupt(). In a
> window where @start_poll is observed as NULL by the unmask path while
> the paired mask persists, the ring is left permanently masked.
>
> Gate on @running instead and add an ioread32() barrier so the posted
> enable reaches the device before the spinlock is dropped.
>
> On NHIs without QUIRK_AUTO_CLEAR_INT a second issue compounds the
> first: stale pending status in REG_RING_NOTIFY_BASE can prevent the
> hardware from re-arming its MSI-X generator when the ring is
> re-enabled. Clear the ring's bit in REG_RING_INT_CLEAR before setting
> the enable bit, mirroring what ring_msix() already does at ISR entry.
>
> Verified on a Maple Ridge 4C transit and two TB3 Titan Ridge endpoints
> running NCCL all-reduce over tb-lo: pre-patch the chain wedges in
> under 1 GB; post-patch a 192 GB run (3000 iterations of a 64 MiB
> all-reduce) completes with mask/unmask counters balanced.
I think this makes sense.
I do have few comments about the code itself. See below.
> Generated-by: Claude Opus 4.7 <claude-opus-4-7@anthropic.com>
> Tested-by: Benjamin Berman <benjamin.s.berman@gmail.com>
> Signed-off-by: Benjamin Berman <benjamin.s.berman@gmail.com>
> ---
> drivers/thunderbolt/nhi.c | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
> index 2bb2e79ca..bba45ec36 100644
> --- a/drivers/thunderbolt/nhi.c
> +++ b/drivers/thunderbolt/nhi.c
> @@ -389,10 +389,24 @@ static void __ring_interrupt_mask(struct tb_ring *ring, bool mask)
> u32 val;
>
> val = ioread32(ring->nhi->iobase + reg);
> - if (mask)
> + if (mask) {
> val &= ~BIT(bit);
> - else
> + } else {
> + if (!(ring->nhi->quirks & QUIRK_AUTO_CLEAR_INT)) {
> + int cbit = ring_interrupt_index(ring) & 31;
> +
> + if (ring->is_tx)
> + iowrite32(BIT(cbit),
> + ring->nhi->iobase +
> + REG_RING_INT_CLEAR);
> + else
> + iowrite32(BIT(cbit),
> + ring->nhi->iobase +
> + REG_RING_INT_CLEAR +
> + 4 * (ring->nhi->hop_count / 32));
> + }
This should be a separate helper function.
ring_interrupt_clear() or so. We actually have function with that name but
it clears with bit too big hammer for this. So I suggest to rework it with
the above code and user here and also in nhi_disable_interrupts().
> val |= BIT(bit);
> + }
> iowrite32(val, ring->nhi->iobase + reg);
> }
>
> @@ -423,8 +437,10 @@ void tb_ring_poll_complete(struct tb_ring *ring)
>
> spin_lock_irqsave(&ring->nhi->lock, flags);
> spin_lock(&ring->lock);
> - if (ring->start_poll)
> + if (ring->running) {
> __ring_interrupt_mask(ring, false);
> + (void)ioread32(ring->nhi->iobase + REG_RING_INTERRUPT_BASE);
Drop the (void) cast but add a comment that this is for posted write.
> + }
> spin_unlock(&ring->lock);
> spin_unlock_irqrestore(&ring->nhi->lock, flags);
> }
> --
> 2.43.0
next prev parent reply other threads:[~2026-04-28 7:33 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 1:55 [PATCH 0/2] thunderbolt: fix wedge under sustained tbnet load on AM4 and AM5 Benjamin Berman
2026-04-28 1:55 ` [PATCH 1/2] thunderbolt: drop start_poll guard in tb_ring_poll_complete() Benjamin Berman
2026-04-28 7:33 ` Mika Westerberg [this message]
2026-04-28 1:55 ` [PATCH 2/2] net: thunderbolt: enlarge RX/TX ring and set NAPI weight for sustained load Benjamin Berman
2026-04-28 7:42 ` Mika Westerberg
2026-04-28 12:54 ` Andrew Lunn
2026-04-28 14:19 ` Mika Westerberg
2026-04-28 14:39 ` Andrew Lunn
2026-04-28 17:27 ` Mika Westerberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428073325.GO557136@black.igk.intel.com \
--to=mika.westerberg@linux.intel.com \
--cc=YehezkelShB@gmail.com \
--cc=andreas.noever@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=benjamin.s.berman@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=westeri@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.