From: Mika Westerberg <mika.westerberg@linux.intel.com>
To: Andrew Lunn <andrew@lunn.ch>
Cc: Benjamin Berman <benjamin.s.berman@gmail.com>,
Andreas Noever <andreas.noever@gmail.com>,
Mika Westerberg <westeri@kernel.org>,
Yehezkel Bernat <YehezkelShB@gmail.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
linux-usb@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] net: thunderbolt: enlarge RX/TX ring and set NAPI weight for sustained load
Date: Tue, 28 Apr 2026 16:19:54 +0200 [thread overview]
Message-ID: <20260428141954.GT557136@black.igk.intel.com> (raw)
In-Reply-To: <e6a249d5-8b11-43cf-89ee-14d436c70cf8@lunn.ch>
On Tue, Apr 28, 2026 at 02:54:58PM +0200, Andrew Lunn wrote:
> On Tue, Apr 28, 2026 at 09:42:53AM +0200, Mika Westerberg wrote:
> > On Mon, Apr 27, 2026 at 06:55:21PM -0700, Benjamin Berman wrote:
> > > The default TBNET_RING_SIZE of 256 and the NAPI_POLL_WEIGHT of 64
> > > implicit in netif_napi_add() are too small for host-to-host Thunderbolt
> > > networking under sustained bulk traffic. Running NCCL all-reduce over
> > > tb-lo on a three-node chain (two TB3 endpoints plus a TB4 Maple Ridge
> > > transit) produces rx_missed_errors at ~1 % of rx_packets on the transit
> > > and ~0.6 % on the endpoints, with rx_packets stalling against a peer's
> > > continuing tx_packets.
> > >
> > > Raise TBNET_RING_SIZE to 2048 (8x) and use netif_napi_add_weight() with
> > > a per-NAPI weight of 256 so tbnet_poll() drains more frames per softirq
> > > invocation. With matching sysctls (net.core.netdev_budget=1024,
> > > net.core.netdev_budget_usecs=8000) rx_missed_errors stays below 0.005 %
> > > over a 192 GB all-reduce workload on the same hardware.
> > >
> > > Generated-by: Claude Opus 4.7 <claude-opus-4-7@anthropic.com>
> > > Tested-by: Benjamin Berman <benjamin.s.berman@gmail.com>
> > > Signed-off-by: Benjamin Berman <benjamin.s.berman@gmail.com>
> >
> > For ring size I don't have any objections. The current ring size 256 is
> > arbitrary and at the time seemed reasonable.
> >
> > For the poll weigth there is the comment in netdevice.h:
> >
> > /* Default NAPI poll() weight
> > * Device drivers are strongly advised to not use bigger value
> > */
> > #define NAPI_POLL_WEIGHT 64
> >
> > But if you see improvement using 256 here I'm fine with that unless the
> > network folks advice otherwise.
>
> I just did a quick sample of other drivers which change the NAPI
> weight. Of the 10 i looked at, 9 reduced the weight. Only one
> increased it.
Yeah, I noticed it too. That's why asking for consultancy :)
> I would like the core netdev people to comment on this, before it is
> accepted.
>
> Questions which come to mind:
>
> Why is the polling not happening frequently enough?
>
> Is it frequently swapping between polling and interrupts?
>
> Is there interrupt coalesce going on, and the coalesce time set too
> high, so that by the time the interrupt fires the ring is full? Can
> you play with ethtool -C?
Thanks!
I'll leave these to Benjamin and Claude AI to answer.
One thing that could affect is the interrupt throttling that the hardware
is doing. We have quite big value there by default. Lowering that may have
affect as well. I just posted a patch series where one of the patches makes
this configurable in the tbnet driver so you could apply that and play with
the throttling value:
https://lore.kernel.org/linux-usb/20260428072209.3084930-6-mika.westerberg@linux.intel.com/
next prev parent reply other threads:[~2026-04-28 14:20 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 1:55 [PATCH 0/2] thunderbolt: fix wedge under sustained tbnet load on AM4 and AM5 Benjamin Berman
2026-04-28 1:55 ` [PATCH 1/2] thunderbolt: drop start_poll guard in tb_ring_poll_complete() Benjamin Berman
2026-04-28 7:33 ` Mika Westerberg
2026-04-28 1:55 ` [PATCH 2/2] net: thunderbolt: enlarge RX/TX ring and set NAPI weight for sustained load Benjamin Berman
2026-04-28 7:42 ` Mika Westerberg
2026-04-28 12:54 ` Andrew Lunn
2026-04-28 14:19 ` Mika Westerberg [this message]
2026-04-28 14:39 ` Andrew Lunn
2026-04-28 17:27 ` Mika Westerberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428141954.GT557136@black.igk.intel.com \
--to=mika.westerberg@linux.intel.com \
--cc=YehezkelShB@gmail.com \
--cc=andreas.noever@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=andrew@lunn.ch \
--cc=benjamin.s.berman@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=westeri@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox