From: Mika Westerberg <mika.westerberg@linux.intel.com>
To: Ricard Bejarano <ricard@bejarano.io>
Cc: netdev@vger.kernel.org, michael.jamet@intel.com,
YehezkelShB@gmail.com, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com
Subject: Re: Poor thunderbolt-net interface performance when bridged
Date: Mon, 26 May 2025 12:22:20 +0300 [thread overview]
Message-ID: <20250526092220.GO88033@black.fi.intel.com> (raw)
In-Reply-To: <5DE64000-782A-492C-A653-7EB758D28283@bejarano.io>
On Mon, May 26, 2025 at 10:50:43AM +0200, Ricard Bejarano wrote:
> Hey, thanks again for looking into this.
No problem.
> Yes, these are 8th generation Intel NUCs with Thunderbolt 3, not 4. And yes, the
> cable I have used so far is Thunderbolt "compatible" not "certified", and it
> doesn't have the lightning logo[1].
>
> I am not convinced, though.
>
> Part I: Thunderbolt 3
> ---------------------
>
> I first ran into this issue a few months ago with a set of 3 12/13th generation
> Intel NUCs, each of which has 2 Thunderbolt 4 ports, directly connected to each
> other so as to form a ring network. When hopping through one of them, bandwidth
> dropped from ~16Gbps to ~5Mbps. Both in routing and bridging. These 3 NUCs are
> in "production" so I didn't want to use them as my test bench. They are rocking
> "Thunderbolt 4 certified" cables with the lightning logo[2].
>
> I could justify running any one of the following disruptive tests if you think
> they would be helpful:
>
> Note: A is connected to B, B to C, and C to A (to form a ring).
I suggest keeping the "test case" as simple as possible.
Simple peer-to-peer, no routing nothing. Anything else is making things
hard to debug. Also note that this whole thing is supposed to be used as
peer-to-peer not some full fledged networking solution.
> 1) Configure A and C to route to each other via B if the A<->C link is down,
> then disconnect A<->C and run iperfs in all directions, like in [4.6].
> If they run at ~16Gbps when hopping via B, then TB3 was (at least part of)
> the problem; otherwise it must be something wrong with the driver.
> I am very confident speed will drop when hopping via B, because this is how I
> first came across this issue. I wanted nodes of the ring to use the other way
> around if the direct path wasn't up, but that wasn't possible due to the huge
> bandwidth drop.
>
> 2) Same as #1 but configure B to bridge both of its Thunderbolt interfaces.
>
> 3) While pulling the A<->C cable for running one of the above, test that cable
> in the 8th gen test bench. This cable is known to run at ~16Gbps when
> connecting A and C via their Thunderbolt 4 ports.
> While very unlikely, if this somehow solves the red->purple bandwidth, then
> we know the current cable was to blame.
>
> These 12/13th gen NUCs are running non-upstream kernels, however, and while I
> can justify playing around a bit with their connections, I can't justify pulling
> them out of production to install upstream kernels and make them our test bench.
>
> Do you think anyone of these tests would be helpful?
Let's forget bridges for now and anything else than this:
Host A <- Thunderbolt Cable -> Host B
> Part II: the cable
> ------------------
>
> You also point to the cable as the likely culprit.
>
> 1) But then, why does iperf between red<->blue[4.6.1] show ~9Gbps both ways, but
> red->blue->purple[4.6.3a] drops to ~5Mbps? If the cable were to blame,
> wouldn't red->blue[4.6.1a] also drop to about the same?
I'm saying two things that will for sure limit the maximum throughput you
get for a fact:
1. You use non-certified cables, so your are limited to 10 Gb/s per lane
instead of 20 Gb/s per lane.
2. Your system has firmware connection manager which does not support lane
bonding so instead of your 2 x 10 Gb/s = 20 Gb/s you only get the 1 x 10
Gb/s.
It is enough if one of the hosts has these limitations it will affect the
whole link. So instead of 40 Gb/s with lane bonding you get 10 Gb/s
(although there are some limitations in the DMA side so you don't get the
full 40 Gb/s but certainly more than what the 10 Gb/s single lane gives
you).
> 2) Also, if the problem were the cable's bandwidth in the red->blue direction,
> flipping the cable around should show a similar bandwidth drop in the (now)
> blue->red direction, right?
> I have tested this and it doesn't hold true, iperfs in all directions after
> flipping the cable around gave about the same results as in [4.6], further
> pointing at something else other than the cable itself.
You can check the link speed using the tool I referred. It may be that
sometimes it manages to negotiate the 20 Gb/s link but sometimes not.
> I've attached the output of 'tblist -Av'. It shows negotiated speed at 10Gb/s in
> both Rx/Tx, which lines up with the red<->blue iperf bandwidth tests of [4.6.1].
You missed the attachment? But anyways as I suspected it shows the same.
> How shall we proceed?
Well, if the link is degraded to 10 Gb/s then I'm not sure there is
nothing more I can do here.
If it is not the case, e.g you see that the link is 40 Gb/s but you still
see crappy throughput the we need to investigate (but keep the topology as
simple as possible). Note in this case please provide full dmesg (with
thunderbolt.dyndbg=+p) on both sides of the link and I can take a look.
next prev parent reply other threads:[~2025-05-26 9:22 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-22 17:19 Poor thunderbolt-net interface performance when bridged Ricard Bejarano
2025-05-23 11:07 ` Mika Westerberg
2025-05-23 15:07 ` Ricard Bejarano
2025-05-26 4:50 ` Mika Westerberg
2025-05-26 8:50 ` Ricard Bejarano
2025-05-26 9:22 ` Mika Westerberg [this message]
2025-05-26 11:47 ` Ricard Bejarano
2025-05-26 12:04 ` Mika Westerberg
2025-05-26 16:10 ` Ricard Bejarano
2025-05-27 10:33 ` Mika Westerberg
2025-05-27 12:36 ` Ricard Bejarano
2025-05-26 14:28 ` Andrew Lunn
2025-05-26 15:09 ` Stephen Hemminger
2025-05-26 19:36 ` Ricard Bejarano
2025-05-26 19:34 ` Ricard Bejarano
2025-05-26 20:19 ` Andrew Lunn
2025-05-27 8:47 ` Ricard Bejarano
2025-05-27 12:51 ` Andrew Lunn
2025-05-27 14:25 ` Ricard Bejarano
2025-05-27 15:02 ` Andrew Lunn
2025-05-27 18:57 ` Ricard Bejarano
2025-05-27 19:08 ` Andrew Lunn
2025-05-27 19:17 ` Ricard Bejarano
2025-05-27 19:32 ` Ricard Bejarano
2025-05-28 6:38 ` Ricard Bejarano
2025-05-28 11:57 ` Andrew Lunn
2025-05-28 13:08 ` Ricard Bejarano
2025-05-29 12:45 ` Ricard Bejarano
2025-06-14 9:13 ` Ricard Bejarano
2025-06-14 14:11 ` Andrew Lunn
2025-06-15 13:56 ` Ricard Bejarano
2025-06-21 11:00 ` Ricard Bejarano
2025-06-30 7:28 ` Ricard Bejarano
2025-08-28 7:59 ` Ricard Bejarano
2025-09-01 20:20 ` Ido Schimmel
2025-09-02 10:18 ` Ricard Bejarano
2025-09-03 7:43 ` Ido Schimmel
2025-09-04 8:56 ` Ricard Bejarano
2025-09-04 10:33 ` Ido Schimmel
2025-05-29 8:38 ` Ricard Bejarano
2025-05-29 10:06 ` Ricard Bejarano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250526092220.GO88033@black.fi.intel.com \
--to=mika.westerberg@linux.intel.com \
--cc=YehezkelShB@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=michael.jamet@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=ricard@bejarano.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.