From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Steve Oswald <stevepeter.oswald@gmail.com>
Cc: linux-pci@vger.kernel.org
Subject: Re: [BUG] Thunderbolt eGPU PCI BARs incorrectly assigned, fails to assign memory
Date: Mon, 1 Sep 2025 16:25:26 +0300 (EEST) [thread overview]
Message-ID: <9254be77-46ea-992f-a1bd-98bea3943520@linux.intel.com> (raw)
In-Reply-To: <CAN95MYEaO8QYYL=5cN19nv_qDGuuP5QOD17pD_ed6a7UqFVZ-g@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3454 bytes --]
On Sun, 31 Aug 2025, Steve Oswald wrote:
> Hello,
>
> I’ve encountered an issue with Thunderbolt eGPU (externally connected
> gpu via thunderbolt 4). The change from kernel 6.10.14 to 6.11.0 broke
> the pci memory assignment of the external pcie device. I figured out
> which version broke it by using ubuntu 25.04 and downgrading the
> kernel (https://raw.githubusercontent.com/pimlie/ubuntu-mainline-kernel.sh/master/ubuntu-mainline-kernel.sh).
>
> >From the dmesg output, on the broken 6.11.0 I see 'failed to assign'.
> The issue occurs (almost never) on previous kernel version 6.10.14.
> Using pci=realloc did not change the behavior (I can produce the dmesg
> output if necessary).
>
> The issue was tested with 2 egpus (Radeon Instinct MI50 32GB, NVIDIA
> 3080 10GB). Both the amd and the nvidia driver fail to initialize the
> device because they cannot write the pcie messages.
>
> System details:
> - Kernel: Linux 6.10.14-061014-generic (Ubuntu build) > 6.11.0-061100
> - Laptop: TUXEDO InfinityBook Pro 16 - Gen8 with Thunderbolt 4
> - eGPU: Radeon Instinct MI50 32GB, NVIDIA 3080 10GB
>
> Steps to reproduce:
> 1. Boot the system with the eGPU.
> 2. Observe PCI BAR message in `dmesg`.
>
> Logs:
> both kernel messages, lspci can be found here:
> https://gist.github.com/stepeos/cd060c7d66ab195f51ab4d5675b4e4af
> raw files:
> - dmesg_linux_6.11.0.log
> https://gist.githubusercontent.com/stepeos/cd060c7d66ab195f51ab4d5675b4e4af/raw/f9470a06ff929d386c50ec6b5d07e0ff3f053dcf/dmesg_linux_6.11.0.log
> - dmesg_linux_6.10.14.log
> https://gist.githubusercontent.com/stepeos/cd060c7d66ab195f51ab4d5675b4e4af/raw/f9470a06ff929d386c50ec6b5d07e0ff3f053dcf/dmesg_linux_6.10.14.log
>
> If additional info is needed, I'm happy to help.
Hi Steve,
Thanks for the report.
My analysis is that the problem boils down to lack of this line with 6.11:
pcieport 0000:00:07.0: resource 15 [mem 0x6000000000-0x601bffffff 64bit pref] released
It means one of the upstream bridge windows could not be released for
resize as it is printed from pci_reassign_bridge_resources() which likely
occurs inside pci_resize_resource() call from amdgpu(?).
The very likely cause is this check:
/* Ignore BARs which are still in use */
if (res->child)
continue;
...which (until very recently) is entirely silent so there's no warning
whatsover what is the root cause.
What this means, is that there's some assigned resource underneath
0000:00:07.0 with 6.11 that wasn't there with 6.10. And it is because 6.11
tried harder to get your resources assigned and was successful here and
there resulting in pinning the bridge window in its place, whereas 6.10
failed to assign the same resource.
Could you provide /proc/iomem (it's enough to do that for 6.11 for now)?
You could try to use hpmmioprefsize= on kernel's command line to reserve
more space for the bridge windows, the default is only 2M and these GPUs
need a magnitude more (gigabytes), you can check from 6.10 what the sizes
of the BARs on the GPU are, and round the sum upwards to the next power of
two multiple.
I'd also be interested to see why pci=realloc failed to solve this problem
as it should reconfigure the entire resource tree so if you could provide
the logs with that. Please take lspci with -vvv.
--
i.
next prev parent reply other threads:[~2025-09-01 13:25 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-31 10:51 [BUG] Thunderbolt eGPU PCI BARs incorrectly assigned, fails to assign memory Steve Oswald
2025-09-01 13:25 ` Ilpo Järvinen [this message]
2025-09-01 15:50 ` Ilpo Järvinen
2025-09-01 16:06 ` Ilpo Järvinen
2025-09-01 16:18 ` Steve Oswald
2025-09-01 16:28 ` Ilpo Järvinen
2025-09-03 13:09 ` Ilpo Järvinen
2025-10-08 10:43 ` Ilpo Järvinen
2025-10-11 14:12 ` Steve Oswald
2025-11-07 16:22 ` Ilpo Järvinen
2025-09-01 16:10 ` Steve Oswald
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9254be77-46ea-992f-a1bd-98bea3943520@linux.intel.com \
--to=ilpo.jarvinen@linux.intel.com \
--cc=linux-pci@vger.kernel.org \
--cc=stevepeter.oswald@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox