From: Cristian Cocos <cristi@ieee.org>
To: "Geramy Loveless" <gloveless@jqluv.com>,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Cc: linux-pci@vger.kernel.org, "Christian König" <christian.koenig@amd.com>
Subject: Re: [PATCH v2] PCI: release empty sibling bridge windows during rebar expansion
Date: Fri, 10 Apr 2026 14:58:50 -0400 [thread overview]
Message-ID: <ec0d287c884cbdd5131e1b09147b9b3cd56faf1d.camel@ieee.org> (raw)
In-Reply-To: <CAGpo2mdZ6Ge9ZSYK4kKYJ7etBu5JqoVR-_G7jnxfZYhfQxxryA@mail.gmail.com>
My experience with my 9060XT Thunderbolt eGPU is that the current
amdgpu driver is full of bugs, and this *specifically* in a Thunderbolt
eGPU configuration. I have attempted to document some of these bugs
here:
https://pcforum.amd.com/s/question/0D5Pd00001S3Av9KAF/linux-9060xt-egpuoverthunderbolt-bugs-galore
Apologies for posting this here, as most of these may not be relevant
to ReBAR, yet an AMD representative may still benefit from this
multiple bug report.
C
On Fri, 2026-04-10 at 10:53 -0700, Geramy Loveless wrote:
> I'm going to loop in Christian Koenig over at AMD he has been
> working
> with me on resolving or attempting to figure out whats going on with
> my gfx1201 connected to a tb5 dock to the host.
> I am currently having problems with the GPU basically loosing MMIO
> and
> crashing randomly. This recent patch change I believe helped but its
> really hard to say at this point.
> Without this patch of course the bar size would be 256MB and cause
> huge performance problems or feature loss. I am able to load up AI
> models and run workloads at nearly 100% gpu usage, i'm seeing 205W
> power draw out of the maximum 300W. But after sustained load I still
> get a crash.
>
> Maybe you would have an idea as to what is causing that crash or
> where
> I should be looking to find the cause?
> Here are some relevant logs, from what I can tell something is going
> on with MMIO, but the config bar as i understand it is still alive.
> This let me to believe maybe the router was getting put into suspend
> mode which wouldnt make sense for a GPU that is active and busy
> because the pcie tunnel would be active.
>
> Any advice or tips would be helpful thank you for the suggestions I
> will get started on writing the patch based on those recommendations.
>
> ## SMU Firmware Version
>
> ```
> smu driver if version = 0x0000002e
> smu fw if version = 0x00000032
> smu fw program = 0
> smu fw version = 0x00684b00 (104.75.0)
> ```
>
> Note: Driver interface version (0x2e / 46) does not match firmware
> interface version (0x32 / 50).
>
> ## PCI Topology
>
> ```
> 65:00.0 PCI bridge: Intel Barlow Ridge Host 80G (rev 84)
> 66:00.0 PCI bridge: Intel Barlow Ridge Host 80G (rev 84) → NHI
> 66:01.0 PCI bridge: Intel Barlow Ridge Host 80G (rev 84) → empty
> hotplug port
> 66:02.0 PCI bridge: Intel Barlow Ridge Host 80G (rev 84) → USB
> 66:03.0 PCI bridge: Intel Barlow Ridge Host 80G (rev 84) → dock
> 93:00.0 PCI bridge: Intel Barlow Ridge Hub 80G (rev 85) → dock switch
> 94:00.0 PCI bridge: Intel Barlow Ridge Hub 80G (rev 85) → downstream
> 95:00.0 PCI bridge: AMD Navi 10 XL Upstream Port (rev 24)
> 96:00.0 PCI bridge: AMD Navi 10 XL Downstream Port (rev 24)
> 97:00.0 VGA: AMD [1002:7551] (rev c0) ← GPU
> 97:00.1 Audio: AMD [1002:ab40]
> ```
>
> ## Workload
>
> GPU compute via llama.cpp (ROCm/HIP backend), running
> Qwen3.5-35B-A3B-Q4_K_M.gguf model (20.49 GiB, fully offloaded to
> VRAM). Flash attention enabled, 128K context, 32 threads.
>
> ## Crash Timeline
>
> All timestamps from `dmesg -T`, kernel boot-relative times in
> brackets.
>
> ### GPU initialization (successful)
>
> ```
> [603.644s] GPU probe: IP DISCOVERY 0x1002:0x7551
> [603.653s] Detected IP block: smu_v14_0_0, gfx_v12_0_0
> [603.771s] Detected VRAM RAM=32624M, BAR=32768M, RAM width 256bits
> GDDR6
> [604.014s] SMU driver IF 0x2e, FW IF 0x32, FW version 104.75.0
> [604.049s] SMU is initialized successfully!
> [604.119s] Runtime PM manually disabled (amdgpu.runpm=0)
> [604.119s] Initialized amdgpu 3.64.0 for 0000:97:00.0
> ```
>
> ### SMU stops responding [T+4238s after init, ~70 minutes]
>
> ```
> [4841.828s] SMU: No response msg_reg: 12 resp_reg: 0
> [4841.828s] [smu_v14_0_2_get_power_profile_mode] Failed to get
> activity monitor!
> [4849.393s] SMU: No response msg_reg: 12 resp_reg: 0
> [4849.393s] Failed to export SMU metrics table!
> ```
>
> 15 consecutive `SMU: No response` messages logged between [4841s] and
> [4948s], approximately every 7-8 seconds. All with `msg_reg: 12
> resp_reg: 0`. Failed operations include:
> - `smu_v14_0_2_get_power_profile_mode` — Failed to get activity
> monitor
> - `Failed to export SMU metrics table`
> - `Failed to get current clock freq`
>
> ### Page faults begin [T+4349s after init, ~111s after first SMU
> failure]
>
> ```
> [4948.927s] [gfxhub] page fault (src_id:0 ring:40 vmid:9 pasid:108)
> Process llama-cli pid 35632
> GCVM_L2_PROTECTION_FAULT_STATUS: 0x00941051
> Faulty UTCL2 client ID: TCP (0x8)
> PERMISSION_FAULTS: 0x5
> WALKER_ERROR: 0x0
> MAPPING_ERROR: 0x0
> RW: 0x1 (write)
> ```
>
> 10 page faults logged at [4948s], all from TCP (Texture Cache Pipe),
> all PERMISSION_FAULTS=0x5, WALKER_ERROR=0x0, MAPPING_ERROR=0x0. 7
> unique faulting addresses:
> - 0x000072ce90828000
> - 0x000072ce90a88000
> - 0x000072ce90a89000
> - 0x000072ce90cde000
> - 0x000072ce90ce1000
> - 0x000072ce90f51000
> - 0x000072ce90f52000
>
> ### MES failure and GPU reset [T+4349s]
>
> ```
> [4952.809s] MES(0) failed to respond to msg=REMOVE_QUEUE
> [4952.809s] failed to remove hardware queue from MES, doorbell=0x1806
> [4952.809s] MES might be in unrecoverable state, issue a GPU reset
> [4952.809s] Failed to evict queue 4
> [4952.809s] Failed to evict process queues
> [4952.809s] GPU reset begin!. Source: 3
> ```
>
> ### GPU reset fails
>
> ```
> [4953.121s] Failed to evict queue 4
> [4953.121s] Failed to suspend process pid 28552
> [4953.121s] remove_all_kfd_queues_mes: Failed to remove queue 3 for
> dev 62536
> ```
>
> 6 MES(1) REMOVE_QUEUE failures, each timing out after ~2.5 seconds:
> ```
> [4955.720s] MES(1) failed to respond to msg=REMOVE_QUEUE → failed to
> unmap legacy queue
> [4958.283s] MES(1) failed to respond to msg=REMOVE_QUEUE → failed to
> unmap legacy queue
> [4960.847s] MES(1) failed to respond to msg=REMOVE_QUEUE → failed to
> unmap legacy queue
> [4963.411s] MES(1) failed to respond to msg=REMOVE_QUEUE → failed to
> unmap legacy queue
> [4965.976s] MES(1) failed to respond to msg=REMOVE_QUEUE → failed to
> unmap legacy queue
> [4968.540s] MES(1) failed to respond to msg=REMOVE_QUEUE → failed to
> unmap legacy queue
> ```
>
> ### PSP suspend fails
>
> ```
> [4971.164s] psp gfx command LOAD_IP_FW(0x6) failed and response
> status is (0x0)
> [4971.164s] Failed to terminate ras ta
> [4971.164s] suspend of IP block <psp> failed -22
> ```
>
> ### Suspend unwind fails — SMU not ready
>
> ```
> [4971.164s] SMU is resuming...
> [4971.164s] SMC is not ready
> [4971.164s] SMC engine is not correctly up!
> [4971.164s] resume of IP block <smu> failed -5
> [4971.164s] amdgpu_device_ip_resume_phase2 failed during unwind: -5
> [4971.164s] GPU pre asic reset failed with err, -22 for drm dev,
> 0000:97:00.0
> ```
>
> ### MODE1 reset — SMU still dead
>
> ```
> [4971.164s] MODE1 reset
> [4971.164s] GPU mode1 reset
> [4971.164s] GPU smu mode1 reset
> [4972.193s] GPU reset succeeded, trying to resume
> [4972.193s] VRAM is lost due to GPU reset!
> [4972.193s] SMU is resuming...
> [4972.193s] SMC is not ready
> [4972.193s] SMC engine is not correctly up!
> [4972.193s] resume of IP block <smu> failed -5
> [4972.193s] GPU reset end with ret = -5
> ```
>
>
>
>
> On Fri, Apr 10, 2026 at 3:09 AM Ilpo Järvinen
> <ilpo.jarvinen@linux.intel.com> wrote:
> >
> > On Thu, 9 Apr 2026, Geramy Loveless wrote:
> >
> > > When pbus_reassign_bridge_resources() walks up the bridge
> > > hierarchy
> > > to expand a window (e.g. for resizable BAR), it refuses to
> > > release
> > > any bridge window that has children. This prevents BAR resize on
> > > devices behind multi-port PCIe switches (such as Thunderbolt
> > > docks)
> > > where empty sibling downstream ports hold small reservations that
> > > block the parent bridge window from being freed and re-sized.
> > >
> > > Add pci_bus_subtree_empty() to check whether a bus subtree
> > > contains
> > > any assigned device BARs, and pci_bus_release_empty_bridges() to
> > > release bridge window resources of empty sibling bridges, saving
> > > them to the rollback list so failures can be properly unwound.
> > >
> > > In pbus_reassign_bridge_resources(), call
> > > pci_bus_release_empty_bridges()
> > > before checking res->child, so empty sibling windows are cleared
> > > first
> > > and the parent window can then be released and grown.
> > >
> > > Uses PCI bus/device iterators rather than walking the raw
> > > resource
> > > tree, which avoids issues with stale sibling pointers after
> > > resource
> > > release.
> >
> > This paragraph can be dropped. And it's not exactly correct either
> > as
> > the pointers are only stale for resource entries that reside
> > outside of
> > the resource tree (after they've been released in a specific way)
> > so if
> > you start from a resource tree entry, you should never encounter a
> > stale
> > pointer.
> >
> > > Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > > Signed-off-by: Geramy Loveless <gloveless@jqluv.com>
> > > ---
> > > drivers/pci/setup-bus.c | 99
> > > ++++++++++++++++++++++++++++++++++++++++-
> > > 1 file changed, 97 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> > > index 4cf120ebe5a..7a182cd7e4d 100644
> > > --- a/drivers/pci/setup-bus.c
> > > +++ b/drivers/pci/setup-bus.c
> > > @@ -2292,6 +2292,94 @@ void
> > > pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
> > > }
> > > EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
> > >
> > > +/*
> > > + * pci_bus_subtree_empty - Check whether a bus subtree has any
> > > assigned
> > > + * non-bridge device resources.
> > > + * @bus: PCI bus to check
> > > + *
> > > + * Returns true if no device on @bus or its descendant buses has
> > > any
> > > + * assigned BARs (bridge window resources are not considered).
> > > + */
> > > +static bool pci_bus_subtree_empty(struct pci_bus *bus)
> > > +{
> > > + struct pci_dev *dev;
> > > +
> > > + list_for_each_entry(dev, &bus->devices, bus_list) {
> > > + struct resource *r;
> > > + unsigned int i;
> > > +
> > > + pci_dev_for_each_resource(dev, r, i) {
> > > + if (i >= PCI_BRIDGE_RESOURCES)
> > > + break;
> > > + if (resource_assigned(r))
> > > + return false;
> > > + }
> > > +
> > > + if (dev->subordinate &&
> > > + !pci_bus_subtree_empty(dev->subordinate))
> > > + return false;
> > > + }
> > > +
> > > + return true;
> > > +}
> > > +
> > > +/*
> > > + * pci_bus_release_empty_bridges - Release bridge window
> > > resources of
> > > + * empty sibling bridges so the parent window can be freed and
> > > re-sized.
> > > + * @bus: PCI bus whose child bridges to scan
> > > + * @b_win: Parent bridge window resource; only children of this
> > > window
> > > + * are released
> > > + * @saved: List to save released resources for rollback
> > > + *
> > > + * For each PCI-to-PCI bridge on @bus whose subtree is empty (no
> > > assigned
> > > + * device BARs), releases bridge window resources that are
> > > children of
> > > + * @b_win, saving them for rollback via @saved.
> > > + *
> > > + * Returns 0 on success, negative errno on failure.
> > > + */
> > > +static int pci_bus_release_empty_bridges(struct pci_bus *bus,
> > > + struct resource *b_win,
> > > + struct list_head *saved)
> > > +{
> > > + struct pci_dev *dev;
> > > +
> > > + list_for_each_entry(dev, &bus->devices, bus_list) {
> > > + struct resource *r;
> > > + unsigned int i;
> > > +
> > > + if (!dev->subordinate)
> > > + continue;
> > > +
> > > + if ((dev->class >> 8) != PCI_CLASS_BRIDGE_PCI)
> > > + continue;
> >
> > I suppose dev->subordinate check is enough for what we're doing so
> > this
> > looks redundant.
> >
> > > +
> > > + if (!pci_bus_subtree_empty(dev->subordinate))
> > > + continue;
> > > +
> > > + pci_dev_for_each_resource(dev, r, i) {
> > > + int ret;
> > > +
> > > + if (!pci_resource_is_bridge_win(i))
> > > + continue;
> > > +
> > > + if (!resource_assigned(r))
> > > + continue;
> > > +
> > > + if (r->parent != b_win)
> > > + continue;
> > > +
> > > + ret = pci_dev_res_add_to_list(saved, dev,
> > > r, 0, 0);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + release_child_resources(r);
> >
> > Unfortunately you cannot call this low-level function because it
> > recursively frees child resources which means you won't be able to
> > rollback them as they were not added to the saved list.
> >
> > I think the release algorithm should basically do this:
> >
> > - Recurse to the subordinate buses
> > - Loop through bridge window resources of this bus
> > - Skip resources that are not assigned or are not parented
> > by b_win
> > - If the resource still has childs, leave the resource
> > alone
> > (+ log it for easier troubleshooting these cases; any
> > failure
> > will also cascade to upstream so it may be possible to
> > shortcut something but it will also make the algorithm
> > more
> > complicated)
> > - Save and free the resource
> >
> > It might be better to move some of the code from
> > pbus_reassign_bridge_resources() here as there's overlap with the
> > sketched
> > algorithm (but I'm not sure until I see the updated version but
> > keep this
> > in mind).
> >
> > Doing pci_bus_subtree_empty() before any removal is fine with me,
> > but I
> > see it just an optimization.
> >
> > > + pci_release_resource(dev, i);
> > > + }
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> > > /*
> > > * Walk to the root bus, find the bridge window relevant for
> > > @res and
> > > * release it when possible. If the bridge window contains
> > > assigned
> > > @@ -2316,7 +2404,14 @@ static int
> > > pbus_reassign_bridge_resources(struct pci_bus *bus, struct
> > > resource *
> > >
> > > i = pci_resource_num(bridge, res);
> > >
> > > - /* Ignore BARs which are still in use */
> >
> > I don't know why you removed this comment (I admit though "BARs"
> > could
> > have been worded better as it's bridge windows we're dealing here).
> >
> > > + /* Release empty sibling bridge windows first */
> > > + if (bridge->subordinate) {
> > > + ret = pci_bus_release_empty_bridges(
> > > + bridge->subordinate, res,
> > > saved);
> >
> > First arg fits to the previous line.
> >
> > Align the second line to (.
> >
> > But consider also rearranging code as I mentioned above.
> >
> > > + if (ret)
> > > + return ret;
> >
> > Consider proceeding with the resize even if something failed as
> > there are
> > cases where the bridge windows are large enough (admittedly, you
> > seem to
> > only bail out in case of alloc error).
> >
> > In to the same vein, there seems to be one existing goto restore
> > (that was
> > added by me), which could also probably do continue instead (but
> > changing
> > it would be worth another patch).
> >
> > > + }
> > > +
> > > if (!res->child) {
> > > ret = pci_dev_res_add_to_list(saved,
> > > bridge, res, 0, 0);
> > > if (ret)
> > > @@ -2327,7 +2422,7 @@ static int
> > > pbus_reassign_bridge_resources(struct pci_bus *bus, struct
> > > resource *
> > > const char *res_name =
> > > pci_resource_name(bridge, i);
> > >
> > > pci_warn(bridge,
> > > - "%s %pR: was not released (still
> > > contains assigned resources)\n",
> > > + "%s %pR: not released, active
> > > children present\n",
> > > res_name, res);
> > > }
> > >
> > >
> >
> > --
> > i.
next prev parent reply other threads:[~2026-04-10 18:58 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 22:31 [PATCH] PCI: release empty sibling resources during bridge window resize Geramy Loveless
2026-04-09 8:03 ` Ilpo Järvinen
[not found] ` <CAGpo2mcyLhY6muz9Zgg3zD=Ux-HT8RXeMvbUi27a+SX=VxCRPQ@mail.gmail.com>
2026-04-09 13:26 ` Ilpo Järvinen
2026-04-09 19:32 ` Cristian Cocos
2026-04-10 5:26 ` [PATCH v2] PCI: release empty sibling bridge windows during rebar expansion Geramy Loveless
2026-04-10 10:09 ` Ilpo Järvinen
2026-04-10 17:53 ` Geramy Loveless
2026-04-10 18:58 ` Cristian Cocos [this message]
2026-04-10 19:10 ` [PATCH] PCI: release empty sibling bridge resources during window resize Geramy Loveless
2026-04-13 10:22 ` Ilpo Järvinen
2026-04-10 19:14 ` [PATCH v2] PCI: release empty sibling bridge windows during rebar expansion Geramy Loveless
2026-04-10 23:01 ` Geramy Loveless
2026-04-10 23:21 ` Cristian Cocos
2026-04-11 3:28 ` Mario Limonciello
2026-04-11 16:30 ` Geramy Loveless
2026-04-11 17:42 ` Mario Limonciello
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ec0d287c884cbdd5131e1b09147b9b3cd56faf1d.camel@ieee.org \
--to=cristi@ieee.org \
--cc=christian.koenig@amd.com \
--cc=gloveless@jqluv.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=linux-pci@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox