public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Maarten Lankhorst <dev@lankhorst.se>
Cc: linux-pci@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
	 "intel-xe@lists.freedesktop.org"
	<intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH] PCI: Fix resizable bar fails due to bridge memory region
Date: Fri, 24 Apr 2026 21:40:34 +0300 (EEST)	[thread overview]
Message-ID: <bdcc18c0-0d75-fcfc-5ab4-3b42bdcbe7e9@linux.intel.com> (raw)
In-Reply-To: <eddf0632-81c5-49cb-b3a8-ef172d3ccb1c@lankhorst.se>

[-- Attachment #1: Type: text/plain, Size: 9586 bytes --]

On Fri, 24 Apr 2026, Maarten Lankhorst wrote:
> Den 2026-04-24 kl. 19:42, skrev Ilpo Järvinen:
> > On Fri, 24 Apr 2026, Maarten Lankhorst wrote:
> >> Den 2026-04-24 kl. 17:04, skrev Ilpo Järvinen:
> >>> On Fri, 24 Apr 2026, Maarten Lankhorst wrote:
> >>>
> >>>> I encountered a problem that I have on my system, where I cannot resize
> >>>> the bar because one of the bridges has a 
> >>>
> >>> You're missing words from here. But I can guess you've that extra BAR on 
> >>> in-card the bridge.
> >>>
> >>>> If I take a look at the topology, the GPU shares the memory region with a bridge,
> >>>>
> >>>> +-[0000:64]-+-00.0-[65-68]----00.0-[66-68]--+-01.0-[67]----00.0
> >>>>
> >>>> The specific bridge likely causing a failure is:
> >>>>
> >>>> 65:00.0 PCI bridge: Intel Corporation Device e2ff (rev 01) (prog-if 00 [Normal decode])
> >>>>         Flags: bus master, fast devsel, latency 0, IRQ 32, IOMMU group 1
> >>>>         Memory at 382400000000 (64-bit, prefetchable) [size=8M]
> >>>>         ....
> >>>>
> >>>> Which causes upstream bridge 64:00.0 initially to allocate the region
> >>>> [38fe0000000-38ff0000000) for the GPU, and [382ff0000000..382ff07fffff]
> >>>> for the bridge device.
> >>>>
> >>>> Bridge 64 is big enough for 1 BMG with a 32 GB bar and the second 8 MB allocation:
> >>>> pci_bus 0000:64: resource 9 [mem 0x382000000000-0x382fffffffff window] (64GB window)
> >>>>
> >>>> The reason for failure is that bridge 65 has a 8 MB memory region assigned,
> >>>
> >>>> and previously it was ignored when reallocating.
> >>>
> >>> What does this mean? I don't think it was ever ignored while 
> >>> reallocating??
> >>>
> >>> Note that kernel has become much more verbose in explaining why things 
> >>> fail so perhaps the added message is confusing you (they won't appear in 
> >>> the old kernels).
> >>>
> >>>> Failing case:
> >>>> xe 0000:67:00.0: [drm] Attempting to resize bar from 256MiB -> 16384MiB
> >>>> xe 0000:67:00.0: BAR 2 [mem 0x382fe0000000-0x382fefffffff 64bit pref]: releasing
> >>>> pcieport 0000:66:01.0: bridge window [mem 0x382fe0000000-0x382fefffffff 64bit pref]: releasing
> >>>> pcieport 0000:65:00.0: bridge window [mem 0x382fe0000000-0x382fefffffff 64bit pref]: releasing
> >>>> pcieport 0000:64:00.0: bridge window [mem 0x382fe0000000-0x382ff07fffff 64bit pref]: was not released (still contains assigned resources)
> >>>> pcieport 0000:65:00.0: bridge window [mem size 0x400000000 64bit pref]: can't assign; no space
> >>>> pcieport 0000:65:00.0: bridge window [mem size 0x400000000 64bit pref]: failed to assign
> >>>> pcieport 0000:65:00.0: bridge window [mem size 0x400000000 64bit pref]: can't assign; no space
> >>>> pcieport 0000:65:00.0: bridge window [mem size 0x400000000 64bit pref]: failed to assign
> >>>> pcieport 0000:66:01.0: bridge window [mem size 0x400000000 64bit pref]: can't assign; no space
> >>>> pcieport 0000:66:01.0: bridge window [mem size 0x400000000 64bit pref]: failed to assign
> >>>> pcieport 0000:66:01.0: bridge window [mem size 0x400000000 64bit pref]: can't assign; no space
> >>>> pcieport 0000:66:01.0: bridge window [mem size 0x400000000 64bit pref]: failed to assign
> >>>> xe 0000:67:00.0: BAR 2 [mem size 0x400000000 64bit pref]: can't assign; no space
> >>>> xe 0000:67:00.0: BAR 2 [mem size 0x400000000 64bit pref]: failed to assign
> >>>> xe 0000:67:00.0: BAR 2 [mem size 0x400000000 64bit pref]: can't assign; no space
> >>>> xe 0000:67:00.0: BAR 2 [mem size 0x400000000 64bit pref]: failed to assign
> >>>> pcieport 0000:64:00.0: PCI bridge to [bus 65-68]
> >>>> pcieport 0000:64:00.0:   bridge window [mem 0xd7000000-0xd83fffff]
> >>>> pcieport 0000:64:00.0:   bridge window [mem 0x382fe0000000-0x382ff07fffff 64bit pref]
> >>>> pcieport 0000:65:00.0: PCI bridge to [bus 66-68]
> >>>> pcieport 0000:65:00.0:   bridge window [mem 0xd7000000-0xd83fffff]
> >>>> pcieport 0000:65:00.0:   bridge window [mem 0x382fe0000000-0x382fefffffff 64bit pref]
> >>>> pcieport 0000:66:01.0: PCI bridge to [bus 67]
> >>>> pcieport 0000:66:01.0:   bridge window [mem 0xd7000000-0xd81fffff]
> >>>> pcieport 0000:66:01.0:   bridge window [mem 0x382fe0000000-0x382fefffffff 64bit pref]
> >>>>
> >>>> Working with the patch below:
> >>>> xe 0000:67:00.0: [drm] Attempting to resize bar from 256MiB -> 16384MiB
> >>>> xe 0000:67:00.0: BAR 2 [mem 0x382fe0000000-0x382fefffffff 64bit pref]: releasing
> >>>> pcieport 0000:66:01.0: bridge window [mem 0x382fe0000000-0x382fefffffff 64bit pref]: releasing
> >>>> pcieport 0000:65:00.0: bridge window [mem 0x382fe0000000-0x382fefffffff 64bit pref]: releasing
> >>>> pcieport 0000:65:00.0: BAR 0 [mem 0x382ff0000000-0x382ff07fffff 64bit pref]: releasing
> >>>> pcieport 0000:64:00.0: bridge window [mem 0x382fe0000000-0x382ff07fffff 64bit pref]: releasing
> >>>> pcieport 0000:64:00.0: bridge window [mem 0x382000000000-0x3824007fffff 64bit pref]: assigned
> >>>> pcieport 0000:65:00.0: bridge window [mem 0x382000000000-0x3823ffffffff 64bit pref]: assigned
> >>>> pcieport 0000:65:00.0: BAR 0 [mem 0x382400000000-0x3824007fffff 64bit pref]: assigned
> >>>> pcieport 0000:66:01.0: bridge window [mem 0x382000000000-0x3823ffffffff 64bit pref]: assigned
> >>>> xe 0000:67:00.0: BAR 2 [mem 0x382000000000-0x3823ffffffff 64bit pref]: assigned
> >>>> pcieport 0000:64:00.0: PCI bridge to [bus 65-68]
> >>>> pcieport 0000:64:00.0:   bridge window [mem 0xd7000000-0xd83fffff]
> >>>> pcieport 0000:64:00.0:   bridge window [mem 0x382000000000-0x3824007fffff 64bit pref]
> >>>> pcieport 0000:65:00.0: PCI bridge to [bus 66-68]
> >>>> pcieport 0000:65:00.0:   bridge window [mem 0xd7000000-0xd83fffff]
> >>>> pcieport 0000:65:00.0:   bridge window [mem 0x382000000000-0x3823ffffffff 64bit pref]
> >>>> pcieport 0000:65:00.0: PCI bridge to [bus 66-68]
> >>>> pcieport 0000:65:00.0:   bridge window [mem 0xd7000000-0xd83fffff]
> >>>> pcieport 0000:65:00.0:   bridge window [mem 0x382000000000-0x3823ffffffff 64bit pref]
> >>>> pcieport 0000:66:01.0: PCI bridge to [bus 67]
> >>>> pcieport 0000:66:01.0:   bridge window [mem 0xd7000000-0xd81fffff]
> >>>> pcieport 0000:66:01.0:   bridge window [mem 0x382000000000-0x3823ffffffff 64bit pref]
> >>>> xe 0000:67:00.0: [drm] BAR2 resized to 16384MiB
> >>>>
> >>>>
> >>>> Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
> >>>> ---
> >>>> I'm not 100% this is the correct fix, I don't know why the bridge itself has
> >>>> a memory region, why the kernel allocates it and when it's supposed to
> >>>> be used. Not a PCI expert. :-)
> >>>
> >>> I don't know why it is there either. Nothing in the portdrv really uses it 
> >>> for anything. There is patchset somewhere lying around which adds a quirk 
> >>> that releases the "extra" BAR.
> >>
> >> Thank you, I've taken a look at that patch series. It looks like patch 1/2
> >> would help a lot.
> > 
> > I should one day find time to finish that series, Bjorn didn't like how 
> > the code didn't "disable" BAR.
> > 
> > I did some (unset) work towards placing the BARs outside the bridge 
> > window but my algorithm was quite basic (just found min & max window
> > addresses and put BAR outside of that range but then I thought maybe I 
> > should add the possiblity for placing the BAR into the gap in the middle to
> > allow placement of a large BAR in cases where 32-bit mmio blocks low-end + 
> > 64-bit mmio range blocking the top of the address space.
> > 
> >> I'm wondering if it will completely fix the issue, it is still possible
> >> that in the same config I have 2 identical cards, where resizing might
> >> work, but when GPU1's VRAM BAR is already bound, it might be unable
> >> to resize GPU2's VRAM BAR for the same reason as this patch.
> > 
> > It is always possible to find cases where a sibling pins the shared bridge 
> > window in place.
> > 
> >> What would be the best way to handle this case?
> > 
> > The best way is to size them into the largest possible sizes right from 
> > the beginning. But when doing that, nothing must break (introducing 
> > regressions is not okay) so it kernel must be able to graciously fallback 
> > to smaller sizes when there isn't enough space available for the largest 
> > size.
> > 
> > What makes it complicated is that sizing and assignment are done very much 
> > separately, so retrying with a smaller size is complicated. I'm working 
> > toward this kind of solution but there are various things that have to be 
> > fixed first.
> 
> How feasible would it be to add a quirk to resize the bar (or reserve enough
> space for max) very early before any allocations are done?

I don't think that is viable.

Resizing is way more complicated than touching a single BAR because how 
the bridge windows work and how resource sizes interact.

No solution is allowed to cause regressions which could easily happen 
_much later_ into the resource allocation (very much after the quirk has 
finished). How'd you rollback the size changes the quirk caused at that 
point???

What such a quirk would effectively do is make rolling back the size 
changes an intractable problem, so any regression would be unfixable 
except by piling in more and more hacks which isn't viable way forward.

> I doubt any user of resizable bar is going to hotplug their GPU's.

eGPUs connected over Thunderbolt are hotpluggable (whether the 
hotpluggability occurs in practice in those cases is another question 
though, or if the eGPU just sits there always connected).

-- 
 i.

      reply	other threads:[~2026-04-24 18:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-24 14:43 [PATCH] PCI: Fix resizable bar fails due to bridge memory region Maarten Lankhorst
2026-04-24 15:04 ` Ilpo Järvinen
2026-04-24 16:03   ` Maarten Lankhorst
2026-04-24 17:42     ` Ilpo Järvinen
2026-04-24 17:58       ` Maarten Lankhorst
2026-04-24 18:40         ` Ilpo Järvinen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bdcc18c0-0d75-fcfc-5ab4-3b42bdcbe7e9@linux.intel.com \
    --to=ilpo.jarvinen@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=dev@lankhorst.se \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox