linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REGRESSION] amdgpu fails to load external RX 580 since PCI: Allow relaxed bridge window tail sizing for optional resources
@ 2025-06-09  2:34 rio
  2025-06-09  9:09 ` Ilpo Järvinen
  0 siblings, 1 reply; 5+ messages in thread
From: rio @ 2025-06-09  2:34 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Bjorn Helgaas, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org, regressions@lists.linux.dev,
	amd-gfx@lists.freedesktop.org

Hello,

I have an external Radeon RX580 on my machine connected via Thunderbolt, and
since upgrading from 6.14.1 the setup stopped working. Dmesg showed warning from
resource sanity check, followed by a stack trace https://pastebin.com/njR55rQW.
Relevant snippet:

[   12.134907] amdgpu 0000:06:00.0: BAR 2 [mem 0x6000000000-0x60001fffff 64bit pref]: releasing
[   12.134910] [drm:amdgpu_device_resize_fb_bar [amdgpu]] *ERROR* Problem resizing BAR0 (-16).
[   12.135456] amdgpu 0000:06:00.0: BAR 2 [mem 0x6000000000-0x60001fffff 64bit pref]: assigned
[   12.135524] amdgpu 0000:06:00.0: amdgpu: VRAM: 8192M 0x000000F400000000 - 0x000000F5FFFFFFFF (8192M used)
[   12.135527] amdgpu 0000:06:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[   12.135536] resource: resource sanity check: requesting [mem 0x0000000000000000-0xffffffffffffffff], which spans more than PCI Bus 0000:00 [mem 0x000a0000-0x000bffff window]
[   12.135542] ------------[ cut here ]------------
[   12.135543] WARNING: CPU: 6 PID: 599 at arch/x86/mm/pat/memtype.c:721 memtype_reserve_io+0xfc/0x110
[   12.135551] Modules linked in: ccm amdgpu(+) snd_hda_codec_realtek ...
[   12.135652] CPU: 6 UID: 0 PID: 599 Comm: (udev-worker) Tainted: G S                  6.15.0-13743-g8630c59e9936 #16 PREEMPT(full)  3b462c924b3ffd8156fc3b77bcc8ddbf7257fa57
[   12.135654] Tainted: [S]=CPU_OUT_OF_SPEC
[   12.135655] Hardware name: COPELION INTERNATIONAL INC. ZX Series/ZX Series, BIOS 1.07.08TCOP3 03/27/2020
[   12.135656] RIP: 0010:memtype_reserve_io+0xfc/0x110
[   12.135659] Code: aa fb ff ff b8 f0 ff ff ff eb 88 8b 54 24 04 4c 89 ee 48 89 df e8 04 fe ff ff 85 c0 75 db 8b 54 24 04 41 89 16 e9 69 ff ff ff <0f> 0b e9 4b ff ff ff e8 b8 5c fc 00 0f 1f 84 00 00 00 00 00 90 90

Bisecting the stable branch pointed me to the following commit:

commit 22df32c984be9e9145978acf011642da042a2af3 (HEAD)
Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Date:   Mon Dec 16 19:56:11 2024 +0200

    PCI: Allow relaxed bridge window tail sizing for optional resources
    
    [ Upstream commit 67f9085596ee55dd27b540ca6088ba0717ee511c ]

I've tested on stable (as of now 8630c59e99363c4b655788fd01134aef9bcd9264), and
the issue persists. Reverting the offending commit via `git revert -n
22df32c984be9e9145978acf011642da042a2af3` allowed amdgpu to load again.
Dmesg: https://pastebin.com/xd76rDsW.

Additional information
   - Distribution: Artix
   - Arch: x86_64
   - Kernel config: https://pastebin.com/DWSERJL5
   - eGPU adapter: https://www.adt.link/product/R43SG-TB3.html
   - Booting with pci=realloc,hpbussize=0x33,hpmmiosize=256M,hpmmioprefsize=1G

I'm reporting here as these are the contacts from the commit message. Please let me know if there's a more appropriate place for this, as well as any more information I can provide.

Thanks,
Rio


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REGRESSION] amdgpu fails to load external RX 580 since PCI: Allow relaxed bridge window tail sizing for optional resources
  2025-06-09  2:34 [REGRESSION] amdgpu fails to load external RX 580 since PCI: Allow relaxed bridge window tail sizing for optional resources rio
@ 2025-06-09  9:09 ` Ilpo Järvinen
  2025-06-09 14:22   ` Rio Liu
  0 siblings, 1 reply; 5+ messages in thread
From: Ilpo Järvinen @ 2025-06-09  9:09 UTC (permalink / raw)
  To: rio
  Cc: Bjorn Helgaas, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org, regressions@lists.linux.dev,
	amd-gfx@lists.freedesktop.org

[-- Attachment #1: Type: text/plain, Size: 6190 bytes --]

On Mon, 9 Jun 2025, rio@r26.me wrote:

> Hello,
> 
> I have an external Radeon RX580 on my machine connected via Thunderbolt, and
> since upgrading from 6.14.1 the setup stopped working. Dmesg showed warning from
> resource sanity check, followed by a stack trace https://pastebin.com/njR55rQW.
> Relevant snippet:
> 
> [   12.134907] amdgpu 0000:06:00.0: BAR 2 [mem 0x6000000000-0x60001fffff 64bit pref]: releasing
> [   12.134910] [drm:amdgpu_device_resize_fb_bar [amdgpu]] *ERROR* Problem resizing BAR0 (-16).
> [   12.135456] amdgpu 0000:06:00.0: BAR 2 [mem 0x6000000000-0x60001fffff 64bit pref]: assigned
> [   12.135524] amdgpu 0000:06:00.0: amdgpu: VRAM: 8192M 0x000000F400000000 - 0x000000F5FFFFFFFF (8192M used)
> [   12.135527] amdgpu 0000:06:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
> [   12.135536] resource: resource sanity check: requesting [mem 0x0000000000000000-0xffffffffffffffff], which spans more than PCI Bus 0000:00 [mem 0x000a0000-0x000bffff window]
> [   12.135542] ------------[ cut here ]------------
> [   12.135543] WARNING: CPU: 6 PID: 599 at arch/x86/mm/pat/memtype.c:721 memtype_reserve_io+0xfc/0x110
> [   12.135551] Modules linked in: ccm amdgpu(+) snd_hda_codec_realtek ...
> [   12.135652] CPU: 6 UID: 0 PID: 599 Comm: (udev-worker) Tainted: G S                  6.15.0-13743-g8630c59e9936 #16 PREEMPT(full)  3b462c924b3ffd8156fc3b77bcc8ddbf7257fa57
> [   12.135654] Tainted: [S]=CPU_OUT_OF_SPEC
> [   12.135655] Hardware name: COPELION INTERNATIONAL INC. ZX Series/ZX Series, BIOS 1.07.08TCOP3 03/27/2020
> [   12.135656] RIP: 0010:memtype_reserve_io+0xfc/0x110
> [   12.135659] Code: aa fb ff ff b8 f0 ff ff ff eb 88 8b 54 24 04 4c 89 ee 48 89 df e8 04 fe ff ff 85 c0 75 db 8b 54 24 04 41 89 16 e9 69 ff ff ff <0f> 0b e9 4b ff ff ff e8 b8 5c fc 00 0f 1f 84 00 00 00 00 00 90 90
> 
> Bisecting the stable branch pointed me to the following commit:
> 
> commit 22df32c984be9e9145978acf011642da042a2af3 (HEAD)
> Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Date:   Mon Dec 16 19:56:11 2024 +0200
> 
>     PCI: Allow relaxed bridge window tail sizing for optional resources
>     
>     [ Upstream commit 67f9085596ee55dd27b540ca6088ba0717ee511c ]
> 
> I've tested on stable (as of now 8630c59e99363c4b655788fd01134aef9bcd9264), and
> the issue persists. Reverting the offending commit via `git revert -n
> 22df32c984be9e9145978acf011642da042a2af3` allowed amdgpu to load again.
> Dmesg: https://pastebin.com/xd76rDsW.
> 
> Additional information
>    - Distribution: Artix
>    - Arch: x86_64
>    - Kernel config: https://pastebin.com/DWSERJL5
>    - eGPU adapter: https://www.adt.link/product/R43SG-TB3.html
>    - Booting with pci=realloc,hpbussize=0x33,hpmmiosize=256M,hpmmioprefsize=1G
> 
> I'm reporting here as these are the contacts from the commit message. 
> Please let me know if there's a more appropriate place for this, as well 
> as any more information I can provide. 

Hi Rio,

Thanks for the report and I'm sorry about causing this issue. Could you 
please try if the patch below solves the issue.

--
From b94823a193032b5f87114cff9e8edc5c67e4ef40 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen@linux.intel.com>
Date: Mon, 9 Jun 2025 12:05:20 +0300
Subject: [PATCH 1/1] PCI: Relaxed alignment should never increase min_align
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When using relaxed tail alignment for the bridge window,
pbus_size_mem() also tries to minimize min_align, which can under
certain scenarios end up increasing min_align from that found by
calculate_mem_align().

Ensure min_align is not increased by the relaxed tail alignment.

Eventually, it would be better to add calculate_relaxed_head_align()
similar to calculate_mem_align() which finds out what alignment can be
used for the head without introducing any gaps into the bridge window
to give flexibility on head address too. But that looks relatively
complex algorithm so it requires much more testing than fixing the
immediate problem causing a regression.

Reported-by: Rio <rio@r26.me>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-bus.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 07c3d021a47e..f90d49cd07da 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1169,6 +1169,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 	resource_size_t children_add_size = 0;
 	resource_size_t children_add_align = 0;
 	resource_size_t add_align = 0;
+	resource_size_t relaxed_align;
 
 	if (!b_res)
 		return -ENOSPC;
@@ -1246,8 +1247,9 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 	if (bus->self && size0 &&
 	    !pbus_upstream_space_available(bus, mask | IORESOURCE_PREFETCH, type,
 					   size0, min_align)) {
-		min_align = 1ULL << (max_order + __ffs(SZ_1M));
-		min_align = max(min_align, win_align);
+		relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
+		relaxed_align = max(relaxed_align, win_align);
+		min_align = min(min_align, relaxed_align);
 		size0 = calculate_memsize(size, min_size, 0, 0, resource_size(b_res), win_align);
 		pci_info(bus->self, "bridge window %pR to %pR requires relaxed alignment rules\n",
 			 b_res, &bus->busn_res);
@@ -1261,8 +1263,9 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 		if (bus->self && size1 &&
 		    !pbus_upstream_space_available(bus, mask | IORESOURCE_PREFETCH, type,
 						   size1, add_align)) {
-			min_align = 1ULL << (max_order + __ffs(SZ_1M));
-			min_align = max(min_align, win_align);
+			relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
+			relaxed_align = max(min_align, win_align);
+			min_align = min(min_align, relaxed_align);
 			size1 = calculate_memsize(size, min_size, add_size, children_add_size,
 						  resource_size(b_res), win_align);
 			pci_info(bus->self,

base-commit: 3719a04a80caf660f899a462cd8f3973bcfa676e
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [REGRESSION] amdgpu fails to load external RX 580 since PCI: Allow relaxed bridge window tail sizing for optional resources
  2025-06-09  9:09 ` Ilpo Järvinen
@ 2025-06-09 14:22   ` Rio Liu
  2025-06-09 17:07     ` Ilpo Järvinen
  0 siblings, 1 reply; 5+ messages in thread
From: Rio Liu @ 2025-06-09 14:22 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Bjorn Helgaas, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org, regressions@lists.linux.dev,
	amd-gfx@lists.freedesktop.org

On Monday, June 9th, 2025 at AM 5:09, Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> wrote:

> 
> 
> On Mon, 9 Jun 2025, rio@r26.me wrote:
> 
> > Hello,
> > 
> > I have an external Radeon RX580 on my machine connected via Thunderbolt, and
> > since upgrading from 6.14.1 the setup stopped working. Dmesg showed warning from
> > resource sanity check, followed by a stack trace https://pastebin.com/njR55rQW.
> > Relevant snippet:
> > 
> > [ 12.134907] amdgpu 0000:06:00.0: BAR 2 [mem 0x6000000000-0x60001fffff 64bit pref]: releasing
> > [ 12.134910] [drm:amdgpu_device_resize_fb_bar [amdgpu]] ERROR Problem resizing BAR0 (-16).
> > [ 12.135456] amdgpu 0000:06:00.0: BAR 2 [mem 0x6000000000-0x60001fffff 64bit pref]: assigned
> > [ 12.135524] amdgpu 0000:06:00.0: amdgpu: VRAM: 8192M 0x000000F400000000 - 0x000000F5FFFFFFFF (8192M used)
> > [ 12.135527] amdgpu 0000:06:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
> > [ 12.135536] resource: resource sanity check: requesting [mem 0x0000000000000000-0xffffffffffffffff], which spans more than PCI Bus 0000:00 [mem 0x000a0000-0x000bffff window]
> > [ 12.135542] ------------[ cut here ]------------
> > [ 12.135543] WARNING: CPU: 6 PID: 599 at arch/x86/mm/pat/memtype.c:721 memtype_reserve_io+0xfc/0x110
> > [ 12.135551] Modules linked in: ccm amdgpu(+) snd_hda_codec_realtek ...
> > [ 12.135652] CPU: 6 UID: 0 PID: 599 Comm: (udev-worker) Tainted: G S 6.15.0-13743-g8630c59e9936 #16 PREEMPT(full) 3b462c924b3ffd8156fc3b77bcc8ddbf7257fa57
> > [ 12.135654] Tainted: [S]=CPU_OUT_OF_SPEC
> > [ 12.135655] Hardware name: COPELION INTERNATIONAL INC. ZX Series/ZX Series, BIOS 1.07.08TCOP3 03/27/2020
> > [ 12.135656] RIP: 0010:memtype_reserve_io+0xfc/0x110
> > [ 12.135659] Code: aa fb ff ff b8 f0 ff ff ff eb 88 8b 54 24 04 4c 89 ee 48 89 df e8 04 fe ff ff 85 c0 75 db 8b 54 24 04 41 89 16 e9 69 ff ff ff <0f> 0b e9 4b ff ff ff e8 b8 5c fc 00 0f 1f 84 00 00 00 00 00 90 90
> > 
> > Bisecting the stable branch pointed me to the following commit:
> > 
> > commit 22df32c984be9e9145978acf011642da042a2af3 (HEAD)
> > Author: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
> > Date: Mon Dec 16 19:56:11 2024 +0200
> > 
> > PCI: Allow relaxed bridge window tail sizing for optional resources
> > 
> > [ Upstream commit 67f9085596ee55dd27b540ca6088ba0717ee511c ]
> > 
> > I've tested on stable (as of now 8630c59e99363c4b655788fd01134aef9bcd9264), and
> > the issue persists. Reverting the offending commit via `git revert -n 22df32c984be9e9145978acf011642da042a2af3` allowed amdgpu to load again.
> > Dmesg: https://pastebin.com/xd76rDsW.
> > 
> > Additional information
> > - Distribution: Artix
> > - Arch: x86_64
> > - Kernel config: https://pastebin.com/DWSERJL5
> > - eGPU adapter: https://www.adt.link/product/R43SG-TB3.html
> > - Booting with pci=realloc,hpbussize=0x33,hpmmiosize=256M,hpmmioprefsize=1G
> > 
> > I'm reporting here as these are the contacts from the commit message.
> > Please let me know if there's a more appropriate place for this, as well
> > as any more information I can provide.
> 
> 
> Hi Rio,
> 
> Thanks for the report and I'm sorry about causing this issue. Could you
> please try if the patch below solves the issue.
> 
> --
> From b94823a193032b5f87114cff9e8edc5c67e4ef40 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= ilpo.jarvinen@linux.intel.com
> 
> Date: Mon, 9 Jun 2025 12:05:20 +0300
> Subject: [PATCH 1/1] PCI: Relaxed alignment should never increase min_align
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> When using relaxed tail alignment for the bridge window,
> pbus_size_mem() also tries to minimize min_align, which can under
> certain scenarios end up increasing min_align from that found by
> calculate_mem_align().
> 
> Ensure min_align is not increased by the relaxed tail alignment.
> 
> Eventually, it would be better to add calculate_relaxed_head_align()
> similar to calculate_mem_align() which finds out what alignment can be
> used for the head without introducing any gaps into the bridge window
> to give flexibility on head address too. But that looks relatively
> complex algorithm so it requires much more testing than fixing the
> immediate problem causing a regression.
> 
> Reported-by: Rio rio@r26.me
> 
> Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
> 
> ---
> drivers/pci/setup-bus.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 07c3d021a47e..f90d49cd07da 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -1169,6 +1169,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
> resource_size_t children_add_size = 0;
> resource_size_t children_add_align = 0;
> resource_size_t add_align = 0;
> + resource_size_t relaxed_align;
> 
> if (!b_res)
> return -ENOSPC;
> @@ -1246,8 +1247,9 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
> if (bus->self && size0 &&
> 
> !pbus_upstream_space_available(bus, mask | IORESOURCE_PREFETCH, type,
> size0, min_align)) {
> - min_align = 1ULL << (max_order + __ffs(SZ_1M));
> - min_align = max(min_align, win_align);
> + relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
> + relaxed_align = max(relaxed_align, win_align);
> + min_align = min(min_align, relaxed_align);
> size0 = calculate_memsize(size, min_size, 0, 0, resource_size(b_res), win_align);
> pci_info(bus->self, "bridge window %pR to %pR requires relaxed alignment rules\n",
> 
> b_res, &bus->busn_res);
> 
> @@ -1261,8 +1263,9 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
> if (bus->self && size1 &&
> 
> !pbus_upstream_space_available(bus, mask | IORESOURCE_PREFETCH, type,
> size1, add_align)) {
> - min_align = 1ULL << (max_order + __ffs(SZ_1M));
> - min_align = max(min_align, win_align);
> + relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
> + relaxed_align = max(min_align, win_align);
> + min_align = min(min_align, relaxed_align);
> size1 = calculate_memsize(size, min_size, add_size, children_add_size,
> resource_size(b_res), win_align);
> pci_info(bus->self,
> 
> 
> base-commit: 3719a04a80caf660f899a462cd8f3973bcfa676e
> --
> 2.39.5

Hello Ilpo,

I've tested the patch and it seems to fix the issue. Thank you!

Rio Liu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REGRESSION] amdgpu fails to load external RX 580 since PCI: Allow relaxed bridge window tail sizing for optional resources
  2025-06-09 14:22   ` Rio Liu
@ 2025-06-09 17:07     ` Ilpo Järvinen
  2025-06-09 17:16       ` Rio Liu
  0 siblings, 1 reply; 5+ messages in thread
From: Ilpo Järvinen @ 2025-06-09 17:07 UTC (permalink / raw)
  To: Rio Liu
  Cc: Bjorn Helgaas, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org, regressions@lists.linux.dev,
	amd-gfx@lists.freedesktop.org

[-- Attachment #1: Type: text/plain, Size: 6901 bytes --]

On Mon, 9 Jun 2025, Rio Liu wrote:
> On Monday, June 9th, 2025 at AM 5:09, Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> wrote:
> > On Mon, 9 Jun 2025, rio@r26.me wrote:
> > 
> > > I have an external Radeon RX580 on my machine connected via Thunderbolt, and
> > > since upgrading from 6.14.1 the setup stopped working. Dmesg showed warning from
> > > resource sanity check, followed by a stack trace https://pastebin.com/njR55rQW.
> > > Relevant snippet:
> > > 
> > > [ 12.134907] amdgpu 0000:06:00.0: BAR 2 [mem 0x6000000000-0x60001fffff 64bit pref]: releasing
> > > [ 12.134910] [drm:amdgpu_device_resize_fb_bar [amdgpu]] ERROR Problem resizing BAR0 (-16).
> > > [ 12.135456] amdgpu 0000:06:00.0: BAR 2 [mem 0x6000000000-0x60001fffff 64bit pref]: assigned
> > > [ 12.135524] amdgpu 0000:06:00.0: amdgpu: VRAM: 8192M 0x000000F400000000 - 0x000000F5FFFFFFFF (8192M used)
> > > [ 12.135527] amdgpu 0000:06:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
> > > [ 12.135536] resource: resource sanity check: requesting [mem 0x0000000000000000-0xffffffffffffffff], which spans more than PCI Bus 0000:00 [mem 0x000a0000-0x000bffff window]
> > > [ 12.135542] ------------[ cut here ]------------
> > > [ 12.135543] WARNING: CPU: 6 PID: 599 at arch/x86/mm/pat/memtype.c:721 memtype_reserve_io+0xfc/0x110
> > > [ 12.135551] Modules linked in: ccm amdgpu(+) snd_hda_codec_realtek ...
> > > [ 12.135652] CPU: 6 UID: 0 PID: 599 Comm: (udev-worker) Tainted: G S 6.15.0-13743-g8630c59e9936 #16 PREEMPT(full) 3b462c924b3ffd8156fc3b77bcc8ddbf7257fa57
> > > [ 12.135654] Tainted: [S]=CPU_OUT_OF_SPEC
> > > [ 12.135655] Hardware name: COPELION INTERNATIONAL INC. ZX Series/ZX Series, BIOS 1.07.08TCOP3 03/27/2020
> > > [ 12.135656] RIP: 0010:memtype_reserve_io+0xfc/0x110
> > > [ 12.135659] Code: aa fb ff ff b8 f0 ff ff ff eb 88 8b 54 24 04 4c 89 ee 48 89 df e8 04 fe ff ff 85 c0 75 db 8b 54 24 04 41 89 16 e9 69 ff ff ff <0f> 0b e9 4b ff ff ff e8 b8 5c fc 00 0f 1f 84 00 00 00 00 00 90 90
> > > 
> > > Bisecting the stable branch pointed me to the following commit:
> > > 
> > > commit 22df32c984be9e9145978acf011642da042a2af3 (HEAD)
> > > Author: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
> > > Date: Mon Dec 16 19:56:11 2024 +0200
> > > 
> > > PCI: Allow relaxed bridge window tail sizing for optional resources
> > > 
> > > [ Upstream commit 67f9085596ee55dd27b540ca6088ba0717ee511c ]
> > > 
> > > I've tested on stable (as of now 8630c59e99363c4b655788fd01134aef9bcd9264), and
> > > the issue persists. Reverting the offending commit via `git revert -n 22df32c984be9e9145978acf011642da042a2af3` allowed amdgpu to load again.
> > > Dmesg: https://pastebin.com/xd76rDsW.
> > > 
> > > Additional information
> > > - Distribution: Artix
> > > - Arch: x86_64
> > > - Kernel config: https://pastebin.com/DWSERJL5
> > > - eGPU adapter: https://www.adt.link/product/R43SG-TB3.html
> > > - Booting with pci=realloc,hpbussize=0x33,hpmmiosize=256M,hpmmioprefsize=1G
> > > 
> > > I'm reporting here as these are the contacts from the commit message.
> > > Please let me know if there's a more appropriate place for this, as well
> > > as any more information I can provide.
> > 
> > 
> > Hi Rio,
> > 
> > Thanks for the report and I'm sorry about causing this issue. Could you
> > please try if the patch below solves the issue.
> > 
> > --
> > From b94823a193032b5f87114cff9e8edc5c67e4ef40 Mon Sep 17 00:00:00 2001
> > From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= ilpo.jarvinen@linux.intel.com
> > 
> > Date: Mon, 9 Jun 2025 12:05:20 +0300
> > Subject: [PATCH 1/1] PCI: Relaxed alignment should never increase min_align
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=UTF-8
> > Content-Transfer-Encoding: 8bit
> > 
> > When using relaxed tail alignment for the bridge window,
> > pbus_size_mem() also tries to minimize min_align, which can under
> > certain scenarios end up increasing min_align from that found by
> > calculate_mem_align().
> > 
> > Ensure min_align is not increased by the relaxed tail alignment.
> > 
> > Eventually, it would be better to add calculate_relaxed_head_align()
> > similar to calculate_mem_align() which finds out what alignment can be
> > used for the head without introducing any gaps into the bridge window
> > to give flexibility on head address too. But that looks relatively
> > complex algorithm so it requires much more testing than fixing the
> > immediate problem causing a regression.
> > 
> > Reported-by: Rio rio@r26.me
> > 
> > Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
> > 
> > ---
> > drivers/pci/setup-bus.c | 11 +++++++----
> > 1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> > index 07c3d021a47e..f90d49cd07da 100644
> > --- a/drivers/pci/setup-bus.c
> > +++ b/drivers/pci/setup-bus.c
> > @@ -1169,6 +1169,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
> > resource_size_t children_add_size = 0;
> > resource_size_t children_add_align = 0;
> > resource_size_t add_align = 0;
> > + resource_size_t relaxed_align;
> > 
> > if (!b_res)
> > return -ENOSPC;
> > @@ -1246,8 +1247,9 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
> > if (bus->self && size0 &&
> > 
> > !pbus_upstream_space_available(bus, mask | IORESOURCE_PREFETCH, type,
> > size0, min_align)) {
> > - min_align = 1ULL << (max_order + __ffs(SZ_1M));
> > - min_align = max(min_align, win_align);
> > + relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
> > + relaxed_align = max(relaxed_align, win_align);
> > + min_align = min(min_align, relaxed_align);
> > size0 = calculate_memsize(size, min_size, 0, 0, resource_size(b_res), win_align);
> > pci_info(bus->self, "bridge window %pR to %pR requires relaxed alignment rules\n",
> > 
> > b_res, &bus->busn_res);
> > 
> > @@ -1261,8 +1263,9 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
> > if (bus->self && size1 &&
> > 
> > !pbus_upstream_space_available(bus, mask | IORESOURCE_PREFETCH, type,
> > size1, add_align)) {
> > - min_align = 1ULL << (max_order + __ffs(SZ_1M));
> > - min_align = max(min_align, win_align);
> > + relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
> > + relaxed_align = max(min_align, win_align);
> > + min_align = min(min_align, relaxed_align);
> > size1 = calculate_memsize(size, min_size, add_size, children_add_size,
> > resource_size(b_res), win_align);
> > pci_info(bus->self,
> > 
> > 
> > base-commit: 3719a04a80caf660f899a462cd8f3973bcfa676e
> > --
> > 2.39.5
> 
> Hello Ilpo,
> 
> I've tested the patch and it seems to fix the issue. Thank you!

Great, thanks for testing.

If you want, you can give your Tested-by tag so I can include it into the 
official submission of the fix.

-- 
 i.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REGRESSION] amdgpu fails to load external RX 580 since PCI: Allow relaxed bridge window tail sizing for optional resources
  2025-06-09 17:07     ` Ilpo Järvinen
@ 2025-06-09 17:16       ` Rio Liu
  0 siblings, 0 replies; 5+ messages in thread
From: Rio Liu @ 2025-06-09 17:16 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Bjorn Helgaas, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org, regressions@lists.linux.dev,
	amd-gfx@lists.freedesktop.org

On Monday, June 9th, 2025 at PM 1:07, Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> wrote:

>
>
> Great, thanks for testing.
>
> If you want, you can give your Tested-by tag so I can include it into the
> official submission of the fix.
>
> --
> i.

That sounds good, thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-06-09 17:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-09  2:34 [REGRESSION] amdgpu fails to load external RX 580 since PCI: Allow relaxed bridge window tail sizing for optional resources rio
2025-06-09  9:09 ` Ilpo Järvinen
2025-06-09 14:22   ` Rio Liu
2025-06-09 17:07     ` Ilpo Järvinen
2025-06-09 17:16       ` Rio Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).