public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END, fix ZONE_DEVICE
@ 2026-03-09 11:09 Vivian Wang
  2026-03-09 11:09 ` [PATCH 2/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END Vivian Wang
  2026-04-03 18:30 ` [PATCH 0/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END, fix ZONE_DEVICE patchwork-bot+linux-riscv
  0 siblings, 2 replies; 3+ messages in thread
From: Vivian Wang @ 2026-03-09 11:09 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Alexandre Ghiti
  Cc: linux-riscv, linux-kernel, sophgo, Vivian Wang, stable, Han Gao

With HSA_AMD_SVM=y, RISC-V runs into the same problem as arm64 at one
point did [1], where it tries to use a struct page that is outside of
vmemmap. See log near the end.

On RISC-V, the actual mappable range of physical addresses is dependent
on the current MMU mode i.e. satp_mode. Define DIRECT_MAP_PHYSMEM_END to
expose this information to get_free_mem_region().

See also commit eeb8fdfcf090 ("arm64: Expose the end of the linear map
in PHYSMEM_END") which fixed the same issue on arm64, although the
situation there is much less complicated.

Patch 1 copies a check in vmemmap_populate() over from arm64, which I
have done to debug this problem. Patch 2 is the actual fix.

[1] https://lore.kernel.org/all/20240903164532.3874988-1-scott@os.amperecomputing.com

Crash log:

[   19.228335] Oops [#1]
[   19.230607] Modules linked in: amdgpu(+) [ ... many more modules omitted ... ]
[   19.309895] CPU: 2 UID: 0 PID: 844 Comm: (udev-worker) Not tainted 6.19.3-ztest #3 PREEMPTLAZY
[   19.318587] Hardware name: Sophgo SG2044 SRD3-10 (DT)
[   19.323632] epc : __init_single_page+0x16/0x78
[   19.328079]  ra : __init_zone_device_page.constprop.0+0x28/0xd0
[   19.333997] epc : ffffffff802ff1be ra : ffffffff80d10de8 sp : ffff8f8002ef3450
[   19.341210]  gp : ffffffff82290cc8 tp : ffffaf808f526c80 t0 : ffffffff800231a8
[   19.348423]  t1 : 0000100000000000 t2 : 0000000000000002 s0 : ffff8f8002ef3460
[   19.355636]  s1 : 00038d7fe6000000 a0 : 00038d7fe6000000 a1 : 00000fffffa00000
[   19.362848]  a2 : 3000000000000000 a3 : 0000000000000000 a4 : 0000200000000000
[   19.370060]  a5 : 0000000000600000 a6 : ffffffff82305608 a7 : 0000000000000001
[   19.377273]  s2 : 00000fffffa00000 s3 : ffffaf8091258028 s4 : 0000000000000000
[   19.384485]  s5 : 0000100000000000 s6 : 0000000000000001 s7 : 00000fffffa00000
[   19.391697]  s8 : ffffffff81708b78 s9 : ffffffff81708b38 s10: 0000000000000001
[   19.398909]  s11: ffffaf8091258028 t3 : ffffffff822b43c0 t4 : 000000000207ffff
[   19.406121]  t5 : ffffffffffffffff t6 : 0000000000000000
[   19.411425] status: 0000000200000120 badaddr: 00038d7fe6000030 cause: 000000000000000f
[   19.419331] [<ffffffff802ff1be>] __init_single_page+0x16/0x78
[   19.425071] [<ffffffff80d10de8>] __init_zone_device_page.constprop.0+0x28/0xd0
[   19.432285] [<ffffffff80d11140>] memmap_init_zone_device+0x108/0x278
[   19.438632] [<ffffffff803f6a7a>] memremap_pages+0x262/0x6b0
[   19.444201] [<ffffffff803f6ef0>] devm_memremap_pages+0x28/0x78
[   19.450029] [<ffffffff053f9370>] kgd2kfd_init_zone_device+0xe0/0x1e0 [amdgpu]
[   19.483691] [<ffffffff059a6df0>] amdgpu_device_ip_init+0xa68/0xad0 [amdgpu]
[   19.517108] [<ffffffff05120f6e>] amdgpu_device_init+0x1a5e/0x21e0 [amdgpu]
[   19.550497] [<ffffffff05122e50>] amdgpu_driver_load_kms+0x20/0xc8 [amdgpu]
[   19.583900] [<ffffffff05116dd6>] amdgpu_pci_probe+0x236/0x6c8 [amdgpu]
[   19.616947] [<ffffffff807b2c10>] local_pci_probe+0x40/0x98
[   19.622439] [<ffffffff807b393c>] pci_device_probe+0xcc/0x280
[   19.628091] [<ffffffff8093527c>] really_probe+0xa4/0x3c0
[   19.633397] [<ffffffff80935614>] __driver_probe_device+0x7c/0x158
[   19.639483] [<ffffffff809357d8>] driver_probe_device+0x38/0xd0
[   19.645308] [<ffffffff80935a3a>] __driver_attach+0xaa/0x200
[   19.650873] [<ffffffff8093297e>] bus_for_each_dev+0x6e/0xc8
[   19.656443] [<ffffffff80934996>] driver_attach+0x26/0x38
[   19.661750] [<ffffffff809340e4>] bus_add_driver+0x11c/0x248
[   19.667317] [<ffffffff80936d1c>] driver_register+0x54/0x100
[   19.672884] [<ffffffff807b1f4c>] __pci_register_driver+0x54/0x68
[   19.678889] [<ffffffff0430e0a8>] amdgpu_init+0xa0/0xff8 [amdgpu]
[   19.711387] [<ffffffff80013a02>] do_one_initcall+0x62/0x2c8
[   19.716960] [<ffffffff80127ad6>] do_init_module+0x5e/0x260
[   19.722445] [<ffffffff80129d1e>] load_module+0x1bd6/0x2388
[   19.727925] [<ffffffff8012a730>] init_module_from_file+0xc8/0x128
[   19.734012] [<ffffffff8012a9d0>] __riscv_sys_finit_module+0x1e0/0x338
[   19.740447] [<ffffffff80d0d62a>] do_trap_ecall_u+0x102/0x3f8
[   19.746101] [<ffffffff80d1dbb4>] handle_exception+0x154/0x160
[   19.751852] Code: ffd2 0013 0000 1141 e022 e406 0800 8a0d 16fa 1672 (3823) 0205
[   19.759334] ---[ end trace 0000000000000000 ]---

---
Vivian Wang (2):
      riscv: mm: WARN_ON() for bad addresses in vmemmap_populate()
      riscv: mm: Define DIRECT_MAP_PHYSMEM_END

 arch/riscv/include/asm/pgtable.h | 10 ++++++++++
 arch/riscv/mm/init.c             |  2 ++
 2 files changed, 12 insertions(+)
---
base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f
change-id: 20260309-riscv-sparsemem-vmemmap-limits-4734f1f67449

Best regards,
-- 
Vivian "dramforever" Wang


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 2/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END
  2026-03-09 11:09 [PATCH 0/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END, fix ZONE_DEVICE Vivian Wang
@ 2026-03-09 11:09 ` Vivian Wang
  2026-04-03 18:30 ` [PATCH 0/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END, fix ZONE_DEVICE patchwork-bot+linux-riscv
  1 sibling, 0 replies; 3+ messages in thread
From: Vivian Wang @ 2026-03-09 11:09 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Alexandre Ghiti
  Cc: linux-riscv, linux-kernel, sophgo, stable, Han Gao, Vivian Wang

On RISC-V, the actual mappable range of physical address space is
dependent on the current MMU mode i.e. satp_mode (See
Documentation/arch/riscv/vm-layout.rst).

Define the DIRECT_MAP_PHYSMEM_END macro based on the existing virtual
address space layout macros to expose this information to
get_free_mem_region(). Otherwise, it returns a region that couldn't be
mapped, which breaks ZONE_DEVICE.

Cc: <stable@vger.kernel.org> # v6.13+
Tested-by: Han Gao <gaohan@iscas.ac.cn> # SG2044
Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn>
---
 arch/riscv/include/asm/pgtable.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 08d1ca047104..9c92a84e9755 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -93,6 +93,16 @@
  */
 #define vmemmap		((struct page *)VMEMMAP_START - vmemmap_start_pfn)
 
+/* Needed to limit get_free_mem_region() */
+#if defined(CONFIG_FLATMEM)
+#define DIRECT_MAP_PHYSMEM_END (phys_ram_base + KERN_VIRT_SIZE - 1)
+#elif defined(CONFIG_SPARSEMEM_VMEMMAP)
+#define DIRECT_MAP_PHYSMEM_END \
+	((vmemmap_start_pfn + VMEMMAP_SIZE / sizeof(struct page)) * PAGE_SIZE - 1)
+#elif defined(CONFIG_SPARSEMEM)
+/* DIRECT_MAP_PHYSMEM_END is not limited by VA space assignment in this case */
+#endif
+
 #define PCI_IO_SIZE      SZ_16M
 #define PCI_IO_END       VMEMMAP_START
 #define PCI_IO_START     (PCI_IO_END - PCI_IO_SIZE)

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 0/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END, fix ZONE_DEVICE
  2026-03-09 11:09 [PATCH 0/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END, fix ZONE_DEVICE Vivian Wang
  2026-03-09 11:09 ` [PATCH 2/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END Vivian Wang
@ 2026-04-03 18:30 ` patchwork-bot+linux-riscv
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+linux-riscv @ 2026-04-03 18:30 UTC (permalink / raw)
  To: Vivian Wang
  Cc: linux-riscv, pjw, palmer, alex, linux-kernel, sophgo, stable,
	gaohan

Hello:

This series was applied to riscv/linux.git (for-next)
by Paul Walmsley <pjw@kernel.org>:

On Mon, 09 Mar 2026 19:09:36 +0800 you wrote:
> With HSA_AMD_SVM=y, RISC-V runs into the same problem as arm64 at one
> point did [1], where it tries to use a struct page that is outside of
> vmemmap. See log near the end.
> 
> On RISC-V, the actual mappable range of physical addresses is dependent
> on the current MMU mode i.e. satp_mode. Define DIRECT_MAP_PHYSMEM_END to
> expose this information to get_free_mem_region().
> 
> [...]

Here is the summary with links:
  - [1/2] riscv: mm: WARN_ON() for bad addresses in vmemmap_populate()
    https://git.kernel.org/riscv/c/49a5cb2dc86c
  - [2/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END
    (no matching commit)

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-03 18:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09 11:09 [PATCH 0/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END, fix ZONE_DEVICE Vivian Wang
2026-03-09 11:09 ` [PATCH 2/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END Vivian Wang
2026-04-03 18:30 ` [PATCH 0/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END, fix ZONE_DEVICE patchwork-bot+linux-riscv

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox