* running mm/ksft_hmm.sh on arm64 results in a kernel panic
@ 2026-03-18 5:26 Zenghui Yu
2026-03-18 15:05 ` Lorenzo Stoakes (Oracle)
0 siblings, 1 reply; 4+ messages in thread
From: Zenghui Yu @ 2026-03-18 5:26 UTC (permalink / raw)
To: linux-mm, linux-kernel
Cc: jgg, leon, akpm, david, ljs, Liam.Howlett, vbabka, rppt, surenb,
mhocko
Hi all,
When running mm/ksft_hmm.sh in my arm64 virtual machine, I ran into the
following kernel panic:
[root@localhost mm]# ./ksft_hmm.sh
TAP version 13
# --------------------------------
# running bash ./test_hmm.sh smoke
# --------------------------------
# Running smoke test. Note, this test provides basic coverage.
# TAP version 13
# 1..74
# # Starting 74 tests from 4 test cases.
# # RUN hmm.hmm_device_private.benchmark_thp_migration ...
#
# HMM THP Migration Benchmark
# ---------------------------
# System page size: 16384 bytes
#
# === Small Buffer (512KB) (0.5 MB) ===
# | With THP | Without THP | Improvement
# ---------------------------------------------------------------------
# Sys->Dev Migration | 0.423 ms | 0.182 ms | -133.0%
# Dev->Sys Migration | 0.027 ms | 0.025 ms | -7.0%
# S->D Throughput | 1.15 GB/s | 2.69 GB/s | -57.1%
# D->S Throughput | 18.12 GB/s | 19.38 GB/s | -6.5%
#
# === Half THP Size (1MB) (1.0 MB) ===
# | With THP | Without THP | Improvement
# ---------------------------------------------------------------------
# Sys->Dev Migration | 0.367 ms | 1.187 ms | 69.0%
# Dev->Sys Migration | 0.048 ms | 0.049 ms | 2.2%
# S->D Throughput | 2.66 GB/s | 0.82 GB/s | 222.9%
# D->S Throughput | 20.53 GB/s | 20.08 GB/s | 2.3%
#
# === Single THP Size (2MB) (2.0 MB) ===
# | With THP | Without THP | Improvement
# ---------------------------------------------------------------------
# Sys->Dev Migration | 0.817 ms | 0.782 ms | -4.4%
# Dev->Sys Migration | 0.089 ms | 0.096 ms | 7.1%
# S->D Throughput | 2.39 GB/s | 2.50 GB/s | -4.2%
# D->S Throughput | 22.00 GB/s | 20.44 GB/s | 7.6%
#
# === Two THP Size (4MB) (4.0 MB) ===
# | With THP | Without THP | Improvement
# ---------------------------------------------------------------------
# Sys->Dev Migration | 3.419 ms | 2.337 ms | -46.3%
# Dev->Sys Migration | 0.321 ms | 0.225 ms | -42.6%
# S->D Throughput | 1.14 GB/s | 1.67 GB/s | -31.6%
# D->S Throughput | 12.17 GB/s | 17.36 GB/s | -29.9%
#
# === Four THP Size (8MB) (8.0 MB) ===
# | With THP | Without THP | Improvement
# ---------------------------------------------------------------------
# Sys->Dev Migration | 4.535 ms | 4.563 ms | 0.6%
# Dev->Sys Migration | 0.583 ms | 0.582 ms | -0.2%
# S->D Throughput | 1.72 GB/s | 1.71 GB/s | 0.6%
# D->S Throughput | 13.39 GB/s | 13.43 GB/s | -0.2%
#
# === Eight THP Size (16MB) (16.0 MB) ===
# | With THP | Without THP | Improvement
# ---------------------------------------------------------------------
# Sys->Dev Migration | 10.190 ms | 9.805 ms | -3.9%
# Dev->Sys Migration | 1.130 ms | 1.195 ms | 5.5%
# S->D Throughput | 1.53 GB/s | 1.59 GB/s | -3.8%
# D->S Throughput | 13.83 GB/s | 13.07 GB/s | 5.8%
#
# === One twenty eight THP Size (256MB) (256.0 MB) ===
# | With THP | Without THP | Improvement
# ---------------------------------------------------------------------
# Sys->Dev Migration | 80.464 ms | 92.764 ms | 13.3%
# Dev->Sys Migration | 9.528 ms | 18.166 ms | 47.6%
# S->D Throughput | 3.11 GB/s | 2.70 GB/s | 15.3%
# D->S Throughput | 26.24 GB/s | 13.76 GB/s | 90.7%
# # OK hmm.hmm_device_private.benchmark_thp_migration
# ok 1 hmm.hmm_device_private.benchmark_thp_migration
# # RUN hmm.hmm_device_private.migrate_anon_huge_zero_err ...
# # hmm-tests.c:2622:migrate_anon_huge_zero_err:Expected ret (-2) == 0 (0)
[ 154.077143] Unable to handle kernel paging request at virtual address
0000000000005268
[ 154.077179] Mem abort info:
[ 154.077203] ESR = 0x0000000096000007
[ 154.077219] EC = 0x25: DABT (current EL), IL = 32 bits
[ 154.078433] SET = 0, FnV = 0
[ 154.078434] EA = 0, S1PTW = 0
[ 154.078435] FSC = 0x07: level 3 translation fault
[ 154.078435] Data abort info:
[ 154.078436] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
[ 154.078459] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 154.078479] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 154.078484] user pgtable: 16k pages, 47-bit VAs, pgdp=000000010b920000
[ 154.078487] [0000000000005268] pgd=0800000101b4c403,
p4d=0800000101b4c403, pud=0800000101b4c403, pmd=0800000108cd8403,
pte=0000000000000000
[ 154.078520] Internal error: Oops: 0000000096000007 [#1] SMP
[ 154.098664] Modules linked in: test_hmm rfkill drm fuse backlight ipv6
[ 154.100468] CPU: 7 UID: 0 PID: 1357 Comm: hmm-tests Kdump: loaded Not
tainted 7.0.0-rc4-00029-ga989fde763f4-dirty #260 PREEMPT
[ 154.103855] Hardware name: QEMU QEMU Virtual Machine, BIOS
edk2-stable202408-prebuilt.qemu.org 08/13/2024
[ 154.104409] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS
BTYPE=--)
[ 154.104847] pc : dmirror_devmem_fault+0xe4/0x1c0 [test_hmm]
[ 154.105758] lr : dmirror_devmem_fault+0xcc/0x1c0 [test_hmm]
[ 154.109465] sp : ffffc000855ab430
[ 154.109677] x29: ffffc000855ab430 x28: ffff8000c9f73e40 x27:
ffff8000c9f73e40
[ 154.110091] x26: ffff8000cb920000 x25: ffffc000812e0000 x24:
0000000000000000
[ 154.110540] x23: ffff8000c9f73e40 x22: 0000000000000000 x21:
0000000000000008
[ 154.110888] x20: ffff8000c07e1980 x19: ffffc000855ab618 x18:
ffffc000855abc40
[ 154.111223] x17: 0000000000000000 x16: 0000000000000000 x15:
0000000000000000
[ 154.111563] x14: 0000000000000000 x13: 0000000000000000 x12:
ffffc00080fedd68
[ 154.111903] x11: 00007fffa3bf7fff x10: 0000000000000000 x9 :
1ffff00019166a41
[ 154.112244] x8 : ffff8000c132df20 x7 : 0000000000000000 x6 :
ffff8000c53bfe88
[ 154.112581] x5 : 0000000000000009 x4 : ffffc000855ab3d0 x3 :
0000000000000004
[ 154.112921] x2 : 0000000000000004 x1 : ffff8000c132df18 x0 :
0000000000005200
[ 154.113254] Call trace:
[ 154.113370] dmirror_devmem_fault+0xe4/0x1c0 [test_hmm] (P)
[ 154.113679] do_swap_page+0x132c/0x17b0
[ 154.113912] __handle_mm_fault+0x7e4/0x1af4
[ 154.114124] handle_mm_fault+0xb4/0x294
[ 154.114398] __get_user_pages+0x210/0xbfc
[ 154.114607] get_dump_page+0xd8/0x144
[ 154.114795] dump_user_range+0x70/0x2e8
[ 154.115020] elf_core_dump+0xb64/0xe40
[ 154.115212] vfs_coredump+0xfb4/0x1ce8
[ 154.115397] get_signal+0x6cc/0x844
[ 154.115582] arch_do_signal_or_restart+0x7c/0x33c
[ 154.115805] exit_to_user_mode_loop+0x104/0x16c
[ 154.116030] el0_svc+0x174/0x178
[ 154.116216] el0t_64_sync_handler+0xa0/0xe4
[ 154.116414] el0t_64_sync+0x198/0x19c
[ 154.116594] Code: d2800083 f9400280 f9003be0 2a0303e2 (b9406800)
[ 154.116891] ---[ end trace 0000000000000000 ]---
[ 158.741771] Kernel panic - not syncing: Oops: Fatal exception
[ 158.742164] SMP: stopping secondary CPUs
[ 158.742970] Kernel Offset: disabled
[ 158.743162] CPU features: 0x0000000,00060005,11210501,94067723
[ 158.743440] Memory Limit: none
[ 164.002089] Starting crashdump kernel...
[ 164.002867] Bye!
[root@localhost linux]# ./scripts/faddr2line lib/test_hmm.ko
dmirror_devmem_fault+0xe4/0x1c0
dmirror_devmem_fault+0xe4/0x1c0:
dmirror_select_device at /root/code/linux/lib/test_hmm.c:153
(inlined by) dmirror_devmem_fault at /root/code/linux/lib/test_hmm.c:1659
The kernel is built with arm64's virt.config plus
+CONFIG_ARM64_16K_PAGES=y
+CONFIG_ZONE_DEVICE=y
+CONFIG_DEVICE_PRIVATE=y
+CONFIG_TEST_HMM=m
I *guess* the problem is that migrate_anon_huge_zero_err() has chosen an
incorrect THP size (which should be 32M in a system with 16k page size),
leading to the failure of the first hmm_migrate_sys_to_dev(). The test
program received a SIGABRT signal and initiated vfs_coredump(). And
something in the test_hmm module doesn't play well with the coredump
process, which ends up with a panic. I'm not familiar with that.
Note that I can also reproduce the panic by aborting the test manually
with following diff (and skipping migrate_anon_huge{,_zero}_err()):
diff --git a/tools/testing/selftests/mm/hmm-tests.c
b/tools/testing/selftests/mm/hmm-tests.c
index e8328c89d855..8d8ea8063a73 100644
--- a/tools/testing/selftests/mm/hmm-tests.c
+++ b/tools/testing/selftests/mm/hmm-tests.c
@@ -1027,6 +1027,8 @@ TEST_F(hmm, migrate)
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
+ ASSERT_TRUE(0);
+
/* Check what the device read. */
for (i = 0, ptr = buffer->mirror; i < size / sizeof(*ptr); ++i)
ASSERT_EQ(ptr[i], i);
Please have a look!
Thanks,
Zenghui
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: running mm/ksft_hmm.sh on arm64 results in a kernel panic
2026-03-18 5:26 running mm/ksft_hmm.sh on arm64 results in a kernel panic Zenghui Yu
@ 2026-03-18 15:05 ` Lorenzo Stoakes (Oracle)
2026-03-19 1:49 ` Alistair Popple
0 siblings, 1 reply; 4+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-18 15:05 UTC (permalink / raw)
To: Zenghui Yu
Cc: linux-mm, linux-kernel, jgg, leon, akpm, david, Liam.Howlett,
vbabka, rppt, surenb, mhocko
On Wed, Mar 18, 2026 at 01:26:39PM +0800, Zenghui Yu wrote:
> Hi all,
>
> When running mm/ksft_hmm.sh in my arm64 virtual machine, I ran into the
> following kernel panic:
>
> [root@localhost mm]# ./ksft_hmm.sh
> TAP version 13
> # --------------------------------
> # running bash ./test_hmm.sh smoke
> # --------------------------------
> # Running smoke test. Note, this test provides basic coverage.
> # TAP version 13
> # 1..74
> # # Starting 74 tests from 4 test cases.
> # # RUN hmm.hmm_device_private.benchmark_thp_migration ...
> #
> # HMM THP Migration Benchmark
> # ---------------------------
> # System page size: 16384 bytes
> #
> # === Small Buffer (512KB) (0.5 MB) ===
> # | With THP | Without THP | Improvement
> # ---------------------------------------------------------------------
> # Sys->Dev Migration | 0.423 ms | 0.182 ms | -133.0%
> # Dev->Sys Migration | 0.027 ms | 0.025 ms | -7.0%
> # S->D Throughput | 1.15 GB/s | 2.69 GB/s | -57.1%
> # D->S Throughput | 18.12 GB/s | 19.38 GB/s | -6.5%
> #
> # === Half THP Size (1MB) (1.0 MB) ===
> # | With THP | Without THP | Improvement
> # ---------------------------------------------------------------------
> # Sys->Dev Migration | 0.367 ms | 1.187 ms | 69.0%
> # Dev->Sys Migration | 0.048 ms | 0.049 ms | 2.2%
> # S->D Throughput | 2.66 GB/s | 0.82 GB/s | 222.9%
> # D->S Throughput | 20.53 GB/s | 20.08 GB/s | 2.3%
> #
> # === Single THP Size (2MB) (2.0 MB) ===
> # | With THP | Without THP | Improvement
> # ---------------------------------------------------------------------
> # Sys->Dev Migration | 0.817 ms | 0.782 ms | -4.4%
> # Dev->Sys Migration | 0.089 ms | 0.096 ms | 7.1%
> # S->D Throughput | 2.39 GB/s | 2.50 GB/s | -4.2%
> # D->S Throughput | 22.00 GB/s | 20.44 GB/s | 7.6%
> #
> # === Two THP Size (4MB) (4.0 MB) ===
> # | With THP | Without THP | Improvement
> # ---------------------------------------------------------------------
> # Sys->Dev Migration | 3.419 ms | 2.337 ms | -46.3%
> # Dev->Sys Migration | 0.321 ms | 0.225 ms | -42.6%
> # S->D Throughput | 1.14 GB/s | 1.67 GB/s | -31.6%
> # D->S Throughput | 12.17 GB/s | 17.36 GB/s | -29.9%
> #
> # === Four THP Size (8MB) (8.0 MB) ===
> # | With THP | Without THP | Improvement
> # ---------------------------------------------------------------------
> # Sys->Dev Migration | 4.535 ms | 4.563 ms | 0.6%
> # Dev->Sys Migration | 0.583 ms | 0.582 ms | -0.2%
> # S->D Throughput | 1.72 GB/s | 1.71 GB/s | 0.6%
> # D->S Throughput | 13.39 GB/s | 13.43 GB/s | -0.2%
> #
> # === Eight THP Size (16MB) (16.0 MB) ===
> # | With THP | Without THP | Improvement
> # ---------------------------------------------------------------------
> # Sys->Dev Migration | 10.190 ms | 9.805 ms | -3.9%
> # Dev->Sys Migration | 1.130 ms | 1.195 ms | 5.5%
> # S->D Throughput | 1.53 GB/s | 1.59 GB/s | -3.8%
> # D->S Throughput | 13.83 GB/s | 13.07 GB/s | 5.8%
> #
> # === One twenty eight THP Size (256MB) (256.0 MB) ===
> # | With THP | Without THP | Improvement
> # ---------------------------------------------------------------------
> # Sys->Dev Migration | 80.464 ms | 92.764 ms | 13.3%
> # Dev->Sys Migration | 9.528 ms | 18.166 ms | 47.6%
> # S->D Throughput | 3.11 GB/s | 2.70 GB/s | 15.3%
> # D->S Throughput | 26.24 GB/s | 13.76 GB/s | 90.7%
> # # OK hmm.hmm_device_private.benchmark_thp_migration
> # ok 1 hmm.hmm_device_private.benchmark_thp_migration
> # # RUN hmm.hmm_device_private.migrate_anon_huge_zero_err ...
> # # hmm-tests.c:2622:migrate_anon_huge_zero_err:Expected ret (-2) == 0 (0)
>
> [ 154.077143] Unable to handle kernel paging request at virtual address
> 0000000000005268
> [ 154.077179] Mem abort info:
> [ 154.077203] ESR = 0x0000000096000007
> [ 154.077219] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 154.078433] SET = 0, FnV = 0
> [ 154.078434] EA = 0, S1PTW = 0
> [ 154.078435] FSC = 0x07: level 3 translation fault
> [ 154.078435] Data abort info:
> [ 154.078436] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
> [ 154.078459] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 154.078479] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 154.078484] user pgtable: 16k pages, 47-bit VAs, pgdp=000000010b920000
> [ 154.078487] [0000000000005268] pgd=0800000101b4c403,
> p4d=0800000101b4c403, pud=0800000101b4c403, pmd=0800000108cd8403,
> pte=0000000000000000
> [ 154.078520] Internal error: Oops: 0000000096000007 [#1] SMP
> [ 154.098664] Modules linked in: test_hmm rfkill drm fuse backlight ipv6
> [ 154.100468] CPU: 7 UID: 0 PID: 1357 Comm: hmm-tests Kdump: loaded Not
> tainted 7.0.0-rc4-00029-ga989fde763f4-dirty #260 PREEMPT
> [ 154.103855] Hardware name: QEMU QEMU Virtual Machine, BIOS
> edk2-stable202408-prebuilt.qemu.org 08/13/2024
> [ 154.104409] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS
> BTYPE=--)
> [ 154.104847] pc : dmirror_devmem_fault+0xe4/0x1c0 [test_hmm]
> [ 154.105758] lr : dmirror_devmem_fault+0xcc/0x1c0 [test_hmm]
> [ 154.109465] sp : ffffc000855ab430
> [ 154.109677] x29: ffffc000855ab430 x28: ffff8000c9f73e40 x27:
> ffff8000c9f73e40
> [ 154.110091] x26: ffff8000cb920000 x25: ffffc000812e0000 x24:
> 0000000000000000
> [ 154.110540] x23: ffff8000c9f73e40 x22: 0000000000000000 x21:
> 0000000000000008
> [ 154.110888] x20: ffff8000c07e1980 x19: ffffc000855ab618 x18:
> ffffc000855abc40
> [ 154.111223] x17: 0000000000000000 x16: 0000000000000000 x15:
> 0000000000000000
> [ 154.111563] x14: 0000000000000000 x13: 0000000000000000 x12:
> ffffc00080fedd68
> [ 154.111903] x11: 00007fffa3bf7fff x10: 0000000000000000 x9 :
> 1ffff00019166a41
> [ 154.112244] x8 : ffff8000c132df20 x7 : 0000000000000000 x6 :
> ffff8000c53bfe88
> [ 154.112581] x5 : 0000000000000009 x4 : ffffc000855ab3d0 x3 :
> 0000000000000004
> [ 154.112921] x2 : 0000000000000004 x1 : ffff8000c132df18 x0 :
> 0000000000005200
> [ 154.113254] Call trace:
> [ 154.113370] dmirror_devmem_fault+0xe4/0x1c0 [test_hmm] (P)
> [ 154.113679] do_swap_page+0x132c/0x17b0
> [ 154.113912] __handle_mm_fault+0x7e4/0x1af4
> [ 154.114124] handle_mm_fault+0xb4/0x294
> [ 154.114398] __get_user_pages+0x210/0xbfc
> [ 154.114607] get_dump_page+0xd8/0x144
> [ 154.114795] dump_user_range+0x70/0x2e8
> [ 154.115020] elf_core_dump+0xb64/0xe40
> [ 154.115212] vfs_coredump+0xfb4/0x1ce8
> [ 154.115397] get_signal+0x6cc/0x844
> [ 154.115582] arch_do_signal_or_restart+0x7c/0x33c
> [ 154.115805] exit_to_user_mode_loop+0x104/0x16c
> [ 154.116030] el0_svc+0x174/0x178
> [ 154.116216] el0t_64_sync_handler+0xa0/0xe4
> [ 154.116414] el0t_64_sync+0x198/0x19c
> [ 154.116594] Code: d2800083 f9400280 f9003be0 2a0303e2 (b9406800)
> [ 154.116891] ---[ end trace 0000000000000000 ]---
> [ 158.741771] Kernel panic - not syncing: Oops: Fatal exception
> [ 158.742164] SMP: stopping secondary CPUs
> [ 158.742970] Kernel Offset: disabled
> [ 158.743162] CPU features: 0x0000000,00060005,11210501,94067723
> [ 158.743440] Memory Limit: none
> [ 164.002089] Starting crashdump kernel...
> [ 164.002867] Bye!
That 'Bye!' is delightful :)
>
> [root@localhost linux]# ./scripts/faddr2line lib/test_hmm.ko
> dmirror_devmem_fault+0xe4/0x1c0
> dmirror_devmem_fault+0xe4/0x1c0:
> dmirror_select_device at /root/code/linux/lib/test_hmm.c:153
> (inlined by) dmirror_devmem_fault at /root/code/linux/lib/test_hmm.c:1659
>
> The kernel is built with arm64's virt.config plus
>
> +CONFIG_ARM64_16K_PAGES=y
> +CONFIG_ZONE_DEVICE=y
> +CONFIG_DEVICE_PRIVATE=y
> +CONFIG_TEST_HMM=m
>
> I *guess* the problem is that migrate_anon_huge_zero_err() has chosen an
> incorrect THP size (which should be 32M in a system with 16k page size),
Yeah, it hardcodes to 2mb:
TEST_F(hmm, migrate_anon_huge_zero_err)
{
...
size = TWOMEG;
}
Which isn't correct obviously and needs to be fixed.
We should read /sys/kernel/mm/transparent_hugepage/hpage_pmd_size instead.
vm_utils.h has read_pmd_pagesize() So this can be fixed with:
size = read_pmd_pagesize();
We then madvise(.., MADV_HUGEPAGE) region of size, which is now too small.:
TEST_F(hmm, migrate_anon_huge_zero_err)
{
...
size = TWOMEG;
...
ret = madvise(map, size, MADV_HUGEPAGE);
ASSERT_EQ(ret, 0); <-- but should succeed anyway, just won't do anything
...
ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_FLAGS, buffer,
HMM_DMIRROR_FLAG_FAIL_ALLOC);
}
Then we switch into lib/test_hmm.c:
static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
struct dmirror *dmirror)
{
...
for (addr = args->start; addr < args->end; ) {
...
if (dmirror->flags & HMM_DMIRROR_FLAG_FAIL_ALLOC) {
dmirror->flags &= ~HMM_DMIRROR_FLAG_FAIL_ALLOC;
dpage = NULL; <-- force failure for 1st page
...
if (!dpage) {
...
if (!is_large) <-- isn't large, as MADV_HUGEPAGE failed
goto next;
...
next:
src++;
dst++;
addr += PAGE_SIZE;
}
}
Back to the hmm-tests.c selftest:
TEST_F(hmm, migrate_anon_huge_zero_err)
{
...
ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_FLAGS, buffer,
HMM_DMIRROR_FLAG_FAIL_ALLOC);
ASSERT_EQ(ret, 0); <-- succeeds but...
ASSERT_EQ(buffer->cpages, npages); <-- cpages = npages - 1.
}
So then we try to teardown which inokves:
FIXTURE_TEARDOWN(hmm)
{
int ret = close(self->fd); <-- triggers kernel dmirror_fops_release()
...
}
In the kernel:
static int dmirror_fops_release(struct inode *inode, struct file *filp)
{
struct dmirror *dmirror = filp->private_data;
...
kfree(dmirror); <-- frees dmirror...
return 0;
}
So dmirror is fred but in dmirror_migrate_alloc_and_copy(), for all those pages
we DID migrate:
static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
struct dmirror *dmirror)
{
...
for (addr = args->start; addr < args->end; ) {
...
if (!dpage) { <-- we will succeed allocation so don't branch.
...
}
rpage = BACKING_PAGE(dpage);
/*
* Normally, a device would use the page->zone_device_data to
* point to the mirror but here we use it to hold the page for
* the simulated device memory and that page holds the pointer
* to the mirror.
*/
rpage->zone_device_data = dmirror;
...
}
...
}
So now a bunch of device private pages have a zone_device_data set to a dangling
dmirror pointer.
Then on coredump, we walk the VMAs, meaning we fault in device private pages and
end up invoking do_swap_page() which in turn calls dmirror_devmem_fault() (via
the struct dev_pagemap_ops
dmirror_devmem_ops->migrate_to_ram=dmirror_devmem_fault callback)
This is via get_dump_page() -> __get_user_pages_locked(..., FOLL_FORCE |
FOLL_DUMP | FOLL_GET) -> __get_user_pages() -> handle_mm_fault() ->
__handle_mm_fault() -> do_swap_page() and:
vm_fault_t do_swap_page(struct vm_fault *vmf)
{
...
entry = softleaf_from_pte(vmf->orig_pte);
if (unlikely(!softleaf_is_swap(entry))) {
if (softleaf_is_migration(entry)) {
...
} else if (softleaf_is_device_private(entry)) {
...
if (trylock_page(vmf->page)) {
...
ret = pgmap->ops->migrate_to_ram(vmf);
...
}
...
}
...
}
...
}
(BTW, we seriously need to clean this up).
And in dmirror_devmem_fault callback():
static vm_fault_t dmirror_devmem_fault(struct vm_fault *vmf)
{
...
/*
* Normally, a device would use the page->zone_device_data to point to
* the mirror but here we use it to hold the page for the simulated
* device memory and that page holds the pointer to the mirror.
*/
rpage = folio_zone_device_data(page_folio(vmf->page));
dmirror = rpage->zone_device_data;
...
args.pgmap_owner = dmirror->mdevice; <-- oops
...
}
So in terms of fixing:
1. Fix the test (trivial)
Use
size = read_pmd_pagesize();
Instead of:
size = TWOMEG;
2. Have dmirror_fops_release() migrate all the device private pages back to ram
before freeing dmirror or something like this
You'd want to abstract code from dmirror_migrate_to_system() to be shared
between the two functions I think.
But I leave that as an exercise for the reader :)
> leading to the failure of the first hmm_migrate_sys_to_dev(). The test
> program received a SIGABRT signal and initiated vfs_coredump(). And
> something in the test_hmm module doesn't play well with the coredump
> process, which ends up with a panic. I'm not familiar with that.
>
> Note that I can also reproduce the panic by aborting the test manually
> with following diff (and skipping migrate_anon_huge{,_zero}_err()):
>
> diff --git a/tools/testing/selftests/mm/hmm-tests.c
> b/tools/testing/selftests/mm/hmm-tests.c
> index e8328c89d855..8d8ea8063a73 100644
> --- a/tools/testing/selftests/mm/hmm-tests.c
> +++ b/tools/testing/selftests/mm/hmm-tests.c
> @@ -1027,6 +1027,8 @@ TEST_F(hmm, migrate)
> ASSERT_EQ(ret, 0);
> ASSERT_EQ(buffer->cpages, npages);
>
> + ASSERT_TRUE(0);
This makes sense as the same dangling dmirror pointer issue arises.
> +
> /* Check what the device read. */
> for (i = 0, ptr = buffer->mirror; i < size / sizeof(*ptr); ++i)
> ASSERT_EQ(ptr[i], i);
>
> Please have a look!
Hopefully did so usefully here :)
>
> Thanks,
> Zenghui
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: running mm/ksft_hmm.sh on arm64 results in a kernel panic
2026-03-18 15:05 ` Lorenzo Stoakes (Oracle)
@ 2026-03-19 1:49 ` Alistair Popple
2026-03-19 2:00 ` Balbir Singh
0 siblings, 1 reply; 4+ messages in thread
From: Alistair Popple @ 2026-03-19 1:49 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: Zenghui Yu, linux-mm, linux-kernel, jgg, leon, akpm, david,
Liam.Howlett, vbabka, rppt, surenb, mhocko, balbirs
On 2026-03-19 at 02:05 +1100, "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote...
> On Wed, Mar 18, 2026 at 01:26:39PM +0800, Zenghui Yu wrote:
> > Hi all,
> >
> > When running mm/ksft_hmm.sh in my arm64 virtual machine, I ran into the
> > following kernel panic:
> >
> > [root@localhost mm]# ./ksft_hmm.sh
> > TAP version 13
> > # --------------------------------
> > # running bash ./test_hmm.sh smoke
> > # --------------------------------
> > # Running smoke test. Note, this test provides basic coverage.
> > # TAP version 13
> > # 1..74
> > # # Starting 74 tests from 4 test cases.
> > # # RUN hmm.hmm_device_private.benchmark_thp_migration ...
> > #
> > # HMM THP Migration Benchmark
> > # ---------------------------
> > # System page size: 16384 bytes
> > #
> > # === Small Buffer (512KB) (0.5 MB) ===
> > # | With THP | Without THP | Improvement
> > # ---------------------------------------------------------------------
> > # Sys->Dev Migration | 0.423 ms | 0.182 ms | -133.0%
> > # Dev->Sys Migration | 0.027 ms | 0.025 ms | -7.0%
> > # S->D Throughput | 1.15 GB/s | 2.69 GB/s | -57.1%
> > # D->S Throughput | 18.12 GB/s | 19.38 GB/s | -6.5%
> > #
> > # === Half THP Size (1MB) (1.0 MB) ===
> > # | With THP | Without THP | Improvement
> > # ---------------------------------------------------------------------
> > # Sys->Dev Migration | 0.367 ms | 1.187 ms | 69.0%
> > # Dev->Sys Migration | 0.048 ms | 0.049 ms | 2.2%
> > # S->D Throughput | 2.66 GB/s | 0.82 GB/s | 222.9%
> > # D->S Throughput | 20.53 GB/s | 20.08 GB/s | 2.3%
> > #
> > # === Single THP Size (2MB) (2.0 MB) ===
> > # | With THP | Without THP | Improvement
> > # ---------------------------------------------------------------------
> > # Sys->Dev Migration | 0.817 ms | 0.782 ms | -4.4%
> > # Dev->Sys Migration | 0.089 ms | 0.096 ms | 7.1%
> > # S->D Throughput | 2.39 GB/s | 2.50 GB/s | -4.2%
> > # D->S Throughput | 22.00 GB/s | 20.44 GB/s | 7.6%
> > #
> > # === Two THP Size (4MB) (4.0 MB) ===
> > # | With THP | Without THP | Improvement
> > # ---------------------------------------------------------------------
> > # Sys->Dev Migration | 3.419 ms | 2.337 ms | -46.3%
> > # Dev->Sys Migration | 0.321 ms | 0.225 ms | -42.6%
> > # S->D Throughput | 1.14 GB/s | 1.67 GB/s | -31.6%
> > # D->S Throughput | 12.17 GB/s | 17.36 GB/s | -29.9%
> > #
> > # === Four THP Size (8MB) (8.0 MB) ===
> > # | With THP | Without THP | Improvement
> > # ---------------------------------------------------------------------
> > # Sys->Dev Migration | 4.535 ms | 4.563 ms | 0.6%
> > # Dev->Sys Migration | 0.583 ms | 0.582 ms | -0.2%
> > # S->D Throughput | 1.72 GB/s | 1.71 GB/s | 0.6%
> > # D->S Throughput | 13.39 GB/s | 13.43 GB/s | -0.2%
> > #
> > # === Eight THP Size (16MB) (16.0 MB) ===
> > # | With THP | Without THP | Improvement
> > # ---------------------------------------------------------------------
> > # Sys->Dev Migration | 10.190 ms | 9.805 ms | -3.9%
> > # Dev->Sys Migration | 1.130 ms | 1.195 ms | 5.5%
> > # S->D Throughput | 1.53 GB/s | 1.59 GB/s | -3.8%
> > # D->S Throughput | 13.83 GB/s | 13.07 GB/s | 5.8%
> > #
> > # === One twenty eight THP Size (256MB) (256.0 MB) ===
> > # | With THP | Without THP | Improvement
> > # ---------------------------------------------------------------------
> > # Sys->Dev Migration | 80.464 ms | 92.764 ms | 13.3%
> > # Dev->Sys Migration | 9.528 ms | 18.166 ms | 47.6%
> > # S->D Throughput | 3.11 GB/s | 2.70 GB/s | 15.3%
> > # D->S Throughput | 26.24 GB/s | 13.76 GB/s | 90.7%
> > # # OK hmm.hmm_device_private.benchmark_thp_migration
> > # ok 1 hmm.hmm_device_private.benchmark_thp_migration
> > # # RUN hmm.hmm_device_private.migrate_anon_huge_zero_err ...
> > # # hmm-tests.c:2622:migrate_anon_huge_zero_err:Expected ret (-2) == 0 (0)
> >
> > [ 154.077143] Unable to handle kernel paging request at virtual address
> > 0000000000005268
> > [ 154.077179] Mem abort info:
> > [ 154.077203] ESR = 0x0000000096000007
> > [ 154.077219] EC = 0x25: DABT (current EL), IL = 32 bits
> > [ 154.078433] SET = 0, FnV = 0
> > [ 154.078434] EA = 0, S1PTW = 0
> > [ 154.078435] FSC = 0x07: level 3 translation fault
> > [ 154.078435] Data abort info:
> > [ 154.078436] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
> > [ 154.078459] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> > [ 154.078479] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > [ 154.078484] user pgtable: 16k pages, 47-bit VAs, pgdp=000000010b920000
> > [ 154.078487] [0000000000005268] pgd=0800000101b4c403,
> > p4d=0800000101b4c403, pud=0800000101b4c403, pmd=0800000108cd8403,
> > pte=0000000000000000
> > [ 154.078520] Internal error: Oops: 0000000096000007 [#1] SMP
> > [ 154.098664] Modules linked in: test_hmm rfkill drm fuse backlight ipv6
> > [ 154.100468] CPU: 7 UID: 0 PID: 1357 Comm: hmm-tests Kdump: loaded Not
> > tainted 7.0.0-rc4-00029-ga989fde763f4-dirty #260 PREEMPT
> > [ 154.103855] Hardware name: QEMU QEMU Virtual Machine, BIOS
> > edk2-stable202408-prebuilt.qemu.org 08/13/2024
> > [ 154.104409] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS
> > BTYPE=--)
> > [ 154.104847] pc : dmirror_devmem_fault+0xe4/0x1c0 [test_hmm]
> > [ 154.105758] lr : dmirror_devmem_fault+0xcc/0x1c0 [test_hmm]
> > [ 154.109465] sp : ffffc000855ab430
> > [ 154.109677] x29: ffffc000855ab430 x28: ffff8000c9f73e40 x27:
> > ffff8000c9f73e40
> > [ 154.110091] x26: ffff8000cb920000 x25: ffffc000812e0000 x24:
> > 0000000000000000
> > [ 154.110540] x23: ffff8000c9f73e40 x22: 0000000000000000 x21:
> > 0000000000000008
> > [ 154.110888] x20: ffff8000c07e1980 x19: ffffc000855ab618 x18:
> > ffffc000855abc40
> > [ 154.111223] x17: 0000000000000000 x16: 0000000000000000 x15:
> > 0000000000000000
> > [ 154.111563] x14: 0000000000000000 x13: 0000000000000000 x12:
> > ffffc00080fedd68
> > [ 154.111903] x11: 00007fffa3bf7fff x10: 0000000000000000 x9 :
> > 1ffff00019166a41
> > [ 154.112244] x8 : ffff8000c132df20 x7 : 0000000000000000 x6 :
> > ffff8000c53bfe88
> > [ 154.112581] x5 : 0000000000000009 x4 : ffffc000855ab3d0 x3 :
> > 0000000000000004
> > [ 154.112921] x2 : 0000000000000004 x1 : ffff8000c132df18 x0 :
> > 0000000000005200
> > [ 154.113254] Call trace:
> > [ 154.113370] dmirror_devmem_fault+0xe4/0x1c0 [test_hmm] (P)
> > [ 154.113679] do_swap_page+0x132c/0x17b0
> > [ 154.113912] __handle_mm_fault+0x7e4/0x1af4
> > [ 154.114124] handle_mm_fault+0xb4/0x294
> > [ 154.114398] __get_user_pages+0x210/0xbfc
> > [ 154.114607] get_dump_page+0xd8/0x144
> > [ 154.114795] dump_user_range+0x70/0x2e8
> > [ 154.115020] elf_core_dump+0xb64/0xe40
> > [ 154.115212] vfs_coredump+0xfb4/0x1ce8
> > [ 154.115397] get_signal+0x6cc/0x844
> > [ 154.115582] arch_do_signal_or_restart+0x7c/0x33c
> > [ 154.115805] exit_to_user_mode_loop+0x104/0x16c
> > [ 154.116030] el0_svc+0x174/0x178
> > [ 154.116216] el0t_64_sync_handler+0xa0/0xe4
> > [ 154.116414] el0t_64_sync+0x198/0x19c
> > [ 154.116594] Code: d2800083 f9400280 f9003be0 2a0303e2 (b9406800)
> > [ 154.116891] ---[ end trace 0000000000000000 ]---
> > [ 158.741771] Kernel panic - not syncing: Oops: Fatal exception
> > [ 158.742164] SMP: stopping secondary CPUs
> > [ 158.742970] Kernel Offset: disabled
> > [ 158.743162] CPU features: 0x0000000,00060005,11210501,94067723
> > [ 158.743440] Memory Limit: none
> > [ 164.002089] Starting crashdump kernel...
> > [ 164.002867] Bye!
>
> That 'Bye!' is delightful :)
>
> >
> > [root@localhost linux]# ./scripts/faddr2line lib/test_hmm.ko
> > dmirror_devmem_fault+0xe4/0x1c0
> > dmirror_devmem_fault+0xe4/0x1c0:
> > dmirror_select_device at /root/code/linux/lib/test_hmm.c:153
> > (inlined by) dmirror_devmem_fault at /root/code/linux/lib/test_hmm.c:1659
> >
> > The kernel is built with arm64's virt.config plus
> >
> > +CONFIG_ARM64_16K_PAGES=y
> > +CONFIG_ZONE_DEVICE=y
> > +CONFIG_DEVICE_PRIVATE=y
> > +CONFIG_TEST_HMM=m
> >
> > I *guess* the problem is that migrate_anon_huge_zero_err() has chosen an
> > incorrect THP size (which should be 32M in a system with 16k page size),
>
> Yeah, it hardcodes to 2mb:
>
> TEST_F(hmm, migrate_anon_huge_zero_err)
> {
> ...
>
> size = TWOMEG;
> }
>
> Which isn't correct obviously and needs to be fixed.
>
> We should read /sys/kernel/mm/transparent_hugepage/hpage_pmd_size instead.
>
> vm_utils.h has read_pmd_pagesize() So this can be fixed with:
>
> size = read_pmd_pagesize();
>
> We then madvise(.., MADV_HUGEPAGE) region of size, which is now too small.:
>
> TEST_F(hmm, migrate_anon_huge_zero_err)
> {
> ...
>
> size = TWOMEG;
>
> ...
>
> ret = madvise(map, size, MADV_HUGEPAGE);
> ASSERT_EQ(ret, 0); <-- but should succeed anyway, just won't do anything
>
> ...
>
> ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_FLAGS, buffer,
> HMM_DMIRROR_FLAG_FAIL_ALLOC);
> }
>
> Then we switch into lib/test_hmm.c:
>
> static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
> struct dmirror *dmirror)
> {
> ...
>
> for (addr = args->start; addr < args->end; ) {
> ...
>
> if (dmirror->flags & HMM_DMIRROR_FLAG_FAIL_ALLOC) {
> dmirror->flags &= ~HMM_DMIRROR_FLAG_FAIL_ALLOC;
> dpage = NULL; <-- force failure for 1st page
>
> ...
>
> if (!dpage) {
> ...
>
> if (!is_large) <-- isn't large, as MADV_HUGEPAGE failed
> goto next;
>
> ...
> next:
> src++;
> dst++;
> addr += PAGE_SIZE;
> }
> }
>
> Back to the hmm-tests.c selftest:
>
> TEST_F(hmm, migrate_anon_huge_zero_err)
> {
> ...
>
> ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_FLAGS, buffer,
> HMM_DMIRROR_FLAG_FAIL_ALLOC);
> ASSERT_EQ(ret, 0); <-- succeeds but...
> ASSERT_EQ(buffer->cpages, npages); <-- cpages = npages - 1.
> }
>
> So then we try to teardown which inokves:
>
> FIXTURE_TEARDOWN(hmm)
> {
> int ret = close(self->fd); <-- triggers kernel dmirror_fops_release()
> ...
> }
>
> In the kernel:
>
> static int dmirror_fops_release(struct inode *inode, struct file *filp)
> {
> struct dmirror *dmirror = filp->private_data;
> ...
>
> kfree(dmirror); <-- frees dmirror...
> return 0;
> }
>
> So dmirror is fred but in dmirror_migrate_alloc_and_copy(), for all those pages
> we DID migrate:
>
> static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
> struct dmirror *dmirror)
> {
> ...
>
> for (addr = args->start; addr < args->end; ) {
> ...
>
> if (!dpage) { <-- we will succeed allocation so don't branch.
> ...
> }
>
> rpage = BACKING_PAGE(dpage);
>
> /*
> * Normally, a device would use the page->zone_device_data to
> * point to the mirror but here we use it to hold the page for
> * the simulated device memory and that page holds the pointer
> * to the mirror.
> */
> rpage->zone_device_data = dmirror;
>
> ...
> }
>
> ...
> }
>
> So now a bunch of device private pages have a zone_device_data set to a dangling
> dmirror pointer.
>
> Then on coredump, we walk the VMAs, meaning we fault in device private pages and
> end up invoking do_swap_page() which in turn calls dmirror_devmem_fault() (via
> the struct dev_pagemap_ops
> dmirror_devmem_ops->migrate_to_ram=dmirror_devmem_fault callback)
>
> This is via get_dump_page() -> __get_user_pages_locked(..., FOLL_FORCE |
> FOLL_DUMP | FOLL_GET) -> __get_user_pages() -> handle_mm_fault() ->
> __handle_mm_fault() -> do_swap_page() and:
>
> vm_fault_t do_swap_page(struct vm_fault *vmf)
> {
> ...
> entry = softleaf_from_pte(vmf->orig_pte);
> if (unlikely(!softleaf_is_swap(entry))) {
> if (softleaf_is_migration(entry)) {
> ...
> } else if (softleaf_is_device_private(entry)) {
> ...
>
> if (trylock_page(vmf->page)) {
> ...
>
> ret = pgmap->ops->migrate_to_ram(vmf);
>
> ...
> }
>
> ...
> }
>
> ...
> }
>
> ...
> }
>
> (BTW, we seriously need to clean this up).
What did you have in mind here?
> And in dmirror_devmem_fault callback():
>
> static vm_fault_t dmirror_devmem_fault(struct vm_fault *vmf)
> {
> ...
>
> /*
> * Normally, a device would use the page->zone_device_data to point to
> * the mirror but here we use it to hold the page for the simulated
> * device memory and that page holds the pointer to the mirror.
> */
> rpage = folio_zone_device_data(page_folio(vmf->page));
> dmirror = rpage->zone_device_data;
>
> ...
>
> args.pgmap_owner = dmirror->mdevice; <-- oops
>
> ...
> }
>
> So in terms of fixing:
>
> 1. Fix the test (trivial)
>
> Use
>
> size = read_pmd_pagesize();
>
> Instead of:
>
> size = TWOMEG;
Adding Balbir as this would have come in with his hugepage changes.
> 2. Have dmirror_fops_release() migrate all the device private pages back to ram
> before freeing dmirror or something like this
Oh yeah that's bad. We definitely need to do that migration once the file is
closed.
> You'd want to abstract code from dmirror_migrate_to_system() to be shared
> between the two functions I think.
>
> But I leave that as an exercise for the reader :)
Good thing I can't read :) I can try and put something together but that won't
happen before next week, so I won't complain if someone beats me to it. Thanks
for the detailed analysis and report though!
> > leading to the failure of the first hmm_migrate_sys_to_dev(). The test
> > program received a SIGABRT signal and initiated vfs_coredump(). And
> > something in the test_hmm module doesn't play well with the coredump
> > process, which ends up with a panic. I'm not familiar with that.
> >
> > Note that I can also reproduce the panic by aborting the test manually
> > with following diff (and skipping migrate_anon_huge{,_zero}_err()):
> >
> > diff --git a/tools/testing/selftests/mm/hmm-tests.c
> > b/tools/testing/selftests/mm/hmm-tests.c
> > index e8328c89d855..8d8ea8063a73 100644
> > --- a/tools/testing/selftests/mm/hmm-tests.c
> > +++ b/tools/testing/selftests/mm/hmm-tests.c
> > @@ -1027,6 +1027,8 @@ TEST_F(hmm, migrate)
> > ASSERT_EQ(ret, 0);
> > ASSERT_EQ(buffer->cpages, npages);
> >
> > + ASSERT_TRUE(0);
>
> This makes sense as the same dangling dmirror pointer issue arises.
>
> > +
> > /* Check what the device read. */
> > for (i = 0, ptr = buffer->mirror; i < size / sizeof(*ptr); ++i)
> > ASSERT_EQ(ptr[i], i);
> >
> > Please have a look!
>
> Hopefully did so usefully here :)
>
> >
> > Thanks,
> > Zenghui
>
> Cheers, Lorenzo
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: running mm/ksft_hmm.sh on arm64 results in a kernel panic
2026-03-19 1:49 ` Alistair Popple
@ 2026-03-19 2:00 ` Balbir Singh
0 siblings, 0 replies; 4+ messages in thread
From: Balbir Singh @ 2026-03-19 2:00 UTC (permalink / raw)
To: Alistair Popple, Lorenzo Stoakes (Oracle)
Cc: Zenghui Yu, linux-mm, linux-kernel, jgg, leon, akpm, david,
Liam.Howlett, vbabka, rppt, surenb, mhocko, Jordan Niethe
On 3/19/26 12:49, Alistair Popple wrote:
> On 2026-03-19 at 02:05 +1100, "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote...
>> On Wed, Mar 18, 2026 at 01:26:39PM +0800, Zenghui Yu wrote:
>>> Hi all,
>>>
>>> When running mm/ksft_hmm.sh in my arm64 virtual machine, I ran into the
>>> following kernel panic:
>>>
>>> [root@localhost mm]# ./ksft_hmm.sh
>>> TAP version 13
>>> # --------------------------------
>>> # running bash ./test_hmm.sh smoke
>>> # --------------------------------
>>> # Running smoke test. Note, this test provides basic coverage.
>>> # TAP version 13
>>> # 1..74
>>> # # Starting 74 tests from 4 test cases.
>>> # # RUN hmm.hmm_device_private.benchmark_thp_migration ...
>>> #
>>> # HMM THP Migration Benchmark
>>> # ---------------------------
>>> # System page size: 16384 bytes
>>> #
>>> # === Small Buffer (512KB) (0.5 MB) ===
>>> # | With THP | Without THP | Improvement
>>> # ---------------------------------------------------------------------
>>> # Sys->Dev Migration | 0.423 ms | 0.182 ms | -133.0%
>>> # Dev->Sys Migration | 0.027 ms | 0.025 ms | -7.0%
>>> # S->D Throughput | 1.15 GB/s | 2.69 GB/s | -57.1%
>>> # D->S Throughput | 18.12 GB/s | 19.38 GB/s | -6.5%
>>> #
>>> # === Half THP Size (1MB) (1.0 MB) ===
>>> # | With THP | Without THP | Improvement
>>> # ---------------------------------------------------------------------
>>> # Sys->Dev Migration | 0.367 ms | 1.187 ms | 69.0%
>>> # Dev->Sys Migration | 0.048 ms | 0.049 ms | 2.2%
>>> # S->D Throughput | 2.66 GB/s | 0.82 GB/s | 222.9%
>>> # D->S Throughput | 20.53 GB/s | 20.08 GB/s | 2.3%
>>> #
>>> # === Single THP Size (2MB) (2.0 MB) ===
>>> # | With THP | Without THP | Improvement
>>> # ---------------------------------------------------------------------
>>> # Sys->Dev Migration | 0.817 ms | 0.782 ms | -4.4%
>>> # Dev->Sys Migration | 0.089 ms | 0.096 ms | 7.1%
>>> # S->D Throughput | 2.39 GB/s | 2.50 GB/s | -4.2%
>>> # D->S Throughput | 22.00 GB/s | 20.44 GB/s | 7.6%
>>> #
>>> # === Two THP Size (4MB) (4.0 MB) ===
>>> # | With THP | Without THP | Improvement
>>> # ---------------------------------------------------------------------
>>> # Sys->Dev Migration | 3.419 ms | 2.337 ms | -46.3%
>>> # Dev->Sys Migration | 0.321 ms | 0.225 ms | -42.6%
>>> # S->D Throughput | 1.14 GB/s | 1.67 GB/s | -31.6%
>>> # D->S Throughput | 12.17 GB/s | 17.36 GB/s | -29.9%
>>> #
>>> # === Four THP Size (8MB) (8.0 MB) ===
>>> # | With THP | Without THP | Improvement
>>> # ---------------------------------------------------------------------
>>> # Sys->Dev Migration | 4.535 ms | 4.563 ms | 0.6%
>>> # Dev->Sys Migration | 0.583 ms | 0.582 ms | -0.2%
>>> # S->D Throughput | 1.72 GB/s | 1.71 GB/s | 0.6%
>>> # D->S Throughput | 13.39 GB/s | 13.43 GB/s | -0.2%
>>> #
>>> # === Eight THP Size (16MB) (16.0 MB) ===
>>> # | With THP | Without THP | Improvement
>>> # ---------------------------------------------------------------------
>>> # Sys->Dev Migration | 10.190 ms | 9.805 ms | -3.9%
>>> # Dev->Sys Migration | 1.130 ms | 1.195 ms | 5.5%
>>> # S->D Throughput | 1.53 GB/s | 1.59 GB/s | -3.8%
>>> # D->S Throughput | 13.83 GB/s | 13.07 GB/s | 5.8%
>>> #
>>> # === One twenty eight THP Size (256MB) (256.0 MB) ===
>>> # | With THP | Without THP | Improvement
>>> # ---------------------------------------------------------------------
>>> # Sys->Dev Migration | 80.464 ms | 92.764 ms | 13.3%
>>> # Dev->Sys Migration | 9.528 ms | 18.166 ms | 47.6%
>>> # S->D Throughput | 3.11 GB/s | 2.70 GB/s | 15.3%
>>> # D->S Throughput | 26.24 GB/s | 13.76 GB/s | 90.7%
>>> # # OK hmm.hmm_device_private.benchmark_thp_migration
>>> # ok 1 hmm.hmm_device_private.benchmark_thp_migration
>>> # # RUN hmm.hmm_device_private.migrate_anon_huge_zero_err ...
>>> # # hmm-tests.c:2622:migrate_anon_huge_zero_err:Expected ret (-2) == 0 (0)
>>>
>>> [ 154.077143] Unable to handle kernel paging request at virtual address
>>> 0000000000005268
>>> [ 154.077179] Mem abort info:
>>> [ 154.077203] ESR = 0x0000000096000007
>>> [ 154.077219] EC = 0x25: DABT (current EL), IL = 32 bits
>>> [ 154.078433] SET = 0, FnV = 0
>>> [ 154.078434] EA = 0, S1PTW = 0
>>> [ 154.078435] FSC = 0x07: level 3 translation fault
>>> [ 154.078435] Data abort info:
>>> [ 154.078436] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
>>> [ 154.078459] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>>> [ 154.078479] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>>> [ 154.078484] user pgtable: 16k pages, 47-bit VAs, pgdp=000000010b920000
>>> [ 154.078487] [0000000000005268] pgd=0800000101b4c403,
>>> p4d=0800000101b4c403, pud=0800000101b4c403, pmd=0800000108cd8403,
>>> pte=0000000000000000
>>> [ 154.078520] Internal error: Oops: 0000000096000007 [#1] SMP
>>> [ 154.098664] Modules linked in: test_hmm rfkill drm fuse backlight ipv6
>>> [ 154.100468] CPU: 7 UID: 0 PID: 1357 Comm: hmm-tests Kdump: loaded Not
>>> tainted 7.0.0-rc4-00029-ga989fde763f4-dirty #260 PREEMPT
>>> [ 154.103855] Hardware name: QEMU QEMU Virtual Machine, BIOS
>>> edk2-stable202408-prebuilt.qemu.org 08/13/2024
>>> [ 154.104409] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS
>>> BTYPE=--)
>>> [ 154.104847] pc : dmirror_devmem_fault+0xe4/0x1c0 [test_hmm]
>>> [ 154.105758] lr : dmirror_devmem_fault+0xcc/0x1c0 [test_hmm]
>>> [ 154.109465] sp : ffffc000855ab430
>>> [ 154.109677] x29: ffffc000855ab430 x28: ffff8000c9f73e40 x27:
>>> ffff8000c9f73e40
>>> [ 154.110091] x26: ffff8000cb920000 x25: ffffc000812e0000 x24:
>>> 0000000000000000
>>> [ 154.110540] x23: ffff8000c9f73e40 x22: 0000000000000000 x21:
>>> 0000000000000008
>>> [ 154.110888] x20: ffff8000c07e1980 x19: ffffc000855ab618 x18:
>>> ffffc000855abc40
>>> [ 154.111223] x17: 0000000000000000 x16: 0000000000000000 x15:
>>> 0000000000000000
>>> [ 154.111563] x14: 0000000000000000 x13: 0000000000000000 x12:
>>> ffffc00080fedd68
>>> [ 154.111903] x11: 00007fffa3bf7fff x10: 0000000000000000 x9 :
>>> 1ffff00019166a41
>>> [ 154.112244] x8 : ffff8000c132df20 x7 : 0000000000000000 x6 :
>>> ffff8000c53bfe88
>>> [ 154.112581] x5 : 0000000000000009 x4 : ffffc000855ab3d0 x3 :
>>> 0000000000000004
>>> [ 154.112921] x2 : 0000000000000004 x1 : ffff8000c132df18 x0 :
>>> 0000000000005200
>>> [ 154.113254] Call trace:
>>> [ 154.113370] dmirror_devmem_fault+0xe4/0x1c0 [test_hmm] (P)
>>> [ 154.113679] do_swap_page+0x132c/0x17b0
>>> [ 154.113912] __handle_mm_fault+0x7e4/0x1af4
>>> [ 154.114124] handle_mm_fault+0xb4/0x294
>>> [ 154.114398] __get_user_pages+0x210/0xbfc
>>> [ 154.114607] get_dump_page+0xd8/0x144
>>> [ 154.114795] dump_user_range+0x70/0x2e8
>>> [ 154.115020] elf_core_dump+0xb64/0xe40
>>> [ 154.115212] vfs_coredump+0xfb4/0x1ce8
>>> [ 154.115397] get_signal+0x6cc/0x844
>>> [ 154.115582] arch_do_signal_or_restart+0x7c/0x33c
>>> [ 154.115805] exit_to_user_mode_loop+0x104/0x16c
>>> [ 154.116030] el0_svc+0x174/0x178
>>> [ 154.116216] el0t_64_sync_handler+0xa0/0xe4
>>> [ 154.116414] el0t_64_sync+0x198/0x19c
>>> [ 154.116594] Code: d2800083 f9400280 f9003be0 2a0303e2 (b9406800)
>>> [ 154.116891] ---[ end trace 0000000000000000 ]---
>>> [ 158.741771] Kernel panic - not syncing: Oops: Fatal exception
>>> [ 158.742164] SMP: stopping secondary CPUs
>>> [ 158.742970] Kernel Offset: disabled
>>> [ 158.743162] CPU features: 0x0000000,00060005,11210501,94067723
>>> [ 158.743440] Memory Limit: none
>>> [ 164.002089] Starting crashdump kernel...
>>> [ 164.002867] Bye!
>>
>> That 'Bye!' is delightful :)
>>
>>>
>>> [root@localhost linux]# ./scripts/faddr2line lib/test_hmm.ko
>>> dmirror_devmem_fault+0xe4/0x1c0
>>> dmirror_devmem_fault+0xe4/0x1c0:
>>> dmirror_select_device at /root/code/linux/lib/test_hmm.c:153
>>> (inlined by) dmirror_devmem_fault at /root/code/linux/lib/test_hmm.c:1659
>>>
>>> The kernel is built with arm64's virt.config plus
>>>
>>> +CONFIG_ARM64_16K_PAGES=y
>>> +CONFIG_ZONE_DEVICE=y
>>> +CONFIG_DEVICE_PRIVATE=y
>>> +CONFIG_TEST_HMM=m
>>>
>>> I *guess* the problem is that migrate_anon_huge_zero_err() has chosen an
>>> incorrect THP size (which should be 32M in a system with 16k page size),
>>
>> Yeah, it hardcodes to 2mb:
>>
>> TEST_F(hmm, migrate_anon_huge_zero_err)
>> {
>> ...
>>
>> size = TWOMEG;
>> }
>>
>> Which isn't correct obviously and needs to be fixed.
>>
>> We should read /sys/kernel/mm/transparent_hugepage/hpage_pmd_size instead.
>>
>> vm_utils.h has read_pmd_pagesize() So this can be fixed with:
>>
>> size = read_pmd_pagesize();
>>
>> We then madvise(.., MADV_HUGEPAGE) region of size, which is now too small.:
>>
>> TEST_F(hmm, migrate_anon_huge_zero_err)
>> {
>> ...
>>
>> size = TWOMEG;
>>
>> ...
>>
>> ret = madvise(map, size, MADV_HUGEPAGE);
>> ASSERT_EQ(ret, 0); <-- but should succeed anyway, just won't do anything
>>
>> ...
>>
>> ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_FLAGS, buffer,
>> HMM_DMIRROR_FLAG_FAIL_ALLOC);
>> }
>>
>> Then we switch into lib/test_hmm.c:
>>
>> static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
>> struct dmirror *dmirror)
>> {
>> ...
>>
>> for (addr = args->start; addr < args->end; ) {
>> ...
>>
>> if (dmirror->flags & HMM_DMIRROR_FLAG_FAIL_ALLOC) {
>> dmirror->flags &= ~HMM_DMIRROR_FLAG_FAIL_ALLOC;
>> dpage = NULL; <-- force failure for 1st page
>>
>> ...
>>
>> if (!dpage) {
>> ...
>>
>> if (!is_large) <-- isn't large, as MADV_HUGEPAGE failed
>> goto next;
>>
>> ...
>> next:
>> src++;
>> dst++;
>> addr += PAGE_SIZE;
>> }
>> }
>>
>> Back to the hmm-tests.c selftest:
>>
>> TEST_F(hmm, migrate_anon_huge_zero_err)
>> {
>> ...
>>
>> ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_FLAGS, buffer,
>> HMM_DMIRROR_FLAG_FAIL_ALLOC);
>> ASSERT_EQ(ret, 0); <-- succeeds but...
>> ASSERT_EQ(buffer->cpages, npages); <-- cpages = npages - 1.
>> }
>>
>> So then we try to teardown which inokves:
>>
>> FIXTURE_TEARDOWN(hmm)
>> {
>> int ret = close(self->fd); <-- triggers kernel dmirror_fops_release()
>> ...
>> }
>>
>> In the kernel:
>>
>> static int dmirror_fops_release(struct inode *inode, struct file *filp)
>> {
>> struct dmirror *dmirror = filp->private_data;
>> ...
>>
>> kfree(dmirror); <-- frees dmirror...
>> return 0;
>> }
>>
>> So dmirror is fred but in dmirror_migrate_alloc_and_copy(), for all those pages
>> we DID migrate:
>>
>> static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
>> struct dmirror *dmirror)
>> {
>> ...
>>
>> for (addr = args->start; addr < args->end; ) {
>> ...
>>
>> if (!dpage) { <-- we will succeed allocation so don't branch.
>> ...
>> }
>>
>> rpage = BACKING_PAGE(dpage);
>>
>> /*
>> * Normally, a device would use the page->zone_device_data to
>> * point to the mirror but here we use it to hold the page for
>> * the simulated device memory and that page holds the pointer
>> * to the mirror.
>> */
>> rpage->zone_device_data = dmirror;
>>
>> ...
>> }
>>
>> ...
>> }
>>
>> So now a bunch of device private pages have a zone_device_data set to a dangling
>> dmirror pointer.
>>
>> Then on coredump, we walk the VMAs, meaning we fault in device private pages and
>> end up invoking do_swap_page() which in turn calls dmirror_devmem_fault() (via
>> the struct dev_pagemap_ops
>> dmirror_devmem_ops->migrate_to_ram=dmirror_devmem_fault callback)
>>
>> This is via get_dump_page() -> __get_user_pages_locked(..., FOLL_FORCE |
>> FOLL_DUMP | FOLL_GET) -> __get_user_pages() -> handle_mm_fault() ->
>> __handle_mm_fault() -> do_swap_page() and:
>>
>> vm_fault_t do_swap_page(struct vm_fault *vmf)
>> {
>> ...
>> entry = softleaf_from_pte(vmf->orig_pte);
>> if (unlikely(!softleaf_is_swap(entry))) {
>> if (softleaf_is_migration(entry)) {
>> ...
>> } else if (softleaf_is_device_private(entry)) {
>> ...
>>
>> if (trylock_page(vmf->page)) {
>> ...
>>
>> ret = pgmap->ops->migrate_to_ram(vmf);
>>
>> ...
>> }
>>
>> ...
>> }
>>
>> ...
>> }
>>
>> ...
>> }
>>
>> (BTW, we seriously need to clean this up).
>
> What did you have in mind here?
>
I have the same question, cc's would be helpful as well. (+)Jordan has been running
this test for his patchset, not sure if he ran into this.
>> And in dmirror_devmem_fault callback():
>>
>> static vm_fault_t dmirror_devmem_fault(struct vm_fault *vmf)
>> {
>> ...
>>
>> /*
>> * Normally, a device would use the page->zone_device_data to point to
>> * the mirror but here we use it to hold the page for the simulated
>> * device memory and that page holds the pointer to the mirror.
>> */
>> rpage = folio_zone_device_data(page_folio(vmf->page));
>> dmirror = rpage->zone_device_data;
>>
>> ...
>>
>> args.pgmap_owner = dmirror->mdevice; <-- oops
>>
>> ...
>> }
>>
>> So in terms of fixing:
>>
>> 1. Fix the test (trivial)
>>
>> Use
>>
>> size = read_pmd_pagesize();
>>
>> Instead of:
>>
>> size = TWOMEG;
>
> Adding Balbir as this would have come in with his hugepage changes.
>
Yes I did, agree with 1
>> 2. Have dmirror_fops_release() migrate all the device private pages back to ram
>> before freeing dmirror or something like this
>
> Oh yeah that's bad. We definitely need to do that migration once the file is
> closed.
>
Agreed, it's been that way for a while, it does need cleanup.
>> You'd want to abstract code from dmirror_migrate_to_system() to be shared
>> between the two functions I think.
>>
>> But I leave that as an exercise for the reader :)
>
> Good thing I can't read :) I can try and put something together but that won't
> happen before next week, so I won't complain if someone beats me to it. Thanks
> for the detailed analysis and report though!
>
I'll try that at my end as well and see I can reproduce it, but I don't think
I'll win the race with Al or come close at this point.
>>> leading to the failure of the first hmm_migrate_sys_to_dev(). The test
>>> program received a SIGABRT signal and initiated vfs_coredump(). And
>>> something in the test_hmm module doesn't play well with the coredump
>>> process, which ends up with a panic. I'm not familiar with that.
>>>
>>> Note that I can also reproduce the panic by aborting the test manually
>>> with following diff (and skipping migrate_anon_huge{,_zero}_err()):
>>>
>>> diff --git a/tools/testing/selftests/mm/hmm-tests.c
>>> b/tools/testing/selftests/mm/hmm-tests.c
>>> index e8328c89d855..8d8ea8063a73 100644
>>> --- a/tools/testing/selftests/mm/hmm-tests.c
>>> +++ b/tools/testing/selftests/mm/hmm-tests.c
>>> @@ -1027,6 +1027,8 @@ TEST_F(hmm, migrate)
>>> ASSERT_EQ(ret, 0);
>>> ASSERT_EQ(buffer->cpages, npages);
>>>
>>> + ASSERT_TRUE(0);
>>
>> This makes sense as the same dangling dmirror pointer issue arises.
>>
>>> +
>>> /* Check what the device read. */
>>> for (i = 0, ptr = buffer->mirror; i < size / sizeof(*ptr); ++i)
>>> ASSERT_EQ(ptr[i], i);
>>>
>>> Please have a look!
>>
>> Hopefully did so usefully here :)
>>
>>>
>>> Thanks,
>>> Zenghui
>>
>> Cheers, Lorenzo
>>
Thanks for the bug report
Balbir
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-03-19 2:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-18 5:26 running mm/ksft_hmm.sh on arm64 results in a kernel panic Zenghui Yu
2026-03-18 15:05 ` Lorenzo Stoakes (Oracle)
2026-03-19 1:49 ` Alistair Popple
2026-03-19 2:00 ` Balbir Singh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox