* [BUG] KVM: NULL pointer dereference in kvm_tdp_mmu_map under memory pressure
@ 2026-04-08 10:29 punixcorn
2026-04-08 11:21 ` punixcorn
2026-04-08 14:18 ` Sean Christopherson
0 siblings, 2 replies; 6+ messages in thread
From: punixcorn @ 2026-04-08 10:29 UTC (permalink / raw)
To: seanjc, pbonzini; +Cc: kvm, linux-kernel, punixcorn
Under host memory pressure, a NULL pointer dereference occurs in
kvm_tdp_mmu_map() at offset 0x24. The exact root cause is unclear --
it may be an unhandled NULL return from tdp_mmu_alloc_sp(), or a
violated invariant elsewhere in the map path.
Crash log:
BUG: kernel NULL pointer dereference, address: 0000000000000024
#PF: supervisor read access in kernel mode
Oops: 0000 [#1] SMP NOPTI
CPU: 2 PID: 1110212 Comm: MainLoopThread Tainted: G U OE 6.19.10-arch1-1
Hardware name: Default Default/NLXB, BIOS BQ141 06/27/2024
RIP: 0010:kvm_tdp_mmu_map+0x471/0x880 [kvm]
Code: 00 00 00 80 48 2b 35 76 72 5c c8 48 c7 44 24 20 00 00 00 00 48 01 f1 48 c1 e9 0c 48 c1 e1 06 48 03 0d 4b 72 5c c8 48 8b 71 28 <0f> b6 4e 24 83 e1 0f 39 ca 0f 85 a7 02 00 00 f6 c4 08 74 26 80 7b
RSP: 0018:ffffce128333f790 EFLAGS: 00010286
Reproduction:
The issue was observed under heavy host memory pressure while running
a KVM guest (Android emulator via QEMU).
The crash is not reliably reproducible and appears to be
timing-dependent. Fault injection targeting tdp_mmu_alloc_sp()
increases the frequency of hitting the same code path without
triggering a panic, suggesting the retry path may be a viable
recovery, though the exact failure condition is still unclear.
Fault injection used:
sp = tdp_mmu_alloc_sp(vcpu);
if (!sp || (atomic_inc_return(&fail_counter) % 100 == 0)) {
if (sp) tdp_mmu_free_sp(sp);
goto retry;
}
With this injection the guest continues running normally initially,
but eventually terminates after sustained injection pressure. This is
expected behavior given the repeated forced failures.
A speculative fix:
if (!sp)
goto retry;
This has not been fully verified. Sending for maintainer review.
Environment:
Linux 6.19.10-arch1-1 x86_64
GNU C 15.2.1
Binutils 2.46
Signed-off-by: punixcorn <ohyunwoods663@gmail.com>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [BUG] KVM: NULL pointer dereference in kvm_tdp_mmu_map under memory pressure 2026-04-08 10:29 [BUG] KVM: NULL pointer dereference in kvm_tdp_mmu_map under memory pressure punixcorn @ 2026-04-08 11:21 ` punixcorn 2026-04-08 14:18 ` Sean Christopherson 1 sibling, 0 replies; 6+ messages in thread From: punixcorn @ 2026-04-08 11:21 UTC (permalink / raw) To: seanjc, pbonzini; +Cc: kvm, linux-kernel Following up with additional analysis from gdb. The crash is at spte.h:263 in to_shadow_page(), not at the tdp_mmu_alloc_sp() path as initially suspected. (gdb) list *(kvm_tdp_mmu_map+0x471) 0x79451 is in kvm_tdp_mmu_map (mmu/spte.h:263) return (struct kvm_mmu_page *)page_private(page); The crash location suggests page_private() is returning 0 for the parent shadow page in tdp_mmu_init_child_sp(). The exact cause is unclear. Sharing for maintainer review. My earlier speculative fix (checking sp == NULL) was incorrect. I am not familiar enough with the KVM MMU internals to propose a correct fix. Sharing this in case it helps maintainers narrow down the root cause. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] KVM: NULL pointer dereference in kvm_tdp_mmu_map under memory pressure 2026-04-08 10:29 [BUG] KVM: NULL pointer dereference in kvm_tdp_mmu_map under memory pressure punixcorn 2026-04-08 11:21 ` punixcorn @ 2026-04-08 14:18 ` Sean Christopherson 1 sibling, 0 replies; 6+ messages in thread From: Sean Christopherson @ 2026-04-08 14:18 UTC (permalink / raw) To: punixcorn; +Cc: pbonzini, kvm, linux-kernel On Wed, Apr 08, 2026, punixcorn wrote: > Under host memory pressure, a NULL pointer dereference occurs in > kvm_tdp_mmu_map() at offset 0x24. The exact root cause is unclear -- > it may be an unhandled NULL return from tdp_mmu_alloc_sp(), or a > violated invariant elsewhere in the map path. It's pretty much guaranteed to be the latter. tdp_mmu_alloc_sp() can't fail, as KVM ensures vcpu->arch.mmu_page_header_cache holds enough pre-allocated entries to service the page fault. Even if that invariant fails and KVM exhausts the cache, it should still be impossible for kvm_mmu_memory_cache_alloc() to return NULL because it will either use a fallback allocation (after WARNing) and succeed, or BUG_ON() and prevent hitting the NULL pointer deref. void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc) { void *p; if (WARN_ON(!mc->nobjs)) p = mmu_memory_cache_alloc_obj(mc, GFP_ATOMIC | __GFP_ACCOUNT); else p = mc->objects[--mc->nobjs]; BUG_ON(!p); return p; } And even if _that_ didn't suffice, tdp_mmu_alloc_sp() itself deferences the return sp, so the NULL pointer deref would happen earlier. static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu) { struct kvm_mmu_page *sp; sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); return sp; } > > Crash log: > > BUG: kernel NULL pointer dereference, address: 0000000000000024 > #PF: supervisor read access in kernel mode > Oops: 0000 [#1] SMP NOPTI > CPU: 2 PID: 1110212 Comm: MainLoopThread Tainted: G U OE 6.19.10-arch1-1 > Hardware name: Default Default/NLXB, BIOS BQ141 06/27/2024 > RIP: 0010:kvm_tdp_mmu_map+0x471/0x880 [kvm] > Code: 00 00 00 80 48 2b 35 76 72 5c c8 48 c7 44 24 20 00 00 00 00 48 01 f1 48 c1 e9 0c 48 c1 e1 06 48 03 0d 4b 72 5c c8 48 8b 71 28 <0f> b6 4e 24 83 e1 0f 39 ca 0f 85 a7 02 00 00 f6 c4 08 74 26 80 7b > RSP: 0018:ffffce128333f790 EFLAGS: 00010286 As noted in your response, I'm 99% certain this is the first derefence of the shadow page in tdp_mmu_map_handle_target_level(): static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, struct tdp_iter *iter) { struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(iter->sptep)); u64 new_spte; int ret = RET_PF_FIXED; bool wrprot = false; if (WARN_ON_ONCE(sp->role.level != fault->goal_level)) <============= "sp" is NULL return RET_PF_RETRY; The code stream lines up with that on my builds, and "role" is at offset 0x24. I can think of three possible sources of failure: 1. KVM installed a non-leaf SPTE without doing set_page_private(). 2. iter->sptep is corrupted/garbage. 3. iter->sptep points at a freed shadow page, i.e. page->private was nullified due to the page being freed and/or re-allocated. #1 seems unlikely as I wouldn't expect such a bug to manifest intermittently; the code is pretty fixed/straightforward. #2 isn't very likely either, given that it's dereferencing the shadow page that fails. I.e. KVM did _not_ fail grabbing the shadow page from iter->sptep, then iter->sptep isn't complete garbage. But it's still a possibility, e.g. if sptep is garbage but happens to still point at a valid struct page. #3 is the most likely option; as it would "just" require a violation of RCU protection somewhere. Can you run with this as a debug patch? With luck, the output will provide some hint as to what's going wrong. diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 7b1102d26f9c..0332faf8ef9a 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1174,6 +1174,17 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int ret = RET_PF_FIXED; bool wrprot = false; + if (WARN_ON_ONCE(!sp)) { + pr_warn("NULL sp. sptep = %lx, spte = %llx, pt[0] = %lx, pt[1] = %lx, pt[2] = %lx, pt[3] = %lx, pt[4] = %lx\n", + (unsigned long)iter->sptep, iter->old_spte, + (unsigned long)iter->pt_path[0], + (unsigned long)iter->pt_path[1], + (unsigned long)iter->pt_path[2], + (unsigned long)iter->pt_path[3], + (unsigned long)iter->pt_path[4]); + return RET_PF_RETRY; + } + if (WARN_ON_ONCE(sp->role.level != fault->goal_level)) return RET_PF_RETRY; > Reproduction: > > The issue was observed under heavy host memory pressure while running > a KVM guest (Android emulator via QEMU). Can you elaborate on the environment? Specifically, what is your host setup? E.g. CPU and platform info, and your .config. > This has not been fully verified. Sending for maintainer review. > > Environment: > Linux 6.19.10-arch1-1 x86_64 > GNU C 15.2.1 > Binutils 2.46 > > Signed-off-by: punixcorn <ohyunwoods663@gmail.com> ^ permalink raw reply related [flat|nested] 6+ messages in thread
[parent not found: <202604081418.sean.christopherson@intel.com>]
* Re: [BUG] KVM: NULL pointer dereference in kvm_tdp_mmu_map under memory pressure [not found] <202604081418.sean.christopherson@intel.com> @ 2026-04-08 15:36 ` punixcorn 2026-04-08 16:33 ` Sean Christopherson 0 siblings, 1 reply; 6+ messages in thread From: punixcorn @ 2026-04-08 15:36 UTC (permalink / raw) To: seanjc, pbonzini; +Cc: kvm, linux-kernel, punixcorn Hi Sean, I attempted to trigger your debug patch via fault injection (zeroing page_private on the allocated sp before it's linked), but the resulting logs aren't meaningful -- every captured entry shows spte = 8000000000000000, a non-present SPTE, which doesn't reflect the real crash scenario where the SPTE is present but page_private returns 0. So I'm not sending those. Natural reproduction is rare and I haven't caught it yet with your patch applied. Given that, what would you recommend as a next step? Would lockdep, KASAN, or RCU debugging (CONFIG_PROVE_RCU) be worth enabling to catch the violation when it happens naturally? Environment: - CPU: 13th Gen Intel(R) Core(TM) i5-13420H (12) @ 4.60 GHz - RAM: 16GB (15Gi usable, 16Gi swap) - OS: Arch Linux - Kernel: 6.19.10-dirty #1 SMP PREEMPT_DYNAMIC Wed Apr 8 06:08:08 GMT 2026 x86_64 - /proc/cpuinfo: https://pastebin.com/pwvNYsCu - .config: https://pastebin.com/z4fVZENs The crash occurs while running an Android emulator (QEMU) under host memory pressure. Signed-off-by: punixcorn <ohyunwoods663@gmail.com> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] KVM: NULL pointer dereference in kvm_tdp_mmu_map under memory pressure 2026-04-08 15:36 ` punixcorn @ 2026-04-08 16:33 ` Sean Christopherson 0 siblings, 0 replies; 6+ messages in thread From: Sean Christopherson @ 2026-04-08 16:33 UTC (permalink / raw) To: punixcorn; +Cc: pbonzini, kvm, linux-kernel On Wed, Apr 08, 2026, punixcorn wrote: > Hi Sean, > > I attempted to trigger your debug patch via fault injection (zeroing > page_private on the allocated sp before it's linked), but the resulting > logs aren't meaningful -- every captured entry shows spte = > 8000000000000000, a non-present SPTE, which doesn't reflect the real > crash scenario where the SPTE is present but page_private returns 0. > So I'm not sending those. Ya, I wouldn't expect synthetic injection to help root cause this. > Natural reproduction is rare and I haven't caught it yet with your patch > applied. How rare is rare? Are we talking hours of runtime? Days? > Given that, what would you recommend as a next step? If it's not too onerous, keep trying to reproduce with that initial debug patch. If the time to repro is several hours (or more), I can try to provide a more elaborate debug patch. > Would lockdep, KASAN, or RCU debugging (CONFIG_PROVE_RCU) be worth enabling > to catch the violation when it happens naturally? Hmm, of those, KASAN has the best chance of being useful. Thought it might make reproducing the bug even more difficult. > Environment: > - CPU: 13th Gen Intel(R) Core(TM) i5-13420H (12) @ 4.60 GHz > - RAM: 16GB (15Gi usable, 16Gi swap) > - OS: Arch Linux > - Kernel: 6.19.10-dirty #1 SMP PREEMPT_DYNAMIC Wed Apr 8 06:08:08 GMT 2026 x86_64 > - /proc/cpuinfo: https://pastebin.com/pwvNYsCu > - .config: https://pastebin.com/z4fVZENs > > The crash occurs while running an Android emulator (QEMU) under host > memory pressure. > > Signed-off-by: punixcorn <ohyunwoods663@gmail.com> ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <202604081633.sean.christopherson@intel.com>]
* Re: [BUG] KVM: NULL pointer dereference in kvm_tdp_mmu_map under memory pressure [not found] <202604081633.sean.christopherson@intel.com> @ 2026-04-08 18:43 ` punixcorn 0 siblings, 0 replies; 6+ messages in thread From: punixcorn @ 2026-04-08 18:43 UTC (permalink / raw) To: seanjc, pbonzini; +Cc: kvm, linux-kernel, punixcorn To be honest, it could be days. The original crash happened only once in a month of heavy use, though my system has been hitting 100% RAM usage frequently. I suspect a specific transition-like a guest memory zap during high host contention-is the trigger. I am currently trying to reproduce this by scripting a loop that reloads the guest project (Android emulator) while the host is under heavy memory load, as that was the environment when the crash occurred. I’ll keep the current debug patch running. If I can't catch it within the next 48 hours, I’d be very interested in that more elaborate debug patch you mentioned to help track the SPTE lifecycle more closely. Signed-off-by: punixcorn <ohyunwoods663@gmail.com> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-04-08 18:43 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-08 10:29 [BUG] KVM: NULL pointer dereference in kvm_tdp_mmu_map under memory pressure punixcorn
2026-04-08 11:21 ` punixcorn
2026-04-08 14:18 ` Sean Christopherson
[not found] <202604081418.sean.christopherson@intel.com>
2026-04-08 15:36 ` punixcorn
2026-04-08 16:33 ` Sean Christopherson
[not found] <202604081633.sean.christopherson@intel.com>
2026-04-08 18:43 ` punixcorn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox