* [PATCH] KVM: Reject wrapped offset in kvm_reset_dirty_gfn() [not found] <agFyH8UoHO5Q3Fta@1wt.eu> @ 2026-05-12 6:07 ` Aaron Sacks 2026-05-12 19:56 ` Paolo Bonzini 0 siblings, 1 reply; 2+ messages in thread From: Aaron Sacks @ 2026-05-12 6:07 UTC (permalink / raw) To: Willy Tarreau; +Cc: security, Paolo Bonzini, kvm kvm_reset_dirty_gfn() guards the gfn range with if (!memslot || (offset + __fls(mask)) >= memslot->npages) return; but offset is u64 and the addition is unchecked. The check can be silently bypassed by a u64 wrap. The dirty ring backing those entries is MAP_SHARED at KVM_DIRTY_LOG_PAGE_OFFSET of the vcpu fd, so the VMM can rewrite the slot and offset fields of any entry between when the kernel pushes them and when KVM_RESET_DIRTY_RINGS consumes them. On reset, kvm_dirty_ring_reset() re-reads the values via READ_ONCE() and feeds them straight back into this check; only the flags handshake is treated as the handover, the slot/offset payload is taken on trust. Crafting two entries entry[i].offset = 0xffffffffffffffc1 entry[i+1].offset = 0 makes the coalescing loop in kvm_dirty_ring_reset() compute delta = (s64)(0 - 0xffffffffffffffc1) = 63 which falls in [0, BITS_PER_LONG), so it folds entry[i+1] into the existing mask by setting bit 63. The trailing kvm_reset_dirty_gfn() call then sees offset = 0xffffffffffffffc1 and __fls(mask) = 63; the sum is 0 in u64 and the bounds check passes. That offset propagates into kvm_arch_mmu_enable_log_dirty_pt_masked() unchanged. On the legacy MMU path -- kvm_memslots_have_rmaps() == true, i.e. shadow paging, any VM that has allocated shadow roots, or a write-tracked slot -- it reaches gfn_to_rmap(), which indexes slot->arch.rmap[0][] with a near-U64_MAX gfn. That is an out-of-bounds load of a kvm_rmap_head, followed by a conditional clear of PT_WRITABLE_MASK in whatever the loaded pointer points at. The path is reachable from any process holding /dev/kvm. Range-check offset on its own first, so the addition cannot wrap. memslot->npages is bounded well below U64_MAX, so once offset < npages holds, offset + __fls(mask) (with __fls(mask) < BITS_PER_LONG) stays in range. Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking") Cc: stable@vger.kernel.org Signed-off-by: Aaron Sacks <contact@xchglabs.com> --- Hi Willy, Thanks for the review. Re-sending this one as a proper patch and addressing your points below. I will reply separately to the other three reports per your guidance there. > Please could you check on an up-to-date kernel? 6.13.7 is long > dead (1 year) and thousands of fixes were merged since. Re-verified on v7.0.6 (current stable). The buggy bounds check in virt/kvm/dirty_ring.c is unchanged: if (!memslot || (offset + __fls(mask)) >= memslot->npages) return; Same PoC still produces the oops: Comm: poc Not tainted 7.0.6 #1 BUG: unable to handle page fault for address: ffffa9bbc002ce08 RIP: 0010:rmap_write_protect+0x6/0xf0 RAX: ffffffffffffffc1 RBX: 8000000000000001 Call Trace: kvm_arch_mmu_enable_log_dirty_pt_masked+0x145/0x210 kvm_reset_dirty_gfn+0xcd/0x100 kvm_dirty_ring_reset+0x12c/0x1f0 kvm_vm_ioctl+0xb41/0x10b0 Kernel panic - not syncing: Fatal exception RAX is the crafted wrapped offset; RBX is the coalesced mask with bit 63 set (__fls == 63). Both match the values planted in the ring entries from userspace, so the wrapped sum still passes the existing bounds check on v7.0.6. > Please also wrap long lines Done -- this reply and the patch are at <=72 cols. > Was all of this generated by an LLM? The reports were generated by an LLM yes. But the analysis, PoC and patch suggestions are mine. Happy to provide the PoC source, kernel .config, or additional artifacts. > Could you please turn this one into a real patch Patch follows. Applies clean on v7.0.6; with it applied, the same trigger no longer oopses (offset is rejected by the added offset >= npages check before the addition can wrap). Thanks, Aaron virt/kvm/dirty_ring.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 02bc6b00d..572b854ed 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -63,7 +63,8 @@ static void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot, u64 offset, u64 mask) memslot = id_to_memslot(__kvm_memslots(kvm, as_id), id); - if (!memslot || (offset + __fls(mask)) >= memslot->npages) + if (!memslot || offset >= memslot->npages || + offset + __fls(mask) >= memslot->npages) return; KVM_MMU_LOCK(kvm); -- 2.43.0 ^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] KVM: Reject wrapped offset in kvm_reset_dirty_gfn() 2026-05-12 6:07 ` [PATCH] KVM: Reject wrapped offset in kvm_reset_dirty_gfn() Aaron Sacks @ 2026-05-12 19:56 ` Paolo Bonzini 0 siblings, 0 replies; 2+ messages in thread From: Paolo Bonzini @ 2026-05-12 19:56 UTC (permalink / raw) To: Aaron Sacks, Willy Tarreau; +Cc: security, kvm On 5/12/26 08:07, Aaron Sacks wrote: > kvm_reset_dirty_gfn() guards the gfn range with > > if (!memslot || (offset + __fls(mask)) >= memslot->npages) > return; > > but offset is u64 and the addition is unchecked. The check can be > silently bypassed by a u64 wrap. > > The dirty ring backing those entries is MAP_SHARED at > KVM_DIRTY_LOG_PAGE_OFFSET of the vcpu fd, so the VMM can rewrite the > slot and offset fields of any entry between when the kernel pushes > them and when KVM_RESET_DIRTY_RINGS consumes them. On reset, > kvm_dirty_ring_reset() re-reads the values via READ_ONCE() and feeds > them straight back into this check; only the flags handshake is > treated as the handover, the slot/offset payload is taken on trust. > > Crafting two entries > > entry[i].offset = 0xffffffffffffffc1 > entry[i+1].offset = 0 > > makes the coalescing loop in kvm_dirty_ring_reset() compute > > delta = (s64)(0 - 0xffffffffffffffc1) = 63 > > which falls in [0, BITS_PER_LONG), so it folds entry[i+1] into the > existing mask by setting bit 63. The trailing kvm_reset_dirty_gfn() > call then sees offset = 0xffffffffffffffc1 and __fls(mask) = 63; > the sum is 0 in u64 and the bounds check passes. > > That offset propagates into kvm_arch_mmu_enable_log_dirty_pt_masked() > unchanged. On the legacy MMU path -- kvm_memslots_have_rmaps() == > true, i.e. shadow paging, any VM that has allocated shadow roots, or > a write-tracked slot -- it reaches gfn_to_rmap(), which indexes > slot->arch.rmap[0][] with a near-U64_MAX gfn. That is an > out-of-bounds load of a kvm_rmap_head, followed by a conditional > clear of PT_WRITABLE_MASK in whatever the loaded pointer points at. > The path is reachable from any process holding /dev/kvm. > > Range-check offset on its own first, so the addition cannot wrap. > memslot->npages is bounded well below U64_MAX, so once offset < > npages holds, offset + __fls(mask) (with __fls(mask) < BITS_PER_LONG) > stays in range. > > Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking") > Cc: stable@vger.kernel.org > Signed-off-by: Aaron Sacks <contact@xchglabs.com> > --- > Hi Willy, > > Thanks for the review. Re-sending this one as a proper patch and > addressing your points below. I will reply separately to the > other three reports per your guidance there. > >> Please could you check on an up-to-date kernel? 6.13.7 is long >> dead (1 year) and thousands of fixes were merged since. > > Re-verified on v7.0.6 (current stable). The buggy bounds check > in virt/kvm/dirty_ring.c is unchanged: > > if (!memslot || (offset + __fls(mask)) >= memslot->npages) > return; > > Same PoC still produces the oops: > > Comm: poc Not tainted 7.0.6 #1 > BUG: unable to handle page fault for address: ffffa9bbc002ce08 > RIP: 0010:rmap_write_protect+0x6/0xf0 > RAX: ffffffffffffffc1 RBX: 8000000000000001 > Call Trace: > kvm_arch_mmu_enable_log_dirty_pt_masked+0x145/0x210 > kvm_reset_dirty_gfn+0xcd/0x100 > kvm_dirty_ring_reset+0x12c/0x1f0 > kvm_vm_ioctl+0xb41/0x10b0 > Kernel panic - not syncing: Fatal exception > > RAX is the crafted wrapped offset; RBX is the coalesced mask with > bit 63 set (__fls == 63). Both match the values planted in the > ring entries from userspace, so the wrapped sum still passes the > existing bounds check on v7.0.6. > >> Please also wrap long lines > > Done -- this reply and the patch are at <=72 cols. > >> Was all of this generated by an LLM? > > The reports were generated by an LLM yes. But the analysis, PoC > and patch suggestions are mine. Happy to provide the PoC source, > kernel .config, or additional artifacts. > >> Could you please turn this one into a real patch > > Patch follows. Applies clean on v7.0.6; with it applied, the > same trigger no longer oopses (offset is rejected by the added > offset >= npages check before the addition can wrap). Applied, thanks. Paolo ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-12 19:56 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <agFyH8UoHO5Q3Fta@1wt.eu>
2026-05-12 6:07 ` [PATCH] KVM: Reject wrapped offset in kvm_reset_dirty_gfn() Aaron Sacks
2026-05-12 19:56 ` Paolo Bonzini
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox