All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	 Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Yosry Ahmed <yosry.ahmed@linux.dev>,
	 Yan Zhao <yan.y.zhao@intel.com>
Subject: Re: [PATCH] KVM: x86/mmu: Don't create SPTEs for addresses that aren't mappable
Date: Thu, 19 Feb 2026 00:23:55 +0000	[thread overview]
Message-ID: <aZZYG_TXe_kfvXh0@google.com> (raw)
In-Reply-To: <20260219002241.2908563-1-seanjc@google.com>

On Wed, Feb 18, 2026, Sean Christopherson wrote:
> Track the mask of guest physical address bits that can actually be mapped
> by a given MMU instance that utilizes TDP, and either exit to userspace
> with -EFAULT or go straight to emulation without creating an SPTE (for
> emulated MMIO) if KVM can't map the address.  Attempting to create an SPTE
> can cause KVM to drop the unmappable bits, and thus install a bad SPTE.
> E.g. when starting a walk, the TDP MMU will round the GFN based on the
> root level, and drop the upper bits.
> 
> Exit with -EFAULT in the unlikely scenario userspace is misbehaving and
> created a memslot that can't be addressed, e.g. if userspace installed
> memory above the guest.MAXPHYADDR defined in CPUID, as there's nothing KVM
> can do to make forward progress, and there _is_ a memslot for the address.
> For emulated MMIO, KVM can at least kick the bad address out to userspace
> via a normal MMIO exit.
> 
> The flaw has existed for a very long time, and was exposed by commit
> 988da7820206 ("KVM: x86/tdp_mmu: WARN if PFN changes for spurious faults")
> thanks to a syzkaller program that prefaults memory at GPA 0x1000000000000
> and then faults in memory at GPA 0x0 (the extra-large GPA gets wrapped to
> '0').
> 
>   WARNING: arch/x86/kvm/mmu/tdp_mmu.c:1183 at kvm_tdp_mmu_map+0x5c3/0xa30 [kvm], CPU#125: syz.5.22/18468
>   CPU: 125 UID: 0 PID: 18468 Comm: syz.5.22 Tainted: G S      W           6.19.0-smp--23879af241d6-next #57 NONE
>   Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
>   Hardware name: Google Izumi-EMR/izumi, BIOS 0.20250917.0-0 09/17/2025
>   RIP: 0010:kvm_tdp_mmu_map+0x5c3/0xa30 [kvm]
>   Call Trace:
>    <TASK>
>    kvm_tdp_page_fault+0x107/0x140 [kvm]
>    kvm_mmu_do_page_fault+0x121/0x200 [kvm]
>    kvm_arch_vcpu_pre_fault_memory+0x18c/0x230 [kvm]
>    kvm_vcpu_pre_fault_memory+0x116/0x1e0 [kvm]
>    kvm_vcpu_ioctl+0x3a5/0x6b0 [kvm]
>    __se_sys_ioctl+0x6d/0xb0
>    do_syscall_64+0x8d/0x900
>    entry_SYSCALL_64_after_hwframe+0x4b/0x53
>    </TASK>
> 
> In practice, the flaw is benign (other than the new WARN) as it only
> affects guests that ignore guest.MAXPHYADDR (e.g. on CPUs with 52-bit
> physical addresses but only 4-level paging) or guests being run by a
> misbehaving userspace VMM (e.g. a VMM that ignored allow_smaller_maxphyaddr
> or is pre-faulting bad addresses).
> 
> For non-TDP shadow paging, always clear the unmappable mask as the flaw
> only affects GPAs affected.  For 32-bit paging, 64-bit virtual addresses
> simply don't exist.  Even when software can shove a 64-bit address
> somewhere, e.g. into SYSENTER_EIP, the value is architecturally truncated
> before it reaches the page table walker.  And for 64-bit paging, KVM's use
> of 4-level vs. 5-level paging is tied to the guest's CR4.LA57, i.e. KVM
> won't observe a 57-bit virtual address with a 4-level MMU.
> 
> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
> Cc: Yan Zhao <yan.y.zhao@intel.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---

Here's the full syzkaller reproducer (found on a manual run internally, so syzbot
didn't do the work for us).

FWIW, I don't love this approach, but I couldn't come up with anything better.

ioctl$KVM_SET_LAPIC(0xffffffffffffffff, 0x4400ae8f, &(0x7f0000000100)={"b46474f815e8d5535f0887c44335cc824dc6121bc72a77f532ff5dad4d643a9cab29d2310e04be14eb26c0af4985fe45e3b3b0680b3ec92725d74b9716e0f7c3119a2c9a0ae65ff4772e2e12733cb013c4308fe40863480747c0a7ddb9361b1578015ca1bb2c1677ebae096f08345476f567443842946ed946434c75916d1db83fe305920de65bfaf9bd940672216846cb16b8ae67cd3affc61375381f91b3b9f1cc5e38cafe5239aee71dcd481fbe1ecd2547ffbaad4469a74697c28fb9beefa6a5d736712a55eb9110c2cf7964062ba8cbc1c038e84f0f5db7fc7053118bf5221e3efa6fc3edb5d0ca3cde7054dd0751a332520aa8478b1775d552c5cc24d3c2df9eb333e5ca3aa06c1c2cf8526714f5caff2f55b41976fc20b64f1fc61d5b44f50953582a1825d32130a31abfeafd1987317879e29ac51b93c9659e023fff3ddb5e39dd19cc3ef1d883c78b9e073d08a9197fb3717df238b9831831214b186693be9dd2568bb77272e80df5dfed03e8c467627bedfbd93359a9f79a3aa37e873dc1357b37b43d813ea85267b0dc8b1c4cc51bd985328833beb2679b7fb762555bbea2da936b36f8f1673fd5f606b2b6eb23b72bf947206e8dbfeb40ca6f265a3485c8446e0f0da652860b88328073d2282c14b48a7774e62754a968b60e92205e8fafcdd70a55c3c4d1a4821ff44e6e3681f15ae091262e3a3290a24d8ceae30ebbf9d24287bb8a5d73c608d47d287f9e716cf02b4796a83fb0c05e45b89de9ef8bce834e6d7a0be6e30d2c66cb6e640cb01898454ad361bc0701d8fe56113335ae6adec59300db04691cc4a689034272a8e086a32ce7061b4f79fa8afbb48a6ce4b62bdc44af013d78980457e1fa61eb9204818606f4c3b03c0f33cd2a841ac9bc2b73151a96e31ab99e6ec969b5f2c3edd5f9abc69845e487af992758ba445368da93dae1d44360d52a534a88276b8aaf349841d8a4788c60408618437c442308dbf70efeda2e54e9b9e4fe5f76997c9dcb945a26bd75748c85d19ca8b99264dce50580e8d4dbda401dad7df31e9a7a6a3a83bfbdfb5394abd581ac0824fbcd75d2f5205c0b7c9188e6f26bfd97734d9a20433f6cdba9d14a5f32a4d97a57f4603b21146fd1aebf082e863d463c224ad623c17d8043d3bf083f0322408dd6ead6915ac6a4222ab51480eb6e11a8913348219515170d9df90d72d7363bbda3e327d19f98c0a856f98076380e788e602e8a2ae0a1930786874dc21a2e99abda15f35457cf1dcb440c4b41350d0eda352aad7f57a0adc8a6914da06460635ed21c4c11cd1a8ec778064c9f62efba2927828b23f94b16619a5520731c2c40ab8583c9f2e73233d74b84f4877ce6b35bb1180300"})
r0 = openat$kvm(0xffffff9c, &(0x7f00000000c0), 0x0, 0x0)
r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0)
r2 = ioctl$KVM_CREATE_VCPU(r1, 0xae41, 0x0)
ioctl$KVM_SET_USER_MEMORY_REGION(r1, 0x4020ae46, &(0x7f0000000080)={0x0, 0x0, 0x0, 0x2000, &(0x7f0000000000/0x2000)=nil})
ioctl$KVM_SET_REGS(r2, 0x4090ae82, &(0x7f0000000200)={[0x0, 0x6, 0xfffffffffffffffd, 0x0, 0xfffd, 0x1, 0x4002004c4, 0x1000, 0x0, 0x0, 0x0, 0x0, 0x3], 0x25000, 0x2011c0})
ioctl$KVM_RUN(r2, 0xae80, 0x0)
ioctl$KVM_PRE_FAULT_MEMORY(r2, 0xc040aed5, &(0x7f0000000000)={0x0, 0x18000})
ioctl$KVM_SET_PIT2(0xffffffffffffffff, 0x4070aea0, &(0x7f0000000100)={[{0x7ff, 0x93, 0x0, 0xc0, 0xc0, 0x92, 0x85, 0x8, 0x6, 0xa, 0x0, 0x7, 0x8001}, {0x5, 0x2, 0xf9, 0x8, 0x7c, 0xf, 0xd, 0x1, 0x5, 0x3, 0x7, 0xa, 0x7}, {0x7, 0x71b0, 0x3, 0x3, 0xf8, 0x1, 0x8, 0x3, 0x8, 0x82, 0xc, 0xa4, 0x6}], 0xfffffffa})

  reply	other threads:[~2026-02-19  0:23 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-19  0:22 [PATCH] KVM: x86/mmu: Don't create SPTEs for addresses that aren't mappable Sean Christopherson
2026-02-19  0:23 ` Sean Christopherson [this message]
     [not found] ` <c06466c636da3fc1dc14dc09260981a2554c7cc2.camel@intel.com>
2026-02-20 16:54   ` Sean Christopherson
2026-02-21  0:01     ` Edgecombe, Rick P
2026-02-21  0:07       ` Sean Christopherson
2026-02-21  0:08 ` Edgecombe, Rick P
2026-02-21  0:49   ` Sean Christopherson
2026-02-23 23:23     ` Edgecombe, Rick P
2026-02-24  1:49       ` Sean Christopherson
2026-02-23 11:12 ` Huang, Kai
2026-02-23 16:54   ` Sean Christopherson
2026-02-23 20:48     ` Huang, Kai
2026-02-23 21:25       ` Sean Christopherson
2026-02-23 21:44         ` Huang, Kai
2026-03-05  7:55 ` Yan Zhao
2026-03-06 22:22   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZZYG_TXe_kfvXh0@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yosry.ahmed@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.