From: Sean Christopherson <seanjc@google.com>
To: Jon Kohler <jon@nutanix.com>
Cc: pbonzini@redhat.com, tglx@linutronix.de, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
hpa@zytor.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
"Mickaël Salaün" <mic@digikod.net>
Subject: Re: [RFC PATCH 11/18] KVM: VMX: Enhance EPT violation handler for PROT_USER_EXEC
Date: Mon, 12 May 2025 11:54:05 -0700 [thread overview]
Message-ID: <aCJDzU1p_SFNRIJd@google.com> (raw)
In-Reply-To: <20250313203702.575156-12-jon@nutanix.com>
On Thu, Mar 13, 2025, Jon Kohler wrote:
> From: Mickaël Salaün <mic@digikod.net>
>
> Add EPT_VIOLATION_PROT_USER_EXEC (6) to reflect the user executable
> permissions of a given address when Intel MBEC is enabled.
>
> Refactor usage of EPT_VIOLATION_RWX_TO_PROT to understand all of the
> specific bits that are now possible with MBEC.
>
> Intel SDM 'Exit Qualification for EPT Violations' states the following
> for Bit 6.
> If the “mode-based execute control” VM-execution control is 0, the
> value of this bit is undefined. If that control is 1, this bit is
> the logical-AND of bit 10 in the EPT paging-structure entries used
> to translate the guest-physical address of the access causing the
> EPT violation. In this case, it indicates whether the guest-physical
> address was executable for user-mode linear addresses.
>
> Bit 6 is cleared to 0 if (1) the “mode-based execute control”
> VM-execution control is 1; and (2) either (a) any of EPT
> paging-structure entries used to translate the guest-physical address
> of the access causing the EPT violation is not present; or
> (b) 4-level EPT is in use and the guest-physical address sets any
> bits in the range 51:48.
>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> Co-developed-by: Jon Kohler <jon@nutanix.com>
> Signed-off-by: Jon Kohler <jon@nutanix.com>
>
> ---
> arch/x86/include/asm/vmx.h | 7 ++++---
> arch/x86/kvm/mmu/paging_tmpl.h | 15 ++++++++++++---
> arch/x86/kvm/vmx/vmx.c | 7 +++++--
> 3 files changed, 21 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
> index ffc90d672b5d..84c5be416f5c 100644
> --- a/arch/x86/include/asm/vmx.h
> +++ b/arch/x86/include/asm/vmx.h
> @@ -587,6 +587,7 @@ enum vm_entry_failure_code {
> #define EPT_VIOLATION_PROT_READ BIT(3)
> #define EPT_VIOLATION_PROT_WRITE BIT(4)
> #define EPT_VIOLATION_PROT_EXEC BIT(5)
> +#define EPT_VIOLATION_PROT_USER_EXEC BIT(6)
Ugh, TDX added this as EPT_VIOLATION_EXEC_FOR_RING3_LIN (apparently the TDX module
enables MBEC?). I like your name a lot better.
> #define EPT_VIOLATION_PROT_MASK (EPT_VIOLATION_PROT_READ | \
> EPT_VIOLATION_PROT_WRITE | \
> EPT_VIOLATION_PROT_EXEC)
Hmm, so I think EPT_VIOLATION_PROT_MASK should include EPT_VIOLATION_PROT_USER_EXEC.
The existing TDX change does not, because unfortunately the bit is undefined if
MBEC is unsupported, but that's easy to solve by unconditionally clearing the bit
in handle_ept_violation(). And then when nested-EPT MBEC support comes along,
handle_ept_violation() can be modified to conditionally clear the bit based on
whether or not the current MMU supports MBEC.
I'll post a patch to include the bit in EPT_VIOLATION_PROT_MASK, and opportunistically
change the name.
> @@ -596,7 +597,7 @@ enum vm_entry_failure_code {
> #define EPT_VIOLATION_READ_TO_PROT(__epte) (((__epte) & VMX_EPT_READABLE_MASK) << 3)
> #define EPT_VIOLATION_WRITE_TO_PROT(__epte) (((__epte) & VMX_EPT_WRITABLE_MASK) << 3)
> #define EPT_VIOLATION_EXEC_TO_PROT(__epte) (((__epte) & VMX_EPT_EXECUTABLE_MASK) << 3)
> -#define EPT_VIOLATION_RWX_TO_PROT(__epte) (((__epte) & VMX_EPT_RWX_MASK) << 3)
Why? There's no escaping the fact that EXEC, a.k.a. X, is doing double duty as
"exec for all" and "kernel exec". And KVM has nearly two decades of history
using EXEC/X to refer to "exec for all". I see no reason to throw all of that
away and discard the intuitive and pervasive RWX logic.
> @@ -510,7 +511,15 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
> * Note, pte_access holds the raw RWX bits from the EPTE, not
> * ACC_*_MASK flags!
> */
> - walker->fault.exit_qualification |= EPT_VIOLATION_RWX_TO_PROT(pte_access);
> + walker->fault.exit_qualification |=
> + EPT_VIOLATION_READ_TO_PROT(pte_access);
> + walker->fault.exit_qualification |=
> + EPT_VIOLATION_WRITE_TO_PROT(pte_access);
> + walker->fault.exit_qualification |=
> + EPT_VIOLATION_EXEC_TO_PROT(pte_access);
IMO, this is a big net negative. I much prefer the existing code, as it highlights
that USER_EXEC is the oddball.
> + if (vcpu->arch.pt_guest_exec_control)
This is wrong on multiple fronts. As mentioned earlier in the series, this is a
property of the MMU (more specifically, the root role), not of the vCPU.
And consulting MBEC support *only* when synthesizing the exit qualifcation is
wrong, because it means pte_access contains bogus data when consumed by
FNAME(gpte_access). At a glance, FNAME(gpte_access) probably needs to be modified
to take in the page role, e.g. like FNAME(sync_spte) and FNAME(prefetch_gpte)
already adjust the access based on the owning shadow page's access mask.
> + walker->fault.exit_qualification |=
> + EPT_VIOLATION_USER_EXEC_TO_PROT(pte_access);
> }
> #endif
> walker->fault.address = addr;
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 116910159a3f..0aadfa924045 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -5809,7 +5809,7 @@ static int handle_task_switch(struct kvm_vcpu *vcpu)
>
> static int handle_ept_violation(struct kvm_vcpu *vcpu)
> {
> - unsigned long exit_qualification;
> + unsigned long exit_qualification, rwx_mask;
> gpa_t gpa;
> u64 error_code;
>
> @@ -5839,7 +5839,10 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu)
> error_code |= (exit_qualification & EPT_VIOLATION_ACC_INSTR)
> ? PFERR_FETCH_MASK : 0;
> /* ept page table entry is present? */
> - error_code |= (exit_qualification & EPT_VIOLATION_PROT_MASK)
> + rwx_mask = EPT_VIOLATION_PROT_MASK;
> + if (vcpu->arch.pt_guest_exec_control)
> + rwx_mask |= EPT_VIOLATION_PROT_USER_EXEC;
> + error_code |= (exit_qualification & rwx_mask)
> ? PFERR_PRESENT_MASK : 0;
As mentioned above, if KVM clears EPT_VIOLATION_PROT_USER_EXEC when it's
undefined, then this can simply use EPT_VIOLATION_PROT_MASK unchanged.
>
> if (error_code & EPT_VIOLATION_GVA_IS_VALID)
> --
> 2.43.0
>
next prev parent reply other threads:[~2025-05-12 18:54 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 20:36 [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC) Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 01/18] KVM: VMX: Remove EPT_VIOLATIONS_ACC_*_BIT defines Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 02/18] KVM: nVMX: Decouple EPT RWX bits from EPT Violation protection bits Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 03/18] KVM: x86: Add module parameter for Intel MBEC Jon Kohler
2025-05-12 18:08 ` Sean Christopherson
2025-05-13 2:18 ` Jon Kohler
2025-05-13 7:57 ` Shah, Amit
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 04/18] KVM: VMX: add cpu_has_vmx_mbec helper Jon Kohler
2025-05-12 18:14 ` Sean Christopherson
2025-05-13 2:17 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 05/18] KVM: x86: Add pt_guest_exec_control to kvm_vcpu_arch Jon Kohler
2025-04-22 6:27 ` Chao Gao
2025-05-12 18:15 ` Sean Christopherson
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 06/18] KVM: VMX: Wire up Intel MBEC enable/disable logic Jon Kohler
2025-04-22 7:06 ` Chao Gao
2025-05-12 18:23 ` Sean Christopherson
2025-05-13 2:16 ` Jon Kohler
2025-05-13 13:28 ` Sean Christopherson
2025-05-14 11:14 ` Shah, Amit
2025-05-14 12:55 ` Sean Christopherson
2025-06-16 9:27 ` Shah, Amit
2025-06-17 14:13 ` Sean Christopherson
2025-07-09 13:40 ` Shah, Amit
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 07/18] KVM: VMX: Define VMX_EPT_USER_EXECUTABLE_MASK Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 08/18] KVM: x86/mmu: Remove SPTE_PERM_MASK Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 09/18] KVM: x86/mmu: Extend access bitfield in kvm_mmu_page_role Jon Kohler
2025-05-12 18:32 ` Sean Christopherson
2025-05-13 2:14 ` Jon Kohler
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 10/18] KVM: VMX: Extend EPT Violation protection bits Jon Kohler
2025-05-12 18:37 ` Sean Christopherson
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 11/18] KVM: VMX: Enhance EPT violation handler for PROT_USER_EXEC Jon Kohler
2025-05-12 18:54 ` Sean Christopherson [this message]
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 12/18] KVM: x86/mmu: Introduce shadow_ux_mask Jon Kohler
2025-04-23 3:06 ` Chao Gao
2025-05-12 19:13 ` Sean Christopherson
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 13/18] KVM: x86/mmu: Adjust SPTE_MMIO_ALLOWED_MASK to understand MBEC Jon Kohler
2025-04-23 5:37 ` Chao Gao
2025-05-12 19:37 ` Sean Christopherson
2025-05-13 2:11 ` Jon Kohler
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 14/18] KVM: x86/mmu: Extend is_executable_pte " Jon Kohler
2025-04-23 6:16 ` Chao Gao
2025-05-12 21:16 ` Sean Christopherson
2025-05-13 2:09 ` Jon Kohler
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 15/18] KVM: x86/mmu: Extend make_spte " Jon Kohler
2025-05-12 21:29 ` Sean Christopherson
2025-05-13 2:04 ` Jon Kohler
2025-05-13 17:54 ` Sean Christopherson
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 16/18] KVM: nVMX: Setup Intel MBEC in nested secondary controls Jon Kohler
2025-05-12 21:32 ` Sean Christopherson
2025-12-23 4:15 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 17/18] KVM: VMX: Allow MBEC with EVMCS Jon Kohler
2025-05-12 21:35 ` Sean Christopherson
2025-05-13 2:01 ` Jon Kohler
2025-12-23 4:16 ` Jon Kohler
2025-03-13 20:36 ` [RFC PATCH 18/18] KVM: x86: Enable module parameter for MBEC Jon Kohler
2025-04-15 9:29 ` [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC) Mickaël Salaün
2025-04-15 14:43 ` Sean Christopherson
2025-05-12 15:26 ` Jon Kohler
2025-04-15 14:43 ` Jon Kohler
2025-04-16 15:44 ` Mickaël Salaün
2025-04-23 13:54 ` Adrian-Ken Rueegsegger
2025-05-12 15:26 ` Jon Kohler
2025-05-12 21:46 ` Sean Christopherson
2025-05-13 1:59 ` Jon Kohler
2025-12-23 4:17 ` Jon Kohler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aCJDzU1p_SFNRIJd@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jon@nutanix.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mic@digikod.net \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.