* [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX
@ 2024-11-28 0:43 Sean Christopherson
2024-11-28 0:43 ` [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() Sean Christopherson
` (7 more replies)
0 siblings, 8 replies; 39+ messages in thread
From: Sean Christopherson @ 2024-11-28 0:43 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata,
Kai Huang, Xiaoyao Li
Effectively v4 of Binbin's series to handle hypercall exits to userspace in
a generic manner, so that TDX
Binbin and Kai, this is fairly different that what we last discussed. While
sorting through Binbin's latest patch, I stumbled on what I think/hope is an
approach that will make life easier for TDX. Rather than have common code
set the return value, _and_ have TDX implement a callback to do the same for
user return MSRs, just use the callback for all paths.
As for abusing vcpu->run->hypercall.ret... It's obviously a bit gross, but
I think it's a lesser evil than having multiple a one-line wrappers just to
trampoline in the return code.
v4:
- Fix an SEV-* bug where KVM trips the WARN in is_64_bit_mode().
- Add a pile of reworks to (hopefully) avoid as much duplicate code when
TDX comes along.
v3: https://lore.kernel.org/all/20240826022255.361406-1-binbin.wu@linux.intel.com
Binbin Wu (1):
KVM: x86: Add a helper to check for user interception of KVM
hypercalls
Sean Christopherson (5):
KVM: x86: Play nice with protected guests in complete_hypercall_exit()
KVM: x86: Move "emulate hypercall" function declarations to x86.h
KVM: x86: Bump hypercall stat prior to fully completing hypercall
KVM: x86: Always complete hypercall via function callback
KVM: x86: Refactor __kvm_emulate_hypercall() into a macro
arch/x86/include/asm/kvm_host.h | 6 ----
arch/x86/kvm/svm/sev.c | 4 +--
arch/x86/kvm/x86.c | 50 +++++++++++----------------------
arch/x86/kvm/x86.h | 28 ++++++++++++++++++
4 files changed, 47 insertions(+), 41 deletions(-)
base-commit: 4d911c7abee56771b0219a9fbf0120d06bdc9c14
--
2.47.0.338.g60cca15819-goog
^ permalink raw reply [flat|nested] 39+ messages in thread* [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() 2024-11-28 0:43 [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Sean Christopherson @ 2024-11-28 0:43 ` Sean Christopherson 2024-11-28 3:22 ` Xiaoyao Li ` (3 more replies) 2024-11-28 0:43 ` [PATCH v4 2/6] KVM: x86: Add a helper to check for user interception of KVM hypercalls Sean Christopherson ` (6 subsequent siblings) 7 siblings, 4 replies; 39+ messages in thread From: Sean Christopherson @ 2024-11-28 0:43 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li Use is_64_bit_hypercall() instead of is_64_bit_mode() to detect a 64-bit hypercall when completing said hypercall. For guests with protected state, e.g. SEV-ES and SEV-SNP, KVM must assume the hypercall was made in 64-bit mode as the vCPU state needed to detect 64-bit mode is unavailable. Hacking the sev_smoke_test selftest to generate a KVM_HC_MAP_GPA_RANGE hypercall via VMGEXIT trips the WARN: ------------[ cut here ]------------ WARNING: CPU: 273 PID: 326626 at arch/x86/kvm/x86.h:180 complete_hypercall_exit+0x44/0xe0 [kvm] Modules linked in: kvm_amd kvm ... [last unloaded: kvm] CPU: 273 UID: 0 PID: 326626 Comm: sev_smoke_test Not tainted 6.12.0-smp--392e932fa0f3-feat #470 Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024 RIP: 0010:complete_hypercall_exit+0x44/0xe0 [kvm] Call Trace: <TASK> kvm_arch_vcpu_ioctl_run+0x2400/0x2720 [kvm] kvm_vcpu_ioctl+0x54f/0x630 [kvm] __se_sys_ioctl+0x6b/0xc0 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> ---[ end trace 0000000000000000 ]--- Fixes: b5aead0064f3 ("KVM: x86: Assume a 64-bit hypercall for guests with protected state") Cc: stable@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/x86.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2e713480933a..0b2fe4aa04a2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9976,7 +9976,7 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) { u64 ret = vcpu->run->hypercall.ret; - if (!is_64_bit_mode(vcpu)) + if (!is_64_bit_hypercall(vcpu)) ret = (u32)ret; kvm_rax_write(vcpu, ret); ++vcpu->stat.hypercalls; -- 2.47.0.338.g60cca15819-goog ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() 2024-11-28 0:43 ` [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() Sean Christopherson @ 2024-11-28 3:22 ` Xiaoyao Li 2024-11-29 9:01 ` Nikunj A. Dadhania ` (2 subsequent siblings) 3 siblings, 0 replies; 39+ messages in thread From: Xiaoyao Li @ 2024-11-28 3:22 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang On 11/28/2024 8:43 AM, Sean Christopherson wrote: > Use is_64_bit_hypercall() instead of is_64_bit_mode() to detect a 64-bit > hypercall when completing said hypercall. For guests with protected state, > e.g. SEV-ES and SEV-SNP, KVM must assume the hypercall was made in 64-bit > mode as the vCPU state needed to detect 64-bit mode is unavailable. > > Hacking the sev_smoke_test selftest to generate a KVM_HC_MAP_GPA_RANGE > hypercall via VMGEXIT trips the WARN: > > ------------[ cut here ]------------ > WARNING: CPU: 273 PID: 326626 at arch/x86/kvm/x86.h:180 complete_hypercall_exit+0x44/0xe0 [kvm] > Modules linked in: kvm_amd kvm ... [last unloaded: kvm] > CPU: 273 UID: 0 PID: 326626 Comm: sev_smoke_test Not tainted 6.12.0-smp--392e932fa0f3-feat #470 > Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024 > RIP: 0010:complete_hypercall_exit+0x44/0xe0 [kvm] > Call Trace: > <TASK> > kvm_arch_vcpu_ioctl_run+0x2400/0x2720 [kvm] > kvm_vcpu_ioctl+0x54f/0x630 [kvm] > __se_sys_ioctl+0x6b/0xc0 > do_syscall_64+0x83/0x160 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > </TASK> > ---[ end trace 0000000000000000 ]--- Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> > Fixes: b5aead0064f3 ("KVM: x86: Assume a 64-bit hypercall for guests with protected state") > Cc: stable@vger.kernel.org > Cc: Tom Lendacky <thomas.lendacky@amd.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/x86.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 2e713480933a..0b2fe4aa04a2 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9976,7 +9976,7 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > { > u64 ret = vcpu->run->hypercall.ret; > > - if (!is_64_bit_mode(vcpu)) > + if (!is_64_bit_hypercall(vcpu)) > ret = (u32)ret; > kvm_rax_write(vcpu, ret); > ++vcpu->stat.hypercalls; ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() 2024-11-28 0:43 ` [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() Sean Christopherson 2024-11-28 3:22 ` Xiaoyao Li @ 2024-11-29 9:01 ` Nikunj A. Dadhania 2024-12-02 18:54 ` Tom Lendacky 2024-12-03 7:29 ` Binbin Wu 3 siblings, 0 replies; 39+ messages in thread From: Nikunj A. Dadhania @ 2024-11-29 9:01 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/28/2024 6:13 AM, Sean Christopherson wrote: > Use is_64_bit_hypercall() instead of is_64_bit_mode() to detect a 64-bit > hypercall when completing said hypercall. For guests with protected state, > e.g. SEV-ES and SEV-SNP, KVM must assume the hypercall was made in 64-bit > mode as the vCPU state needed to detect 64-bit mode is unavailable. > > Hacking the sev_smoke_test selftest to generate a KVM_HC_MAP_GPA_RANGE > hypercall via VMGEXIT trips the WARN: > > ------------[ cut here ]------------ > WARNING: CPU: 273 PID: 326626 at arch/x86/kvm/x86.h:180 complete_hypercall_exit+0x44/0xe0 [kvm] > Modules linked in: kvm_amd kvm ... [last unloaded: kvm] > CPU: 273 UID: 0 PID: 326626 Comm: sev_smoke_test Not tainted 6.12.0-smp--392e932fa0f3-feat #470 > Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024 > RIP: 0010:complete_hypercall_exit+0x44/0xe0 [kvm] > Call Trace: > <TASK> > kvm_arch_vcpu_ioctl_run+0x2400/0x2720 [kvm] > kvm_vcpu_ioctl+0x54f/0x630 [kvm] > __se_sys_ioctl+0x6b/0xc0 > do_syscall_64+0x83/0x160 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > </TASK> > ---[ end trace 0000000000000000 ]--- > > Fixes: b5aead0064f3 ("KVM: x86: Assume a 64-bit hypercall for guests with protected state") > Cc: stable@vger.kernel.org > Cc: Tom Lendacky <thomas.lendacky@amd.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> > --- > arch/x86/kvm/x86.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 2e713480933a..0b2fe4aa04a2 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9976,7 +9976,7 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > { > u64 ret = vcpu->run->hypercall.ret; > > - if (!is_64_bit_mode(vcpu)) > + if (!is_64_bit_hypercall(vcpu)) > ret = (u32)ret; > kvm_rax_write(vcpu, ret); > ++vcpu->stat.hypercalls; ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() 2024-11-28 0:43 ` [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() Sean Christopherson 2024-11-28 3:22 ` Xiaoyao Li 2024-11-29 9:01 ` Nikunj A. Dadhania @ 2024-12-02 18:54 ` Tom Lendacky 2024-12-03 7:29 ` Binbin Wu 3 siblings, 0 replies; 39+ messages in thread From: Tom Lendacky @ 2024-12-02 18:54 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/27/24 18:43, Sean Christopherson wrote: > Use is_64_bit_hypercall() instead of is_64_bit_mode() to detect a 64-bit > hypercall when completing said hypercall. For guests with protected state, > e.g. SEV-ES and SEV-SNP, KVM must assume the hypercall was made in 64-bit > mode as the vCPU state needed to detect 64-bit mode is unavailable. > > Hacking the sev_smoke_test selftest to generate a KVM_HC_MAP_GPA_RANGE > hypercall via VMGEXIT trips the WARN: > > ------------[ cut here ]------------ > WARNING: CPU: 273 PID: 326626 at arch/x86/kvm/x86.h:180 complete_hypercall_exit+0x44/0xe0 [kvm] > Modules linked in: kvm_amd kvm ... [last unloaded: kvm] > CPU: 273 UID: 0 PID: 326626 Comm: sev_smoke_test Not tainted 6.12.0-smp--392e932fa0f3-feat #470 > Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024 > RIP: 0010:complete_hypercall_exit+0x44/0xe0 [kvm] > Call Trace: > <TASK> > kvm_arch_vcpu_ioctl_run+0x2400/0x2720 [kvm] > kvm_vcpu_ioctl+0x54f/0x630 [kvm] > __se_sys_ioctl+0x6b/0xc0 > do_syscall_64+0x83/0x160 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > </TASK> > ---[ end trace 0000000000000000 ]--- > > Fixes: b5aead0064f3 ("KVM: x86: Assume a 64-bit hypercall for guests with protected state") > Cc: stable@vger.kernel.org > Cc: Tom Lendacky <thomas.lendacky@amd.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> > --- > arch/x86/kvm/x86.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 2e713480933a..0b2fe4aa04a2 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9976,7 +9976,7 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > { > u64 ret = vcpu->run->hypercall.ret; > > - if (!is_64_bit_mode(vcpu)) > + if (!is_64_bit_hypercall(vcpu)) > ret = (u32)ret; > kvm_rax_write(vcpu, ret); > ++vcpu->stat.hypercalls; ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() 2024-11-28 0:43 ` [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() Sean Christopherson ` (2 preceding siblings ...) 2024-12-02 18:54 ` Tom Lendacky @ 2024-12-03 7:29 ` Binbin Wu 3 siblings, 0 replies; 39+ messages in thread From: Binbin Wu @ 2024-12-03 7:29 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/28/2024 8:43 AM, Sean Christopherson wrote: > Use is_64_bit_hypercall() instead of is_64_bit_mode() to detect a 64-bit > hypercall when completing said hypercall. For guests with protected state, > e.g. SEV-ES and SEV-SNP, KVM must assume the hypercall was made in 64-bit > mode as the vCPU state needed to detect 64-bit mode is unavailable. > > Hacking the sev_smoke_test selftest to generate a KVM_HC_MAP_GPA_RANGE > hypercall via VMGEXIT trips the WARN: > > ------------[ cut here ]------------ > WARNING: CPU: 273 PID: 326626 at arch/x86/kvm/x86.h:180 complete_hypercall_exit+0x44/0xe0 [kvm] > Modules linked in: kvm_amd kvm ... [last unloaded: kvm] > CPU: 273 UID: 0 PID: 326626 Comm: sev_smoke_test Not tainted 6.12.0-smp--392e932fa0f3-feat #470 > Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024 > RIP: 0010:complete_hypercall_exit+0x44/0xe0 [kvm] > Call Trace: > <TASK> > kvm_arch_vcpu_ioctl_run+0x2400/0x2720 [kvm] > kvm_vcpu_ioctl+0x54f/0x630 [kvm] > __se_sys_ioctl+0x6b/0xc0 > do_syscall_64+0x83/0x160 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > </TASK> > ---[ end trace 0000000000000000 ]--- > > Fixes: b5aead0064f3 ("KVM: x86: Assume a 64-bit hypercall for guests with protected state") > Cc: stable@vger.kernel.org > Cc: Tom Lendacky <thomas.lendacky@amd.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> > --- > arch/x86/kvm/x86.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 2e713480933a..0b2fe4aa04a2 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9976,7 +9976,7 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > { > u64 ret = vcpu->run->hypercall.ret; > > - if (!is_64_bit_mode(vcpu)) > + if (!is_64_bit_hypercall(vcpu)) > ret = (u32)ret; > kvm_rax_write(vcpu, ret); > ++vcpu->stat.hypercalls; ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v4 2/6] KVM: x86: Add a helper to check for user interception of KVM hypercalls 2024-11-28 0:43 [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Sean Christopherson 2024-11-28 0:43 ` [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() Sean Christopherson @ 2024-11-28 0:43 ` Sean Christopherson 2024-12-02 18:57 ` Tom Lendacky 2024-11-28 0:43 ` [PATCH v4 3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h Sean Christopherson ` (5 subsequent siblings) 7 siblings, 1 reply; 39+ messages in thread From: Sean Christopherson @ 2024-11-28 0:43 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li From: Binbin Wu <binbin.wu@linux.intel.com> Add and use user_exit_on_hypercall() to check if userspace wants to handle a KVM hypercall instead of open-coding the logic everywhere. No functional change intended. Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> [sean: squash into one patch, keep explicit KVM_HC_MAP_GPA_RANGE check] Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/svm/sev.c | 4 ++-- arch/x86/kvm/x86.c | 2 +- arch/x86/kvm/x86.h | 5 +++++ 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 72674b8825c4..6ac6312c4d57 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -3640,7 +3640,7 @@ static int snp_begin_psc_msr(struct vcpu_svm *svm, u64 ghcb_msr) return 1; /* resume guest */ } - if (!(vcpu->kvm->arch.hypercall_exit_enabled & (1 << KVM_HC_MAP_GPA_RANGE))) { + if (!user_exit_on_hypercall(vcpu->kvm, KVM_HC_MAP_GPA_RANGE)) { set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR); return 1; /* resume guest */ } @@ -3723,7 +3723,7 @@ static int snp_begin_psc(struct vcpu_svm *svm, struct psc_buffer *psc) bool huge; u64 gfn; - if (!(vcpu->kvm->arch.hypercall_exit_enabled & (1 << KVM_HC_MAP_GPA_RANGE))) { + if (!user_exit_on_hypercall(vcpu->kvm, KVM_HC_MAP_GPA_RANGE)) { snp_complete_psc(svm, VMGEXIT_PSC_ERROR_GENERIC); return 1; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0b2fe4aa04a2..13fe5d6eb8f3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10041,7 +10041,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, u64 gpa = a0, npages = a1, attrs = a2; ret = -KVM_ENOSYS; - if (!(vcpu->kvm->arch.hypercall_exit_enabled & (1 << KVM_HC_MAP_GPA_RANGE))) + if (!user_exit_on_hypercall(vcpu->kvm, KVM_HC_MAP_GPA_RANGE)) break; if (!PAGE_ALIGNED(gpa) || !npages || diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index ec623d23d13d..45dd53284dbd 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -612,4 +612,9 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, unsigned int port, void *data, unsigned int count, int in); +static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) +{ + return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); +} + #endif -- 2.47.0.338.g60cca15819-goog ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 2/6] KVM: x86: Add a helper to check for user interception of KVM hypercalls 2024-11-28 0:43 ` [PATCH v4 2/6] KVM: x86: Add a helper to check for user interception of KVM hypercalls Sean Christopherson @ 2024-12-02 18:57 ` Tom Lendacky 0 siblings, 0 replies; 39+ messages in thread From: Tom Lendacky @ 2024-12-02 18:57 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/27/24 18:43, Sean Christopherson wrote: > From: Binbin Wu <binbin.wu@linux.intel.com> > > Add and use user_exit_on_hypercall() to check if userspace wants to handle > a KVM hypercall instead of open-coding the logic everywhere. > > No functional change intended. > > Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com> > Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> > Reviewed-by: Kai Huang <kai.huang@intel.com> > Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> > [sean: squash into one patch, keep explicit KVM_HC_MAP_GPA_RANGE check] > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> > --- > arch/x86/kvm/svm/sev.c | 4 ++-- > arch/x86/kvm/x86.c | 2 +- > arch/x86/kvm/x86.h | 5 +++++ > 3 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > index 72674b8825c4..6ac6312c4d57 100644 > --- a/arch/x86/kvm/svm/sev.c > +++ b/arch/x86/kvm/svm/sev.c > @@ -3640,7 +3640,7 @@ static int snp_begin_psc_msr(struct vcpu_svm *svm, u64 ghcb_msr) > return 1; /* resume guest */ > } > > - if (!(vcpu->kvm->arch.hypercall_exit_enabled & (1 << KVM_HC_MAP_GPA_RANGE))) { > + if (!user_exit_on_hypercall(vcpu->kvm, KVM_HC_MAP_GPA_RANGE)) { > set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR); > return 1; /* resume guest */ > } > @@ -3723,7 +3723,7 @@ static int snp_begin_psc(struct vcpu_svm *svm, struct psc_buffer *psc) > bool huge; > u64 gfn; > > - if (!(vcpu->kvm->arch.hypercall_exit_enabled & (1 << KVM_HC_MAP_GPA_RANGE))) { > + if (!user_exit_on_hypercall(vcpu->kvm, KVM_HC_MAP_GPA_RANGE)) { > snp_complete_psc(svm, VMGEXIT_PSC_ERROR_GENERIC); > return 1; > } > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 0b2fe4aa04a2..13fe5d6eb8f3 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -10041,7 +10041,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > u64 gpa = a0, npages = a1, attrs = a2; > > ret = -KVM_ENOSYS; > - if (!(vcpu->kvm->arch.hypercall_exit_enabled & (1 << KVM_HC_MAP_GPA_RANGE))) > + if (!user_exit_on_hypercall(vcpu->kvm, KVM_HC_MAP_GPA_RANGE)) > break; > > if (!PAGE_ALIGNED(gpa) || !npages || > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index ec623d23d13d..45dd53284dbd 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -612,4 +612,9 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, > unsigned int port, void *data, unsigned int count, > int in); > > +static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) > +{ > + return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); > +} > + > #endif ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v4 3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h 2024-11-28 0:43 [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Sean Christopherson 2024-11-28 0:43 ` [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() Sean Christopherson 2024-11-28 0:43 ` [PATCH v4 2/6] KVM: x86: Add a helper to check for user interception of KVM hypercalls Sean Christopherson @ 2024-11-28 0:43 ` Sean Christopherson 2024-11-28 3:23 ` Xiaoyao Li ` (2 more replies) 2024-11-28 0:43 ` [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall Sean Christopherson ` (4 subsequent siblings) 7 siblings, 3 replies; 39+ messages in thread From: Sean Christopherson @ 2024-11-28 0:43 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li Move the declarations for the hypercall emulation APIs to x86.h. While the helpers are exported, they are intended to be consumed only KVM vendor modules, i.e. don't need to exposed to the kernel at-large. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/include/asm/kvm_host.h | 6 ------ arch/x86/kvm/x86.h | 6 ++++++ 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e159e44a6a1b..c1251b371421 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2181,12 +2181,6 @@ static inline void kvm_clear_apicv_inhibit(struct kvm *kvm, kvm_set_or_clear_apicv_inhibit(kvm, reason, false); } -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, - unsigned long a0, unsigned long a1, - unsigned long a2, unsigned long a3, - int op_64_bit, int cpl); -int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); - int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, void *insn, int insn_len); void kvm_mmu_print_sptes(struct kvm_vcpu *vcpu, gpa_t gpa, const char *msg); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 45dd53284dbd..6db13b696468 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -617,4 +617,10 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); } +unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, + unsigned long a0, unsigned long a1, + unsigned long a2, unsigned long a3, + int op_64_bit, int cpl); +int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); + #endif -- 2.47.0.338.g60cca15819-goog ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h 2024-11-28 0:43 ` [PATCH v4 3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h Sean Christopherson @ 2024-11-28 3:23 ` Xiaoyao Li 2024-12-02 20:05 ` Tom Lendacky 2024-12-03 7:33 ` Binbin Wu 2 siblings, 0 replies; 39+ messages in thread From: Xiaoyao Li @ 2024-11-28 3:23 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang On 11/28/2024 8:43 AM, Sean Christopherson wrote: > Move the declarations for the hypercall emulation APIs to x86.h. While > the helpers are exported, they are intended to be consumed only KVM vendor > modules, i.e. don't need to exposed to the kernel at-large. > > No functional change intended. Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/include/asm/kvm_host.h | 6 ------ > arch/x86/kvm/x86.h | 6 ++++++ > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index e159e44a6a1b..c1251b371421 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -2181,12 +2181,6 @@ static inline void kvm_clear_apicv_inhibit(struct kvm *kvm, > kvm_set_or_clear_apicv_inhibit(kvm, reason, false); > } > > -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl); > -int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > - > int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, > void *insn, int insn_len); > void kvm_mmu_print_sptes(struct kvm_vcpu *vcpu, gpa_t gpa, const char *msg); > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 45dd53284dbd..6db13b696468 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -617,4 +617,10 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) > return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); > } > > +unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl); > +int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > + > #endif ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h 2024-11-28 0:43 ` [PATCH v4 3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h Sean Christopherson 2024-11-28 3:23 ` Xiaoyao Li @ 2024-12-02 20:05 ` Tom Lendacky 2024-12-03 7:33 ` Binbin Wu 2 siblings, 0 replies; 39+ messages in thread From: Tom Lendacky @ 2024-12-02 20:05 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/27/24 18:43, Sean Christopherson wrote: > Move the declarations for the hypercall emulation APIs to x86.h. While > the helpers are exported, they are intended to be consumed only KVM vendor s/only/only by/ > modules, i.e. don't need to exposed to the kernel at-large. > > No functional change intended. > > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> > --- > arch/x86/include/asm/kvm_host.h | 6 ------ > arch/x86/kvm/x86.h | 6 ++++++ > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index e159e44a6a1b..c1251b371421 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -2181,12 +2181,6 @@ static inline void kvm_clear_apicv_inhibit(struct kvm *kvm, > kvm_set_or_clear_apicv_inhibit(kvm, reason, false); > } > > -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl); > -int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > - > int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, > void *insn, int insn_len); > void kvm_mmu_print_sptes(struct kvm_vcpu *vcpu, gpa_t gpa, const char *msg); > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 45dd53284dbd..6db13b696468 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -617,4 +617,10 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) > return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); > } > > +unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl); > +int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > + > #endif ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h 2024-11-28 0:43 ` [PATCH v4 3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h Sean Christopherson 2024-11-28 3:23 ` Xiaoyao Li 2024-12-02 20:05 ` Tom Lendacky @ 2024-12-03 7:33 ` Binbin Wu 2 siblings, 0 replies; 39+ messages in thread From: Binbin Wu @ 2024-12-03 7:33 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/28/2024 8:43 AM, Sean Christopherson wrote: > Move the declarations for the hypercall emulation APIs to x86.h. While > the helpers are exported, they are intended to be consumed only KVM vendor only -> only by > modules, i.e. don't need to exposed to the kernel at-large. to exposed -> to be exposed > > No functional change intended. > > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> > --- > arch/x86/include/asm/kvm_host.h | 6 ------ > arch/x86/kvm/x86.h | 6 ++++++ > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index e159e44a6a1b..c1251b371421 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -2181,12 +2181,6 @@ static inline void kvm_clear_apicv_inhibit(struct kvm *kvm, > kvm_set_or_clear_apicv_inhibit(kvm, reason, false); > } > > -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl); > -int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > - > int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, > void *insn, int insn_len); > void kvm_mmu_print_sptes(struct kvm_vcpu *vcpu, gpa_t gpa, const char *msg); > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 45dd53284dbd..6db13b696468 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -617,4 +617,10 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) > return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); > } > > +unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl); > +int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > + > #endif ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall 2024-11-28 0:43 [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Sean Christopherson ` (2 preceding siblings ...) 2024-11-28 0:43 ` [PATCH v4 3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h Sean Christopherson @ 2024-11-28 0:43 ` Sean Christopherson 2024-11-28 3:24 ` Xiaoyao Li ` (2 more replies) 2024-11-28 0:43 ` [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback Sean Christopherson ` (3 subsequent siblings) 7 siblings, 3 replies; 39+ messages in thread From: Sean Christopherson @ 2024-11-28 0:43 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li Increment the "hypercalls" stat for KVM hypercalls as soon as KVM knows it will skip the guest instruction, i.e. once KVM is committed to emulating the hypercall. Waiting until completion adds no known value, and creates a discrepancy where the stat will be bumped if KVM exits to userspace as a result of trying to skip the instruction, but not if the hypercall itself exits. Handling the stat in common code will also avoid the need for another helper to dedup code when TDX comes along (TDX needs a separate completion path due to GPR usage differences). Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/x86.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 13fe5d6eb8f3..11434752b467 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9979,7 +9979,6 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) if (!is_64_bit_hypercall(vcpu)) ret = (u32)ret; kvm_rax_write(vcpu, ret); - ++vcpu->stat.hypercalls; return kvm_skip_emulated_instruction(vcpu); } @@ -9990,6 +9989,8 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, { unsigned long ret; + ++vcpu->stat.hypercalls; + trace_kvm_hypercall(nr, a0, a1, a2, a3); if (!op_64_bit) { @@ -10070,7 +10071,6 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, } out: - ++vcpu->stat.hypercalls; return ret; } EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); -- 2.47.0.338.g60cca15819-goog ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall 2024-11-28 0:43 ` [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall Sean Christopherson @ 2024-11-28 3:24 ` Xiaoyao Li 2024-12-02 20:13 ` Tom Lendacky 2024-12-03 7:37 ` Binbin Wu 2 siblings, 0 replies; 39+ messages in thread From: Xiaoyao Li @ 2024-11-28 3:24 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang On 11/28/2024 8:43 AM, Sean Christopherson wrote: > Increment the "hypercalls" stat for KVM hypercalls as soon as KVM knows > it will skip the guest instruction, i.e. once KVM is committed to emulating > the hypercall. Waiting until completion adds no known value, and creates a > discrepancy where the stat will be bumped if KVM exits to userspace as a > result of trying to skip the instruction, but not if the hypercall itself > exits. > > Handling the stat in common code will also avoid the need for another > helper to dedup code when TDX comes along (TDX needs a separate completion > path due to GPR usage differences). Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/x86.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 13fe5d6eb8f3..11434752b467 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9979,7 +9979,6 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > if (!is_64_bit_hypercall(vcpu)) > ret = (u32)ret; > kvm_rax_write(vcpu, ret); > - ++vcpu->stat.hypercalls; > return kvm_skip_emulated_instruction(vcpu); > } > > @@ -9990,6 +9989,8 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > { > unsigned long ret; > > + ++vcpu->stat.hypercalls; > + > trace_kvm_hypercall(nr, a0, a1, a2, a3); > > if (!op_64_bit) { > @@ -10070,7 +10071,6 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > } > > out: > - ++vcpu->stat.hypercalls; > return ret; > } > EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall 2024-11-28 0:43 ` [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall Sean Christopherson 2024-11-28 3:24 ` Xiaoyao Li @ 2024-12-02 20:13 ` Tom Lendacky 2024-12-02 20:33 ` Tom Lendacky 2024-12-03 7:37 ` Binbin Wu 2 siblings, 1 reply; 39+ messages in thread From: Tom Lendacky @ 2024-12-02 20:13 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/27/24 18:43, Sean Christopherson wrote: > Increment the "hypercalls" stat for KVM hypercalls as soon as KVM knows > it will skip the guest instruction, i.e. once KVM is committed to emulating > the hypercall. Waiting until completion adds no known value, and creates a > discrepancy where the stat will be bumped if KVM exits to userspace as a > result of trying to skip the instruction, but not if the hypercall itself > exits. > > Handling the stat in common code will also avoid the need for another > helper to dedup code when TDX comes along (TDX needs a separate completion > path due to GPR usage differences). > > Signed-off-by: Sean Christopherson <seanjc@google.com> There's a comment in the KVM_HC_MAP_GPA_RANGE case that reads: /* stat is incremented on completion. */ that should probably be deleted, but otherwise: Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Also, if you want, you could get rid of the 'out' label, too, by doing: if (cpl) return -KVM_EPERM; > --- > arch/x86/kvm/x86.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 13fe5d6eb8f3..11434752b467 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9979,7 +9979,6 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > if (!is_64_bit_hypercall(vcpu)) > ret = (u32)ret; > kvm_rax_write(vcpu, ret); > - ++vcpu->stat.hypercalls; > return kvm_skip_emulated_instruction(vcpu); > } > > @@ -9990,6 +9989,8 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > { > unsigned long ret; > > + ++vcpu->stat.hypercalls; > + > trace_kvm_hypercall(nr, a0, a1, a2, a3); > > if (!op_64_bit) { > @@ -10070,7 +10071,6 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > } > > out: > - ++vcpu->stat.hypercalls; > return ret; > } > EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall 2024-12-02 20:13 ` Tom Lendacky @ 2024-12-02 20:33 ` Tom Lendacky 0 siblings, 0 replies; 39+ messages in thread From: Tom Lendacky @ 2024-12-02 20:33 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On 12/2/24 14:13, Tom Lendacky wrote: > On 11/27/24 18:43, Sean Christopherson wrote: >> Increment the "hypercalls" stat for KVM hypercalls as soon as KVM knows >> it will skip the guest instruction, i.e. once KVM is committed to emulating >> the hypercall. Waiting until completion adds no known value, and creates a >> discrepancy where the stat will be bumped if KVM exits to userspace as a >> result of trying to skip the instruction, but not if the hypercall itself >> exits. >> >> Handling the stat in common code will also avoid the need for another >> helper to dedup code when TDX comes along (TDX needs a separate completion >> path due to GPR usage differences). >> >> Signed-off-by: Sean Christopherson <seanjc@google.com> > > There's a comment in the KVM_HC_MAP_GPA_RANGE case that reads: > > /* stat is incremented on completion. */ > > that should probably be deleted, but otherwise: > > Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> > > Also, if you want, you could get rid of the 'out' label, too, by doing: > > if (cpl) > return -KVM_EPERM; Until I saw the next patch... nevermind. Thanks, Tom > >> --- >> arch/x86/kvm/x86.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 13fe5d6eb8f3..11434752b467 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -9979,7 +9979,6 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) >> if (!is_64_bit_hypercall(vcpu)) >> ret = (u32)ret; >> kvm_rax_write(vcpu, ret); >> - ++vcpu->stat.hypercalls; >> return kvm_skip_emulated_instruction(vcpu); >> } >> >> @@ -9990,6 +9989,8 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, >> { >> unsigned long ret; >> >> + ++vcpu->stat.hypercalls; >> + >> trace_kvm_hypercall(nr, a0, a1, a2, a3); >> >> if (!op_64_bit) { >> @@ -10070,7 +10071,6 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, >> } >> >> out: >> - ++vcpu->stat.hypercalls; >> return ret; >> } >> EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall 2024-11-28 0:43 ` [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall Sean Christopherson 2024-11-28 3:24 ` Xiaoyao Li 2024-12-02 20:13 ` Tom Lendacky @ 2024-12-03 7:37 ` Binbin Wu 2 siblings, 0 replies; 39+ messages in thread From: Binbin Wu @ 2024-12-03 7:37 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/28/2024 8:43 AM, Sean Christopherson wrote: > Increment the "hypercalls" stat for KVM hypercalls as soon as KVM knows > it will skip the guest instruction, i.e. once KVM is committed to emulating > the hypercall. Waiting until completion adds no known value, and creates a > discrepancy where the stat will be bumped if KVM exits to userspace as a > result of trying to skip the instruction, but not if the hypercall itself > exits. > > Handling the stat in common code will also avoid the need for another > helper to dedup code when TDX comes along (TDX needs a separate completion > path due to GPR usage differences). > > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> > --- > arch/x86/kvm/x86.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 13fe5d6eb8f3..11434752b467 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9979,7 +9979,6 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > if (!is_64_bit_hypercall(vcpu)) > ret = (u32)ret; > kvm_rax_write(vcpu, ret); > - ++vcpu->stat.hypercalls; > return kvm_skip_emulated_instruction(vcpu); > } > > @@ -9990,6 +9989,8 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > { > unsigned long ret; > > + ++vcpu->stat.hypercalls; > + > trace_kvm_hypercall(nr, a0, a1, a2, a3); > > if (!op_64_bit) { > @@ -10070,7 +10071,6 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > } > > out: > - ++vcpu->stat.hypercalls; > return ret; > } > EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback 2024-11-28 0:43 [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Sean Christopherson ` (3 preceding siblings ...) 2024-11-28 0:43 ` [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall Sean Christopherson @ 2024-11-28 0:43 ` Sean Christopherson 2024-11-28 3:08 ` Xiaoyao Li 2024-12-02 20:57 ` Tom Lendacky 2024-11-28 0:43 ` [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro Sean Christopherson ` (2 subsequent siblings) 7 siblings, 2 replies; 39+ messages in thread From: Sean Christopherson @ 2024-11-28 0:43 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li Finish "emulation" of KVM hypercalls by function callback, even when the hypercall is handled entirely within KVM, i.e. doesn't require an exit to userspace, and refactor __kvm_emulate_hypercall()'s return value to *only* communicate whether or not KVM should exit to userspace or resume the guest. (Ab)Use vcpu->run->hypercall.ret to propagate the return value to the callback, purely to avoid having to add a trampoline for every completion callback. Using the function return value for KVM's control flow eliminates the multiplexed return value, where '0' for KVM_HC_MAP_GPA_RANGE (and only that hypercall) means "exit to userspace". Note, the unnecessary extra indirect call and thus potential retpoline will be eliminated in the near future by converting the intermediate layer to a macro. Suggested-by: Binbin Wu <binbin.wu@linux.intel.com> Suggested-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/x86.c | 29 ++++++++++++----------------- arch/x86/kvm/x86.h | 10 ++++++---- 2 files changed, 18 insertions(+), 21 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 11434752b467..39be2a891ab4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9982,10 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) return kvm_skip_emulated_instruction(vcpu); } -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, - unsigned long a0, unsigned long a1, - unsigned long a2, unsigned long a3, - int op_64_bit, int cpl) +int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, + unsigned long a0, unsigned long a1, + unsigned long a2, unsigned long a3, + int op_64_bit, int cpl, + int (*complete_hypercall)(struct kvm_vcpu *)) { unsigned long ret; @@ -10061,7 +10062,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, vcpu->run->hypercall.flags |= KVM_EXIT_HYPERCALL_LONG_MODE; WARN_ON_ONCE(vcpu->run->hypercall.flags & KVM_EXIT_HYPERCALL_MBZ); - vcpu->arch.complete_userspace_io = complete_hypercall_exit; + vcpu->arch.complete_userspace_io = complete_hypercall; /* stat is incremented on completion. */ return 0; } @@ -10071,13 +10072,15 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, } out: - return ret; + vcpu->run->hypercall.ret = ret; + complete_hypercall(vcpu); + return 1; } EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) { - unsigned long nr, a0, a1, a2, a3, ret; + unsigned long nr, a0, a1, a2, a3; int op_64_bit; int cpl; @@ -10095,16 +10098,8 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) op_64_bit = is_64_bit_hypercall(vcpu); cpl = kvm_x86_call(get_cpl)(vcpu); - ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl); - if (nr == KVM_HC_MAP_GPA_RANGE && !ret) - /* MAP_GPA tosses the request to the user space. */ - return 0; - - if (!op_64_bit) - ret = (u32)ret; - kvm_rax_write(vcpu, ret); - - return kvm_skip_emulated_instruction(vcpu); + return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, + complete_hypercall_exit); } EXPORT_SYMBOL_GPL(kvm_emulate_hypercall); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 6db13b696468..28adc8ea04bf 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -617,10 +617,12 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); } -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, - unsigned long a0, unsigned long a1, - unsigned long a2, unsigned long a3, - int op_64_bit, int cpl); +int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, + unsigned long a0, unsigned long a1, + unsigned long a2, unsigned long a3, + int op_64_bit, int cpl, + int (*complete_hypercall)(struct kvm_vcpu *)); + int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); #endif -- 2.47.0.338.g60cca15819-goog ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback 2024-11-28 0:43 ` [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback Sean Christopherson @ 2024-11-28 3:08 ` Xiaoyao Li 2024-12-02 18:44 ` Sean Christopherson 2024-12-02 20:57 ` Tom Lendacky 1 sibling, 1 reply; 39+ messages in thread From: Xiaoyao Li @ 2024-11-28 3:08 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang On 11/28/2024 8:43 AM, Sean Christopherson wrote: > Finish "emulation" of KVM hypercalls by function callback, even when the > hypercall is handled entirely within KVM, i.e. doesn't require an exit to > userspace, and refactor __kvm_emulate_hypercall()'s return value to *only* > communicate whether or not KVM should exit to userspace or resume the > guest. > > (Ab)Use vcpu->run->hypercall.ret to propagate the return value to the > callback, purely to avoid having to add a trampoline for every completion > callback. > > Using the function return value for KVM's control flow eliminates the > multiplexed return value, where '0' for KVM_HC_MAP_GPA_RANGE (and only > that hypercall) means "exit to userspace". > > Note, the unnecessary extra indirect call and thus potential retpoline > will be eliminated in the near future by converting the intermediate layer > to a macro. > > Suggested-by: Binbin Wu <binbin.wu@linux.intel.com> > Suggested-by: Kai Huang <kai.huang@intel.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/x86.c | 29 ++++++++++++----------------- > arch/x86/kvm/x86.h | 10 ++++++---- > 2 files changed, 18 insertions(+), 21 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 11434752b467..39be2a891ab4 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9982,10 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > return kvm_skip_emulated_instruction(vcpu); > } > > -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl) > +int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl, > + int (*complete_hypercall)(struct kvm_vcpu *)) > { > unsigned long ret; > > @@ -10061,7 +10062,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > vcpu->run->hypercall.flags |= KVM_EXIT_HYPERCALL_LONG_MODE; > > WARN_ON_ONCE(vcpu->run->hypercall.flags & KVM_EXIT_HYPERCALL_MBZ); > - vcpu->arch.complete_userspace_io = complete_hypercall_exit; > + vcpu->arch.complete_userspace_io = complete_hypercall; > /* stat is incremented on completion. */ > return 0; > } > @@ -10071,13 +10072,15 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > } > > out: > - return ret; > + vcpu->run->hypercall.ret = ret; > + complete_hypercall(vcpu); > + return 1; shouldn't it be return complete_hypercall(vcpu); ? Originally, kvm_emulate_hypercall() returns kvm_skip_emulated_instruction(). Now it becomes kvm_skip_emulated_instruction(); return 1; I don't go deep to see if kvm_skip_emulated_instruction() always return 1 for this case. But logically, return complete_hypercall(vcpu); looks more reasonable. > } > EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); > > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) > { > - unsigned long nr, a0, a1, a2, a3, ret; > + unsigned long nr, a0, a1, a2, a3; > int op_64_bit; > int cpl; > > @@ -10095,16 +10098,8 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) > op_64_bit = is_64_bit_hypercall(vcpu); > cpl = kvm_x86_call(get_cpl)(vcpu); > > - ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl); > - if (nr == KVM_HC_MAP_GPA_RANGE && !ret) > - /* MAP_GPA tosses the request to the user space. */ > - return 0; > - > - if (!op_64_bit) > - ret = (u32)ret; > - kvm_rax_write(vcpu, ret); > - > - return kvm_skip_emulated_instruction(vcpu); > + return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, > + complete_hypercall_exit); > } > EXPORT_SYMBOL_GPL(kvm_emulate_hypercall); > > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 6db13b696468..28adc8ea04bf 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -617,10 +617,12 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) > return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); > } > > -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl); > +int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl, > + int (*complete_hypercall)(struct kvm_vcpu *)); > + > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > > #endif ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback 2024-11-28 3:08 ` Xiaoyao Li @ 2024-12-02 18:44 ` Sean Christopherson 0 siblings, 0 replies; 39+ messages in thread From: Sean Christopherson @ 2024-12-02 18:44 UTC (permalink / raw) To: Xiaoyao Li Cc: Paolo Bonzini, kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang On Thu, Nov 28, 2024, Xiaoyao Li wrote: > On 11/28/2024 8:43 AM, Sean Christopherson wrote: > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 11434752b467..39be2a891ab4 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -9982,10 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > > return kvm_skip_emulated_instruction(vcpu); > > } > > -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > > - unsigned long a0, unsigned long a1, > > - unsigned long a2, unsigned long a3, > > - int op_64_bit, int cpl) > > +int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > > + unsigned long a0, unsigned long a1, > > + unsigned long a2, unsigned long a3, > > + int op_64_bit, int cpl, > > + int (*complete_hypercall)(struct kvm_vcpu *)) > > { > > unsigned long ret; > > @@ -10061,7 +10062,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > > vcpu->run->hypercall.flags |= KVM_EXIT_HYPERCALL_LONG_MODE; > > WARN_ON_ONCE(vcpu->run->hypercall.flags & KVM_EXIT_HYPERCALL_MBZ); > > - vcpu->arch.complete_userspace_io = complete_hypercall_exit; > > + vcpu->arch.complete_userspace_io = complete_hypercall; > > /* stat is incremented on completion. */ > > return 0; > > } > > @@ -10071,13 +10072,15 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > > } > > out: > > - return ret; > > + vcpu->run->hypercall.ret = ret; > > + complete_hypercall(vcpu); > > + return 1; > > shouldn't it be > > return complete_hypercall(vcpu); > > ? Ouch. Yes, it most definitely should be. > Originally, kvm_emulate_hypercall() returns kvm_skip_emulated_instruction(). > Now it becomes > > kvm_skip_emulated_instruction(); > return 1; > > I don't go deep to see if kvm_skip_emulated_instruction() always return 1 > for this case. It doesn't. KVM needs to exit to userspace if userspace is single-stepping, or in the extremely unlikely scenario that KVM can't skip the emulated instruction (which can very theoretically happen on older AMD CPUs). ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback 2024-11-28 0:43 ` [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback Sean Christopherson 2024-11-28 3:08 ` Xiaoyao Li @ 2024-12-02 20:57 ` Tom Lendacky 2024-12-02 20:59 ` Tom Lendacky 1 sibling, 1 reply; 39+ messages in thread From: Tom Lendacky @ 2024-12-02 20:57 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/27/24 18:43, Sean Christopherson wrote: > Finish "emulation" of KVM hypercalls by function callback, even when the > hypercall is handled entirely within KVM, i.e. doesn't require an exit to > userspace, and refactor __kvm_emulate_hypercall()'s return value to *only* > communicate whether or not KVM should exit to userspace or resume the > guest. > > (Ab)Use vcpu->run->hypercall.ret to propagate the return value to the > callback, purely to avoid having to add a trampoline for every completion > callback. > > Using the function return value for KVM's control flow eliminates the > multiplexed return value, where '0' for KVM_HC_MAP_GPA_RANGE (and only > that hypercall) means "exit to userspace". > > Note, the unnecessary extra indirect call and thus potential retpoline > will be eliminated in the near future by converting the intermediate layer > to a macro. > > Suggested-by: Binbin Wu <binbin.wu@linux.intel.com> > Suggested-by: Kai Huang <kai.huang@intel.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/x86.c | 29 ++++++++++++----------------- > arch/x86/kvm/x86.h | 10 ++++++---- > 2 files changed, 18 insertions(+), 21 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 11434752b467..39be2a891ab4 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9982,10 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > return kvm_skip_emulated_instruction(vcpu); > } > > -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl) > +int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl, > + int (*complete_hypercall)(struct kvm_vcpu *)) > { > unsigned long ret; > > @@ -10061,7 +10062,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > vcpu->run->hypercall.flags |= KVM_EXIT_HYPERCALL_LONG_MODE; > > WARN_ON_ONCE(vcpu->run->hypercall.flags & KVM_EXIT_HYPERCALL_MBZ); > - vcpu->arch.complete_userspace_io = complete_hypercall_exit; > + vcpu->arch.complete_userspace_io = complete_hypercall; > /* stat is incremented on completion. */ > return 0; > } > @@ -10071,13 +10072,15 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > } > > out: > - return ret; > + vcpu->run->hypercall.ret = ret; > + complete_hypercall(vcpu); > + return 1; Should this do return complete_hypercall(vcpu) so that you get the return code from kvm_skip_emulated_instruction()? Thanks, Tom > } > EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); > > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) > { > - unsigned long nr, a0, a1, a2, a3, ret; > + unsigned long nr, a0, a1, a2, a3; > int op_64_bit; > int cpl; > > @@ -10095,16 +10098,8 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) > op_64_bit = is_64_bit_hypercall(vcpu); > cpl = kvm_x86_call(get_cpl)(vcpu); > > - ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl); > - if (nr == KVM_HC_MAP_GPA_RANGE && !ret) > - /* MAP_GPA tosses the request to the user space. */ > - return 0; > - > - if (!op_64_bit) > - ret = (u32)ret; > - kvm_rax_write(vcpu, ret); > - > - return kvm_skip_emulated_instruction(vcpu); > + return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, > + complete_hypercall_exit); > } > EXPORT_SYMBOL_GPL(kvm_emulate_hypercall); > > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 6db13b696468..28adc8ea04bf 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -617,10 +617,12 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) > return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); > } > > -unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl); > +int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl, > + int (*complete_hypercall)(struct kvm_vcpu *)); > + > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > > #endif ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback 2024-12-02 20:57 ` Tom Lendacky @ 2024-12-02 20:59 ` Tom Lendacky 2024-12-03 0:14 ` Sean Christopherson 0 siblings, 1 reply; 39+ messages in thread From: Tom Lendacky @ 2024-12-02 20:59 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On 12/2/24 14:57, Tom Lendacky wrote: > On 11/27/24 18:43, Sean Christopherson wrote: >> @@ -10071,13 +10072,15 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, >> } >> >> out: >> - return ret; >> + vcpu->run->hypercall.ret = ret; >> + complete_hypercall(vcpu); >> + return 1; > > Should this do return complete_hypercall(vcpu) so that you get the > return code from kvm_skip_emulated_instruction()? Bah, ignore... already commented on by Xiaoyao. Thanks, Tom > > Thanks, > Tom > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback 2024-12-02 20:59 ` Tom Lendacky @ 2024-12-03 0:14 ` Sean Christopherson 0 siblings, 0 replies; 39+ messages in thread From: Sean Christopherson @ 2024-12-03 0:14 UTC (permalink / raw) To: Tom Lendacky Cc: Paolo Bonzini, kvm, linux-kernel, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On Mon, Dec 02, 2024, Tom Lendacky wrote: > On 12/2/24 14:57, Tom Lendacky wrote: > > On 11/27/24 18:43, Sean Christopherson wrote: > > >> @@ -10071,13 +10072,15 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > >> } > >> > >> out: > >> - return ret; > >> + vcpu->run->hypercall.ret = ret; > >> + complete_hypercall(vcpu); > >> + return 1; > > > > Should this do return complete_hypercall(vcpu) so that you get the > > return code from kvm_skip_emulated_instruction()? > > Bah, ignore... already commented on by Xiaoyao. Reviewers: 2, Sean: 0 :-) ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro 2024-11-28 0:43 [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Sean Christopherson ` (4 preceding siblings ...) 2024-11-28 0:43 ` [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback Sean Christopherson @ 2024-11-28 0:43 ` Sean Christopherson 2024-11-28 8:38 ` Adrian Hunter ` (2 more replies) 2024-11-28 1:06 ` [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Huang, Kai 2024-12-19 2:40 ` Sean Christopherson 7 siblings, 3 replies; 39+ messages in thread From: Sean Christopherson @ 2024-11-28 0:43 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li Rework __kvm_emulate_hypercall() into a macro so that completion of hypercalls that don't exit to userspace use direct function calls to the completion helper, i.e. don't trigger a retpoline when RETPOLINE=y. Opportunistically take the names of the input registers, as opposed to taking the input values, to preemptively dedup more of the calling code (TDX needs to use different registers). Use the direct GPR accessors to read values to avoid the pointless marking of the registers as available (KVM requires GPRs to always be available). Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/x86.c | 29 +++++++++-------------------- arch/x86/kvm/x86.h | 25 ++++++++++++++++++++----- 2 files changed, 29 insertions(+), 25 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 39be2a891ab4..fef8b4e63d25 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9982,11 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) return kvm_skip_emulated_instruction(vcpu); } -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, - unsigned long a0, unsigned long a1, - unsigned long a2, unsigned long a3, - int op_64_bit, int cpl, - int (*complete_hypercall)(struct kvm_vcpu *)) +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, + unsigned long a0, unsigned long a1, + unsigned long a2, unsigned long a3, + int op_64_bit, int cpl, + int (*complete_hypercall)(struct kvm_vcpu *)) { unsigned long ret; @@ -10073,32 +10073,21 @@ int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, out: vcpu->run->hypercall.ret = ret; - complete_hypercall(vcpu); return 1; } -EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); +EXPORT_SYMBOL_GPL(____kvm_emulate_hypercall); int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) { - unsigned long nr, a0, a1, a2, a3; - int op_64_bit; - int cpl; - if (kvm_xen_hypercall_enabled(vcpu->kvm)) return kvm_xen_hypercall(vcpu); if (kvm_hv_hypercall_enabled(vcpu)) return kvm_hv_hypercall(vcpu); - nr = kvm_rax_read(vcpu); - a0 = kvm_rbx_read(vcpu); - a1 = kvm_rcx_read(vcpu); - a2 = kvm_rdx_read(vcpu); - a3 = kvm_rsi_read(vcpu); - op_64_bit = is_64_bit_hypercall(vcpu); - cpl = kvm_x86_call(get_cpl)(vcpu); - - return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, + return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi, + is_64_bit_hypercall(vcpu), + kvm_x86_call(get_cpl)(vcpu), complete_hypercall_exit); } EXPORT_SYMBOL_GPL(kvm_emulate_hypercall); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 28adc8ea04bf..ad6fe6159dea 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -617,11 +617,26 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); } -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, - unsigned long a0, unsigned long a1, - unsigned long a2, unsigned long a3, - int op_64_bit, int cpl, - int (*complete_hypercall)(struct kvm_vcpu *)); +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, + unsigned long a0, unsigned long a1, + unsigned long a2, unsigned long a3, + int op_64_bit, int cpl, + int (*complete_hypercall)(struct kvm_vcpu *)); + +#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall) \ +({ \ + int __ret; \ + \ + __ret = ____kvm_emulate_hypercall(_vcpu, \ + kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu), \ + kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu), \ + kvm_##a3##_read(_vcpu), op_64_bit, cpl, \ + complete_hypercall); \ + \ + if (__ret > 0) \ + complete_hypercall(_vcpu); \ + __ret; \ +}) int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); -- 2.47.0.338.g60cca15819-goog ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro 2024-11-28 0:43 ` [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro Sean Christopherson @ 2024-11-28 8:38 ` Adrian Hunter 2024-12-10 16:20 ` Paolo Bonzini 2024-12-03 8:01 ` Binbin Wu 2024-12-10 16:17 ` Paolo Bonzini 2 siblings, 1 reply; 39+ messages in thread From: Adrian Hunter @ 2024-11-28 8:38 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li, Dave Hansen On 28/11/24 02:43, Sean Christopherson wrote: > Rework __kvm_emulate_hypercall() into a macro so that completion of > hypercalls that don't exit to userspace use direct function calls to the > completion helper, i.e. don't trigger a retpoline when RETPOLINE=y. > > Opportunistically take the names of the input registers, as opposed to > taking the input values, to preemptively dedup more of the calling code > (TDX needs to use different registers). Use the direct GPR accessors to > read values to avoid the pointless marking of the registers as available > (KVM requires GPRs to always be available). For TDX, there is an RFC relating to using descriptively named parameters instead of register names for tdh_vp_enter(): https://lore.kernel.org/all/fa817f29-e3ba-4c54-8600-e28cf6ab1953@intel.com/ Please do give some feedback on that approach. Note we need both KVM and x86 maintainer approval for SEAMCALL wrappers like tdh_vp_enter(). As proposed, that ends up with putting the values back into vcpu->arch.regs[] for __kvm_emulate_hypercall() which is not pretty: static int tdx_emulate_vmcall(struct kvm_vcpu *vcpu) { + struct vcpu_tdx *tdx = to_tdx(vcpu); int r; + kvm_r10_write(vcpu, tdx->vp_enter_args.tdcall.fn); + kvm_r11_write(vcpu, tdx->vp_enter_args.tdcall.subfn); + kvm_r12_write(vcpu, tdx->vp_enter_args.tdcall.vmcall.p2); + kvm_r13_write(vcpu, tdx->vp_enter_args.tdcall.vmcall.p3); + kvm_r14_write(vcpu, tdx->vp_enter_args.tdcall.vmcall.p4); + /* * ABI for KVM tdvmcall argument: * In Guest-Hypervisor Communication Interface(GHCI) specification, @@ -1092,13 +1042,12 @@ static int tdx_emulate_vmcall(struct kvm_vcpu *vcpu) * vendor-specific. KVM uses this for KVM hypercall. NOTE: KVM * hypercall number starts from one. Zero isn't used for KVM hypercall * number. - * - * R10: KVM hypercall number - * arguments: R11, R12, R13, R14. */ r = __kvm_emulate_hypercall(vcpu, r10, r11, r12, r13, r14, true, 0, R10, complete_hypercall_exit); + tdvmcall_set_return_code(vcpu, kvm_r10_read(vcpu)); + return r > 0; } > > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/x86.c | 29 +++++++++-------------------- > arch/x86/kvm/x86.h | 25 ++++++++++++++++++++----- > 2 files changed, 29 insertions(+), 25 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 39be2a891ab4..fef8b4e63d25 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9982,11 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > return kvm_skip_emulated_instruction(vcpu); > } > > -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl, > - int (*complete_hypercall)(struct kvm_vcpu *)) > +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl, > + int (*complete_hypercall)(struct kvm_vcpu *)) > { > unsigned long ret; > > @@ -10073,32 +10073,21 @@ int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > > out: > vcpu->run->hypercall.ret = ret; > - complete_hypercall(vcpu); > return 1; > } > -EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); > +EXPORT_SYMBOL_GPL(____kvm_emulate_hypercall); > > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) > { > - unsigned long nr, a0, a1, a2, a3; > - int op_64_bit; > - int cpl; > - > if (kvm_xen_hypercall_enabled(vcpu->kvm)) > return kvm_xen_hypercall(vcpu); > > if (kvm_hv_hypercall_enabled(vcpu)) > return kvm_hv_hypercall(vcpu); > > - nr = kvm_rax_read(vcpu); > - a0 = kvm_rbx_read(vcpu); > - a1 = kvm_rcx_read(vcpu); > - a2 = kvm_rdx_read(vcpu); > - a3 = kvm_rsi_read(vcpu); > - op_64_bit = is_64_bit_hypercall(vcpu); > - cpl = kvm_x86_call(get_cpl)(vcpu); > - > - return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, > + return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi, > + is_64_bit_hypercall(vcpu), > + kvm_x86_call(get_cpl)(vcpu), > complete_hypercall_exit); > } > EXPORT_SYMBOL_GPL(kvm_emulate_hypercall); > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 28adc8ea04bf..ad6fe6159dea 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -617,11 +617,26 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) > return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); > } > > -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl, > - int (*complete_hypercall)(struct kvm_vcpu *)); > +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl, > + int (*complete_hypercall)(struct kvm_vcpu *)); > + > +#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall) \ > +({ \ > + int __ret; \ > + \ > + __ret = ____kvm_emulate_hypercall(_vcpu, \ > + kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu), \ > + kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu), \ > + kvm_##a3##_read(_vcpu), op_64_bit, cpl, \ > + complete_hypercall); \ > + \ > + if (__ret > 0) \ > + complete_hypercall(_vcpu); \ > + __ret; \ > +}) > > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro 2024-11-28 8:38 ` Adrian Hunter @ 2024-12-10 16:20 ` Paolo Bonzini 2024-12-10 20:03 ` Sean Christopherson 0 siblings, 1 reply; 39+ messages in thread From: Paolo Bonzini @ 2024-12-10 16:20 UTC (permalink / raw) To: Adrian Hunter, Sean Christopherson Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li, Dave Hansen On 11/28/24 09:38, Adrian Hunter wrote: > > For TDX, there is an RFC relating to using descriptively > named parameters instead of register names for tdh_vp_enter(): > > https://lore.kernel.org/all/fa817f29-e3ba-4c54-8600-e28cf6ab1953@intel.com/ > > Please do give some feedback on that approach. Note we > need both KVM and x86 maintainer approval for SEAMCALL > wrappers like tdh_vp_enter(). > > As proposed, that ends up with putting the values back into > vcpu->arch.regs[] for __kvm_emulate_hypercall() which is not > pretty: If needed we can revert this patch, it's not a big problem. Paolo ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro 2024-12-10 16:20 ` Paolo Bonzini @ 2024-12-10 20:03 ` Sean Christopherson 2024-12-12 7:32 ` Adrian Hunter 0 siblings, 1 reply; 39+ messages in thread From: Sean Christopherson @ 2024-12-10 20:03 UTC (permalink / raw) To: Paolo Bonzini Cc: Adrian Hunter, kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li, Dave Hansen On Tue, Dec 10, 2024, Paolo Bonzini wrote: > On 11/28/24 09:38, Adrian Hunter wrote: > > > > For TDX, there is an RFC relating to using descriptively > > named parameters instead of register names for tdh_vp_enter(): > > > > https://lore.kernel.org/all/fa817f29-e3ba-4c54-8600-e28cf6ab1953@intel.com/ > > > > Please do give some feedback on that approach. Note we > > need both KVM and x86 maintainer approval for SEAMCALL > > wrappers like tdh_vp_enter(). > > > > As proposed, that ends up with putting the values back into > > vcpu->arch.regs[] for __kvm_emulate_hypercall() which is not > > pretty: > > If needed we can revert this patch, it's not a big problem. I don't care terribly about the SEAMCALL interfaces. I have opinions on what would I think would be ideal, but I can live with whatever. What I do deeply care about though is consistency within KVM, across vendors and VM flavors. And that means that guest registers absolutely need to be captured in vcpu->arch.regs[]. TDX already requires too much special cased code in KVM, there is zero reason to make TDX even more different and thus more difficult to maintain. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro 2024-12-10 20:03 ` Sean Christopherson @ 2024-12-12 7:32 ` Adrian Hunter 2024-12-12 15:42 ` Paolo Bonzini 2024-12-12 18:40 ` Sean Christopherson 0 siblings, 2 replies; 39+ messages in thread From: Adrian Hunter @ 2024-12-12 7:32 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li, Dave Hansen On 10/12/24 22:03, Sean Christopherson wrote: > On Tue, Dec 10, 2024, Paolo Bonzini wrote: >> On 11/28/24 09:38, Adrian Hunter wrote: >>> >>> For TDX, there is an RFC relating to using descriptively >>> named parameters instead of register names for tdh_vp_enter(): >>> >>> https://lore.kernel.org/all/fa817f29-e3ba-4c54-8600-e28cf6ab1953@intel.com/ >>> >>> Please do give some feedback on that approach. Note we >>> need both KVM and x86 maintainer approval for SEAMCALL >>> wrappers like tdh_vp_enter(). >>> >>> As proposed, that ends up with putting the values back into >>> vcpu->arch.regs[] for __kvm_emulate_hypercall() which is not >>> pretty: >> >> If needed we can revert this patch, it's not a big problem. > > I don't care terribly about the SEAMCALL interfaces. I have opinions on what > would I think would be ideal, but I can live with whatever. > > What I do deeply care about though is consistency within KVM, across vendors and > VM flavors. And that means that guest registers absolutely need to be captured in > vcpu->arch.regs[]. In general, TDX host VMM does not know what guest register values are. This case, where some GPRs are passed to the host VMM via arguments of the TDG.VP.VMCALL TDCALL, is really just a side effect of the choice of argument passing rather than any attempt to share guest registers with the host VMM. It could be regarded as more consistent to never use vcpu->arch.regs[] for confidential guests. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro 2024-12-12 7:32 ` Adrian Hunter @ 2024-12-12 15:42 ` Paolo Bonzini 2024-12-12 18:40 ` Sean Christopherson 1 sibling, 0 replies; 39+ messages in thread From: Paolo Bonzini @ 2024-12-12 15:42 UTC (permalink / raw) To: Adrian Hunter, Sean Christopherson Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li, Dave Hansen On 12/12/24 08:32, Adrian Hunter wrote: > On 10/12/24 22:03, Sean Christopherson wrote: >> What I do deeply care about though is consistency within KVM, across vendors and >> VM flavors. And that means that guest registers absolutely need to be captured in >> vcpu->arch.regs[]. > > In general, TDX host VMM does not know what guest register > values are. > > This case, where some GPRs are passed to the host VMM via > arguments of the TDG.VP.VMCALL TDCALL, is really just a > side effect of the choice of argument passing rather than > any attempt to share guest registers with the host VMM. > > It could be regarded as more consistent to never use > vcpu->arch.regs[] for confidential guests. Yes, that's where I stand as well. There's reasons to use vcpu->arch.regs[] when "decrypted" values are available, and reasons to not use it at all. Both of them could be considered the more consistent choice, and I think I prefer slightly the latter, but it's definitely not a hill to die on... Paolo ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro 2024-12-12 7:32 ` Adrian Hunter 2024-12-12 15:42 ` Paolo Bonzini @ 2024-12-12 18:40 ` Sean Christopherson 1 sibling, 0 replies; 39+ messages in thread From: Sean Christopherson @ 2024-12-12 18:40 UTC (permalink / raw) To: Adrian Hunter Cc: Paolo Bonzini, kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li, Dave Hansen [-- Attachment #1: Type: text/plain, Size: 4294 bytes --] On Thu, Dec 12, 2024, Adrian Hunter wrote: > On 10/12/24 22:03, Sean Christopherson wrote: > > On Tue, Dec 10, 2024, Paolo Bonzini wrote: > >> On 11/28/24 09:38, Adrian Hunter wrote: > >>> > >>> For TDX, there is an RFC relating to using descriptively > >>> named parameters instead of register names for tdh_vp_enter(): > >>> > >>> https://lore.kernel.org/all/fa817f29-e3ba-4c54-8600-e28cf6ab1953@intel.com/ > >>> > >>> Please do give some feedback on that approach. Note we > >>> need both KVM and x86 maintainer approval for SEAMCALL > >>> wrappers like tdh_vp_enter(). > >>> > >>> As proposed, that ends up with putting the values back into > >>> vcpu->arch.regs[] for __kvm_emulate_hypercall() which is not > >>> pretty: > >> > >> If needed we can revert this patch, it's not a big problem. > > > > I don't care terribly about the SEAMCALL interfaces. I have opinions on what > > would I think would be ideal, but I can live with whatever. > > > > What I do deeply care about though is consistency within KVM, across vendors and > > VM flavors. And that means that guest registers absolutely need to be captured in > > vcpu->arch.regs[]. > > In general, TDX host VMM does not know what guest register values are. > > This case, where some GPRs are passed to the host VMM via arguments of the > TDG.VP.VMCALL TDCALL, is really just a side effect of the choice of argument > passing rather than any attempt to share guest registers with the host VMM. > > It could be regarded as more consistent to never use vcpu->arch.regs[] for > confidential guests. SEV-ES+ marshalls data to/from the GHCB to KVM's register array, because the GHCB spec was intentionally crafted to allow hypervisors to reuse exit-handling code. Granted, that's only for R{A,B,C,D}X and RSI, but the other GPRs should never be used and thus their data is irrelevant. Which applies to TDX as well. For regs[], it's really only TDVMCALL that I care about, i.e. cases where GPRs hold guest values, versus things like EXIT_QUALIFICATION where the GPR is simply TDX's way of communicating information to the hypervisor. Argh. The reason I care about putting vCPU state into regs[] is because it helps share code between vendors. Looking at kvm-coco-queue, TDX support wandered in the opposite direction. E.g. TDX rolls its own RDMSR, WRMSR, CPUID, and HYPERCALL implementations, which is quite frustrating. Ditto for things like EXIT_REASON, EXIT_QUALIFICATION, EXIT_INTR_INFO, etc. For EXIT_REASON in particular, I think maintaining the guest-requested exit reason (via TDMVCALL) in a separate field is a mistake. Readers shouldn't have to care that a HLT exit technically was requested via TDVMCALL. If KVM instead immediately morphs the requested exit reason to KVM's tracked exit_reason, then there's no need to deal with the TDVMCALL layer in flows that don't care. The only danger is a collision with a EXIT_REASON_EPT_MISCONFIG from the TDX module, but that's easy enough to handle. And even where TDX and VMX have shared some code, IMO it doesn't go far enough. E.g. having vcpu_tdx and vcpu_vmx open code their own version of the posted interrupt fields, just to avoid minimal churn in the VMX code, is beyond gross. Even concepts like guest_state_loaded are unnecessarily different for TDX. Yes, I get that that host state doesn't need to be reloaded if KVM doesn't actually enter the guest. But holy moly, we're talking about avoid _one_ WRMSR in an extremely rare path (late abort of entry), at the cost of making TDX frustratingly different from VMX. Making TDX look more like everything else isn't just about code sharing. It's also about providing a familiar setting so that readers who know almost nothing about TDX can find their way around without having to effectively learn an entirely new "architecture" *and* code base. Hacking around, I think the attached half-baked diff will provide a middle ground for the regs[] vs. struct/union issue. The basic gist is to essentially treat TDX's register ABI as a faster version of VMREAD/VMWRITE, e.g. marshall state to/from the appropriate x86 registers as needed. That way, regs[] holds the correct state and so TDX can reuse much of KVM's existing code verbatim, while allowing the kernel's VP_ENTER API to evolve independently. [-- Attachment #2: 0001-Make-TDX-look-more-like-VMX.patch --] [-- Type: text/x-diff, Size: 31837 bytes --] From 0319082fc23089f516618e193d94da18c837e35a Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc@google.com> Date: Thu, 12 Dec 2024 10:22:25 -0800 Subject: [PATCH] Make TDX look more like VMX --- arch/x86/kvm/vmx/common.h | 64 +++++- arch/x86/kvm/vmx/nested.c | 7 +- arch/x86/kvm/vmx/posted_intr.h | 11 - arch/x86/kvm/vmx/tdx.c | 375 +++++++++++---------------------- arch/x86/kvm/vmx/tdx.h | 11 +- arch/x86/kvm/vmx/vmx.c | 26 +-- arch/x86/kvm/vmx/vmx.h | 42 +--- 7 files changed, 201 insertions(+), 335 deletions(-) diff --git a/arch/x86/kvm/vmx/common.h b/arch/x86/kvm/vmx/common.h index 809ced4c6cd8..f1679e53cb4b 100644 --- a/arch/x86/kvm/vmx/common.h +++ b/arch/x86/kvm/vmx/common.h @@ -12,6 +12,61 @@ #include "vmcs.h" #include "x86.h" +struct vcpu_vt { + /* Posted interrupt descriptor */ + struct pi_desc pi_desc; + + /* Used if this vCPU is waiting for PI notification wakeup. */ + struct list_head pi_wakeup_list; + + union vmx_exit_reason exit_reason; + + unsigned long exit_qualification; + u64 ext_qualification; + gpa_t exit_gpa; + u32 exit_intr_info; + u32 idt_vectoring_info; + + + /* + * If true, guest state has been loaded into hardware, and host state + * saved into vcpu_{vt,vmx,tdx}. If false, host state is loaded into + * hardware. + */ + bool guest_state_loaded; + +#ifdef CONFIG_X86_64 + u64 msr_host_kernel_gs_base; +#endif +}; + +static __always_inline unsigned long vmx_get_exit_reason(struct kvm_vcpu *vcpu) +{ + return to_vt(vcpu)->exit_reason; +} + +static __always_inline unsigned long vmx_get_exit_qual(struct kvm_vcpu *vcpu) +{ + struct vcpu_vt *vt = to_vcpu_vt(vcpu); + + if (!kvm_register_test_and_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1) && + !WARN_ON_ONCE(is_td_vcpu(vcpu)))) + vt->exit_qualification = vmcs_readl(EXIT_QUALIFICATION); + + return vt->exit_qualification; +} + +static __always_inline u32 vmx_get_intr_info(struct kvm_vcpu *vcpu) +{ + struct vcpu_vt *vt = to_vcpu_vt(vcpu); + + if (!kvm_register_test_and_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_2) && + !WARN_ON_ONCE(is_td_vcpu(vcpu))) + vt->exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO); + + return vt->exit_intr_info; +} + extern unsigned long vmx_host_idt_base; void vmx_do_interrupt_irqoff(unsigned long entry); void vmx_do_nmi_irqoff(void); @@ -36,9 +91,10 @@ static inline void vmx_handle_nm_fault_irqoff(struct kvm_vcpu *vcpu) rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); } -static inline void vmx_handle_exception_irqoff(struct kvm_vcpu *vcpu, - u32 intr_info) +static inline void vmx_handle_exception_irqoff(struct kvm_vcpu *vcpu) { + u32 intr_info = vmx_get_intr_info(vcpu); + /* if exit due to PF check for async PF */ if (is_page_fault(intr_info)) vcpu->arch.apf.host_apf_flags = kvm_read_and_reset_apf_flags(); @@ -50,9 +106,9 @@ static inline void vmx_handle_exception_irqoff(struct kvm_vcpu *vcpu, kvm_machine_check(); } -static inline void vmx_handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu, - u32 intr_info) +static inline void vmx_handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) { + u32 intr_info = vmx_get_intr_info(vcpu); unsigned int vector = intr_info & INTR_INFO_VECTOR_MASK; if (KVM_BUG(!is_external_intr(intr_info), vcpu->kvm, diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index aa78b6f38dfe..056b6ff1503e 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -410,6 +410,7 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, { struct vmcs12 *vmcs12 = get_vmcs12(vcpu); struct vcpu_vmx *vmx = to_vmx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); unsigned long exit_qualification; u32 vm_exit_reason; @@ -425,7 +426,7 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, * tables also changed, but KVM should not treat EPT Misconfig * VM-Exits as writes. */ - WARN_ON_ONCE(vmx->exit_reason.basic != EXIT_REASON_EPT_VIOLATION); + WARN_ON_ONCE(vt->exit_reason.basic != EXIT_REASON_EPT_VIOLATION); /* * PML Full and EPT Violation VM-Exits both use bit 12 to report @@ -6099,7 +6100,7 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu) * nested VM-Exit. Pass the original exit reason, i.e. don't hardcode * EXIT_REASON_VMFUNC as the exit reason. */ - nested_vmx_vmexit(vcpu, vmx->exit_reason.full, + nested_vmx_vmexit(vcpu, vmx_get_exit_reason(vcpu).full, vmx_get_intr_info(vcpu), vmx_get_exit_qual(vcpu)); return 1; @@ -6544,7 +6545,7 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu, bool nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); - union vmx_exit_reason exit_reason = vmx->exit_reason; + union vmx_exit_reason exit_reason = vmx_get_exit_reason(vcpu); unsigned long exit_qual; u32 exit_intr_info; diff --git a/arch/x86/kvm/vmx/posted_intr.h b/arch/x86/kvm/vmx/posted_intr.h index 8b1dccfe4885..9ac4f6eafac5 100644 --- a/arch/x86/kvm/vmx/posted_intr.h +++ b/arch/x86/kvm/vmx/posted_intr.h @@ -5,17 +5,6 @@ #include <linux/find.h> #include <asm/posted_intr.h> -struct vcpu_pi { - struct kvm_vcpu vcpu; - - /* Posted interrupt descriptor */ - struct pi_desc pi_desc; - - /* Used if this vCPU is waiting for PI notification wakeup. */ - struct list_head pi_wakeup_list; - /* Until here common layout between vcpu_vmx and vcpu_tdx. */ -}; - struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu); void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu); diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 69ef9c967fbf..7eff717c9d0d 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -182,49 +182,6 @@ static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) return pa | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits); } -static __always_inline union vmx_exit_reason tdexit_exit_reason(struct kvm_vcpu *vcpu) -{ - return (union vmx_exit_reason)(u32)(to_tdx(vcpu)->vp_enter_ret); -} - -/* - * There is no simple way to check some bit(s) to decide whether the return - * value of TDH.VP.ENTER has a VMX exit reason or not. E.g., - * TDX_NON_RECOVERABLE_TD_WRONG_APIC_MODE has exit reason but with error bit - * (bit 63) set, TDX_NON_RECOVERABLE_TD_CORRUPTED_MD has no exit reason but with - * error bit cleared. - */ -static __always_inline bool tdx_has_exit_reason(struct kvm_vcpu *vcpu) -{ - u64 status = to_tdx(vcpu)->vp_enter_ret & TDX_SEAMCALL_STATUS_MASK; - - return status == TDX_SUCCESS || status == TDX_NON_RECOVERABLE_VCPU || - status == TDX_NON_RECOVERABLE_TD || - status == TDX_NON_RECOVERABLE_TD_NON_ACCESSIBLE || - status == TDX_NON_RECOVERABLE_TD_WRONG_APIC_MODE; -} - -static __always_inline bool tdx_check_exit_reason(struct kvm_vcpu *vcpu, u16 reason) -{ - return tdx_has_exit_reason(vcpu) && - (u16)tdexit_exit_reason(vcpu).basic == reason; -} - -static __always_inline unsigned long tdexit_exit_qual(struct kvm_vcpu *vcpu) -{ - return kvm_rcx_read(vcpu); -} - -static __always_inline unsigned long tdexit_ext_exit_qual(struct kvm_vcpu *vcpu) -{ - return kvm_rdx_read(vcpu); -} - -static __always_inline unsigned long tdexit_gpa(struct kvm_vcpu *vcpu) -{ - return kvm_r8_read(vcpu); -} - static __always_inline unsigned long tdexit_intr_info(struct kvm_vcpu *vcpu) { return kvm_r9_read(vcpu); @@ -246,23 +203,15 @@ BUILD_TDVMCALL_ACCESSORS(a1, r13); BUILD_TDVMCALL_ACCESSORS(a2, r14); BUILD_TDVMCALL_ACCESSORS(a3, r15); -static __always_inline unsigned long tdvmcall_exit_type(struct kvm_vcpu *vcpu) -{ - return kvm_r10_read(vcpu); -} -static __always_inline unsigned long tdvmcall_leaf(struct kvm_vcpu *vcpu) -{ - return kvm_r11_read(vcpu); -} static __always_inline void tdvmcall_set_return_code(struct kvm_vcpu *vcpu, long val) { - kvm_r10_write(vcpu, val); + ??? = val; } static __always_inline void tdvmcall_set_return_val(struct kvm_vcpu *vcpu, unsigned long val) { - kvm_r11_write(vcpu, val); + ??? = val; } static inline void tdx_hkid_free(struct kvm_tdx *kvm_tdx) @@ -742,11 +691,8 @@ bool tdx_interrupt_allowed(struct kvm_vcpu *vcpu) * interrupt is always allowed unless TDX guest calls TDVMCALL with HLT, * which passes the interrupt blocked flag. */ - if (!tdx_check_exit_reason(vcpu, EXIT_REASON_TDCALL) || - tdvmcall_exit_type(vcpu) || tdvmcall_leaf(vcpu) != EXIT_REASON_HLT) - return true; - - return !tdvmcall_a0_read(vcpu); + return vmx_get_exit_reason(vcpu).basic != EXIT_REASON_HLT || + <don't care where this resides>; } bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) @@ -768,31 +714,30 @@ bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) */ void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) { - struct vcpu_tdx *tdx = to_tdx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); - if (!tdx->host_state_need_save) + if (vt->guest_state_loaded) return; if (likely(is_64bit_mm(current->mm))) - tdx->msr_host_kernel_gs_base = current->thread.gsbase; + vt->msr_host_kernel_gs_base = current->thread.gsbase; else - tdx->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE); + vt->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE); - tdx->host_state_need_save = false; + vt->guest_state_loaded = true; } static void tdx_prepare_switch_to_host(struct kvm_vcpu *vcpu) { struct vcpu_tdx *tdx = to_tdx(vcpu); - tdx->host_state_need_save = true; - if (!tdx->host_state_need_restore) + if (!vt->guest_state_loaded) return; ++vcpu->stat.host_state_reload; - wrmsrl(MSR_KERNEL_GS_BASE, tdx->msr_host_kernel_gs_base); - tdx->host_state_need_restore = false; + wrmsrl(MSR_KERNEL_GS_BASE, vt->msr_host_kernel_gs_base); + vt->guest_state_loaded = false; } void tdx_vcpu_put(struct kvm_vcpu *vcpu) @@ -897,57 +842,60 @@ static void tdx_restore_host_xsave_state(struct kvm_vcpu *vcpu) write_pkru(vcpu->arch.host_pkru); } +static union vmx_exit_reason tdx_to_vmx_exit_reason(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + u32 exit_reason; + + switch (tdx->vp_enter_ret & TDX_SEAMCALL_STATUS_MASK) { + case TDX_SUCCESS: + case TDX_NON_RECOVERABLE_VCPU: + case TDX_NON_RECOVERABLE_TD; + case TDX_NON_RECOVERABLE_TD_NON_ACCESSIBLE: + case TDX_NON_RECOVERABLE_TD_WRONG_APIC_MODE: + break; + default: + return -1u; + } + + exit_reason = tdx->vp_enter_ret; + switch (exit_reason) + case EXIT_REASON_TDCALL: + if (tdx->blah.tdvmcall_exit_type) + return EXIT_REASON_VMCALL; + + if (tdx->blah.tdvmcall_leaf < 0x10000) + return tdx->blah.tdvmcall_leaf; + break; + case EXIT_REASON_EPT_MISCONFIG: + KVM_BUG_ON(1, vcpu->kvm); + return -1; + default: + break; + } + return exit_reason; +} + static void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu) { struct vcpu_tdx *tdx = to_tdx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); struct tdx_module_args args; + u64 status; guest_state_enter_irqoff(); - /* - * TODO: optimization: - * - Eliminate copy between args and vcpu->arch.regs. - * - copyin/copyout registers only if (tdx->tdvmvall.regs_mask != 0) - * which means TDG.VP.VMCALL. - */ - args = (struct tdx_module_args) { - .rcx = tdx->tdvpr_pa, -#define REG(reg, REG) .reg = vcpu->arch.regs[VCPU_REGS_ ## REG] - REG(rdx, RDX), - REG(r8, R8), - REG(r9, R9), - REG(r10, R10), - REG(r11, R11), - REG(r12, R12), - REG(r13, R13), - REG(r14, R14), - REG(r15, R15), - REG(rbx, RBX), - REG(rdi, RDI), - REG(rsi, RSI), -#undef REG - }; - tdx->vp_enter_ret = tdh_vp_enter(tdx->tdvpr_pa, &args); -#define REG(reg, REG) vcpu->arch.regs[VCPU_REGS_ ## REG] = args.reg - REG(rcx, RCX); - REG(rdx, RDX); - REG(r8, R8); - REG(r9, R9); - REG(r10, R10); - REG(r11, R11); - REG(r12, R12); - REG(r13, R13); - REG(r14, R14); - REG(r15, R15); - REG(rbx, RBX); - REG(rdi, RDI); - REG(rsi, RSI); -#undef REG + vt->exit_reason.full = tdx_to_vmx_exit_reason(vcpu); - if (tdx_check_exit_reason(vcpu, EXIT_REASON_EXCEPTION_NMI) && - is_nmi(tdexit_intr_info(vcpu))) + tdx->exit.qualification = args.rcx; + tdx->exit.extended_qualification = args.rdx; + tdx->exit.intr_info = args.r9; + tdx->exit.guest_physical_address = args.r8; + + if (vt->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI) && + is_nmi(vmx_get_intr_info(vcpu))) __vmx_handle_nmi(vcpu); guest_state_exit_irqoff(); @@ -971,11 +919,12 @@ static fastpath_t tdx_exit_handlers_fastpath(struct kvm_vcpu *vcpu) fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) { struct vcpu_tdx *tdx = to_tdx(vcpu); + struct vcpu_vt *vt = to_tdx(vcpu); /* TDX exit handle takes care of this error case. */ if (unlikely(tdx->state != VCPU_TD_STATE_INITIALIZED)) { - /* Set to avoid collision with EXIT_REASON_EXCEPTION_NMI. */ tdx->vp_enter_ret = TDX_SW_ERROR; + vt->exit_reason.full = -1ul; return EXIT_FASTPATH_NONE; } @@ -1005,7 +954,7 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) trace_kvm_exit(vcpu, KVM_ISA_VMX); - if (unlikely(tdx_has_exit_reason(vcpu) && tdexit_exit_reason(vcpu).failed_vmentry)) + if (unlikely(vmx_get_exit_reason(vcpu).failed_vmentry)) return EXIT_FASTPATH_NONE; tdx_complete_interrupts(vcpu); @@ -1032,15 +981,14 @@ void tdx_inject_nmi(struct kvm_vcpu *vcpu) void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu) { if (tdx_check_exit_reason(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT)) - vmx_handle_external_interrupt_irqoff(vcpu, - tdexit_intr_info(vcpu)); + vmx_handle_external_interrupt_irqoff(vcpu); else if (tdx_check_exit_reason(vcpu, EXIT_REASON_EXCEPTION_NMI)) - vmx_handle_exception_irqoff(vcpu, tdexit_intr_info(vcpu)); + vmx_handle_exception_irqoff(vcpu); } static int tdx_handle_exception_nmi(struct kvm_vcpu *vcpu) { - u32 intr_info = tdexit_intr_info(vcpu); + u32 intr_info = vmx_get_intr_info(vcpu); /* * Machine checks are handled by vmx_handle_exception_irqoff(), or by @@ -1051,8 +999,7 @@ static int tdx_handle_exception_nmi(struct kvm_vcpu *vcpu) return 1; kvm_pr_unimpl("unexpected exception 0x%x(exit_reason 0x%llx qual 0x%lx)\n", - intr_info, - to_tdx(vcpu)->vp_enter_ret, tdexit_exit_qual(vcpu)); + intr_info, to_tdx(vcpu)->vp_enter_ret, vmx_get_exit_qual(vcpu)); vcpu->run->exit_reason = KVM_EXIT_EXCEPTION; vcpu->run->ex.exception = intr_info & INTR_INFO_VECTOR_MASK; @@ -1063,21 +1010,12 @@ static int tdx_handle_exception_nmi(struct kvm_vcpu *vcpu) static int tdx_handle_external_interrupt(struct kvm_vcpu *vcpu) { - ++vcpu->stat.irq_exits; return 1; } -static int tdx_handle_triple_fault(struct kvm_vcpu *vcpu) -{ - vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN; - vcpu->mmio_needed = 0; - return 0; -} - - static int complete_hypercall_exit(struct kvm_vcpu *vcpu) { - kvm_r10_write(vcpu, vcpu->run->hypercall.ret); + <tdx thingie> = kvm_rax_read(vcpu); return 1; } @@ -1085,21 +1023,13 @@ static int tdx_emulate_vmcall(struct kvm_vcpu *vcpu) { int r; - /* - * ABI for KVM tdvmcall argument: - * In Guest-Hypervisor Communication Interface(GHCI) specification, - * Non-zero leaf number (R10 != 0) is defined to indicate - * vendor-specific. KVM uses this for KVM hypercall. NOTE: KVM - * hypercall number starts from one. Zero isn't used for KVM hypercall - * number. - * - * R10: KVM hypercall number - * arguments: R11, R12, R13, R14. - */ - r = __kvm_emulate_hypercall(vcpu, r10, r11, r12, r13, r14, true, 0, - complete_hypercall_exit); + kvm_rax_write(vcpu, blah); + kvm_rbx_write(vcpu, blah); + kvm_rcx_write(vcpu, blah); + kvm_rdx_write(vcpu, blah); + kvm_rsi_write(vcpu, blah); - return r > 0; + return kvm_emulate_hypercall(vcpu, complete_hypercall_exit); } /* @@ -1258,36 +1188,9 @@ static int tdx_report_fatal_error(struct kvm_vcpu *vcpu) return 0; } -static int tdx_emulate_cpuid(struct kvm_vcpu *vcpu) -{ - u32 eax, ebx, ecx, edx; - - /* EAX and ECX for cpuid is stored in R12 and R13. */ - eax = tdvmcall_a0_read(vcpu); - ecx = tdvmcall_a1_read(vcpu); - - kvm_cpuid(vcpu, &eax, &ebx, &ecx, &edx, false); - - tdvmcall_a0_write(vcpu, eax); - tdvmcall_a1_write(vcpu, ebx); - tdvmcall_a2_write(vcpu, ecx); - tdvmcall_a3_write(vcpu, edx); - - tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_SUCCESS); - - return 1; -} - -static int tdx_emulate_hlt(struct kvm_vcpu *vcpu) -{ - tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_SUCCESS); - return kvm_emulate_halt_noskip(vcpu); -} - static int tdx_complete_pio_out(struct kvm_vcpu *vcpu) { vcpu->arch.pio.count = 0; - tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_SUCCESS); return 1; } @@ -1301,10 +1204,7 @@ static int tdx_complete_pio_in(struct kvm_vcpu *vcpu) vcpu->arch.pio.port, &val, 1); WARN_ON_ONCE(!ret); - - tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_SUCCESS); tdvmcall_set_return_val(vcpu, val); - return 1; } @@ -1337,7 +1237,6 @@ static int tdx_emulate_io(struct kvm_vcpu *vcpu) if (ret) { if (!write) tdvmcall_set_return_val(vcpu, val); - tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_SUCCESS); } else { if (write) vcpu->arch.complete_userspace_io = tdx_complete_pio_out; @@ -1348,22 +1247,18 @@ static int tdx_emulate_io(struct kvm_vcpu *vcpu) return ret; } -static int tdx_complete_mmio(struct kvm_vcpu *vcpu) +static int tdx_complete_mmio_read(struct kvm_vcpu *vcpu) { unsigned long val = 0; gpa_t gpa; int size; - if (!vcpu->mmio_is_write) { - gpa = vcpu->mmio_fragments[0].gpa; - size = vcpu->mmio_fragments[0].len; + gpa = vcpu->mmio_fragments[0].gpa; + size = vcpu->mmio_fragments[0].len; - memcpy(&val, vcpu->run->mmio.data, size); - tdvmcall_set_return_val(vcpu, val); - trace_kvm_mmio(KVM_TRACE_MMIO_READ, size, gpa, &val); - } - - tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_SUCCESS); + memcpy(&val, vcpu->run->mmio.data, size); + tdvmcall_set_return_val(vcpu, val); + trace_kvm_mmio(KVM_TRACE_MMIO_READ, size, gpa, &val); return 1; } @@ -1434,7 +1329,8 @@ static int tdx_emulate_mmio(struct kvm_vcpu *vcpu) /* Request the device emulation to userspace device model. */ vcpu->mmio_is_write = write; - vcpu->arch.complete_userspace_io = tdx_complete_mmio; + if (!write) + vcpu->arch.complete_userspace_io = tdx_complete_mmio_read; vcpu->run->mmio.phys_addr = gpa; vcpu->run->mmio.len = size; @@ -1455,39 +1351,15 @@ static int tdx_emulate_mmio(struct kvm_vcpu *vcpu) return 1; } -static int tdx_emulate_rdmsr(struct kvm_vcpu *vcpu) +int tdx_complete_emulated_msr(struct kvm_vcpu *vcpu) { - u32 index = tdvmcall_a0_read(vcpu); - u64 data; - - if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_READ) || - kvm_get_msr(vcpu, index, &data)) { - trace_kvm_msr_read_ex(index); - tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_INVALID_OPERAND); - return 1; - } - trace_kvm_msr_read(index, data); - - tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_SUCCESS); - tdvmcall_set_return_val(vcpu, data); - return 1; -} - -static int tdx_emulate_wrmsr(struct kvm_vcpu *vcpu) -{ - u32 index = tdvmcall_a0_read(vcpu); - u64 data = tdvmcall_a1_read(vcpu); - - if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_WRITE) || - kvm_set_msr(vcpu, index, data)) { - trace_kvm_msr_write_ex(index, data); + if (err) { tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_INVALID_OPERAND); return 1; } - trace_kvm_msr_write(index, data); - tdvmcall_set_return_code(vcpu, TDVMCALL_STATUS_SUCCESS); - return 1; + if (vmx_get_exit_reason(vcpu).basic == EXIT_REASON_MSR_READ) + tdvmcall_set_return_val(vcpu, kvm_read_edx_eax(vcpu)); } static int tdx_get_td_vm_call_info(struct kvm_vcpu *vcpu) @@ -1506,26 +1378,11 @@ static int tdx_get_td_vm_call_info(struct kvm_vcpu *vcpu) static int handle_tdvmcall(struct kvm_vcpu *vcpu) { - if (tdvmcall_exit_type(vcpu)) - return tdx_emulate_vmcall(vcpu); - - switch (tdvmcall_leaf(vcpu)) { + switch (to_tdx(vcpu)->blah.tdvmcall_leaf) { case TDVMCALL_MAP_GPA: return tdx_map_gpa(vcpu); case TDVMCALL_REPORT_FATAL_ERROR: return tdx_report_fatal_error(vcpu); - case EXIT_REASON_CPUID: - return tdx_emulate_cpuid(vcpu); - case EXIT_REASON_HLT: - return tdx_emulate_hlt(vcpu); - case EXIT_REASON_IO_INSTRUCTION: - return tdx_emulate_io(vcpu); - case EXIT_REASON_EPT_VIOLATION: - return tdx_emulate_mmio(vcpu); - case EXIT_REASON_MSR_READ: - return tdx_emulate_rdmsr(vcpu); - case EXIT_REASON_MSR_WRITE: - return tdx_emulate_wrmsr(vcpu); case TDVMCALL_GET_TD_VM_CALL_INFO: return tdx_get_td_vm_call_info(vcpu); default: @@ -1841,8 +1698,8 @@ void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, static inline bool tdx_is_sept_violation_unexpected_pending(struct kvm_vcpu *vcpu) { - u64 eeq_type = tdexit_ext_exit_qual(vcpu) & TDX_EXT_EXIT_QUAL_TYPE_MASK; - u64 eq = tdexit_exit_qual(vcpu); + u64 eeq_type = vmx_get_ext_exit_qual(vcpu) & TDX_EXT_EXIT_QUAL_TYPE_MASK; + u64 eq = vmx_get_exit_qual(vcpu); if (eeq_type != TDX_EXT_EXIT_QUAL_TYPE_PENDING_EPT_VIOLATION) return false; @@ -1852,7 +1709,7 @@ static inline bool tdx_is_sept_violation_unexpected_pending(struct kvm_vcpu *vcp static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) { - gpa_t gpa = tdexit_gpa(vcpu); + gpa_t gpa = vmx_get_exit_gpa(vcpu); unsigned long exit_qual; if (vt_is_tdx_private_gpa(vcpu->kvm, gpa)) { @@ -1873,7 +1730,7 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) */ exit_qual = EPT_VIOLATION_ACC_WRITE; } else { - exit_qual = tdexit_exit_qual(vcpu); + exit_qual = vmx_get_exit_qual(vcpu); /* * EPT violation due to instruction fetch should never be * triggered from shared memory in TDX guest. If such EPT @@ -1889,18 +1746,14 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) { + union vmx_exit_reason exit_reason = vmx_get_exit_reason(vcpu); struct vcpu_tdx *tdx = to_tdx(vcpu); u64 vp_enter_ret = tdx->vp_enter_ret; - union vmx_exit_reason exit_reason; if (fastpath != EXIT_FASTPATH_NONE) return 1; - /* - * Handle TDX SW errors, including TDX_SEAMCALL_UD, TDX_SEAMCALL_GP and - * TDX_SEAMCALL_VMFAILINVALID. - */ - if (unlikely((vp_enter_ret & TDX_SW_ERROR) == TDX_SW_ERROR)) { + if (unlikely(exit_reason.full == -1u) { KVM_BUG_ON(!kvm_rebooting, vcpu->kvm); goto unhandled_exit; } @@ -1909,33 +1762,47 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) * Without off-TD debug enabled, failed_vmentry case must have * TDX_NON_RECOVERABLE set. */ - if (unlikely(vp_enter_ret & (TDX_ERROR | TDX_NON_RECOVERABLE))) { - /* Triple fault is non-recoverable. */ - if (unlikely(tdx_check_exit_reason(vcpu, EXIT_REASON_TRIPLE_FAULT))) - return tdx_handle_triple_fault(vcpu); - + if (unlikely(vp_enter_ret & (TDX_ERROR | TDX_NON_RECOVERABLE)) && + exit_reason.basic != EXIT_REASON_TRIPLE_FAULT) { kvm_pr_unimpl("TD vp_enter_ret 0x%llx, hkid 0x%x hkid pa 0x%llx\n", vp_enter_ret, to_kvm_tdx(vcpu->kvm)->hkid, set_hkid_to_hpa(0, to_kvm_tdx(vcpu->kvm)->hkid)); goto unhandled_exit; } - /* From now, the seamcall status should be TDX_SUCCESS. */ - WARN_ON_ONCE((vp_enter_ret & TDX_SEAMCALL_STATUS_MASK) != TDX_SUCCESS); - exit_reason = tdexit_exit_reason(vcpu); + WARN_ON_ONCE(exit_reason.basic != EXIT_REASON_TRIPLE_FAULT && + (vp_enter_ret & TDX_SEAMCALL_STATUS_MASK) != TDX_SUCCESS); switch (exit_reason.basic) { case EXIT_REASON_EXCEPTION_NMI: return tdx_handle_exception_nmi(vcpu); case EXIT_REASON_EXTERNAL_INTERRUPT: - return tdx_handle_external_interrupt(vcpu); + ++vcpu->stat.irq_exits; + return 1; + case EXIT_REASON_CPUID: + return tdx_emulate_cpuid(vcpu); + case EXIT_REASON_HLT: + return kvm_emulate_halt_noskip(vcpu); + case EXIT_REASON_VMCALL: + return tdx_emulate_vmcall(vcpu); + case EXIT_REASON_IO_INSTRUCTION: + return tdx_emulate_io(vcpu); + case EXIT_REASON_MSR_READ: + kvm_rcx_write(vcpu, <don't care where this comes from>); + return kvm_emulate_rdmsr(vcpu); + case EXIT_REASON_MSR_WRITE: + kvm_rcx_write(vcpu, <don't care where this comes from>); + return kvm_emulate_wrmsr(vcpu); + case EXIT_REASON_EPT_MISCONFIG: + return tdx_emulate_mmio(vcpu); case EXIT_REASON_TDCALL: return handle_tdvmcall(vcpu); case EXIT_REASON_EPT_VIOLATION: return tdx_handle_ept_violation(vcpu); - case EXIT_REASON_EPT_MISCONFIG: - KVM_BUG_ON(1, vcpu->kvm); - return -EIO; + case EXIT_REASON_TRIPLE_FAULT: + vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN; + vcpu->mmio_needed = 0; + return 0; case EXIT_REASON_OTHER_SMI: /* * Unlike VMX, SMI in SEAM non-root mode (i.e. when @@ -1970,20 +1837,20 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) return 0; } -void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, - u64 *info1, u64 *info2, u32 *intr_info, u32 *error_code) +void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, u64 *info1, + u64 *info2, u32 *intr_info, u32 *error_code) { struct vcpu_tdx *tdx = to_tdx(vcpu); - if (tdx_has_exit_reason(vcpu)) { + if (tdx_get_exit_reason(vcpu).full != -1ul) { /* * Encode some useful info from the the 64 bit return code * into the 32 bit exit 'reason'. If the VMX exit reason is * valid, just set it to those bits. */ *reason = (u32)tdx->vp_enter_ret; - *info1 = tdexit_exit_qual(vcpu); - *info2 = tdexit_ext_exit_qual(vcpu); + *info1 = vmx_get_exit_qual(vcpu); + *info2 = vmx_get_ext_exit_qual(vcpu); } else { /* * When the VMX exit reason in vp_enter_ret is not valid, @@ -1997,7 +1864,7 @@ void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, *info2 = 0; } - *intr_info = tdexit_intr_info(vcpu); + *intr_info = vmx_get_intr_info(vcpu); *error_code = 0; } diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 0833d1084331..33d316e81a7e 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -59,12 +59,7 @@ enum vcpu_tdx_state { struct vcpu_tdx { struct kvm_vcpu vcpu; - /* Posted interrupt descriptor */ - struct pi_desc pi_desc; - - /* Used if this vCPU is waiting for PI notification wakeup. */ - struct list_head pi_wakeup_list; - /* Until here same layout to struct vcpu_pi. */ + struct vcpu_vt vt; unsigned long tdvpr_pa; unsigned long *tdcx_pa; @@ -75,10 +70,6 @@ struct vcpu_tdx { enum vcpu_tdx_state state; - bool host_state_need_save; - bool host_state_need_restore; - u64 msr_host_kernel_gs_base; - u64 map_gpa_next; u64 map_gpa_end; }; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 832387bea753..8302e429c82a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6099,9 +6099,9 @@ void vmx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, { struct vcpu_vmx *vmx = to_vmx(vcpu); - *reason = vmx->exit_reason.full; + *reason = vmx_get_exit_reason(vcpu).full; *info1 = vmx_get_exit_qual(vcpu); - if (!(vmx->exit_reason.failed_vmentry)) { + if (!(vmx_get_exit_reason(vcpu).failed_vmentry)) { *info2 = vmx->idt_vectoring_info; *intr_info = vmx_get_intr_info(vcpu); if (is_exception_with_error_code(*intr_info)) @@ -6380,7 +6380,7 @@ void dump_vmcs(struct kvm_vcpu *vcpu) static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) { struct vcpu_vmx *vmx = to_vmx(vcpu); - union vmx_exit_reason exit_reason = vmx->exit_reason; + union vmx_exit_reason exit_reason = vmx_get_exit_reason(vcpu); u32 vectoring_info = vmx->idt_vectoring_info; u16 exit_handler_index; @@ -6901,11 +6901,10 @@ void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu) if (vmx->emulation_required) return; - if (vmx->exit_reason.basic == EXIT_REASON_EXTERNAL_INTERRUPT) - vmx_handle_external_interrupt_irqoff(vcpu, - vmx_get_intr_info(vcpu)); - else if (vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI) - vmx_handle_exception_irqoff(vcpu, vmx_get_intr_info(vcpu)); + if (vmx_get_exit_reason(vcpu).basic == EXIT_REASON_EXTERNAL_INTERRUPT) + vmx_handle_external_interrupt_irqoff(vcpu); + else if (vmx_get_exit_reason(vcpu).basic == EXIT_REASON_EXCEPTION_NMI) + vmx_handle_exception_irqoff(vcpu); } /* @@ -7154,6 +7153,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, unsigned int flags) { struct vcpu_vmx *vmx = to_vmx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); guest_state_enter_irqoff(); @@ -7185,15 +7185,15 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, vmx_enable_fb_clear(vmx); if (unlikely(vmx->fail)) { - vmx->exit_reason.full = 0xdead; + vt->exit_reason.full = 0xdead; goto out; } - vmx->exit_reason.full = vmcs_read32(VM_EXIT_REASON); - if (likely(!vmx->exit_reason.failed_vmentry)) + vt->exit_reason.full = vmcs_read32(VM_EXIT_REASON); + if (likely(!vt->exit_reason.failed_vmentry)) vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD); - if ((u16)vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI && + if ((u16)vt->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI && is_nmi(vmx_get_intr_info(vcpu))) __vmx_handle_nmi(vcpu); @@ -7331,7 +7331,7 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) * checking. */ if (vmx->nested.nested_run_pending && - !vmx->exit_reason.failed_vmentry) + !vmx_get_exit_reason(vcpu).failed_vmentry) ++vcpu->stat.nested_run; vmx->nested.nested_run_pending = 0; diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index a91e1610b0b7..7a385dcdb2d5 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -231,28 +231,11 @@ struct nested_vmx { struct vcpu_vmx { struct kvm_vcpu vcpu; - /* Posted interrupt descriptor */ - struct pi_desc pi_desc; - - /* Used if this vCPU is waiting for PI notification wakeup. */ - struct list_head pi_wakeup_list; - /* Until here same layout to struct vcpu_pi. */ + struct vcpu_vt vt; u8 fail; u8 x2apic_msr_bitmap_mode; - /* - * If true, host state has been stored in vmx->loaded_vmcs for - * the CPU registers that only need to be switched when transitioning - * to/from the kernel, and the registers have been loaded with guest - * values. If false, host state is loaded in the CPU registers - * and vmx->loaded_vmcs->host_state is invalid. - */ - bool guest_state_loaded; - - unsigned long exit_qualification; - u32 exit_intr_info; - u32 idt_vectoring_info; ulong rflags; /* @@ -263,11 +246,10 @@ struct vcpu_vmx { */ struct vmx_uret_msr guest_uret_msrs[MAX_NR_USER_RETURN_MSRS]; bool guest_uret_msrs_loaded; + #ifdef CONFIG_X86_64 - u64 msr_host_kernel_gs_base; u64 msr_guest_kernel_gs_base; #endif - u64 spec_ctrl; u32 msr_ia32_umwait_control; @@ -649,26 +631,6 @@ void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu); int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu); void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu); -static __always_inline unsigned long vmx_get_exit_qual(struct kvm_vcpu *vcpu) -{ - struct vcpu_vmx *vmx = to_vmx(vcpu); - - if (!kvm_register_test_and_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1)) - vmx->exit_qualification = vmcs_readl(EXIT_QUALIFICATION); - - return vmx->exit_qualification; -} - -static __always_inline u32 vmx_get_intr_info(struct kvm_vcpu *vcpu) -{ - struct vcpu_vmx *vmx = to_vmx(vcpu); - - if (!kvm_register_test_and_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_2)) - vmx->exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO); - - return vmx->exit_intr_info; -} - struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags); void free_vmcs(struct vmcs *vmcs); int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs); base-commit: 14cfaed7621d53af608fd96aa36188064937ca44 -- 2.47.1.613.gc27f4b7a9f-goog ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro 2024-11-28 0:43 ` [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro Sean Christopherson 2024-11-28 8:38 ` Adrian Hunter @ 2024-12-03 8:01 ` Binbin Wu 2024-12-10 16:17 ` Paolo Bonzini 2 siblings, 0 replies; 39+ messages in thread From: Binbin Wu @ 2024-12-03 8:01 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/28/2024 8:43 AM, Sean Christopherson wrote: > Rework __kvm_emulate_hypercall() into a macro so that completion of > hypercalls that don't exit to userspace use direct function calls to the > completion helper, i.e. don't trigger a retpoline when RETPOLINE=y. > > Opportunistically take the names of the input registers, as opposed to > taking the input values, to preemptively dedup more of the calling code > (TDX needs to use different registers). Use the direct GPR accessors to > read values to avoid the pointless marking of the registers as available > (KVM requires GPRs to always be available). > > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> > --- > arch/x86/kvm/x86.c | 29 +++++++++-------------------- > arch/x86/kvm/x86.h | 25 ++++++++++++++++++++----- > 2 files changed, 29 insertions(+), 25 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 39be2a891ab4..fef8b4e63d25 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9982,11 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > return kvm_skip_emulated_instruction(vcpu); > } > > -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl, > - int (*complete_hypercall)(struct kvm_vcpu *)) > +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl, > + int (*complete_hypercall)(struct kvm_vcpu *)) > { > unsigned long ret; > > @@ -10073,32 +10073,21 @@ int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > > out: > vcpu->run->hypercall.ret = ret; > - complete_hypercall(vcpu); > return 1; > } > -EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); > +EXPORT_SYMBOL_GPL(____kvm_emulate_hypercall); > > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) > { > - unsigned long nr, a0, a1, a2, a3; > - int op_64_bit; > - int cpl; > - > if (kvm_xen_hypercall_enabled(vcpu->kvm)) > return kvm_xen_hypercall(vcpu); > > if (kvm_hv_hypercall_enabled(vcpu)) > return kvm_hv_hypercall(vcpu); > > - nr = kvm_rax_read(vcpu); > - a0 = kvm_rbx_read(vcpu); > - a1 = kvm_rcx_read(vcpu); > - a2 = kvm_rdx_read(vcpu); > - a3 = kvm_rsi_read(vcpu); > - op_64_bit = is_64_bit_hypercall(vcpu); > - cpl = kvm_x86_call(get_cpl)(vcpu); > - > - return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, > + return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi, > + is_64_bit_hypercall(vcpu), > + kvm_x86_call(get_cpl)(vcpu), > complete_hypercall_exit); > } > EXPORT_SYMBOL_GPL(kvm_emulate_hypercall); > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 28adc8ea04bf..ad6fe6159dea 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -617,11 +617,26 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) > return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); > } > > -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl, > - int (*complete_hypercall)(struct kvm_vcpu *)); > +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl, > + int (*complete_hypercall)(struct kvm_vcpu *)); > + > +#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall) \ > +({ \ > + int __ret; \ > + \ > + __ret = ____kvm_emulate_hypercall(_vcpu, \ > + kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu), \ > + kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu), \ > + kvm_##a3##_read(_vcpu), op_64_bit, cpl, \ > + complete_hypercall); \ > + \ > + if (__ret > 0) \ > + complete_hypercall(_vcpu); \ > + __ret; \ > +}) > > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro 2024-11-28 0:43 ` [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro Sean Christopherson 2024-11-28 8:38 ` Adrian Hunter 2024-12-03 8:01 ` Binbin Wu @ 2024-12-10 16:17 ` Paolo Bonzini 2024-12-10 20:10 ` Sean Christopherson 2 siblings, 1 reply; 39+ messages in thread From: Paolo Bonzini @ 2024-12-10 16:17 UTC (permalink / raw) To: Sean Christopherson Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On 11/28/24 01:43, Sean Christopherson wrote: > Rework __kvm_emulate_hypercall() into a macro so that completion of > hypercalls that don't exit to userspace use direct function calls to the > completion helper, i.e. don't trigger a retpoline when RETPOLINE=y. > > Opportunistically take the names of the input registers, as opposed to > taking the input values, to preemptively dedup more of the calling code > (TDX needs to use different registers). Use the direct GPR accessors to > read values to avoid the pointless marking of the registers as available > (KVM requires GPRs to always be available). > > Signed-off-by: Sean Christopherson <seanjc@google.com> > Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> > --- > arch/x86/kvm/x86.c | 29 +++++++++-------------------- > arch/x86/kvm/x86.h | 25 ++++++++++++++++++++----- > 2 files changed, 29 insertions(+), 25 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 39be2a891ab4..fef8b4e63d25 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9982,11 +9982,11 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) > return kvm_skip_emulated_instruction(vcpu); > } > > -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl, > - int (*complete_hypercall)(struct kvm_vcpu *)) > +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl, > + int (*complete_hypercall)(struct kvm_vcpu *)) > { > unsigned long ret; > > @@ -10073,32 +10073,21 @@ int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > > out: > vcpu->run->hypercall.ret = ret; > - complete_hypercall(vcpu); > return 1; > } > -EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); > +EXPORT_SYMBOL_GPL(____kvm_emulate_hypercall); > > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) > { > - unsigned long nr, a0, a1, a2, a3; > - int op_64_bit; > - int cpl; > - > if (kvm_xen_hypercall_enabled(vcpu->kvm)) > return kvm_xen_hypercall(vcpu); > > if (kvm_hv_hypercall_enabled(vcpu)) > return kvm_hv_hypercall(vcpu); > > - nr = kvm_rax_read(vcpu); > - a0 = kvm_rbx_read(vcpu); > - a1 = kvm_rcx_read(vcpu); > - a2 = kvm_rdx_read(vcpu); > - a3 = kvm_rsi_read(vcpu); > - op_64_bit = is_64_bit_hypercall(vcpu); > - cpl = kvm_x86_call(get_cpl)(vcpu); > - > - return __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, > + return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi, > + is_64_bit_hypercall(vcpu), > + kvm_x86_call(get_cpl)(vcpu), > complete_hypercall_exit); > } > EXPORT_SYMBOL_GPL(kvm_emulate_hypercall); > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 28adc8ea04bf..ad6fe6159dea 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -617,11 +617,26 @@ static inline bool user_exit_on_hypercall(struct kvm *kvm, unsigned long hc_nr) > return kvm->arch.hypercall_exit_enabled & BIT(hc_nr); > } > d - > -int __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > - unsigned long a0, unsigned long a1, > - unsigned long a2, unsigned long a3, > - int op_64_bit, int cpl, > - int (*complete_hypercall)(struct kvm_vcpu *)); > +int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > + unsigned long a0, unsigned long a1, > + unsigned long a2, unsigned long a3, > + int op_64_bit, int cpl, > + int (*complete_hypercall)(struct kvm_vcpu *)); > + > +#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall) \ > +({ \ > + int __ret; \ > + \ > + __ret = ____kvm_emulate_hypercall(_vcpu, \ > + kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu), \ > + kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu), \ > + kvm_##a3##_read(_vcpu), op_64_bit, cpl, \ > + complete_hypercall); \ > + \ > + if (__ret > 0) \ > + complete_hypercall(_vcpu); \ So based on the review of the previous patch this should become __ret = complete_hypercall(_vcpu); Applied with this change to kvm-coco-queue, thanks. Paolo > + __ret; \ > +}) > > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro 2024-12-10 16:17 ` Paolo Bonzini @ 2024-12-10 20:10 ` Sean Christopherson 0 siblings, 0 replies; 39+ messages in thread From: Sean Christopherson @ 2024-12-10 20:10 UTC (permalink / raw) To: Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On Tue, Dec 10, 2024, Paolo Bonzini wrote: > On 11/28/24 01:43, Sean Christopherson wrote: > > +#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall) \ > > +({ \ > > + int __ret; \ > > + \ > > + __ret = ____kvm_emulate_hypercall(_vcpu, \ > > + kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu), \ > > + kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu), \ > > + kvm_##a3##_read(_vcpu), op_64_bit, cpl, \ > > + complete_hypercall); \ > > + \ > > + if (__ret > 0) \ > > + complete_hypercall(_vcpu); \ > > So based on the review of the previous patch this should become > > __ret = complete_hypercall(_vcpu); > > Applied with this change to kvm-coco-queue, thanks. I was planning on applying this for 6.14. Should I still do that, or do you want to take the bulk of the series through kvm/next, or maybe let it set in kvm-coco-queue? I can't think of any potential conflicts off the top of my head, and the refactoring is really only useful for TDX. Patch 1 should go in sooner than later though. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX 2024-11-28 0:43 [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Sean Christopherson ` (5 preceding siblings ...) 2024-11-28 0:43 ` [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro Sean Christopherson @ 2024-11-28 1:06 ` Huang, Kai 2024-12-19 2:40 ` Sean Christopherson 7 siblings, 0 replies; 39+ messages in thread From: Huang, Kai @ 2024-11-28 1:06 UTC (permalink / raw) To: pbonzini@redhat.com, seanjc@google.com Cc: thomas.lendacky@amd.com, kvm@vger.kernel.org, Li, Xiaoyao, linux-kernel@vger.kernel.org, binbin.wu@linux.intel.com, Yamahata, Isaku On Wed, 2024-11-27 at 16:43 -0800, Sean Christopherson wrote: > Effectively v4 of Binbin's series to handle hypercall exits to userspace in > a generic manner, so that TDX > > Binbin and Kai, this is fairly different that what we last discussed. While > sorting through Binbin's latest patch, I stumbled on what I think/hope is an > approach that will make life easier for TDX. Rather than have common code > set the return value, _and_ have TDX implement a callback to do the same for > user return MSRs, just use the callback for all paths. > > As for abusing vcpu->run->hypercall.ret... It's obviously a bit gross, but > I think it's a lesser evil than having multiple a one-line wrappers just to > trampoline in the return code. Doesn't seem to be "gross" to me, and AFAICT now for TDX we just need to play with __kvm_emulate_hypercall() with a TDX-specific completion callback. Which is nice. Thanks! For this series: Reviewed-by: Kai Huang <kai.huang@intel.com> ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX 2024-11-28 0:43 [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Sean Christopherson ` (6 preceding siblings ...) 2024-11-28 1:06 ` [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Huang, Kai @ 2024-12-19 2:40 ` Sean Christopherson 2025-01-15 9:40 ` Binbin Wu 7 siblings, 1 reply; 39+ messages in thread From: Sean Christopherson @ 2024-12-19 2:40 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Binbin Wu, Isaku Yamahata, Kai Huang, Xiaoyao Li On Wed, 27 Nov 2024 16:43:38 -0800, Sean Christopherson wrote: > Effectively v4 of Binbin's series to handle hypercall exits to userspace in > a generic manner, so that TDX > > Binbin and Kai, this is fairly different that what we last discussed. While > sorting through Binbin's latest patch, I stumbled on what I think/hope is an > approach that will make life easier for TDX. Rather than have common code > set the return value, _and_ have TDX implement a callback to do the same for > user return MSRs, just use the callback for all paths. > > [...] Applied patch 1 to kvm-x86 fixes. I'm going to hold off on the rest until the dust settles on the SEAMCALL interfaces, e.g. in case TDX ends up marshalling state into the "normal" GPRs. [1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() https://github.com/kvm-x86/linux/commit/a317794eefd0 [2/6] KVM: x86: Add a helper to check for user interception of KVM hypercalls (no commit info) [3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h (no commit info) [4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall (no commit info) [5/6] KVM: x86: Always complete hypercall via function callback (no commit info) [6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro (no commit info) -- https://github.com/kvm-x86/linux/tree/next ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX 2024-12-19 2:40 ` Sean Christopherson @ 2025-01-15 9:40 ` Binbin Wu 2025-01-17 19:31 ` Sean Christopherson 0 siblings, 1 reply; 39+ messages in thread From: Binbin Wu @ 2025-01-15 9:40 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Tom Lendacky, Isaku Yamahata, Kai Huang, Xiaoyao Li, Hunter, Adrian, Rick Edgecombe On 12/19/2024 10:40 AM, Sean Christopherson wrote: > On Wed, 27 Nov 2024 16:43:38 -0800, Sean Christopherson wrote: >> Effectively v4 of Binbin's series to handle hypercall exits to userspace in >> a generic manner, so that TDX >> >> Binbin and Kai, this is fairly different that what we last discussed. While >> sorting through Binbin's latest patch, I stumbled on what I think/hope is an >> approach that will make life easier for TDX. Rather than have common code >> set the return value, _and_ have TDX implement a callback to do the same for >> user return MSRs, just use the callback for all paths. >> >> [...] > Applied patch 1 to kvm-x86 fixes. I'm going to hold off on the rest until the > dust settles on the SEAMCALL interfaces, e.g. in case TDX ends up marshalling > state into the "normal" GPRs. Hi Sean, Based on your suggestions in the link https://lore.kernel.org/kvm/Z1suNzg2Or743a7e@google.com, the v2 of "KVM: TDX: TDX hypercalls may exit to userspace" is planned to morph the TDG.VP.VMCALL with KVM hypercall to EXIT_REASON_VMCALL and marshall r10~r14 from vp_enter_args in struct vcpu_tdx to the appropriate x86 registers for KVM hypercall handling. diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index ef66985ddc91..d5aaf66af835 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -935,6 +935,23 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) return tdx_exit_handlers_fastpath(vcpu); } +static int complete_hypercall_exit(struct kvm_vcpu *vcpu) +{ + tdvmcall_set_return_code(vcpu, vcpu->run->hypercall.ret); + return 1; +} + +static int tdx_emulate_vmcall(struct kvm_vcpu *vcpu) +{ + kvm_rax_write(vcpu, to_tdx(vcpu)->vp_enter_args.r10); + kvm_rbx_write(vcpu, to_tdx(vcpu)->vp_enter_args.r11); + kvm_rcx_write(vcpu, to_tdx(vcpu)->vp_enter_args.r12); + kvm_rdx_write(vcpu, to_tdx(vcpu)->vp_enter_args.r13); + kvm_rsi_write(vcpu, to_tdx(vcpu)->vp_enter_args.r14); + + return __kvm_emulate_hypercall(vcpu, 0, complete_hypercall_exit); +} + static int handle_tdvmcall(struct kvm_vcpu *vcpu) { switch (tdvmcall_leaf(vcpu)) { @@ -1286,6 +1303,8 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) return 0; case EXIT_REASON_TDCALL: return handle_tdvmcall(vcpu); + case EXIT_REASON_VMCALL: + return tdx_emulate_vmcall(vcpu); default: break; } To test TDX, I made some modifications to your patch "KVM: x86: Refactor __kvm_emulate_hypercall() into a macro" Are the following changes make sense to you? diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a2198807290b..2c5df57ad799 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10088,9 +10088,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) if (kvm_hv_hypercall_enabled(vcpu)) return kvm_hv_hypercall(vcpu); - return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi, - is_64_bit_hypercall(vcpu), - kvm_x86_call(get_cpl)(vcpu), + return __kvm_emulate_hypercall(vcpu, kvm_x86_call(get_cpl)(vcpu), complete_hypercall_exit); } EXPORT_SYMBOL_GPL(kvm_emulate_hypercall); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index b00ecbfef000..989bed5b48b0 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -623,19 +623,18 @@ int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, int op_64_bit, int cpl, int (*complete_hypercall)(struct kvm_vcpu *)); -#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall) \ -({ \ - int __ret; \ - \ - __ret = ____kvm_emulate_hypercall(_vcpu, \ - kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu), \ - kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu), \ - kvm_##a3##_read(_vcpu), op_64_bit, cpl, \ - complete_hypercall); \ - \ - if (__ret > 0) \ - __ret = complete_hypercall(_vcpu); \ - __ret; \ +#define __kvm_emulate_hypercall(_vcpu, cpl, complete_hypercall) \ +({ \ + int __ret; \ + __ret = ____kvm_emulate_hypercall(_vcpu, kvm_rax_read(_vcpu), \ + kvm_rbx_read(_vcpu), kvm_rcx_read(_vcpu), \ + kvm_rdx_read(_vcpu), kvm_rsi_read(_vcpu), \ + is_64_bit_hypercall(_vcpu), cpl, \ + complete_hypercall); \ + \ + if (__ret > 0) \ + __ret = complete_hypercall(_vcpu); \ + __ret; \ }) int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX 2025-01-15 9:40 ` Binbin Wu @ 2025-01-17 19:31 ` Sean Christopherson 2025-01-20 0:37 ` Binbin Wu 0 siblings, 1 reply; 39+ messages in thread From: Sean Christopherson @ 2025-01-17 19:31 UTC (permalink / raw) To: Binbin Wu Cc: Paolo Bonzini, kvm, linux-kernel, Tom Lendacky, Isaku Yamahata, Kai Huang, Xiaoyao Li, Adrian Hunter, Rick Edgecombe On Wed, Jan 15, 2025, Binbin Wu wrote: > On 12/19/2024 10:40 AM, Sean Christopherson wrote: > > On Wed, 27 Nov 2024 16:43:38 -0800, Sean Christopherson wrote: > > > Effectively v4 of Binbin's series to handle hypercall exits to userspace in > > > a generic manner, so that TDX > > > > > > Binbin and Kai, this is fairly different that what we last discussed. While > > > sorting through Binbin's latest patch, I stumbled on what I think/hope is an > > > approach that will make life easier for TDX. Rather than have common code > > > set the return value, _and_ have TDX implement a callback to do the same for > > > user return MSRs, just use the callback for all paths. > > > > > > [...] > > Applied patch 1 to kvm-x86 fixes. I'm going to hold off on the rest until the > > dust settles on the SEAMCALL interfaces, e.g. in case TDX ends up marshalling > > state into the "normal" GPRs. > Hi Sean, Based on your suggestions in the link > https://lore.kernel.org/kvm/Z1suNzg2Or743a7e@google.com, the v2 of "KVM: TDX: > TDX hypercalls may exit to userspace" is planned to morph the TDG.VP.VMCALL > with KVM hypercall to EXIT_REASON_VMCALL and marshall r10~r14 from > vp_enter_args in struct vcpu_tdx to the appropriate x86 registers for KVM > hypercall handling. ... > To test TDX, I made some modifications to your patch > "KVM: x86: Refactor __kvm_emulate_hypercall() into a macro" > Are the following changes make sense to you? Yes, but I think we can go a step further and effectively revert the bulk of commit e913ef159fad ("KVM: x86: Split core of hypercall emulation to helper function"), i.e. have ____kvm_emulate_hypercall() read the GPRs instead of passing them in via the macro. > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index a2198807290b..2c5df57ad799 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -10088,9 +10088,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) > if (kvm_hv_hypercall_enabled(vcpu)) > return kvm_hv_hypercall(vcpu); > - return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi, > - is_64_bit_hypercall(vcpu), > - kvm_x86_call(get_cpl)(vcpu), > + return __kvm_emulate_hypercall(vcpu, kvm_x86_call(get_cpl)(vcpu), > complete_hypercall_exit); > } > EXPORT_SYMBOL_GPL(kvm_emulate_hypercall); > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index b00ecbfef000..989bed5b48b0 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -623,19 +623,18 @@ int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, > int op_64_bit, int cpl, > int (*complete_hypercall)(struct kvm_vcpu *)); > -#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall) \ > -({ \ > - int __ret; \ > - \ > - __ret = ____kvm_emulate_hypercall(_vcpu, \ > - kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu), \ > - kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu), \ > - kvm_##a3##_read(_vcpu), op_64_bit, cpl, \ > - complete_hypercall); \ > - \ > - if (__ret > 0) \ > - __ret = complete_hypercall(_vcpu); \ > - __ret; \ > +#define __kvm_emulate_hypercall(_vcpu, cpl, complete_hypercall) \ > +({ \ > + int __ret; \ > + __ret = ____kvm_emulate_hypercall(_vcpu, kvm_rax_read(_vcpu), \ > + kvm_rbx_read(_vcpu), kvm_rcx_read(_vcpu), \ > + kvm_rdx_read(_vcpu), kvm_rsi_read(_vcpu), \ > + is_64_bit_hypercall(_vcpu), cpl, \ > + complete_hypercall); \ > + \ > + if (__ret > 0) \ > + __ret = complete_hypercall(_vcpu); \ > + __ret; \ > }) > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); > > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX 2025-01-17 19:31 ` Sean Christopherson @ 2025-01-20 0:37 ` Binbin Wu 2025-01-21 18:04 ` Sean Christopherson 0 siblings, 1 reply; 39+ messages in thread From: Binbin Wu @ 2025-01-20 0:37 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, kvm, linux-kernel, Tom Lendacky, Isaku Yamahata, Kai Huang, Xiaoyao Li, Adrian Hunter, Rick Edgecombe On 1/18/2025 3:31 AM, Sean Christopherson wrote: > On Wed, Jan 15, 2025, Binbin Wu wrote: >> On 12/19/2024 10:40 AM, Sean Christopherson wrote: >>> On Wed, 27 Nov 2024 16:43:38 -0800, Sean Christopherson wrote: >>>> Effectively v4 of Binbin's series to handle hypercall exits to userspace in >>>> a generic manner, so that TDX >>>> >>>> Binbin and Kai, this is fairly different that what we last discussed. While >>>> sorting through Binbin's latest patch, I stumbled on what I think/hope is an >>>> approach that will make life easier for TDX. Rather than have common code >>>> set the return value, _and_ have TDX implement a callback to do the same for >>>> user return MSRs, just use the callback for all paths. >>>> >>>> [...] >>> Applied patch 1 to kvm-x86 fixes. I'm going to hold off on the rest until the >>> dust settles on the SEAMCALL interfaces, e.g. in case TDX ends up marshalling >>> state into the "normal" GPRs. >> Hi Sean, Based on your suggestions in the link >> https://lore.kernel.org/kvm/Z1suNzg2Or743a7e@google.com, the v2 of "KVM: TDX: >> TDX hypercalls may exit to userspace" is planned to morph the TDG.VP.VMCALL >> with KVM hypercall to EXIT_REASON_VMCALL and marshall r10~r14 from >> vp_enter_args in struct vcpu_tdx to the appropriate x86 registers for KVM >> hypercall handling. > ... > >> To test TDX, I made some modifications to your patch >> "KVM: x86: Refactor __kvm_emulate_hypercall() into a macro" >> Are the following changes make sense to you? > Yes, but I think we can go a step further and effectively revert the bulk of commit > e913ef159fad ("KVM: x86: Split core of hypercall emulation to helper function"), > i.e. have ____kvm_emulate_hypercall() read the GPRs instead of passing them in > via the macro. Sure. Are you OK if I sent the change (as a prep patch) along with v2 of "TDX hypercalls may exit to userspace"? > >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index a2198807290b..2c5df57ad799 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -10088,9 +10088,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) >> if (kvm_hv_hypercall_enabled(vcpu)) >> return kvm_hv_hypercall(vcpu); >> - return __kvm_emulate_hypercall(vcpu, rax, rbx, rcx, rdx, rsi, >> - is_64_bit_hypercall(vcpu), >> - kvm_x86_call(get_cpl)(vcpu), >> + return __kvm_emulate_hypercall(vcpu, kvm_x86_call(get_cpl)(vcpu), >> complete_hypercall_exit); >> } >> EXPORT_SYMBOL_GPL(kvm_emulate_hypercall); >> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h >> index b00ecbfef000..989bed5b48b0 100644 >> --- a/arch/x86/kvm/x86.h >> +++ b/arch/x86/kvm/x86.h >> @@ -623,19 +623,18 @@ int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, >> int op_64_bit, int cpl, >> int (*complete_hypercall)(struct kvm_vcpu *)); >> -#define __kvm_emulate_hypercall(_vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, complete_hypercall) \ >> -({ \ >> - int __ret; \ >> - \ >> - __ret = ____kvm_emulate_hypercall(_vcpu, \ >> - kvm_##nr##_read(_vcpu), kvm_##a0##_read(_vcpu), \ >> - kvm_##a1##_read(_vcpu), kvm_##a2##_read(_vcpu), \ >> - kvm_##a3##_read(_vcpu), op_64_bit, cpl, \ >> - complete_hypercall); \ >> - \ >> - if (__ret > 0) \ >> - __ret = complete_hypercall(_vcpu); \ >> - __ret; \ >> +#define __kvm_emulate_hypercall(_vcpu, cpl, complete_hypercall) \ >> +({ \ >> + int __ret; \ >> + __ret = ____kvm_emulate_hypercall(_vcpu, kvm_rax_read(_vcpu), \ >> + kvm_rbx_read(_vcpu), kvm_rcx_read(_vcpu), \ >> + kvm_rdx_read(_vcpu), kvm_rsi_read(_vcpu), \ >> + is_64_bit_hypercall(_vcpu), cpl, \ >> + complete_hypercall); \ >> + \ >> + if (__ret > 0) \ >> + __ret = complete_hypercall(_vcpu); \ >> + __ret; \ >> }) >> int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); >> >> ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX 2025-01-20 0:37 ` Binbin Wu @ 2025-01-21 18:04 ` Sean Christopherson 0 siblings, 0 replies; 39+ messages in thread From: Sean Christopherson @ 2025-01-21 18:04 UTC (permalink / raw) To: Binbin Wu Cc: Paolo Bonzini, kvm, linux-kernel, Tom Lendacky, Isaku Yamahata, Kai Huang, Xiaoyao Li, Adrian Hunter, Rick Edgecombe On Mon, Jan 20, 2025, Binbin Wu wrote: > On 1/18/2025 3:31 AM, Sean Christopherson wrote: > > On Wed, Jan 15, 2025, Binbin Wu wrote: > > > On 12/19/2024 10:40 AM, Sean Christopherson wrote: > > > > On Wed, 27 Nov 2024 16:43:38 -0800, Sean Christopherson wrote: > > > > > Effectively v4 of Binbin's series to handle hypercall exits to userspace in > > > > > a generic manner, so that TDX > > > > > > > > > > Binbin and Kai, this is fairly different that what we last discussed. While > > > > > sorting through Binbin's latest patch, I stumbled on what I think/hope is an > > > > > approach that will make life easier for TDX. Rather than have common code > > > > > set the return value, _and_ have TDX implement a callback to do the same for > > > > > user return MSRs, just use the callback for all paths. > > > > > > > > > > [...] > > > > Applied patch 1 to kvm-x86 fixes. I'm going to hold off on the rest until the > > > > dust settles on the SEAMCALL interfaces, e.g. in case TDX ends up marshalling > > > > state into the "normal" GPRs. > > > Hi Sean, Based on your suggestions in the link > > > https://lore.kernel.org/kvm/Z1suNzg2Or743a7e@google.com, the v2 of "KVM: TDX: > > > TDX hypercalls may exit to userspace" is planned to morph the TDG.VP.VMCALL > > > with KVM hypercall to EXIT_REASON_VMCALL and marshall r10~r14 from > > > vp_enter_args in struct vcpu_tdx to the appropriate x86 registers for KVM > > > hypercall handling. > > ... > > > > > To test TDX, I made some modifications to your patch > > > "KVM: x86: Refactor __kvm_emulate_hypercall() into a macro" > > > Are the following changes make sense to you? > > Yes, but I think we can go a step further and effectively revert the bulk of commit > > e913ef159fad ("KVM: x86: Split core of hypercall emulation to helper function"), > > i.e. have ____kvm_emulate_hypercall() read the GPRs instead of passing them in > > via the macro. > > Sure. > > Are you OK if I sent the change (as a prep patch) along with v2 of > "TDX hypercalls may exit to userspace"? Ya, go for it. ^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2025-01-21 18:04 UTC | newest] Thread overview: 39+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-28 0:43 [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Sean Christopherson 2024-11-28 0:43 ` [PATCH v4 1/6] KVM: x86: Play nice with protected guests in complete_hypercall_exit() Sean Christopherson 2024-11-28 3:22 ` Xiaoyao Li 2024-11-29 9:01 ` Nikunj A. Dadhania 2024-12-02 18:54 ` Tom Lendacky 2024-12-03 7:29 ` Binbin Wu 2024-11-28 0:43 ` [PATCH v4 2/6] KVM: x86: Add a helper to check for user interception of KVM hypercalls Sean Christopherson 2024-12-02 18:57 ` Tom Lendacky 2024-11-28 0:43 ` [PATCH v4 3/6] KVM: x86: Move "emulate hypercall" function declarations to x86.h Sean Christopherson 2024-11-28 3:23 ` Xiaoyao Li 2024-12-02 20:05 ` Tom Lendacky 2024-12-03 7:33 ` Binbin Wu 2024-11-28 0:43 ` [PATCH v4 4/6] KVM: x86: Bump hypercall stat prior to fully completing hypercall Sean Christopherson 2024-11-28 3:24 ` Xiaoyao Li 2024-12-02 20:13 ` Tom Lendacky 2024-12-02 20:33 ` Tom Lendacky 2024-12-03 7:37 ` Binbin Wu 2024-11-28 0:43 ` [PATCH v4 5/6] KVM: x86: Always complete hypercall via function callback Sean Christopherson 2024-11-28 3:08 ` Xiaoyao Li 2024-12-02 18:44 ` Sean Christopherson 2024-12-02 20:57 ` Tom Lendacky 2024-12-02 20:59 ` Tom Lendacky 2024-12-03 0:14 ` Sean Christopherson 2024-11-28 0:43 ` [PATCH v4 6/6] KVM: x86: Refactor __kvm_emulate_hypercall() into a macro Sean Christopherson 2024-11-28 8:38 ` Adrian Hunter 2024-12-10 16:20 ` Paolo Bonzini 2024-12-10 20:03 ` Sean Christopherson 2024-12-12 7:32 ` Adrian Hunter 2024-12-12 15:42 ` Paolo Bonzini 2024-12-12 18:40 ` Sean Christopherson 2024-12-03 8:01 ` Binbin Wu 2024-12-10 16:17 ` Paolo Bonzini 2024-12-10 20:10 ` Sean Christopherson 2024-11-28 1:06 ` [PATCH v4 0/6] KVM: x86: Prep KVM hypercall handling for TDX Huang, Kai 2024-12-19 2:40 ` Sean Christopherson 2025-01-15 9:40 ` Binbin Wu 2025-01-17 19:31 ` Sean Christopherson 2025-01-20 0:37 ` Binbin Wu 2025-01-21 18:04 ` Sean Christopherson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox