* [PATCH v2] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall [not found] <20230417122206.34647-1-metikaya@amazon.co.uk> @ 2023-04-17 12:22 ` Metin Kaya 2023-04-17 16:31 ` Sean Christopherson 0 siblings, 1 reply; 17+ messages in thread From: Metin Kaya @ 2023-04-17 12:22 UTC (permalink / raw) To: kvm, pbonzini Cc: x86, bp, dwmw, paul, seanjc, tglx, mingo, dave.hansen, joao.m.martins, Metin Kaya HVMOP_flush_tlbs suboperation of hvm_op hypercall allows a guest to flush all vCPU TLBs. There is no way for the VMM to flush TLBs from userspace. Hence, this patch adds support for flushing vCPU TLBs to KVM by making a KVM_REQ_TLB_FLUSH_GUEST request for all guest vCPUs. Signed-off-by: Metin Kaya <metikaya@amazon.co.uk> --- arch/x86/kvm/xen.c | 31 ++++++++++++++++++++++++++++++ include/xen/interface/hvm/hvm_op.h | 3 +++ 2 files changed, 34 insertions(+) diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 40edf4d1974c..78fa6d08bebc 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -21,6 +21,7 @@ #include <xen/interface/vcpu.h> #include <xen/interface/version.h> #include <xen/interface/event_channel.h> +#include <xen/interface/hvm/hvm_op.h> #include <xen/interface/sched.h> #include <asm/xen/cpuid.h> @@ -1330,6 +1331,32 @@ static bool kvm_xen_hcall_sched_op(struct kvm_vcpu *vcpu, bool longmode, return false; } +static void kvm_xen_hvmop_flush_tlbs(struct kvm_vcpu *vcpu, bool longmode, + u64 arg, u64 *r) +{ + if (arg) { + *r = -EINVAL; + return; + } + + kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH_GUEST); + *r = 0; +} + +static bool kvm_xen_hcall_hvm_op(struct kvm_vcpu *vcpu, bool longmode, + int cmd, u64 arg, u64 *r) +{ + switch (cmd) { + case HVMOP_flush_tlbs: + kvm_xen_hvmop_flush_tlbs(vcpu, longmode, arg, r); + return true; + default: + break; + } + + return false; +} + struct compat_vcpu_set_singleshot_timer { uint64_t timeout_abs_ns; uint32_t flags; @@ -1501,6 +1528,10 @@ int kvm_xen_hypercall(struct kvm_vcpu *vcpu) timeout |= params[1] << 32; handled = kvm_xen_hcall_set_timer_op(vcpu, timeout, &r); break; + case __HYPERVISOR_hvm_op: + handled = kvm_xen_hcall_hvm_op(vcpu, longmode, params[0], params[1], + &r); + break; } default: break; diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h index 03134bf3cec1..373123226c6f 100644 --- a/include/xen/interface/hvm/hvm_op.h +++ b/include/xen/interface/hvm/hvm_op.h @@ -16,6 +16,9 @@ struct xen_hvm_param { }; DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param); +/* Flushes all VCPU TLBs: @arg must be NULL. */ +#define HVMOP_flush_tlbs 5 + /* Hint from PV drivers for pagetable destruction. */ #define HVMOP_pagetable_dying 9 struct xen_hvm_pagetable_dying { -- 2.39.2 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-17 12:22 ` [PATCH v2] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall Metin Kaya @ 2023-04-17 16:31 ` Sean Christopherson 2023-04-18 9:14 ` [EXTERNAL][PATCH " David Woodhouse 0 siblings, 1 reply; 17+ messages in thread From: Sean Christopherson @ 2023-04-17 16:31 UTC (permalink / raw) To: Metin Kaya Cc: kvm, pbonzini, x86, bp, dwmw, paul, tglx, mingo, dave.hansen, joao.m.martins On Mon, Apr 17, 2023, Metin Kaya wrote: > HVMOP_flush_tlbs suboperation of hvm_op hypercall allows a guest to > flush all vCPU TLBs. There is no way for the VMM to flush TLBs from > userspace. Ah, took me a minute to connect the dots. Monday morning is definitely partly to blame, but it would be helpful to expand this sentence to be more explicit as to why userspace's inability to efficiently flush TLBs. And strictly speaking, userspace _can_ flush TLBs, just not in a precise, efficient way. > Hence, this patch adds support for flushing vCPU TLBs to KVM Don't use "this patch", just state what the patch does as a command. > by making a KVM_REQ_TLB_FLUSH_GUEST request for all guest vCPUs. E.g. Implement in-KVM support for Xen's HVMOP_flush_tlbs hypercall, which allows the guest to flush all vCPU's TLBs. KVM doesn't provide an ioctl() to precisely flush guest TLBs, and punting to userspace would likely negate the performance benefits of avoiding a TLB shootdown in the guest. > Signed-off-by: Metin Kaya <metikaya@amazon.co.uk> > > --- Regarding the cover letter, first and foremost, make sure the cover letter has a subject. `git format-patch --cover-letter` will generate what you want, i.e. there's no need hand generate it (or however you ended up with a nearly empty mail with no subjection. Second, there's no need to provide a cover letter for a standalone patch just to capture the version information. This area of the patch, between the three "---" and the diff, is ignored by `git am`, and can be used for version info. > arch/x86/kvm/xen.c | 31 ++++++++++++++++++++++++++++++ > include/xen/interface/hvm/hvm_op.h | 3 +++ Modifications to uapi headers is conspicuously missing. I.e. there likely needs to be a capability so that userspace can query support. > 2 files changed, 34 insertions(+) > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > index 40edf4d1974c..78fa6d08bebc 100644 > --- a/arch/x86/kvm/xen.c > +++ b/arch/x86/kvm/xen.c > @@ -21,6 +21,7 @@ > #include <xen/interface/vcpu.h> > #include <xen/interface/version.h> > #include <xen/interface/event_channel.h> > +#include <xen/interface/hvm/hvm_op.h> > #include <xen/interface/sched.h> > > #include <asm/xen/cpuid.h> > @@ -1330,6 +1331,32 @@ static bool kvm_xen_hcall_sched_op(struct kvm_vcpu *vcpu, bool longmode, > return false; > } > > +static void kvm_xen_hvmop_flush_tlbs(struct kvm_vcpu *vcpu, bool longmode, Passing @longmode all the way down here just to ignore just screams copy+paste :-) > + u64 arg, u64 *r) > +{ > + if (arg) { > + *r = -EINVAL; > + return; > + } > + > + kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH_GUEST); This doesn't even compile. I'll give you one guess as to how much confidence I have that this was properly tested. Aha! And QEMU appears to have Xen emulation support. That means KVM-Unit-Tests is an option. Specifically, extend the "access" test to use this hypercall instead of INVLPG. That'll verify that the flush is actually being performed as expteced. > + *r = 0; Using a out param in a void function is silly. But that's a moot point because this whole function is silly, just open code it in the caller. > +} > + > +static bool kvm_xen_hcall_hvm_op(struct kvm_vcpu *vcpu, bool longmode, There's no need for @longmode. > + int cmd, u64 arg, u64 *r) > +{ > + switch (cmd) { > + case HVMOP_flush_tlbs: > + kvm_xen_hvmop_flush_tlbs(vcpu, longmode, arg, r); This can simply be: if (arg) { *r = -EINVAL; } else { kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_TLB_FLUSH_GUEST); *r = 0; } return true; > + return true; > + default: > + break; > + } > + > + return false; > +} > + > struct compat_vcpu_set_singleshot_timer { > uint64_t timeout_abs_ns; > uint32_t flags; > @@ -1501,6 +1528,10 @@ int kvm_xen_hypercall(struct kvm_vcpu *vcpu) > timeout |= params[1] << 32; > handled = kvm_xen_hcall_set_timer_op(vcpu, timeout, &r); > break; > + case __HYPERVISOR_hvm_op: > + handled = kvm_xen_hcall_hvm_op(vcpu, longmode, params[0], params[1], > + &r); It'll be a moot point, but in the future let code like this poke out past 80 chars. The 80 char soft limit exists to make code more readable, wrapping a one-char variable at the tail end of a function call does more harm than good. > + break; > } > default: > break; > diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h > index 03134bf3cec1..373123226c6f 100644 > --- a/include/xen/interface/hvm/hvm_op.h > +++ b/include/xen/interface/hvm/hvm_op.h > @@ -16,6 +16,9 @@ struct xen_hvm_param { > }; > DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param); > > +/* Flushes all VCPU TLBs: s/VCPU/vCPU And "all vCPU TLBs" is ambiguous. It could mean "all TLBs for the target vCPU". Adding "guest" in there would also be helpful, e.g. to make it clear that this doesn't flush TDP mappings. > @arg must be NULL. */ Is the "NULL" part verbatim from Xen? Because "0" seems like a better description than "NULL" for a u64. E.g. "Flush guest TLBs for all vCPUs" > +#define HVMOP_flush_tlbs 5 > + > /* Hint from PV drivers for pagetable destruction. */ > #define HVMOP_pagetable_dying 9 > struct xen_hvm_pagetable_dying { > -- > 2.39.2 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [EXTERNAL][PATCH v2] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-17 16:31 ` Sean Christopherson @ 2023-04-18 9:14 ` David Woodhouse 2023-04-18 10:13 ` [PATCH v3] " Metin Kaya 2023-04-18 14:44 ` [EXTERNAL][PATCH v2] " Sean Christopherson 0 siblings, 2 replies; 17+ messages in thread From: David Woodhouse @ 2023-04-18 9:14 UTC (permalink / raw) To: Sean Christopherson, Metin Kaya Cc: kvm, pbonzini, x86, bp, paul, tglx, mingo, dave.hansen, joao.m.martins [-- Attachment #1: Type: text/plain, Size: 3650 bytes --] On Mon, 2023-04-17 at 09:31 -0700, Sean Christopherson wrote: > On Mon, Apr 17, 2023, Metin Kaya wrote: > > HVMOP_flush_tlbs suboperation of hvm_op hypercall allows a guest to > > flush all vCPU TLBs. There is no way for the VMM to flush TLBs from > > userspace. > > Ah, took me a minute to connect the dots. Monday morning is definitely partly > to blame, but it would be helpful to expand this sentence to be more explicit as > to why userspace's inability to efficiently flush TLBs. > > And strictly speaking, userspace _can_ flush TLBs, just not in a precise, efficient > way. Hm, how? We should probably implement that in userspace as a fallback, however much it sucks. > > arch/x86/kvm/xen.c | 31 ++++++++++++++++++++++++++++++ > > include/xen/interface/hvm/hvm_op.h | 3 +++ > > Modifications to uapi headers is conspicuously missing. I.e. there likely needs > to be a capability so that userspace can query support. Nah, nobody cares. If the kernel "accelerates" this hypercall, so be it. Userspace will just never get the KVM_EXIT_XEN for that hypercall because it'll be magically handled, like the others. > > > + u64 arg, u64 *r) > > +{ > > + if (arg) { > > + *r = -EINVAL; > > + return; > > + } > > + > > + kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH_GUEST); > > This doesn't even compile. I'll give you one guess as to how much confidence I > have that this was properly tested. > > Aha! And QEMU appears to have Xen emulation support. As of 8.0, yes :) > That means KVM-Unit-Tests > is an option. Specifically, extend the "access" test to use this hypercall instead > of INVLPG. That'll verify that the flush is actually being performed as expteced. > > > + int cmd, u64 arg, u64 *r) > > +{ > > + switch (cmd) { > > + case HVMOP_flush_tlbs: > > + kvm_xen_hvmop_flush_tlbs(vcpu, longmode, arg, r); > > This can simply be: > > if (arg) { > *r = -EINVAL; I don't even know that we care about in-kernel acceleration for the -EINVAL case. We could just return false for that, and let userspace (report and) handle it. It should never happen; Xen has never accepted anything but NULL as the first argument. For reasons unclear. case HVMOP_flush_tlbs: rc = guest_handle_is_null(arg) ? hvmop_flush_tlb_all() : -EINVAL; I don't think we'll be adding any more HVMOPs to that switch statement so we might as well make it: if (cmd == HVMOP_flush_tlbs && !arg) > } else { > kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_TLB_FLUSH_GUEST); > *r = 0; > } > return true; ... > > +/* Flushes all VCPU TLBs: > > s/VCPU/vCPU > > And "all vCPU TLBs" is ambiguous. It could mean "all TLBs for the target vCPU". > Adding "guest" in there would also be helpful, e.g. to make it clear that this > doesn't flush TDP mappings. > > > @arg must be NULL. */ > > Is the "NULL" part verbatim from Xen? Because "0" seems like a better description > than "NULL" for a u64. Yes, that's all verbatim from Xen. The first hypercall argument is typically a pointer. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-18 9:14 ` [EXTERNAL][PATCH " David Woodhouse @ 2023-04-18 10:13 ` Metin Kaya 2023-04-18 10:48 ` Paul Durrant ` (2 more replies) 2023-04-18 14:44 ` [EXTERNAL][PATCH v2] " Sean Christopherson 1 sibling, 3 replies; 17+ messages in thread From: Metin Kaya @ 2023-04-18 10:13 UTC (permalink / raw) To: kvm, pbonzini Cc: x86, bp, dwmw, paul, seanjc, tglx, mingo, dave.hansen, joao.m.martins, Metin Kaya Implement in-KVM support for Xen's HVMOP_flush_tlbs hypercall, which allows the guest to flush all vCPU's TLBs. KVM doesn't provide an ioctl() to precisely flush guest TLBs, and punting to userspace would likely negate the performance benefits of avoiding a TLB shootdown in the guest. Signed-off-by: Metin Kaya <metikaya@amazon.co.uk> --- v3: - Addressed comments for v2. - Verified with XTF/invlpg test case. v2: - Removed an irrelevant URL from commit message. --- arch/x86/kvm/xen.c | 15 +++++++++++++++ include/xen/interface/hvm/hvm_op.h | 3 +++ 2 files changed, 18 insertions(+) diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 40edf4d1974c..a63c48e8d8fa 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -21,6 +21,7 @@ #include <xen/interface/vcpu.h> #include <xen/interface/version.h> #include <xen/interface/event_channel.h> +#include <xen/interface/hvm/hvm_op.h> #include <xen/interface/sched.h> #include <asm/xen/cpuid.h> @@ -1330,6 +1331,17 @@ static bool kvm_xen_hcall_sched_op(struct kvm_vcpu *vcpu, bool longmode, return false; } +static bool kvm_xen_hcall_hvm_op(struct kvm_vcpu *vcpu, int cmd, u64 arg, u64 *r) +{ + if (cmd == HVMOP_flush_tlbs && !arg) { + kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_TLB_FLUSH_GUEST); + *r = 0; + return true; + } + + return false; +} + struct compat_vcpu_set_singleshot_timer { uint64_t timeout_abs_ns; uint32_t flags; @@ -1501,6 +1513,9 @@ int kvm_xen_hypercall(struct kvm_vcpu *vcpu) timeout |= params[1] << 32; handled = kvm_xen_hcall_set_timer_op(vcpu, timeout, &r); break; + case __HYPERVISOR_hvm_op: + handled = kvm_xen_hcall_hvm_op(vcpu, params[0], params[1], &r); + break; } default: break; diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h index 03134bf3cec1..240d8149bc04 100644 --- a/include/xen/interface/hvm/hvm_op.h +++ b/include/xen/interface/hvm/hvm_op.h @@ -16,6 +16,9 @@ struct xen_hvm_param { }; DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param); +/* Flushes guest TLBs for all vCPUs: @arg must be 0. */ +#define HVMOP_flush_tlbs 5 + /* Hint from PV drivers for pagetable destruction. */ #define HVMOP_pagetable_dying 9 struct xen_hvm_pagetable_dying { -- 2.39.2 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-18 10:13 ` [PATCH v3] " Metin Kaya @ 2023-04-18 10:48 ` Paul Durrant 2023-04-18 11:04 ` Kaya, Metin 2023-04-18 11:05 ` [EXTERNAL][PATCH " David Woodhouse 2023-05-26 20:32 ` [PATCH " Sean Christopherson 2023-08-04 0:22 ` Sean Christopherson 2 siblings, 2 replies; 17+ messages in thread From: Paul Durrant @ 2023-04-18 10:48 UTC (permalink / raw) To: Metin Kaya, kvm, pbonzini Cc: x86, bp, dwmw, seanjc, tglx, mingo, dave.hansen, joao.m.martins On 18/04/2023 11:13, Metin Kaya wrote: > Implement in-KVM support for Xen's HVMOP_flush_tlbs hypercall, which > allows the guest to flush all vCPU's TLBs. KVM doesn't provide an > ioctl() to precisely flush guest TLBs, and punting to userspace would > likely negate the performance benefits of avoiding a TLB shootdown in > the guest. > > Signed-off-by: Metin Kaya <metikaya@amazon.co.uk> > > --- > v3: > - Addressed comments for v2. > - Verified with XTF/invlpg test case. > > v2: > - Removed an irrelevant URL from commit message. > --- > arch/x86/kvm/xen.c | 15 +++++++++++++++ > include/xen/interface/hvm/hvm_op.h | 3 +++ > 2 files changed, 18 insertions(+) > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > index 40edf4d1974c..a63c48e8d8fa 100644 > --- a/arch/x86/kvm/xen.c > +++ b/arch/x86/kvm/xen.c > @@ -21,6 +21,7 @@ > #include <xen/interface/vcpu.h> > #include <xen/interface/version.h> > #include <xen/interface/event_channel.h> > +#include <xen/interface/hvm/hvm_op.h> > #include <xen/interface/sched.h> > > #include <asm/xen/cpuid.h> > @@ -1330,6 +1331,17 @@ static bool kvm_xen_hcall_sched_op(struct kvm_vcpu *vcpu, bool longmode, > return false; > } > > +static bool kvm_xen_hcall_hvm_op(struct kvm_vcpu *vcpu, int cmd, u64 arg, u64 *r) > +{ > + if (cmd == HVMOP_flush_tlbs && !arg) { > + kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_TLB_FLUSH_GUEST); > + *r = 0; > + return true; > + } > + > + return false; > +} This code structure means that arg != NULL will result in the guest seeing ENOSYS rather than EINVAL. Paul > + > struct compat_vcpu_set_singleshot_timer { > uint64_t timeout_abs_ns; > uint32_t flags; > @@ -1501,6 +1513,9 @@ int kvm_xen_hypercall(struct kvm_vcpu *vcpu) > timeout |= params[1] << 32; > handled = kvm_xen_hcall_set_timer_op(vcpu, timeout, &r); > break; > + case __HYPERVISOR_hvm_op: > + handled = kvm_xen_hcall_hvm_op(vcpu, params[0], params[1], &r); > + break; > } > default: > break; > diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h > index 03134bf3cec1..240d8149bc04 100644 > --- a/include/xen/interface/hvm/hvm_op.h > +++ b/include/xen/interface/hvm/hvm_op.h > @@ -16,6 +16,9 @@ struct xen_hvm_param { > }; > DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param); > > +/* Flushes guest TLBs for all vCPUs: @arg must be 0. */ > +#define HVMOP_flush_tlbs 5 > + > /* Hint from PV drivers for pagetable destruction. */ > #define HVMOP_pagetable_dying 9 > struct xen_hvm_pagetable_dying { ^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-18 10:48 ` Paul Durrant @ 2023-04-18 11:04 ` Kaya, Metin 2023-04-18 11:13 ` Paul Durrant 2023-04-18 11:05 ` [EXTERNAL][PATCH " David Woodhouse 1 sibling, 1 reply; 17+ messages in thread From: Kaya, Metin @ 2023-04-18 11:04 UTC (permalink / raw) To: paul@xen.org Cc: x86@kernel.org, bp@alien8.de, Woodhouse, David, seanjc@google.com, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, joao.m.martins@oracle.com, pbonzini@redhat.com, kvm@vger.kernel.org On 18/04/2023 11:48, Paul Durrant wrote: >On 18/04/2023 11:13, Metin Kaya wrote: >> Implement in-KVM support for Xen's HVMOP_flush_tlbs hypercall, which >> allows the guest to flush all vCPU's TLBs. KVM doesn't provide an >> ioctl() to precisely flush guest TLBs, and punting to userspace would >> likely negate the performance benefits of avoiding a TLB shootdown in >> the guest. >> >> Signed-off-by: Metin Kaya <metikaya@amazon.co.uk> >> >> --- >> v3: >> - Addressed comments for v2. >> - Verified with XTF/invlpg test case. >> >> v2: >> - Removed an irrelevant URL from commit message. >> --- >> arch/x86/kvm/xen.c | 15 +++++++++++++++ >> include/xen/interface/hvm/hvm_op.h | 3 +++ >> 2 files changed, 18 insertions(+) >> >> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index >> 40edf4d1974c..a63c48e8d8fa 100644 >> --- a/arch/x86/kvm/xen.c >> +++ b/arch/x86/kvm/xen.c >> @@ -21,6 +21,7 @@ >> #include <xen/interface/vcpu.h> >> #include <xen/interface/version.h> >> #include <xen/interface/event_channel.h> >> +#include <xen/interface/hvm/hvm_op.h> >> #include <xen/interface/sched.h> >> >> #include <asm/xen/cpuid.h> >> @@ -1330,6 +1331,17 @@ static bool kvm_xen_hcall_sched_op(struct kvm_vcpu *vcpu, bool longmode, >> return false; >> } >> >> +static bool kvm_xen_hcall_hvm_op(struct kvm_vcpu *vcpu, int cmd, u64 >> +arg, u64 *r) { >> + if (cmd == HVMOP_flush_tlbs && !arg) { >> + kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_TLB_FLUSH_GUEST); >> + *r = 0; >> + return true; >> + } >> + >> + return false; >> +} > >This code structure means that arg != NULL will result in the guest seeing ENOSYS rather than EINVAL. > > Paul Yes, because of this comment in David's email: "I don't even know that we care about in-kernel acceleration for the -EINVAL case. We could just return false for that, and let userspace (report and) handle it." > >> + >> struct compat_vcpu_set_singleshot_timer { >> uint64_t timeout_abs_ns; >> uint32_t flags; >> @@ -1501,6 +1513,9 @@ int kvm_xen_hypercall(struct kvm_vcpu *vcpu) >> timeout |= params[1] << 32; >> handled = kvm_xen_hcall_set_timer_op(vcpu, timeout, &r); >> break; >> + case __HYPERVISOR_hvm_op: >> + handled = kvm_xen_hcall_hvm_op(vcpu, params[0], params[1], &r); >> + break; >> } >> default: >> break; >> diff --git a/include/xen/interface/hvm/hvm_op.h >> b/include/xen/interface/hvm/hvm_op.h >> index 03134bf3cec1..240d8149bc04 100644 >> --- a/include/xen/interface/hvm/hvm_op.h >> +++ b/include/xen/interface/hvm/hvm_op.h >> @@ -16,6 +16,9 @@ struct xen_hvm_param { >> }; >> DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param); >> >> +/* Flushes guest TLBs for all vCPUs: @arg must be 0. */ >> +#define HVMOP_flush_tlbs 5 >> + >> /* Hint from PV drivers for pagetable destruction. */ >> #define HVMOP_pagetable_dying 9 >> struct xen_hvm_pagetable_dying { > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-18 11:04 ` Kaya, Metin @ 2023-04-18 11:13 ` Paul Durrant 0 siblings, 0 replies; 17+ messages in thread From: Paul Durrant @ 2023-04-18 11:13 UTC (permalink / raw) To: Kaya, Metin Cc: x86@kernel.org, bp@alien8.de, Woodhouse, David, seanjc@google.com, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, joao.m.martins@oracle.com, pbonzini@redhat.com, kvm@vger.kernel.org On 18/04/2023 12:04, Kaya, Metin wrote: > On 18/04/2023 11:48, Paul Durrant wrote: >> On 18/04/2023 11:13, Metin Kaya wrote: >>> Implement in-KVM support for Xen's HVMOP_flush_tlbs hypercall, which >>> allows the guest to flush all vCPU's TLBs. KVM doesn't provide an >>> ioctl() to precisely flush guest TLBs, and punting to userspace would >>> likely negate the performance benefits of avoiding a TLB shootdown in >>> the guest. >>> >>> Signed-off-by: Metin Kaya <metikaya@amazon.co.uk> >>> >>> --- >>> v3: >>> - Addressed comments for v2. >>> - Verified with XTF/invlpg test case. >>> >>> v2: >>> - Removed an irrelevant URL from commit message. >>> --- >>> arch/x86/kvm/xen.c | 15 +++++++++++++++ >>> include/xen/interface/hvm/hvm_op.h | 3 +++ >>> 2 files changed, 18 insertions(+) >>> >>> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index >>> 40edf4d1974c..a63c48e8d8fa 100644 >>> --- a/arch/x86/kvm/xen.c >>> +++ b/arch/x86/kvm/xen.c >>> @@ -21,6 +21,7 @@ >>> #include <xen/interface/vcpu.h> >>> #include <xen/interface/version.h> >>> #include <xen/interface/event_channel.h> >>> +#include <xen/interface/hvm/hvm_op.h> >>> #include <xen/interface/sched.h> >>> >>> #include <asm/xen/cpuid.h> >>> @@ -1330,6 +1331,17 @@ static bool kvm_xen_hcall_sched_op(struct kvm_vcpu *vcpu, bool longmode, >>> return false; >>> } >>> >>> +static bool kvm_xen_hcall_hvm_op(struct kvm_vcpu *vcpu, int cmd, u64 >>> +arg, u64 *r) { >>> + if (cmd == HVMOP_flush_tlbs && !arg) { >>> + kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_TLB_FLUSH_GUEST); >>> + *r = 0; >>> + return true; >>> + } >>> + >>> + return false; >>> +} >> >> This code structure means that arg != NULL will result in the guest seeing ENOSYS rather than EINVAL. >> >> Paul > > Yes, because of this comment in David's email: > "I don't even know that we care about in-kernel acceleration for the > -EINVAL case. We could just return false for that, and let userspace > (report and) handle it." > Ok, fair enough. Reviewed-by: Paul Durrant <paul@xen.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [EXTERNAL][PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-18 10:48 ` Paul Durrant 2023-04-18 11:04 ` Kaya, Metin @ 2023-04-18 11:05 ` David Woodhouse 1 sibling, 0 replies; 17+ messages in thread From: David Woodhouse @ 2023-04-18 11:05 UTC (permalink / raw) To: paul, Metin Kaya, kvm, pbonzini Cc: x86, bp, seanjc, tglx, mingo, dave.hansen, joao.m.martins [-- Attachment #1: Type: text/plain, Size: 2441 bytes --] On Tue, 2023-04-18 at 11:48 +0100, Paul Durrant wrote: > On 18/04/2023 11:13, Metin Kaya wrote: > > Implement in-KVM support for Xen's HVMOP_flush_tlbs hypercall, which > > allows the guest to flush all vCPU's TLBs. KVM doesn't provide an > > ioctl() to precisely flush guest TLBs, and punting to userspace would > > likely negate the performance benefits of avoiding a TLB shootdown in > > the guest. > > > > Signed-off-by: Metin Kaya <metikaya@amazon.co.uk> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Although as noted on the internal review and by Sean, it would be good to have a test case that verifies that the TLBs are actually flushed. > > > > --- > > v3: > > - Addressed comments for v2. > > - Verified with XTF/invlpg test case. > > > > v2: > > - Removed an irrelevant URL from commit message. > > --- > > arch/x86/kvm/xen.c | 15 +++++++++++++++ > > include/xen/interface/hvm/hvm_op.h | 3 +++ > > 2 files changed, 18 insertions(+) > > > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > > index 40edf4d1974c..a63c48e8d8fa 100644 > > --- a/arch/x86/kvm/xen.c > > +++ b/arch/x86/kvm/xen.c > > @@ -21,6 +21,7 @@ > > #include <xen/interface/vcpu.h> > > #include <xen/interface/version.h> > > #include <xen/interface/event_channel.h> > > +#include <xen/interface/hvm/hvm_op.h> > > #include <xen/interface/sched.h> > > > > #include <asm/xen/cpuid.h> > > @@ -1330,6 +1331,17 @@ static bool kvm_xen_hcall_sched_op(struct kvm_vcpu *vcpu, bool longmode, > > return false; > > } > > > > +static bool kvm_xen_hcall_hvm_op(struct kvm_vcpu *vcpu, int cmd, u64 arg, u64 *r) > > +{ > > + if (cmd == HVMOP_flush_tlbs && !arg) { > > + kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_TLB_FLUSH_GUEST); > > + *r = 0; > > + return true; > > + } > > + > > + return false; > > +} > > This code structure means that arg != NULL will result in the guest > seeing ENOSYS rather than EINVAL. In kvm_xen_hypercall() the default for 'r' is -ENOSYS but because 'handled' never gets set to true, we don't hand that back to the guest. Instead we get to handle_in_userspace: and do the KVM_EXIT_XEN exit. So arg != NULL will cause the standard hypercall exit to userspace just as it does today. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-18 10:13 ` [PATCH v3] " Metin Kaya 2023-04-18 10:48 ` Paul Durrant @ 2023-05-26 20:32 ` Sean Christopherson 2023-07-25 12:56 ` David Woodhouse 2023-08-04 0:22 ` Sean Christopherson 2 siblings, 1 reply; 17+ messages in thread From: Sean Christopherson @ 2023-05-26 20:32 UTC (permalink / raw) To: Metin Kaya Cc: kvm, pbonzini, x86, bp, dwmw, paul, tglx, mingo, dave.hansen, joao.m.martins On Tue, Apr 18, 2023, Metin Kaya wrote: > Implement in-KVM support for Xen's HVMOP_flush_tlbs hypercall, which > allows the guest to flush all vCPU's TLBs. KVM doesn't provide an > ioctl() to precisely flush guest TLBs, and punting to userspace would > likely negate the performance benefits of avoiding a TLB shootdown in > the guest. > > Signed-off-by: Metin Kaya <metikaya@amazon.co.uk> > > --- > v3: > - Addressed comments for v2. > - Verified with XTF/invlpg test case. > > v2: > - Removed an irrelevant URL from commit message. > --- > arch/x86/kvm/xen.c | 15 +++++++++++++++ > include/xen/interface/hvm/hvm_op.h | 3 +++ > 2 files changed, 18 insertions(+) I still don't see a testcase. : This doesn't even compile. I'll give you one guess as to how much confidence I : have that this was properly tested. : : Aha! And QEMU appears to have Xen emulation support. That means KVM-Unit-Tests : is an option. Specifically, extend the "access" test to use this hypercall instead : of INVLPG. That'll verify that the flush is actually being performed as expteced. Let me be more explicit this time: I am not applying this without a test. I don't care how trivial a patch may seem, I'm done taking patches without test coverage unless there's a *really* good reason for me to do so. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-05-26 20:32 ` [PATCH " Sean Christopherson @ 2023-07-25 12:56 ` David Woodhouse 2023-07-26 20:07 ` Sean Christopherson 0 siblings, 1 reply; 17+ messages in thread From: David Woodhouse @ 2023-07-25 12:56 UTC (permalink / raw) To: Sean Christopherson, Metin Kaya Cc: kvm, pbonzini, x86, bp, paul, tglx, mingo, dave.hansen, joao.m.martins [-- Attachment #1: Type: text/plain, Size: 3957 bytes --] On Fri, 2023-05-26 at 13:32 -0700, Sean Christopherson wrote: > : Aha! And QEMU appears to have Xen emulation support. That means KVM-Unit-Tests > : is an option. Specifically, extend the "access" test to use this hypercall instead > : of INVLPG. That'll verify that the flush is actually being performed as expteced. That works. Metin has a better version that actually sets up the hypercall page properly and uses it, but that one bails out when Xen support isn't present, and doesn't show the failure mode quite so clearly. This is the simple version: From: Metin Kaya <metikaya@amazon.co.uk> Subject: [PATCH] x86/access: Use hvm_op/flush_tlbs hypercall instead of invlpg instruction Signed-off-by: Metin Kaya <metikaya@amazon.co.uk> --- x86/access.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/x86/access.c b/x86/access.c index 83c8221..6fa08c5 100644 --- a/x86/access.c +++ b/x86/access.c @@ -253,12 +253,31 @@ static void clear_user_mask(pt_element_t *ptep, int level, unsigned long virt) *ptep &= ~PT_USER_MASK; } +#define KVM_HYPERCALL_INTEL ".byte 0x0f,0x01,0xc1" +#define __HYPERVISOR_hvm_op 34 +#define HVMOP_flush_tlbs 5 + +static inline long hvm_op_flush_tlbs(void) +{ + long ret; + uint64_t nr = __HYPERVISOR_hvm_op; + uint64_t op = HVMOP_flush_tlbs; + uint64_t arg = 0; + + asm volatile(KVM_HYPERCALL_INTEL + : "=a"(ret) + : "a"(nr), "D" (op), "S" (arg) + : "memory"); + + return ret; +} + static void set_user_mask(pt_element_t *ptep, int level, unsigned long virt) { *ptep |= PT_USER_MASK; /* Flush to avoid spurious #PF */ - invlpg((void*)virt); + hvm_op_flush_tlbs(); } static unsigned set_cr4_smep(ac_test_t *at, int smep) @@ -577,7 +596,7 @@ fault: static void ac_set_expected_status(ac_test_t *at) { - invlpg(at->virt); + hvm_op_flush_tlbs(); if (at->ptep) at->expected_pte = *at->ptep; -- 2.34.1 Run it with Xen support enabled and it works... $ qemu-system-x86_64 -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -M pc -device pci-testdev --accel kvm,xen-version=0x4000a,kernel-irqchip=split -kernel x86/access_test.flat enabling apic starting test run CR4.PKE not available, disabling PKE tests CR4.SMEP not available, disabling SMEP tests ............................. 1916935 tests, 0 failures To be sure, run it without Xen support, the hypercall silently fails and the test fails.... $ qemu-system-x86_64 -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -M pc -device pci-testdev --accel kvm -kernel x86/access_test.flat enabling apic starting test run CR4.PKE not available, disabling PKE tests CR4.SMEP not available, disabling SMEP tests test pte.rw pte.p pde.p: FAIL: pte 2000003 expected 2000023 Dump mapping: address: 0xffff923400000000 ------L4 I292: 2100027 ------L3 I208: 2101027 ------L2 I0: 2102061 ------L1 I0: 2000003 test pte.user pte.p pde.p: FAIL: pte 2000005 expected 2000025 Dump mapping: address: 0xffff923400000000 ------L4 I292: 2100027 ------L3 I208: 2101027 ------L2 I0: 2102061 ------L1 I0: 2000005 … > Let me be more explicit this time: I am not applying this without a test. I don't > care how trivial a patch may seem, I'm done taking patches without test coverage > unless there's a *really* good reason for me to do so. Understood. So, we know it *works*, but the above is a one-off and not a *regression* test. It would definitely be nice to have regression tests that cover absolutely everything, but adding Xen guest support to the generic KVM- Unit-Tests might be considered overkill, because this *particular* thing is fairly unlikely to regress? It really is just calling kvm_flush_remote_tlbs(), and if that goes wrong we're probably likely to notice it anyway. What do you think? [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --] ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-07-25 12:56 ` David Woodhouse @ 2023-07-26 20:07 ` Sean Christopherson 2023-07-27 12:04 ` David Woodhouse 0 siblings, 1 reply; 17+ messages in thread From: Sean Christopherson @ 2023-07-26 20:07 UTC (permalink / raw) To: David Woodhouse Cc: Metin Kaya, kvm, pbonzini, x86, bp, paul, tglx, mingo, dave.hansen, joao.m.martins On Tue, Jul 25, 2023, David Woodhouse wrote: > On Fri, 2023-05-26 at 13:32 -0700, Sean Christopherson wrote: > > : Aha! And QEMU appears to have Xen emulation support. That means KVM-Unit-Tests > > : is an option. Specifically, extend the "access" test to use this hypercall instead > > : of INVLPG. That'll verify that the flush is actually being performed as expteced. > > That works. Metin has a better version that actually sets up the > hypercall page properly and uses it, but that one bails out when Xen > support isn't present, and doesn't show the failure mode quite so > clearly. This is the simple version: IIUC, y'all have already written both tests, so why not post both? I certainly won't object to more tests if they provide different coverage. > > Let me be more explicit this time: I am not applying this without a test. I don't > > care how trivial a patch may seem, I'm done taking patches without test coverage > > unless there's a *really* good reason for me to do so. > > Understood. > > So, we know it *works*, but the above is a one-off and not a > *regression* test. > > It would definitely be nice to have regression tests that cover > absolutely everything, but adding Xen guest support to the generic KVM- > Unit-Tests might be considered overkill, because this *particular* > thing is fairly unlikely to regress? It really is just calling > kvm_flush_remote_tlbs(), and if that goes wrong we're probably likely > to notice it anyway. > > What do you think? I'm a-ok with just having basic test coverage. We don't have anywhere near 100% (or even 10%...) coverage of KVM's TLB management, so it would be ridiculous to require that for Xen PV. I'd definitely love to change that, and raise the bar for test coverage in general, but that'll take time (and effort). ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-07-26 20:07 ` Sean Christopherson @ 2023-07-27 12:04 ` David Woodhouse 2023-07-28 7:31 ` Kaya, Metin 0 siblings, 1 reply; 17+ messages in thread From: David Woodhouse @ 2023-07-27 12:04 UTC (permalink / raw) To: Sean Christopherson Cc: Metin Kaya, kvm, pbonzini, x86, bp, paul, tglx, mingo, dave.hansen, joao.m.martins [-- Attachment #1: Type: text/plain, Size: 5085 bytes --] On Wed, 2023-07-26 at 13:07 -0700, Sean Christopherson wrote: > On Tue, Jul 25, 2023, David Woodhouse wrote: > > On Fri, 2023-05-26 at 13:32 -0700, Sean Christopherson wrote: > > > : Aha! And QEMU appears to have Xen emulation support. That means KVM-Unit-Tests > > > : is an option. Specifically, extend the "access" test to use this hypercall instead > > > : of INVLPG. That'll verify that the flush is actually being performed as expteced. > > > > That works. Metin has a better version that actually sets up the > > hypercall page properly and uses it, but that one bails out when Xen > > support isn't present, and doesn't show the failure mode quite so > > clearly. This is the simple version: > > IIUC, y'all have already written both tests, so why not post both? I certainly > won't object to more tests if they provide different coverage. Yeah, it just needed cleaning up. This is what we have; Metin will submit it for real after a little more polishing. It modifies the existing access test so that *if* it's run in a Xen environment, and *if* the HVMOP_flush_tlbs call returns success instead of -ENOSYS, it'll use that instead of invlpg. In itself, that doesn't give us an automatic regression tests, because you still need to run it manually — as before, qemu-system-x86_64 -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev --accel kvm,xen-version=0x4000a,kernel-irqchip=split -kernel ~/access_test.flat If we really want to, we can look at making it run that way when qemu and the host kernel support Xen guests...? diff --git a/x86/access.c b/x86/access.c index 83c8221..8c6e44a 100644 --- a/x86/access.c +++ b/x86/access.c @@ -4,6 +4,7 @@ #include "asm/page.h" #include "x86/vm.h" #include "access.h" +#include "alloc_page.h" #define true 1 #define false 0 @@ -253,12 +254,90 @@ static void clear_user_mask(pt_element_t *ptep, int level, unsigned long virt) *ptep &= ~PT_USER_MASK; } +uint8_t *hypercall_page; + +#define __HYPERVISOR_hvm_op 34 +#define HVMOP_flush_tlbs 5 + +static inline int do_hvm_op_flush_tlbs(void) +{ + long res = 0, _a1 = (long)(HVMOP_flush_tlbs), _a2 = (long)(NULL); + + asm volatile ("call *%[offset]" +#if defined(__x86_64__) + : "=a" (res), "+D" (_a1), "+S" (_a2) +#else + : "=a" (res), "+b" (_a1), "+c" (_a2) +#endif + : [offset] "r" (hypercall_page + (__HYPERVISOR_hvm_op * 32)) + : "memory"); + + if (res) + printf("hvm_op/HVMOP_flush_tlbs failed: %ld.", res); + + return (int)res; +} + +#define XEN_CPUID_FIRST_LEAF 0x40000000 +#define XEN_CPUID_SIGNATURE_EBX 0x566e6558 /* "XenV" */ +#define XEN_CPUID_SIGNATURE_ECX 0x65584d4d /* "MMXe" */ +#define XEN_CPUID_SIGNATURE_EDX 0x4d4d566e /* "nVMM" */ + +static void init_hypercalls(void) +{ + struct cpuid c; + u32 base; + bool found = false; + + for (base = XEN_CPUID_FIRST_LEAF; base < XEN_CPUID_FIRST_LEAF + 0x10000; + base += 0x100) { + c = cpuid(base); + if ((c.b == XEN_CPUID_SIGNATURE_EBX) && + (c.c == XEN_CPUID_SIGNATURE_ECX) && + (c.d == XEN_CPUID_SIGNATURE_EDX) && + ((c.a - base) >= 2)) { + found = true; + break; + } + } + if (!found) { + printf("Using native invlpg instruction\n"); + return; + } + + hypercall_page = alloc_pages_flags(0, AREA_ANY | FLAG_DONTZERO); + if (!hypercall_page) + report_abort("failed to allocate hypercall page"); + + memset(hypercall_page, 0xc3, PAGE_SIZE); + + c = cpuid(base + 2); + wrmsr(c.b, (u64)hypercall_page); + barrier(); + + if (hypercall_page[0] == 0xc3) + report_abort("Hypercall page not initialised correctly\n"); + + /* + * Fall back to invlpg instruction if HVMOP_flush_tlbs hypercall is + * unsuported. + */ + if (do_hvm_op_flush_tlbs()) { + printf("Using native invlpg instruction\n"); + free_page(hypercall_page); + hypercall_page = NULL; + return; + } + + printf("Using Xen HVMOP_flush_tlbs hypercall\n"); +} + static void set_user_mask(pt_element_t *ptep, int level, unsigned long virt) { *ptep |= PT_USER_MASK; /* Flush to avoid spurious #PF */ - invlpg((void*)virt); + hypercall_page ? do_hvm_op_flush_tlbs() : invlpg((void*)virt); } static unsigned set_cr4_smep(ac_test_t *at, int smep) @@ -577,7 +656,7 @@ fault: static void ac_set_expected_status(ac_test_t *at) { - invlpg(at->virt); + hypercall_page ? do_hvm_op_flush_tlbs() : invlpg((void*)at->virt); if (at->ptep) at->expected_pte = *at->ptep; @@ -1162,6 +1241,10 @@ int ac_test_run(int pt_levels) printf("run\n"); tests = successes = 0; + setup_vm(); + + init_hypercalls(); + shadow_cr0 = read_cr0(); shadow_cr4 = read_cr4(); shadow_cr3 = read_cr3(); @@ -1234,6 +1317,9 @@ int ac_test_run(int pt_levels) successes += ac_test_cases[i](&pt_env); } + if (hypercall_page) + free_page(hypercall_page); + printf("\n%d tests, %d failures\n", tests, tests - successes); return successes == tests; -- 2.34.1 [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --] ^ permalink raw reply related [flat|nested] 17+ messages in thread
* RE: [PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-07-27 12:04 ` David Woodhouse @ 2023-07-28 7:31 ` Kaya, Metin 0 siblings, 0 replies; 17+ messages in thread From: Kaya, Metin @ 2023-07-28 7:31 UTC (permalink / raw) To: David Woodhouse, Sean Christopherson Cc: kvm@vger.kernel.org, pbonzini@redhat.com, x86@kernel.org, bp@alien8.de, paul@xen.org, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, joao.m.martins@oracle.com On Thu, Jul 27, 2023 at 1:04 PM, David Woodhouse wrote: > On Wed, 2023-07-26 at 13:07 -0700, Sean Christopherson wrote: > > On Tue, Jul 25, 2023, David Woodhouse wrote: > > > On Fri, 2023-05-26 at 13:32 -0700, Sean Christopherson wrote: > > > > : Aha! And QEMU appears to have Xen emulation support. That means KVM-Unit-Tests > > > > : is an option. Specifically, extend the "access" test to use this hypercall instead > > > > : of INVLPG. That'll verify that the flush is actually being performed as expteced. > > > > > > That works. Metin has a better version that actually sets up the > > > hypercall page properly and uses it, but that one bails out when Xen > > > support isn't present, and doesn't show the failure mode quite so > > > clearly. This is the simple version: > > > > IIUC, y'all have already written both tests, so why not post both? I certainly > > won't object to more tests if they provide different coverage. > > Yeah, it just needed cleaning up. > > This is what we have; Metin will submit it for real after a little more > polishing. It modifies the existing access test so that *if* it's run > in a Xen environment, and *if* the HVMOP_flush_tlbs call returns > success instead of -ENOSYS, it'll use that instead of invlpg. Patch is posted to kvm-unit-tests: https://marc.info/?l=kvm&m=169052867024191&w=2 > In itself, that doesn't give us an automatic regression tests, because > you still need to run it manually — as before, > qemu-system-x86_64 -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev --accel kvm,xen-version=0x4000a,kernel-irqchip=split -kernel ~/access_test.flat > > If we really want to, we can look at making it run that way when qemu > and the host kernel support Xen guests...? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-18 10:13 ` [PATCH v3] " Metin Kaya 2023-04-18 10:48 ` Paul Durrant 2023-05-26 20:32 ` [PATCH " Sean Christopherson @ 2023-08-04 0:22 ` Sean Christopherson 2 siblings, 0 replies; 17+ messages in thread From: Sean Christopherson @ 2023-08-04 0:22 UTC (permalink / raw) To: Metin Kaya Cc: kvm, pbonzini, x86, bp, dwmw, paul, tglx, mingo, dave.hansen, joao.m.martins On Tue, Apr 18, 2023, Metin Kaya wrote: > Implement in-KVM support for Xen's HVMOP_flush_tlbs hypercall, which > allows the guest to flush all vCPU's TLBs. KVM doesn't provide an > ioctl() to precisely flush guest TLBs, and punting to userspace would > likely negate the performance benefits of avoiding a TLB shootdown in > the guest. > > Signed-off-by: Metin Kaya <metikaya@amazon.co.uk> > > --- > v3: > - Addressed comments for v2. > - Verified with XTF/invlpg test case. > > v2: > - Removed an irrelevant URL from commit message. > --- I had applied this and even generated the "thank you", but then I actually ran the KUT access test[*]. It may very well be a pre-existing bug, but that doesn't change the fact that this patch breaks a test. I apologize for being snippy, especially if it turns out this is unique to HSW (or worse, my system), and your systems don't exhibit issues. Getting the test to run took way too long and I'm bit grumpy. Anyways, whatever is going wrong needs to be sorted out before this can be applied. [*] https://lore.kernel.org/all/ZMxDRg6gWsVLeYFL@google.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [EXTERNAL][PATCH v2] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-18 9:14 ` [EXTERNAL][PATCH " David Woodhouse 2023-04-18 10:13 ` [PATCH v3] " Metin Kaya @ 2023-04-18 14:44 ` Sean Christopherson 2023-04-18 15:47 ` David Woodhouse 1 sibling, 1 reply; 17+ messages in thread From: Sean Christopherson @ 2023-04-18 14:44 UTC (permalink / raw) To: David Woodhouse Cc: Metin Kaya, kvm, pbonzini, x86, bp, paul, tglx, mingo, dave.hansen, joao.m.martins On Tue, Apr 18, 2023, David Woodhouse wrote: > On Mon, 2023-04-17 at 09:31 -0700, Sean Christopherson wrote: > > On Mon, Apr 17, 2023, Metin Kaya wrote: > > > HVMOP_flush_tlbs suboperation of hvm_op hypercall allows a guest to > > > flush all vCPU TLBs. There is no way for the VMM to flush TLBs from > > > userspace. > > > > Ah, took me a minute to connect the dots.� Monday morning is definitely partly > > to blame, but it would be helpful to expand this sentence to be more explicit as > > to why userspace's inability to efficiently flush TLBs. > > > > And strictly speaking, userspace _can_ flush TLBs, just not in a precise, efficient > > way. > > Hm, how? We should probably implement that in userspace as a fallback, > however much it sucks. Oh, the suckage is high :-) Use KVM_{G,S}ET_SREGS2 to toggle any CR{0,3,4}/EFER bit and __set_sregs() will reset the MMU context. Note that without this fix[*] that I'm going to squeeze into 6.4, the MMU context reset may result in all TDP MMU roots being freed and reallocated. [*] https://lore.kernel.org/all/20230413231251.1481410-1-seanjc@google.com > > > > �arch/x86/kvm/xen.c���������������� | 31 ++++++++++++++++++++++++++++++ > > > �include/xen/interface/hvm/hvm_op.h |� 3 +++ > > > > Modifications to uapi headers is conspicuously missing.� I.e. there likely needs > > to be a capability so that userspace can query support. > > Nah, nobody cares. If the kernel "accelerates" this hypercall, so be > it. Userspace will just never get the KVM_EXIT_XEN for that hypercall > because it'll be magically handled, like the others. Ah, that makes sense, I was thinking userspace would complain if it got the "unexpected" exit. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [EXTERNAL][PATCH v2] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-18 14:44 ` [EXTERNAL][PATCH v2] " Sean Christopherson @ 2023-04-18 15:47 ` David Woodhouse 2023-04-18 16:08 ` David Woodhouse 0 siblings, 1 reply; 17+ messages in thread From: David Woodhouse @ 2023-04-18 15:47 UTC (permalink / raw) To: Sean Christopherson Cc: Metin Kaya, kvm, pbonzini, x86, bp, paul, tglx, mingo, dave.hansen, joao.m.martins [-- Attachment #1: Type: text/plain, Size: 3179 bytes --] On Tue, 2023-04-18 at 07:44 -0700, Sean Christopherson wrote: > On Tue, Apr 18, 2023, David Woodhouse wrote: > > On Mon, 2023-04-17 at 09:31 -0700, Sean Christopherson wrote: > > > On Mon, Apr 17, 2023, Metin Kaya wrote: > > > > HVMOP_flush_tlbs suboperation of hvm_op hypercall allows a guest to > > > > flush all vCPU TLBs. There is no way for the VMM to flush TLBs from > > > > userspace. > > > > > > Ah, took me a minute to connect the dots.� Monday morning is definitely partly > > > to blame, but it would be helpful to expand this sentence to be more explicit as > > > to why userspace's inability to efficiently flush TLBs. > > > > > > And strictly speaking, userspace _can_ flush TLBs, just not in a precise, efficient > > > way. > > > > Hm, how? We should probably implement that in userspace as a fallback, > > however much it sucks. > > Oh, the suckage is high :-) Use KVM_{G,S}ET_SREGS2 to toggle any CR{0,3,4}/EFER > bit and __set_sregs() will reset the MMU context. Oh, that suckage is quite high indeed. I'm not actually sure I'll bother doing this in QEMU. It's quite esoteric; Linux has never used it and I think that after launching <mumble> million production Xen guests this way we've only ever seen it used by some FreeBSD variant. It doesn't seem to be in mainline FreeBSD; there's a hint of it here https://people.freebsd.org/~dfr/freebsd-6.x-xen-31032009.diff for example but it's disabled even then: if (pmap == kernel_pmap || pmap->pm_active == all_cpus) { - invltlb(); - smp_invltlb(); +#if defined(XENHVM) && defined(notdef) + /* + * As far as I can tell, this makes things slower, at + * least where there are only two physical cpus and + * the host is not overcommitted. + */ + if (is_running_on_xen()) { + HYPERVISOR_hvm_op(HVMOP_flush_tlbs, NULL); + } else +#endif + { + invltlb(); + smp_invltlb(); + } > Note that without this fix[*] > that I'm going to squeeze into 6.4, the MMU context reset may result in all TDP > MMU roots being freed and reallocated. > > [*] https://lore.kernel.org/all/20230413231251.1481410-1-seanjc@google.com > > > > > > > �arch/x86/kvm/xen.c���������������� | 31 ++++++++++++++++++++++++++++++ > > > > �include/xen/interface/hvm/hvm_op.h |� 3 +++ > > > > > > Modifications to uapi headers is conspicuously missing.� I.e. there likely needs > > > to be a capability so that userspace can query support. > > > > Nah, nobody cares. If the kernel "accelerates" this hypercall, so be > > it. Userspace will just never get the KVM_EXIT_XEN for that hypercall > > because it'll be magically handled, like the others. > > Ah, that makes sense, I was thinking userspace would complain if it got the > "unexpected" exit. I've tried to follow a model where userspace is *always* expected to implement the hypercall, and if the kernel chooses to intervene to accelerate it, that's a bonus. It saves a bunch of complexity in error handling in the kernel when we can just say "screw this, let userspace cope". [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [EXTERNAL][PATCH v2] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall 2023-04-18 15:47 ` David Woodhouse @ 2023-04-18 16:08 ` David Woodhouse 0 siblings, 0 replies; 17+ messages in thread From: David Woodhouse @ 2023-04-18 16:08 UTC (permalink / raw) To: Sean Christopherson Cc: Metin Kaya, kvm, pbonzini, x86, bp, paul, tglx, mingo, dave.hansen, joao.m.martins [-- Attachment #1: Type: text/plain, Size: 675 bytes --] On Tue, 2023-04-18 at 16:47 +0100, David Woodhouse wrote: > > Oh, that suckage is quite high indeed. I'm not actually sure I'll > bother doing this in QEMU. It's quite esoteric; Linux has never used it > and I think that after launching <mumble> million production Xen guests > this way we've only ever seen it used by some FreeBSD variant. > > It doesn't seem to be in mainline FreeBSD; there's a hint of it here > https://people.freebsd.org/~dfr/freebsd-6.x-xen-31032009.diff for > example but it's disabled even then: For the record, Metin and Paul tell me I lied; it was actually seen with a Windows guest using the ancient closed-source Citrix drivers. [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5965 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2023-08-04 0:22 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20230417122206.34647-1-metikaya@amazon.co.uk>
2023-04-17 12:22 ` [PATCH v2] KVM: x86/xen: Implement hvm_op/HVMOP_flush_tlbs hypercall Metin Kaya
2023-04-17 16:31 ` Sean Christopherson
2023-04-18 9:14 ` [EXTERNAL][PATCH " David Woodhouse
2023-04-18 10:13 ` [PATCH v3] " Metin Kaya
2023-04-18 10:48 ` Paul Durrant
2023-04-18 11:04 ` Kaya, Metin
2023-04-18 11:13 ` Paul Durrant
2023-04-18 11:05 ` [EXTERNAL][PATCH " David Woodhouse
2023-05-26 20:32 ` [PATCH " Sean Christopherson
2023-07-25 12:56 ` David Woodhouse
2023-07-26 20:07 ` Sean Christopherson
2023-07-27 12:04 ` David Woodhouse
2023-07-28 7:31 ` Kaya, Metin
2023-08-04 0:22 ` Sean Christopherson
2023-04-18 14:44 ` [EXTERNAL][PATCH v2] " Sean Christopherson
2023-04-18 15:47 ` David Woodhouse
2023-04-18 16:08 ` David Woodhouse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).