public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* RE: [RFC PATCH 4.14] KVM: x86: Backport support for interrupt-based APF page-ready delivery in guest
@ 2023-10-17 13:44 Mancini, Riccardo
  2023-10-17 14:06 ` Vitaly Kuznetsov
  0 siblings, 1 reply; 7+ messages in thread
From: Mancini, Riccardo @ 2023-10-17 13:44 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Paolo Bonzini
  Cc: Batalov, Eugene, Graf (AWS), Alexander, kvm@vger.kernel.org,
	Farrell, Greg

Hey,

Thank you both for the quick feedback.

> > I've backported the guest-side of the patchset to 4.14.326, could you 
> > help us and take a look at the backport?
> > I only backported the original patchset, I'm not sure if there's any 
> > other patch (bug fix) that needs to be included in the backpotrt.
>
> I remember us fixing PV feature enablement/disablement for hibernation/kdump later, see e.g.
>
> commit 8b79feffeca28c5459458fe78676b081e87c93a4
> Author: Vitaly Kuznetsov <vkuznets@redhat.com>
> Date:   Wed Apr 14 14:35:41 2021 +0200
>
>     x86/kvm: Teardown PV features on boot CPU as well
>
> commit 3d6b84132d2a57b5a74100f6923a8feb679ac2ce
> Author: Vitaly Kuznetsov <vkuznets@redhat.com>
> Date:   Wed Apr 14 14:35:43 2021 +0200
>
>     x86/kvm: Disable all PV features on crash
>
> if you're interested in such use-cases. I don't recall any required fixes for normal operation.

These look like issues already present in 4.14, not introduced by the
interrupt-based mechanism, correct?
If so, I wouldn't chase them.
Furthermore, I don't even think we hit those use cases in our scenario.

> 
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > On 10/16/23 16:18, Vitaly Kuznetsov wrote:
> >> In case keeping legacy mechanism is a must, I would suggest you
> >> somehow record the fact that the guest has opted for interrupt-based
> >> delivery (e.g. set a global variable or use a static key) and
> >> short-circuit
> >> do_async_page_fault() to immediately return and not do anything in
> >> this case.
> >
> > I guess you mean "not do anything for KVM_PV_REASON_PAGE_READY in this
> > case"?
> 
> Yes, of course: KVM_PV_REASON_PAGE_NOT_PRESENT is always a #PF.

I agree this is a difference with the upstream asyncpf-int implementation and
it's theoretically incorrect. I think this shouldn't happen in a normal case, 
but it's better to keep it consistent.
I'll add a check that asyncpf-int is _not_ enabled before processing 
KVM_PV_REASON_PAGE_READY. Draft diff below.

Thanks,
Riccardo

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 582a366b82d8..bdfdffd35939 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -79,6 +79,8 @@ static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
 static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
 static int has_steal_clock = 0;
 
+static DEFINE_PER_CPU(u32, kvm_apf_int_enabled);
+
 /*
  * No need for any "IO delay" on KVM
  */
@@ -277,7 +279,8 @@ do_async_page_fault(struct pt_regs *regs, unsigned long error_code)
                prev_state = exception_enter();
                kvm_async_pf_task_wait((u32)read_cr2(), !user_mode(regs));
                exception_exit(prev_state);
-       } else if (reason & KVM_PV_REASON_PAGE_READY) {
+       } else if (!__this_cpu_read(kvm_apf_int_enabled) && (reason & KVM_PV_REASON_PAGE_READY)) {
+               /* this event is only possible if interrupt-based mechanism is disabled */
                rcu_irq_enter();
                kvm_async_pf_task_wake((u32)read_cr2());
                rcu_irq_exit();
@@ -367,6 +370,7 @@ static void kvm_guest_cpu_init(void)
                if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_INT)) {
                        pa |= KVM_ASYNC_PF_DELIVERY_AS_INT;
                        wrmsrl(MSR_KVM_ASYNC_PF_INT, HYPERVISOR_CALLBACK_VECTOR);
+                       __this_cpu_write(kvm_apf_int_enabled, 1);
                }
 
                wrmsrl(MSR_KVM_ASYNC_PF_EN, pa);
@@ -396,6 +400,7 @@ static void kvm_pv_disable_apf(void)
 
        wrmsrl(MSR_KVM_ASYNC_PF_EN, 0);
        __this_cpu_write(apf_reason.enabled, 0);
+       __this_cpu_write(kvm_apf_int_enabled, 0);
 
        printk(KERN_INFO"Unregister pv shared memory for cpu %d\n",
               smp_processor_id());


^ permalink raw reply related	[flat|nested] 7+ messages in thread
* Re: Bug? Incompatible APF for 4.14 guest on 5.10 and later host
@ 2023-10-05 15:38 Vitaly Kuznetsov
  2023-10-13 16:36 ` [RFC PATCH 4.14] KVM: x86: Backport support for interrupt-based APF page-ready delivery in guest Riccardo Mancini
  0 siblings, 1 reply; 7+ messages in thread
From: Vitaly Kuznetsov @ 2023-10-05 15:38 UTC (permalink / raw)
  To: Mancini, Riccardo
  Cc: kvm@vger.kernel.org, Graf (AWS), Alexander, Teragni, Matias,
	Batalov, Eugene, pbonzini@redhat.com

"Mancini, Riccardo" <mancio@amazon.com> writes:

> Hi,
>
> when a 4.14 guest runs on a 5.10 host (and later), it cannot use APF (despite
> CPUID advertising KVM_FEATURE_ASYNC_PF) due to the new interrupt-based
> mechanism 2635b5c4a0 (KVM: x86: interrupt based APF 'page ready' event delivery).
> Kernels after 5.9 won't satisfy the guest request to enable APF through
> KVM_ASYNC_PF_ENABLED, requiring also KVM_ASYNC_PF_DELIVERY_AS_INT to be set.
> Furthermore, the patch set seems to be dropping parts of the legacy #PF handling
> as well.
> I consider this as a bug as it breaks APF compatibility for older guests running
> on newer kernels, by breaking the underlying ABI.
> What do you think? Was this a deliberate decision?

It was. #PF based "page ready" injection was found to be fragile as in
some cases it can collide with an actual #PF and nothing good is
expected if this ever happens. I don't think we've actually broken the
ABI as "asynchronous page fault" was always a "best effort" service: the
guest indicates its readiness to process 'page missing' events but the
host is under no obligation to actually send such notifications.

> Was this already reported in the past (I couldn't find anything in the mailing list
> but I might have missed it!)?

I think it was Andy Lutomirski who started the discussion, see
e.g. https://lore.kernel.org/lkml/ed71d0967113a35f670a9625a058b8e6e0b2f104.1583547991.git.luto@kernel.org/

the patch is about KVM_ASYNC_PF_SEND_ALWAYS but if you go down the
discussion you'll find more concerns expressed.

> Would it be much effort to support the legacy #PF based mechanism for older
> guests that choose to only set KVM_ASYNC_PF_ENABLED?

Personally, I wouldn't go down this road: #PF injection at random time
(for page-ready events) is still considered being fragile.

>
> The reason this is an issue for us now is that not having APF for older guests
> introduces a significant performance regression on 4.14 guests when paired to
> uffd handling of "remote" page-faults (similar to a live migration scenario)
> when we update from a 4.14 host kernel to a 5.10 host kernel.

What about backporting interrupt-based APF mechanism to older guests?

-- 
Vitaly


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-10-18 14:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-17 13:44 [RFC PATCH 4.14] KVM: x86: Backport support for interrupt-based APF page-ready delivery in guest Mancini, Riccardo
2023-10-17 14:06 ` Vitaly Kuznetsov
2023-10-18 14:40   ` [RFC PATCH 4.14 v2] " Riccardo Mancini
  -- strict thread matches above, loose matches on Subject: below --
2023-10-05 15:38 Bug? Incompatible APF for 4.14 guest on 5.10 and later host Vitaly Kuznetsov
2023-10-13 16:36 ` [RFC PATCH 4.14] KVM: x86: Backport support for interrupt-based APF page-ready delivery in guest Riccardo Mancini
2023-10-16 14:18   ` Vitaly Kuznetsov
2023-10-16 21:57     ` Paolo Bonzini
2023-10-17 11:22       ` Vitaly Kuznetsov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox