From: Sean Christopherson <seanjc@google.com>
To: Wanpeng Li <kernellwp@gmail.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] KVM: X86: Properly account for guest CPU time when considering context tracking
Date: Mon, 29 Mar 2021 17:15:08 +0000 [thread overview]
Message-ID: <YGILHM7CHpjXtxaH@google.com> (raw)
In-Reply-To: <1617011036-11734-1-git-send-email-wanpengli@tencent.com>
+Thomas
On Mon, Mar 29, 2021, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
>
> The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831
> reported that the guest time remains 0 when running a while true
> loop in the guest.
>
> The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it
> belongs") moves guest_exit_irqoff() close to vmexit breaks the
> tick-based time accouting when the ticks that happen after IRQs are
> disabled are incorrectly accounted to the host/system time. This is
> because we exit the guest state too early.
>
> vtime-based time accounting is tied to context tracking, keep the
> guest_exit_irqoff() around vmexit code when both vtime-based time
> accounting and specific cpu is context tracking mode active.
> Otherwise, leave guest_exit_irqoff() after handle_exit_irqoff()
> and explicit IRQ window for tick-based time accouting.
>
> Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
> Cc: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
> arch/x86/kvm/svm/svm.c | 3 ++-
> arch/x86/kvm/vmx/vmx.c | 3 ++-
> arch/x86/kvm/x86.c | 2 ++
> 3 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 58a45bb..55fb5ce 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3812,7 +3812,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu,
> * into world and some more.
> */
> lockdep_hardirqs_off(CALLER_ADDR0);
> - guest_exit_irqoff();
> + if (vtime_accounting_enabled_this_cpu())
> + guest_exit_irqoff();
>
> instrumentation_begin();
> trace_hardirqs_off_finish();
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 32cf828..85695b3 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6689,7 +6689,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
> * into world and some more.
> */
> lockdep_hardirqs_off(CALLER_ADDR0);
> - guest_exit_irqoff();
> + if (vtime_accounting_enabled_this_cpu())
> + guest_exit_irqoff();
This looks ok, as CONFIG_CONTEXT_TRACKING and CONFIG_VIRT_CPU_ACCOUNTING_GEN are
selected by CONFIG_NO_HZ_FULL=y, and can't be enabled independently, e.g. the
rcu_user_exit() call won't be delayed because it will never be called in the
!vtime case. But it still feels wrong poking into those details, e.g. it'll
be weird and/or wrong guest_exit_irqoff() gains stuff that isn't vtime specific.
Maybe that will never happen though? And of course, my hack alternative also
pokes into the details[*].
Thomas, do you have an input on the least awful way to handle this? My horrible
hack was to force PF_VCPU around the window where KVM handles IRQs after guest
exit.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d9f931c63293..6ddf341cd755 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9118,6 +9118,13 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
vcpu->mode = OUTSIDE_GUEST_MODE;
smp_wmb();
+ /*
+ * Temporarily pretend this task is running a vCPU when potentially
+ * processing an IRQ exit, including the below opening of an IRQ
+ * window. Tick-based accounting of guest time relies on PF_VCPU
+ * being set when the tick IRQ handler runs.
+ */
+ current->flags |= PF_VCPU;
static_call(kvm_x86_handle_exit_irqoff)(vcpu);
/*
@@ -9132,6 +9139,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
++vcpu->stat.exits;
local_irq_disable();
kvm_after_interrupt(vcpu);
+ current->flags &= ~PF_VCPU;
if (lapic_in_kernel(vcpu)) {
s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;
[*]https://lkml.kernel.org/r/20210206004218.312023-1-seanjc@google.com
> instrumentation_begin();
> trace_hardirqs_off_finish();
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fe806e8..234c8b3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9185,6 +9185,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> ++vcpu->stat.exits;
> local_irq_disable();
> kvm_after_interrupt(vcpu);
> + if (!vtime_accounting_enabled_this_cpu())
> + guest_exit_irqoff();
>
> if (lapic_in_kernel(vcpu)) {
> s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;
> --
> 2.7.4
>
next prev parent reply other threads:[~2021-03-29 17:16 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-29 9:43 [PATCH] KVM: X86: Properly account for guest CPU time when considering context tracking Wanpeng Li
2021-03-29 17:15 ` Sean Christopherson [this message]
2021-03-30 1:33 ` Wanpeng Li
2021-04-06 22:16 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YGILHM7CHpjXtxaH@google.com \
--to=seanjc@google.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kernellwp@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.