From: Sean Christopherson <sean.j.christopherson@intel.com>
To: "Paolo Bonzini" <pbonzini@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org,
Sean Christopherson <sean.j.christopherson@intel.com>,
Liran Alon <liran.alon@oracle.com>,
Wanpeng Li <wanpengli@tencent.com>
Subject: [PATCH 0/7] KVM: lapic: Fix a variety of timer adv issues
Date: Fri, 12 Apr 2019 13:18:27 -0700 [thread overview]
Message-ID: <20190412201834.10831-1-sean.j.christopherson@intel.com> (raw)
The recent change to automatically calculate lapic_timer_advance_ns
introduced a handful of gnarly bugs, and also exposed a latent bug by
virtue of the advancement logic being enabled by default. Inspection
and debug revealed several other opportunities for minor improvements.
The primary issue is that the auto-tuning of lapic_timer_advance_ns is
completely unbounded, e.g. there's nothing in the logic that prevents the
advancement from creeping up to hundreds of milliseconds. Adjusting the
timers by large amounts creates major discrepancies in the guest, e.g. a
timer that was configured to fire after multiple milliseconds may arrive
before the guest executes a single instruction. While technically correct
from a time perspective, it breaks a reasonable assumption from the guest
that it can execute some number of instructions between timer events.
The other major issue is that the advancement is global, while TSC scaling
is done on a per-vCPU basic. Adjusting the advancement at runtime
exacerbates this as there is no protection against multiple vCPUs and/or
multiple VMs concurrently modifying the advancement value, e.g. it can
effectively become corrupted or never stabilize due to getting bounced all
over tarnation.
As for the latent bug, when timer advancement was applied to the hv_timer,
i.e. the VMX preemption timer, the logic to trigger wait_for_lapic_timer()
was not updated. As a result, a timer interrupt emulated via the hv_timer
can easily arrive too early from a *time* perspective, as opposed to
simply arriving early from a "number of instructions executed" perspective.
As an aside, the test stage counting in kvm-unit-tests' vmx.flat/interrupt
is off by one, which causes it to misreport what it's testing, e.g. says
it's testing "direct interrupt" when the unit test's host is intercepting
interrupts and vice versa. That was a fun few hours of debug.
Sean Christopherson (7):
KVM: lapic: Hard cap the auto-calculated timer advancement
KVM: lapic: Delay 1ns at a time when waiting for timer to "expire"
KVM: lapic: Track lapic timer advance per vCPU
KVM: lapic: Allow user to override auto-tuning of timer advancement
KVM: lapic: Busy wait for timer to expire when using hv_timer
KVM: lapic: Clean up the code for handling of a pre-expired hv_timer
KVM: VMX: Skip delta_tsc shift-and-divide if the dividend is zero
arch/x86/kvm/lapic.c | 56 +++++++++++++++++++++++++-----------------
arch/x86/kvm/lapic.h | 6 ++++-
arch/x86/kvm/vmx/vmx.c | 9 ++++---
arch/x86/kvm/x86.c | 7 +++---
arch/x86/kvm/x86.h | 2 --
5 files changed, 47 insertions(+), 33 deletions(-)
--
2.21.0
next reply other threads:[~2019-04-12 20:18 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-12 20:18 Sean Christopherson [this message]
2019-04-12 20:18 ` [PATCH 1/7] KVM: lapic: Hard cap the auto-calculated timer advancement Sean Christopherson
2019-04-14 10:22 ` Liran Alon
2019-04-12 20:18 ` [PATCH 2/7] KVM: lapic: Delay 1ns at a time when waiting for timer to "expire" Sean Christopherson
2019-04-14 11:25 ` Liran Alon
2019-04-15 16:11 ` Sean Christopherson
2019-04-15 17:06 ` Liran Alon
2019-04-16 11:02 ` Paolo Bonzini
2019-04-16 11:04 ` Liran Alon
2019-04-16 11:09 ` Paolo Bonzini
2019-04-12 20:18 ` [PATCH 3/7] KVM: lapic: Track lapic timer advance per vCPU Sean Christopherson
2019-04-14 11:29 ` Liran Alon
2019-04-12 20:18 ` [PATCH 4/7] KVM: lapic: Allow user to override auto-tuning of timer advancement Sean Christopherson
2019-04-14 11:35 ` Liran Alon
2019-04-15 16:23 ` Sean Christopherson
2019-04-15 17:10 ` Liran Alon
2019-04-12 20:18 ` [PATCH 5/7] KVM: lapic: Busy wait for timer to expire when using hv_timer Sean Christopherson
2019-04-14 11:47 ` Liran Alon
2019-04-12 20:18 ` [PATCH 6/7] KVM: lapic: Clean up the code for handling of a pre-expired hv_timer Sean Christopherson
2019-04-14 12:15 ` Liran Alon
2019-04-15 16:32 ` Sean Christopherson
2019-04-15 17:25 ` Liran Alon
2019-04-16 16:39 ` Sean Christopherson
2019-04-16 16:48 ` Liran Alon
2019-04-16 17:27 ` Sean Christopherson
2019-04-16 17:27 ` Liran Alon
2019-04-16 11:14 ` Paolo Bonzini
2019-04-12 20:18 ` [PATCH 7/7] KVM: VMX: Skip delta_tsc shift-and-divide if the dividend is zero Sean Christopherson
2019-04-14 12:21 ` Liran Alon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190412201834.10831-1-sean.j.christopherson@intel.com \
--to=sean.j.christopherson@intel.com \
--cc=kvm@vger.kernel.org \
--cc=liran.alon@oracle.com \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox