From: Luiz Capitulino <lcapitulino@redhat.com>
To: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, pbonzini@redhat.com,
rkrcmar@redhat.com, mtosatti@redhat.com, riel@redhat.com,
bsd@redhat.com
Subject: [PATCH] kvm: x86: make lapic hrtimer pinned
Date: Mon, 4 Apr 2016 16:46:07 -0400 [thread overview]
Message-ID: <20160404164607.09e306fa@redhat.com> (raw)
When a vCPU runs on a nohz_full core, the hrtimer used by
the lapic emulation code can be migrated to another core.
When this happens, it's possible to observe milisecond
latency when delivering timer IRQs to KVM guests.
The huge latency is mainly due to the fact that
apic_timer_fn() expects to run during a kvm exit. It
sets KVM_REQ_PENDING_TIMER and let it be handled on kvm
entry. However, if the timer fires on a different core,
we have to wait until the next kvm exit for the guest
to see KVM_REQ_PENDING_TIMER set.
This problem became visible after commit 9642d18ee. This
commit changed the timer migration code to always attempt
to migrate timers away from nohz_full cores. While it's
discussable if this is correct/desirable (I don't think
it is), it's clear that the lapic emulation code has
a requirement on firing the hrtimer in the same core
where it was started. This is achieved by making the
hrtimer pinned.
Lastly, note that KVM has code to migrate timers when a
vCPU is scheduled to run in different core. However, this
forced migration may fail. When this happens, we can have
the same problem. If we want 100% correctness, we'll have
to modify apic_timer_fn() to cause a kvm exit when it runs
on a different core than the vCPU. Not sure if this is
possible.
Here's a reproducer for the issue being fixed:
1. Set all cores but core0 to be nohz_full cores
2. Start a guest with a single vCPU
3. Trace apic_timer_fn() and kvm_inject_apic_timer_irqs()
You'll see that apic_timer_fn() will run in core0 while
kvm_inject_apic_timer_irqs() runs in a different core. If
you get both on core0, try running a program that takes 100%
of the CPU and pin it to core0 to force the vCPU out.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
arch/x86/kvm/lapic.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 443d2a5..1a2da0e 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1369,7 +1369,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
hrtimer_start(&apic->lapic_timer.timer,
ktime_add_ns(now, apic->lapic_timer.period),
- HRTIMER_MODE_ABS);
+ HRTIMER_MODE_ABS_PINNED);
apic_debug("%s: bus cycle is %" PRId64 "ns, now 0x%016"
PRIx64 ", "
@@ -1402,7 +1402,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
expire = ktime_add_ns(now, ns);
expire = ktime_sub_ns(expire, lapic_timer_advance_ns);
hrtimer_start(&apic->lapic_timer.timer,
- expire, HRTIMER_MODE_ABS);
+ expire, HRTIMER_MODE_ABS_PINNED);
} else
apic_timer_expired(apic);
@@ -1868,7 +1868,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
apic->vcpu = vcpu;
hrtimer_init(&apic->lapic_timer.timer, CLOCK_MONOTONIC,
- HRTIMER_MODE_ABS);
+ HRTIMER_MODE_ABS_PINNED);
apic->lapic_timer.timer.function = apic_timer_fn;
/*
@@ -2003,7 +2003,7 @@ void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
timer = &vcpu->arch.apic->lapic_timer.timer;
if (hrtimer_cancel(timer))
- hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
+ hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED);
}
/*
--
2.1.0
next reply other threads:[~2016-04-04 20:46 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-04 20:46 Luiz Capitulino [this message]
2016-04-04 21:00 ` [PATCH] kvm: x86: make lapic hrtimer pinned Rik van Riel
2016-04-05 6:18 ` Yang Zhang
2016-04-05 12:40 ` Luiz Capitulino
2016-04-21 23:12 ` Wanpeng Li
2016-04-22 13:12 ` Luiz Capitulino
2016-04-23 23:06 ` Wanpeng Li
2016-04-05 15:54 ` Radim Krčmář
2016-04-07 2:08 ` Yang Zhang
2016-04-05 10:05 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160404164607.09e306fa@redhat.com \
--to=lcapitulino@redhat.com \
--cc=bsd@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=riel@redhat.com \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).