From: Paolo Bonzini <pbonzini@redhat.com>
To: Luiz Capitulino <lcapitulino@redhat.com>, kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, rkrcmar@redhat.com,
mtosatti@redhat.com, riel@redhat.com, bsd@redhat.com
Subject: Re: [PATCH] kvm: x86: make lapic hrtimer pinned
Date: Tue, 5 Apr 2016 12:05:12 +0200 [thread overview]
Message-ID: <57038DD8.7070204@redhat.com> (raw)
In-Reply-To: <20160404164607.09e306fa@redhat.com>
On 04/04/2016 22:46, Luiz Capitulino wrote:
> When a vCPU runs on a nohz_full core, the hrtimer used by
> the lapic emulation code can be migrated to another core.
> When this happens, it's possible to observe milisecond
> latency when delivering timer IRQs to KVM guests.
>
> The huge latency is mainly due to the fact that
> apic_timer_fn() expects to run during a kvm exit. It
> sets KVM_REQ_PENDING_TIMER and let it be handled on kvm
> entry. However, if the timer fires on a different core,
> we have to wait until the next kvm exit for the guest
> to see KVM_REQ_PENDING_TIMER set.
>
> This problem became visible after commit 9642d18ee. This
> commit changed the timer migration code to always attempt
> to migrate timers away from nohz_full cores. While it's
> discussable if this is correct/desirable (I don't think
> it is), it's clear that the lapic emulation code has
> a requirement on firing the hrtimer in the same core
> where it was started. This is achieved by making the
> hrtimer pinned.
>
> Lastly, note that KVM has code to migrate timers when a
> vCPU is scheduled to run in different core. However, this
> forced migration may fail. When this happens, we can have
> the same problem. If we want 100% correctness, we'll have
> to modify apic_timer_fn() to cause a kvm exit when it runs
> on a different core than the vCPU. Not sure if this is
> possible.
>
> Here's a reproducer for the issue being fixed:
>
> 1. Set all cores but core0 to be nohz_full cores
> 2. Start a guest with a single vCPU
> 3. Trace apic_timer_fn() and kvm_inject_apic_timer_irqs()
>
> You'll see that apic_timer_fn() will run in core0 while
> kvm_inject_apic_timer_irqs() runs in a different core. If
> you get both on core0, try running a program that takes 100%
> of the CPU and pin it to core0 to force the vCPU out.
>
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> ---
> arch/x86/kvm/lapic.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 443d2a5..1a2da0e 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1369,7 +1369,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
>
> hrtimer_start(&apic->lapic_timer.timer,
> ktime_add_ns(now, apic->lapic_timer.period),
> - HRTIMER_MODE_ABS);
> + HRTIMER_MODE_ABS_PINNED);
>
> apic_debug("%s: bus cycle is %" PRId64 "ns, now 0x%016"
> PRIx64 ", "
> @@ -1402,7 +1402,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
> expire = ktime_add_ns(now, ns);
> expire = ktime_sub_ns(expire, lapic_timer_advance_ns);
> hrtimer_start(&apic->lapic_timer.timer,
> - expire, HRTIMER_MODE_ABS);
> + expire, HRTIMER_MODE_ABS_PINNED);
> } else
> apic_timer_expired(apic);
>
> @@ -1868,7 +1868,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
> apic->vcpu = vcpu;
>
> hrtimer_init(&apic->lapic_timer.timer, CLOCK_MONOTONIC,
> - HRTIMER_MODE_ABS);
> + HRTIMER_MODE_ABS_PINNED);
> apic->lapic_timer.timer.function = apic_timer_fn;
>
> /*
> @@ -2003,7 +2003,7 @@ void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>
> timer = &vcpu->arch.apic->lapic_timer.timer;
> if (hrtimer_cancel(timer))
> - hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
> + hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED);
> }
>
> /*
>
Queued for 4.6.0-rc3, thanks.
Paolo
prev parent reply other threads:[~2016-04-05 10:05 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-04 20:46 [PATCH] kvm: x86: make lapic hrtimer pinned Luiz Capitulino
2016-04-04 21:00 ` Rik van Riel
2016-04-05 6:18 ` Yang Zhang
2016-04-05 12:40 ` Luiz Capitulino
2016-04-21 23:12 ` Wanpeng Li
2016-04-22 13:12 ` Luiz Capitulino
2016-04-23 23:06 ` Wanpeng Li
2016-04-05 15:54 ` Radim Krčmář
2016-04-07 2:08 ` Yang Zhang
2016-04-05 10:05 ` Paolo Bonzini [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57038DD8.7070204@redhat.com \
--to=pbonzini@redhat.com \
--cc=bsd@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=lcapitulino@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=riel@redhat.com \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.