From: Radim Krcmar <rkrcmar@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>, X86 ML <x86@kernel.org>,
Marcelo Tosatti <mtosatti@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
kvm list <kvm@vger.kernel.org>, Alexander Graf <agraf@suse.de>
Subject: Re: [PATCH 1/5] x86/kvm: On KVM re-enable (e.g. after suspend), update clocks
Date: Thu, 17 Mar 2016 16:10:07 +0100 [thread overview]
Message-ID: <20160317151007.GF20310@potion.brq.redhat.com> (raw)
In-Reply-To: <CALCETrUSWnuW5yWyu28kqV1JrBUU+TrVTFDJZ+a8VwQoi2FPYA@mail.gmail.com>
2016-03-16 16:07-0700, Andy Lutomirski:
> On Wed, Mar 16, 2016 at 3:59 PM, Radim Krcmar <rkrcmar@redhat.com> wrote:
>> 2016-03-16 15:15-0700, Andy Lutomirski:
>>> FWIW, if you ever intend to support ART ("always running timer")
>>> passthrough, this is going to be a giant clusterfsck. Good luck. I
>>> haven't gotten a straight answer as to what hardware actually supports
>>> that thing, so even testing isn't no easy.
>>
>> Hm, AR TSC would be best handled by doing nothing ... dropping the
>> faking logic just became tempting.
ART is different from what I initially thought, it's the underlying
mechanism for invariant TSC and nothing more ... we already forbid
migrations when the guest knows about invariant TSC, so we could do the
same and let ART be virtualized. (Suspend has to be forbidden too.)
> As it stands, ART is screwed if you adjust the VMCS's tsc offset. But
Luckily, assigning real hardware can prevent migration or suspend, so we
won't need to adjust the offset during runtime. TSC is a generally
unmigratable device that just happens to live on the CPU.
(It would have been better to hide TSC capability from the guest and only
use rdtsc for kvmclock if the guest wanted fancy features.)
> I think it's also screwed if you migrate to a machine with a different
> ratio of guest TSC ticks to host ART ticks or a different offset,
> because the host isn't going to do the rdmsr every time it tries to
> access the ART, so passing it through might require a paravirt
> mechanism no matter what.
It's almost certain that the other host will have a different offset,
which makes TSC unmigratable in software without even considering ART
or frequencies. Well, KVM already emulates different TSC frequency, so
we could emulate ART without sinking much lower. :)
> ISTM that, if KVM tries to keep the guest TSC monotonic across
> migration, it should probably also keep it monotonic across host
> suspend/resume.
Yes, "Pausing" TSC during suspend or migration is one way of improving
the TSC estimate. If we want to emulate ART, then the estimate is
noticeably lacking, because TSC and ART are defined by a simple
equation (SDM 2015-12, 17.14.4 Invariant Time-Keeping):
TSC_Value = (ART_Value * CPUID.15H:EBX[31:0] )/ CPUID.15H:EAX[31:0] + K
where the guest thinks that CPUID and K are constant (between events
that the guest knows of), so we should give the best estimate of how
many TSC cycles have passed. (The best estimate is still lacking.)
> After all, host suspend/resume is kind of like
> migrating from the pre-suspend host to the post-resume host. Maybe it
> could even share code.
Hopefully ... host suspend/resume is driven by kernel and migration is
driven by userspace, which might complicate sharing.
next prev parent reply other threads:[~2016-03-17 15:10 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-09 23:12 [PATCH 0/5] x86: KVM vdso and clock improvements Andy Lutomirski
2015-12-09 23:12 ` [PATCH 1/5] x86/kvm: On KVM re-enable (e.g. after suspend), update clocks Andy Lutomirski
2015-12-14 8:16 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-12-14 10:18 ` Paolo Bonzini
2016-03-16 22:06 ` [PATCH 1/5] " Radim Krcmar
2016-03-16 22:15 ` Andy Lutomirski
2016-03-16 22:59 ` Radim Krcmar
2016-03-16 23:07 ` Andy Lutomirski
2016-03-17 15:10 ` Radim Krcmar [this message]
2016-03-17 18:22 ` Andy Lutomirski
2016-03-17 19:58 ` Radim Krcmar
2015-12-09 23:12 ` [PATCH 2/5] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader Andy Lutomirski
2015-12-10 9:09 ` Paolo Bonzini
2015-12-11 7:52 ` Ingo Molnar
2015-12-11 8:42 ` Paolo Bonzini
2015-12-11 18:03 ` Andy Lutomirski
2015-12-14 8:16 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-12-09 23:12 ` [PATCH 3/5] x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap Andy Lutomirski
2015-12-10 9:09 ` Paolo Bonzini
2015-12-14 8:17 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-12-09 23:12 ` [PATCH 4/5] x86/vdso: Remove pvclock fixmap machinery Andy Lutomirski
2015-12-10 9:09 ` Paolo Bonzini
2015-12-11 8:06 ` [PATCH] x86/platform/uv: Include clocksource.h for clocksource_touch_watchdog() Ingo Molnar
2015-12-11 17:33 ` Andy Lutomirski
2015-12-14 8:17 ` [tip:x86/asm] x86/vdso: Remove pvclock fixmap machinery tip-bot for Andy Lutomirski
2015-12-09 23:12 ` [PATCH 5/5] x86/vdso: Enable vdso pvclock access on all vdso variants Andy Lutomirski
2015-12-10 9:10 ` Paolo Bonzini
2015-12-14 8:17 ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2015-12-11 3:21 ` [PATCH 0/5] x86: KVM vdso and clock improvements Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160317151007.GF20310@potion.brq.redhat.com \
--to=rkrcmar@redhat.com \
--cc=agraf@suse.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=luto@kernel.org \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).