From: Radim Krcmar <rkrcmar@redhat.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>, kvm list <kvm@vger.kernel.org>
Subject: Re: What's kvmclock's custom sched_clock for?
Date: Tue, 12 Jan 2016 16:33:28 +0100 [thread overview]
Message-ID: <20160112153327.GA12521@potion.brq.redhat.com> (raw)
In-Reply-To: <20160111210045.GA13738@amt.cnet>
2016-01-11 19:00-0200, Marcelo Tosatti:
> On Fri, Jan 08, 2016 at 03:13:16PM +0100, Radim Krcmar wrote:
>> 2016-01-07 18:10-0200, Marcelo Tosatti:
>>> On Thu, Jan 07, 2016 at 04:18:11PM +0100, Radim Krcmar wrote:
>>>> Stable sched clock is quite unrelated to TSC features. KVMs from last
>>>> few years should always give good enough result to allow stable sched
>>>> clock. We wanted realtime guests and realtime linux needs no_hz=full
>>>> that depends on stable sched clock. The result is huge hack.
>>>>
>>>> We'd need to say that migration creates powerful gravity fields to
>>>> faithfully migrate constant/invariant TSC, but stable sched clock
>>>> doesn't have that strict expectations about time.
>>>
>>> Was that supposed to be a joke?
>>
>> Yes, if you mean the first sentence of the second paragraph.
>> (I think that we'll use a different disclaimer when we enable
>> best-effort migration with invariant TSC.)
>
> About getting rid of kvmclock,
I never wanted to get rid of kvmclock. In the first part of the email
in question, I meant that the shift and scale can be accelerated by
VMX-TSC hardware, leaving only a check that kvmclock in expected mode
and rdtsc to get the result.
> problem is steal time. Should
> separate steal time reporting from rest of kvmclock, so that you
> can use TSC clocksource and have steal time reporting.
We can already do that, steal time doesn't depend on guest sched clock.
Steal time uses a MSR+memory based interface that is related to kvmclock
only by shared notion of a second.
> Also, its very clear why migration was disabled, because
> invariant tsc man page says:
>
> QEMU commit 68bfd0ad4a1dcc4c328d5db85dc746b49c1ec07e
>
> target-i386: block migration and savevm if invariant tsc is exposed
>
> Invariant TSC documentation mentions that "invariant TSC will run at a
> constant rate in all ACPI P-, C-. and T-states".
>
> This is not the case if migration to a host with different TSC frequency
> is allowed, or if savevm is performed. So block migration/savevm.
>
> The issue is, even with migration to a host with
> proper frequency, TSC counting will stop for the duration of migration.
Stopping is the easiest solution. We can also try to mitigate the
difference by synchronizing time on source and destination hosts,
sharing what UTC/TAI/... time there was at one TSC read on the source,
and setting the appropriate TSC shift on the destination. (And solve
accumulation of the error, maybe by always using the initial pair.)
The result should be less off than when stopping and the guest couldn't
tell that TSC rate varied as it can't have more reliable time source
than the host.
The issue doesn't have a good solution and I think that some people will
prefer drawbacks associated with invariant TSC migration.
(They do so for other time sources and all have the issue + we already
migrate constant TSC, which can only match the spec if we make some
excuses, like "migration forces CPUs into a deep C-state".)
> But i suppose you can document the fact (that "invariant TSC" behaviour
> as documented is different than what exposed by virtualization),
Yep, that generic explanation is quite likely, next to no documentation.
(There are some lawyerish explanations that don't need to violate the
spec, but I prefer the physics-based one.)
> and
> go for it.
I definitely won't be proactive.
next prev parent reply other threads:[~2016-01-12 15:33 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-07 7:18 What's kvmclock's custom sched_clock for? Andy Lutomirski
2016-01-07 8:41 ` Andy Lutomirski
2016-01-07 10:59 ` Marcelo Tosatti
2016-01-07 15:18 ` Radim Krcmar
2016-01-07 17:27 ` Andy Lutomirski
2016-01-07 17:48 ` Radim Krcmar
2016-01-07 20:15 ` Marcelo Tosatti
2016-01-07 20:10 ` Marcelo Tosatti
2016-01-08 14:13 ` Radim Krcmar
2016-01-11 21:00 ` Marcelo Tosatti
2016-01-12 15:33 ` Radim Krcmar [this message]
2016-01-12 20:48 ` Marcelo Tosatti
2016-01-13 14:59 ` Radim Krcmar
2016-01-07 10:56 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160112153327.GA12521@potion.brq.redhat.com \
--to=rkrcmar@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.