From: Joao Martins <joao.m.martins@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
xen-devel@lists.xenproject.org
Subject: Re: [PATCH v3 5/6] x86/time: implement PVCLOCK_TSC_STABLE_BIT
Date: Tue, 30 Aug 2016 15:14:41 +0100 [thread overview]
Message-ID: <57C594D1.1000404@oracle.com> (raw)
In-Reply-To: <57C59BF6020000780010A29A@prv-mh.provo.novell.com>
On 08/30/2016 01:45 PM, Jan Beulich wrote:
>>>> On 30.08.16 at 14:26, <joao.m.martins@oracle.com> wrote:
>> On 08/29/2016 11:06 AM, Jan Beulich wrote:
>>>>>> On 26.08.16 at 17:44, <joao.m.martins@oracle.com> wrote:
>>>> On 08/25/2016 11:37 AM, Jan Beulich wrote:
>>>>>>>> On 24.08.16 at 14:43, <joao.m.martins@oracle.com> wrote:
>>>>>> This patch proposes relying on host TSC synchronization and
>>>>>> passthrough to the guest, when running on a TSC-safe platform. On
>>>>>> time_calibration we retrieve the platform time in ns and the counter
>>>>>> read by the clocksource that was used to compute system time. We
>>>>>> introduce a new rendezous function which doesn't require
>>>>>> synchronization between master and slave CPUS and just reads
>>>>>> calibration_rendezvous struct and writes it down the stime and stamp
>>>>>> to the cpu_calibration struct to be used later on. We can guarantee that
>>>>>> on a platform with a constant and reliable TSC, that the time read on
>>>>>> vcpu B right after A is bigger independently of the VCPU calibration
>>>>>> error. Since pvclock time infos are monotonic as seen by any vCPU set
>>>>>> PVCLOCK_TSC_STABLE_BIT, which then enables usage of VDSO on Linux.
>>>>>> IIUC, this is similar to how it's implemented on KVM.
>>>>>
>>>>> Without any tools side change, how is it guaranteed that a guest
>>>>> which observed the stable bit won't get migrated to a host not
>>>>> providing that guarantee?
>>>> Do you want to prevent migration in such cases? The worst that can happen is that the
>>>> guest might need to fallback to a system call if this bit is 0 and would keep doing
>>>> so if the bit is 0.
>>>
>>> Whether migration needs preventing I'm not sure; all I was trying
>>> to indicate is that there seem to be pieces missing wrt migration.
>>> As to the guest falling back to a system call - are guest kernels and
>>> (as far as as affected) applications required to cope with the flag
>>> changing from 1 to 0 behind their back?
>> It's expected they cope with this bit changing AFAIK. The vdso code (i.e.
>> applications) always check this bit on every read to decide whether to fallback to a
>> system call. And same for pvclock code in the guest kernel on every read in both
>> Linux/FreeBSD to see whether to skip or not the monotonicity checks.
>
> Okay, but please make sure this is called out at least in the commit
> message, if not in a code comment.
Got it.
>>>> Other than the things above I am not sure how to go about this :( Should we start
>>>> adjusting the TSCs if we find disparities or skew is observed on the long run? Or
>>>> allow only TSCs on vCPUS of the same package to expose this flag? Hmm, what's your
>>>> take on this? Appreciate your feedback.
>>>
>>> At least as an initial approach requiring affinities to be limited to a
>>> single socket would seem like a good compromise, provided HT
>>> aspects don't have a bad effect (in which case also excluding HT
>>> may be required). I'd also be fine with command line options
>>> allowing to further relax that, but a simple "clocksource=tsc"
>>> should imo result in a setup which from all we can tell will work as
>>> intended.
>> Sounds reasonable, so unless command line options are specified we disallow TSC to be
>> clocksource on multi-socket systems. WRT to command line options, how about extending
>> "tsc" parameter to accept another possible value such as "global" or "socketsafe"?
>> Current values are "unstable" and "skewed".
>
> What about "stable, "stable:socket" (and then perhaps also
> "stable:node")?
Hmm, much nicer. Let me add these two options, alongside with the docs update wrt to
the tsc param. I'll probably do so in a separate patch in the series.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-08-30 14:13 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-24 12:43 [PATCH v3 0/6] x86/time: PVCLOCK_TSC_STABLE_BIT support Joao Martins
2016-08-24 12:43 ` [PATCH v3 1/6] x86/time: refactor init_platform_time() Joao Martins
2016-08-25 10:03 ` Jan Beulich
2016-08-26 14:54 ` Joao Martins
2016-08-24 12:43 ` [PATCH v3 2/6] x86/time: implement tsc as clocksource Joao Martins
2016-08-25 10:06 ` Jan Beulich
2016-08-26 15:11 ` Joao Martins
2016-08-29 9:36 ` Jan Beulich
2016-08-30 12:08 ` Joao Martins
2016-08-30 12:30 ` Jan Beulich
2016-08-30 13:59 ` Joao Martins
2016-08-24 12:43 ` [PATCH v3 3/6] x86/time: streamline platform time init on plt_update() Joao Martins
2016-08-25 10:13 ` Jan Beulich
2016-08-26 15:12 ` Joao Martins
2016-08-29 9:41 ` Jan Beulich
2016-08-30 12:10 ` Joao Martins
2016-08-30 12:31 ` Jan Beulich
2016-09-09 16:32 ` Joao Martins
2016-09-12 7:26 ` Jan Beulich
2016-09-12 10:35 ` Joao Martins
2016-08-24 12:43 ` [PATCH v3 4/6] x86/time: refactor read_platform_stime() Joao Martins
2016-08-25 10:17 ` Jan Beulich
2016-08-26 15:13 ` Joao Martins
2016-08-29 9:42 ` Jan Beulich
2016-08-24 12:43 ` [PATCH v3 5/6] x86/time: implement PVCLOCK_TSC_STABLE_BIT Joao Martins
2016-08-25 10:37 ` Jan Beulich
2016-08-26 15:44 ` Joao Martins
2016-08-29 10:06 ` Jan Beulich
2016-08-30 12:26 ` Joao Martins
2016-08-30 12:45 ` Jan Beulich
2016-08-30 14:14 ` Joao Martins [this message]
2016-08-24 12:43 ` [PATCH v3 6/6] docs: update clocksource option Joao Martins
2016-08-25 10:38 ` Jan Beulich
2016-08-26 15:13 ` Joao Martins
2016-08-24 12:50 ` [PATCH v3 0/6] x86/time: PVCLOCK_TSC_STABLE_BIT support Joao Martins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57C594D1.1000404@oracle.com \
--to=joao.m.martins@oracle.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.