* [patch v2 0/3] kvmclock: allow stable sched clock
@ 2015-05-28 23:20 Marcelo Tosatti
2015-05-28 23:20 ` [patch v2 1/3] x86: kvmclock: add flag to indicate pvclock counts from zero Marcelo Tosatti
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Marcelo Tosatti @ 2015-05-28 23:20 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Luiz Capitulino, kvm
kvmclock provides the behaviour sched_clock users expect.
Mark it as stable allowing nohz_full in guests.
See individual patches for more details.
v2: address comments from Paolo:
- set COUNTS_FROM_ZERO unconditionally (host).
- rely on KVM_FEATURE_CLOCKSOURCE_STABLE_BIT (guest).
^ permalink raw reply [flat|nested] 4+ messages in thread
* [patch v2 1/3] x86: kvmclock: add flag to indicate pvclock counts from zero
2015-05-28 23:20 [patch v2 0/3] kvmclock: allow stable sched clock Marcelo Tosatti
@ 2015-05-28 23:20 ` Marcelo Tosatti
2015-05-28 23:20 ` [patch v2 2/3] x86: kvmclock: set scheduler clock stable Marcelo Tosatti
2015-05-28 23:20 ` [patch v2 3/3] KVM: x86: zero kvmclock_offset when vcpu0 initializes kvmclock system MSR Marcelo Tosatti
2 siblings, 0 replies; 4+ messages in thread
From: Marcelo Tosatti @ 2015-05-28 23:20 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Luiz Capitulino, kvm, Marcelo Tosatti
[-- Attachment #1: 01-pvclock-add-flag --]
[-- Type: text/plain, Size: 880 bytes --]
Setting sched clock stable for kvmclock causes the printk timestamps
to not start from zero, which is different from baremetal and
can possibly break userspace. Add a flag to indicate that
hypervisor sets clock base at zero when kvmclock is initialized.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/x86/include/asm/pvclock-abi.h | 1 +
1 file changed, 1 insertion(+)
Index: kvm/arch/x86/include/asm/pvclock-abi.h
===================================================================
--- kvm.orig/arch/x86/include/asm/pvclock-abi.h 2014-11-06 23:59:14.615913334 -0200
+++ kvm/arch/x86/include/asm/pvclock-abi.h 2015-05-27 17:40:53.435192771 -0300
@@ -41,5 +41,6 @@
#define PVCLOCK_TSC_STABLE_BIT (1 << 0)
#define PVCLOCK_GUEST_STOPPED (1 << 1)
+#define PVCLOCK_COUNTS_FROM_ZERO (1 << 2)
#endif /* __ASSEMBLY__ */
#endif /* _ASM_X86_PVCLOCK_ABI_H */
^ permalink raw reply [flat|nested] 4+ messages in thread
* [patch v2 2/3] x86: kvmclock: set scheduler clock stable
2015-05-28 23:20 [patch v2 0/3] kvmclock: allow stable sched clock Marcelo Tosatti
2015-05-28 23:20 ` [patch v2 1/3] x86: kvmclock: add flag to indicate pvclock counts from zero Marcelo Tosatti
@ 2015-05-28 23:20 ` Marcelo Tosatti
2015-05-28 23:20 ` [patch v2 3/3] KVM: x86: zero kvmclock_offset when vcpu0 initializes kvmclock system MSR Marcelo Tosatti
2 siblings, 0 replies; 4+ messages in thread
From: Marcelo Tosatti @ 2015-05-28 23:20 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Luiz Capitulino, kvm, Marcelo Tosatti
[-- Attachment #1: 02-guest-patch --]
[-- Type: text/plain, Size: 3357 bytes --]
From: Luiz Capitulino <lcapitulino@redhat.com>
If you try to enable NOHZ_FULL on a guest today, you'll get
the following error when the guest tries to deactivate the
scheduler tick:
WARNING: CPU: 3 PID: 2182 at kernel/time/tick-sched.c:192 can_stop_full_tick+0xb9/0x290()
NO_HZ FULL will not work with unstable sched clock
CPU: 3 PID: 2182 Comm: kworker/3:1 Not tainted 4.0.0-10545-gb9bb6fb #204
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: events flush_to_ldisc
ffffffff8162a0c7 ffff88011f583e88 ffffffff814e6ba0 0000000000000002
ffff88011f583ed8 ffff88011f583ec8 ffffffff8104d095 ffff88011f583eb8
0000000000000000 0000000000000003 0000000000000001 0000000000000001
Call Trace:
<IRQ> [<ffffffff814e6ba0>] dump_stack+0x4f/0x7b
[<ffffffff8104d095>] warn_slowpath_common+0x85/0xc0
[<ffffffff8104d146>] warn_slowpath_fmt+0x46/0x50
[<ffffffff810bd2a9>] can_stop_full_tick+0xb9/0x290
[<ffffffff810bd9ed>] tick_nohz_irq_exit+0x8d/0xb0
[<ffffffff810511c5>] irq_exit+0xc5/0x130
[<ffffffff814f180a>] smp_apic_timer_interrupt+0x4a/0x60
[<ffffffff814eff5e>] apic_timer_interrupt+0x6e/0x80
<EOI> [<ffffffff814ee5d1>] ? _raw_spin_unlock_irqrestore+0x31/0x60
[<ffffffff8108bbc8>] __wake_up+0x48/0x60
[<ffffffff8134836c>] n_tty_receive_buf_common+0x49c/0xba0
[<ffffffff8134a6bf>] ? tty_ldisc_ref+0x1f/0x70
[<ffffffff81348a84>] n_tty_receive_buf2+0x14/0x20
[<ffffffff8134b390>] flush_to_ldisc+0xe0/0x120
[<ffffffff81064d05>] process_one_work+0x1d5/0x540
[<ffffffff81064c81>] ? process_one_work+0x151/0x540
[<ffffffff81065191>] worker_thread+0x121/0x470
[<ffffffff81065070>] ? process_one_work+0x540/0x540
[<ffffffff8106b4df>] kthread+0xef/0x110
[<ffffffff8106b3f0>] ? __kthread_parkme+0xa0/0xa0
[<ffffffff814ef4f2>] ret_from_fork+0x42/0x70
[<ffffffff8106b3f0>] ? __kthread_parkme+0xa0/0xa0
---[ end trace 06e3507544a38866 ]---
However, it turns out that kvmclock does provide a stable
sched_clock callback. So, let the scheduler know this which
in turn makes NOHZ_FULL work in the guest.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
arch/x86/kernel/kvmclock.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
Index: kvm/arch/x86/kernel/kvmclock.c
===================================================================
--- kvm.orig/arch/x86/kernel/kvmclock.c 2015-05-27 18:00:53.616391551 -0300
+++ kvm/arch/x86/kernel/kvmclock.c 2015-05-28 19:19:13.878274636 -0300
@@ -24,6 +24,7 @@
#include <linux/percpu.h>
#include <linux/hardirq.h>
#include <linux/memblock.h>
+#include <linux/sched.h>
#include <asm/x86_init.h>
#include <asm/reboot.h>
@@ -217,8 +218,10 @@
void __init kvmclock_init(void)
{
+ struct pvclock_vcpu_time_info *vcpu_time;
unsigned long mem;
- int size;
+ int size, cpu;
+ u8 flags;
size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
@@ -264,7 +267,14 @@
pv_info.name = "KVM";
if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
- pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
+ pvclock_set_flags(~0);
+
+ cpu = get_cpu();
+ vcpu_time = &hv_clock[cpu].pvti;
+ flags = pvclock_read_flags(vcpu_time);
+ if (flags & PVCLOCK_COUNTS_FROM_ZERO)
+ set_sched_clock_stable();
+ put_cpu();
}
int __init kvm_setup_vsyscall_timeinfo(void)
^ permalink raw reply [flat|nested] 4+ messages in thread
* [patch v2 3/3] KVM: x86: zero kvmclock_offset when vcpu0 initializes kvmclock system MSR
2015-05-28 23:20 [patch v2 0/3] kvmclock: allow stable sched clock Marcelo Tosatti
2015-05-28 23:20 ` [patch v2 1/3] x86: kvmclock: add flag to indicate pvclock counts from zero Marcelo Tosatti
2015-05-28 23:20 ` [patch v2 2/3] x86: kvmclock: set scheduler clock stable Marcelo Tosatti
@ 2015-05-28 23:20 ` Marcelo Tosatti
2 siblings, 0 replies; 4+ messages in thread
From: Marcelo Tosatti @ 2015-05-28 23:20 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Luiz Capitulino, kvm, Marcelo Tosatti
[-- Attachment #1: 03-host-patch --]
[-- Type: text/plain, Size: 1008 bytes --]
Initialize kvmclock base, on kvmclock system MSR write time,
so that the guest sees kvmclock counting from zero.
This matches baremetal behaviour when kvmclock in guest
sets sched clock stable.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
arch/x86/kvm/x86.c | 4 ++++
1 file changed, 4 insertions(+)
Index: kvm/arch/x86/kvm/x86.c
===================================================================
--- kvm.orig/arch/x86/kvm/x86.c 2015-05-28 19:18:12.621372286 -0300
+++ kvm/arch/x86/kvm/x86.c 2015-05-28 19:19:17.738268690 -0300
@@ -1700,6 +1700,8 @@
vcpu->pvclock_set_guest_stopped_request = false;
}
+ /* pvclock counts from zero */
+ pvclock_flags |= PVCLOCK_COUNTS_FROM_ZERO;
/* If the host uses TSC clocksource, then it is stable */
if (use_master_clock)
pvclock_flags |= PVCLOCK_TSC_STABLE_BIT;
@@ -2282,6 +2284,8 @@
&vcpu->requests);
ka->boot_vcpu_runs_old_kvmclock = tmp;
+
+ ka->kvmclock_offset = -get_kernel_ns();
}
vcpu->arch.time = data;
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-05-28 23:24 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-28 23:20 [patch v2 0/3] kvmclock: allow stable sched clock Marcelo Tosatti
2015-05-28 23:20 ` [patch v2 1/3] x86: kvmclock: add flag to indicate pvclock counts from zero Marcelo Tosatti
2015-05-28 23:20 ` [patch v2 2/3] x86: kvmclock: set scheduler clock stable Marcelo Tosatti
2015-05-28 23:20 ` [patch v2 3/3] KVM: x86: zero kvmclock_offset when vcpu0 initializes kvmclock system MSR Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).