* [PATCH 0/3] KVM clock, new iteration
@ 2008-01-11 13:10 Glauber de Oliveira Costa
[not found] ` <12000570563874-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Glauber de Oliveira Costa @ 2008-01-11 13:10 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Cc: jeremy-TSDbQ3PG+2Y, avi-atKUWr5tajBWk0Htik3J/w
Hi folks,
A lot of time has been elapsed since I last sent a version of kvm clock,
with a lot of things in the middle, including my vacations and a nasty bug
in the clock itself ;-)
Here it is a new version. I'm following avi's last suggestion of using an msr
to make it more migration-proof. If it's not approved, we can easily switch back
to the hypercall mechanism. That's the easy part ;-)
Clock updates are done every time we're scheduled. It is _not_ enough to check
if we're changing cpus, because the tsc will change its frequency when the cpu is
idle (if using the mwait idle). So after a long idle period, one cpu will end up stepping
ahead causing a backward time (remember the bug I talked about? :-) )
As I fell it ready for inclusion (which obviously does not mean it is perfect, nor that
I don't value any further comments ), I'm also posting the userspace patches in a separate
series that will follow shortly.
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] put kvm_para.h include outside __KERNEL__
[not found] ` <12000570563874-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2008-01-11 13:10 ` Glauber de Oliveira Costa
[not found] ` <12000570732755-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 15:49 ` [PATCH 0/3] KVM clock, new iteration Anthony Liguori
1 sibling, 1 reply; 12+ messages in thread
From: Glauber de Oliveira Costa @ 2008-01-11 13:10 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Cc: jeremy-TSDbQ3PG+2Y, avi-atKUWr5tajBWk0Htik3J/w,
Glauber de Oliveira Costa
kvm_para.h potentially contains definitions that are to be used by kvm-userspace,
so it should not be included inside the __KERNEL__ block. To protect its own data structures,
kvm_para.h already includes its own __KERNEL__ block.
Signed-off-by: Glauber de Oliveira Costa <gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
include/linux/kvm_para.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
index 6af91a5..5497aac 100644
--- a/include/linux/kvm_para.h
+++ b/include/linux/kvm_para.h
@@ -14,12 +14,12 @@
#define KVM_HC_VAPIC_POLL_IRQ 1
-#ifdef __KERNEL__
/*
* hypercalls use architecture specific
*/
#include <asm/kvm_para.h>
+#ifdef __KERNEL__
static inline int kvm_para_has_feature(unsigned int feature)
{
if (kvm_arch_para_features() & (1UL << feature))
--
1.5.0.6
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/3] [PATCH] kvmclock - the host part.
[not found] ` <12000570732755-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2008-01-11 13:10 ` Glauber de Oliveira Costa
[not found] ` <12000570802212-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 15:09 ` [PATCH 1/3] put kvm_para.h include outside __KERNEL__ Amit Shah
1 sibling, 1 reply; 12+ messages in thread
From: Glauber de Oliveira Costa @ 2008-01-11 13:10 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Cc: jeremy-TSDbQ3PG+2Y, avi-atKUWr5tajBWk0Htik3J/w,
Glauber de Oliveira Costa
This is the host part of kvm clocksource implementation. As it does
not include clockevents, it is a fairly simple implementation. We
only have to register a per-vcpu area, and start writting to it periodically.
The area is binary compatible with xen, as we use the same shadow_info structure.
Signed-off-by: Glauber de Oliveira Costa <gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
arch/x86/kvm/x86.c | 64 +++++++++++++++++++++++++++++++++++++++++++-
include/asm-x86/kvm_host.h | 5 +++
include/asm-x86/kvm_para.h | 5 +++
include/linux/kvm.h | 1 +
4 files changed, 74 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8a90403..d3fd920 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -19,6 +19,7 @@
#include "irq.h"
#include "mmu.h"
+#include <linux/clocksource.h>
#include <linux/kvm.h>
#include <linux/fs.h>
#include <linux/vmalloc.h>
@@ -412,7 +413,7 @@ static u32 msrs_to_save[] = {
#ifdef CONFIG_X86_64
MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR,
#endif
- MSR_IA32_TIME_STAMP_COUNTER,
+ MSR_IA32_TIME_STAMP_COUNTER, MSR_PARAVIRT_CLOCK,
};
static unsigned num_msrs_to_save;
@@ -467,6 +468,50 @@ static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
return kvm_set_msr(vcpu, index, *data);
}
+#define WC_OFFSET offsetof(struct shared_info, wc_version)
+
+static void kvm_write_guest_time(struct kvm_vcpu *v)
+{
+ struct timespec ts, wc_ts;
+ int wc_args[3]; /* version, wc_sec, wc_nsec */
+ unsigned long flags;
+ struct kvm_vcpu_arch *vcpu = &v->arch;
+
+ if (!vcpu->shared_info)
+ return;
+
+ /* Make the update of both version numbers visible to guest */
+ wc_args[0] = ++vcpu->wc_version;
+ kvm_write_guest(v->kvm, vcpu->shared_info + WC_OFFSET, wc_args,
+ sizeof(wc_args));
+ vcpu->hv_clock.version++;
+ kvm_write_guest(v->kvm, vcpu->clock_addr, &vcpu->hv_clock,
+ sizeof(vcpu->hv_clock));
+
+ /* Keep irq disabled to prevent changes to the clock */
+ local_irq_save(flags);
+ kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER,
+ &vcpu->hv_clock.tsc_timestamp);
+ wc_ts = current_kernel_time();
+ ktime_get_ts(&ts);
+ local_irq_restore(flags);
+
+ /* With all the info we got, fill in the values */
+ wc_args[1] = wc_ts.tv_sec;
+ wc_args[2] = wc_ts.tv_nsec;
+ wc_args[0] = ++vcpu->wc_version;
+
+ vcpu->hv_clock.system_time = ts.tv_nsec +
+ (NSEC_PER_SEC * (u64)ts.tv_sec);
+ vcpu->hv_clock.version++;
+
+ /* And finally, let the guest see them */
+ kvm_write_guest(v->kvm, vcpu->shared_info + WC_OFFSET, wc_args,
+ sizeof(wc_args));
+ kvm_write_guest(v->kvm, vcpu->clock_addr, &vcpu->hv_clock,
+ sizeof(vcpu->hv_clock));
+}
+
int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
{
@@ -494,6 +539,18 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
case MSR_IA32_MISC_ENABLE:
vcpu->arch.ia32_misc_enable_msr = data;
break;
+ case MSR_PARAVIRT_CLOCK: {
+ vcpu->arch.shared_info = data;
+
+ vcpu->arch.clock_addr = data + offsetof(struct vcpu_info, time)
+ + sizeof(struct vcpu_info) * vcpu->vcpu_id;
+
+ vcpu->arch.hv_clock.tsc_to_system_mul =
+ clocksource_khz2mult(tsc_khz, 22);
+ vcpu->arch.hv_clock.tsc_shift = 22;
+ kvm_write_guest_time(vcpu);
+ break;
+ }
default:
pr_unimpl(vcpu, "unhandled wrmsr: 0x%x data %llx\n", msr, data);
return 1;
@@ -553,6 +610,9 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
data = vcpu->arch.shadow_efer;
break;
#endif
+ case MSR_PARAVIRT_CLOCK:
+ data = vcpu->arch.shared_info;
+ break;
default:
pr_unimpl(vcpu, "unhandled rdmsr: 0x%x\n", msr);
return 1;
@@ -680,6 +740,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_USER_MEMORY:
case KVM_CAP_SET_TSS_ADDR:
case KVM_CAP_EXT_CPUID:
+ case KVM_CAP_CLOCKSOURCE:
r = 1;
break;
case KVM_CAP_VAPIC:
@@ -737,6 +798,7 @@ out:
void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
kvm_x86_ops->vcpu_load(vcpu, cpu);
+ kvm_write_guest_time(vcpu);
}
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h
index d6db0de..bbc4b51 100644
--- a/include/asm-x86/kvm_host.h
+++ b/include/asm-x86/kvm_host.h
@@ -261,6 +261,11 @@ struct kvm_vcpu_arch {
/* emulate context */
struct x86_emulate_ctxt emulate_ctxt;
+
+ struct vcpu_time_info hv_clock;
+ gpa_t shared_info;
+ gpa_t clock_addr; /* so we avoid computing it every time */
+ int wc_version;
};
struct kvm_mem_alias {
diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
index c6f3fd8..abe412a 100644
--- a/include/asm-x86/kvm_para.h
+++ b/include/asm-x86/kvm_para.h
@@ -1,5 +1,6 @@
#ifndef __X86_KVM_PARA_H
#define __X86_KVM_PARA_H
+#include <xen/interface/xen.h>
/* This CPUID returns the signature 'KVMKVMKVM' in ebx, ecx, and edx. It
* should be used to determine that a VM is running under KVM.
@@ -10,9 +11,13 @@
* paravirtualization, the appropriate feature bit should be checked.
*/
#define KVM_CPUID_FEATURES 0x40000001
+#define KVM_FEATURE_CLOCKSOURCE 0
+#define MSR_PARAVIRT_CLOCK 0x11
#ifdef __KERNEL__
#include <asm/processor.h>
+extern void kvmclock_init(void);
+
/* This instruction is vmcall. On non-VT architectures, it will generate a
* trap that we will then rewrite to the appropriate instruction.
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 4de4fd2..78ce53f 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -232,6 +232,7 @@ struct kvm_vapic_addr {
#define KVM_CAP_SET_TSS_ADDR 4
#define KVM_CAP_EXT_CPUID 5
#define KVM_CAP_VAPIC 6
+#define KVM_CAP_CLOCKSOURCE 7
/*
* ioctls for VM fds
--
1.5.0.6
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/3] [PATCH] kvmclock implementation, the guest part.
[not found] ` <12000570802212-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2008-01-11 13:10 ` Glauber de Oliveira Costa
[not found] ` <12000570863750-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 15:47 ` [PATCH 2/3] [PATCH] kvmclock - the host part Anthony Liguori
2008-01-12 20:38 ` Avi Kivity
2 siblings, 1 reply; 12+ messages in thread
From: Glauber de Oliveira Costa @ 2008-01-11 13:10 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Cc: jeremy-TSDbQ3PG+2Y, avi-atKUWr5tajBWk0Htik3J/w,
Glauber de Oliveira Costa
This is the guest part of kvm clock implementation
It does not do tsc-only timing, as tsc can have deltas
between cpus, and it did not seem worthy to me to keep
adjusting them.
We do use it, however, for fine-grained adjustment.
Other than that, time comes from the host.
Signed-off-by: Glauber de Oliveira Costa <gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
arch/x86/Kconfig | 10 +++
arch/x86/kernel/Makefile_32 | 1 +
arch/x86/kernel/kvmclock.c | 160 +++++++++++++++++++++++++++++++++++++++++++
arch/x86/kernel/setup_32.c | 5 ++
arch/x86/xen/time.c | 6 +-
5 files changed, 179 insertions(+), 3 deletions(-)
create mode 100644 arch/x86/kernel/kvmclock.c
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ab2df55..968315e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -350,6 +350,16 @@ config VMI
at the moment), by linking the kernel to a GPL-ed ROM module
provided by the hypervisor.
+config KVM_CLOCK
+ bool "KVM paravirtualized clock"
+ select PARAVIRT
+ help
+ Turning on this option will allow you to run a paravirtualized clock
+ when running over the KVM hypervisor. Instead of relying on a PIT
+ (or probably other) emulation by the underlying device model, the host
+ provides the guest with timing infrastructure, as time of day, and
+ timer expiration.
+
source "arch/x86/lguest/Kconfig"
endif
diff --git a/arch/x86/kernel/Makefile_32 b/arch/x86/kernel/Makefile_32
index a7bc93c..f6332b6 100644
--- a/arch/x86/kernel/Makefile_32
+++ b/arch/x86/kernel/Makefile_32
@@ -44,6 +44,7 @@ obj-$(CONFIG_K8_NB) += k8.o
obj-$(CONFIG_MGEODE_LX) += geode_32.o mfgpt_32.o
obj-$(CONFIG_VMI) += vmi_32.o vmiclock_32.o
+obj-$(CONFIG_KVM_CLOCK) += kvmclock.o
obj-$(CONFIG_PARAVIRT) += paravirt_32.o
obj-y += pcspeaker.o
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
new file mode 100644
index 0000000..f2094f4
--- /dev/null
+++ b/arch/x86/kernel/kvmclock.c
@@ -0,0 +1,160 @@
+/* KVM paravirtual clock driver. A clocksource implementation
+ Copyright (C) 2007 Glauber de Oliveira Costa, Red Hat Inc.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+*/
+
+#include <linux/clocksource.h>
+#include <linux/kvm_para.h>
+#include <asm/arch_hooks.h>
+#include <asm/msr.h>
+
+
+#define KVM_SCALE 22
+
+static int kvmclock = 1;
+
+static int parse_no_kvmclock(char *arg)
+{
+ kvmclock = 0;
+ return 0;
+}
+early_param("no-kvmclock", parse_no_kvmclock);
+
+struct shared_info shared_info __attribute__((__aligned__(PAGE_SIZE)));
+
+/* The hypervisor will put information about time periodically here */
+DEFINE_PER_CPU(struct vcpu_time_info *, hv_clock);
+#define get_clock(cpu, field) per_cpu(hv_clock, cpu)->field
+
+static inline u64 kvm_get_delta(u64 last_tsc)
+{
+ int cpu = smp_processor_id();
+ u64 delta = native_read_tsc() - last_tsc;
+ return (delta * get_clock(cpu, tsc_to_system_mul)) >> KVM_SCALE;
+}
+
+/*
+ * The wallclock is the time of day when we booted. Since then, some time may
+ * have elapsed since the hypervisor wrote the data. So we try to account for
+ * that. Even if the tsc is not accurate, it gives us a more accurate timing
+ * than not adjusting at all
+ */
+unsigned long kvm_get_wallclock(void)
+{
+ u32 wc_sec, wc_nsec;
+ u64 delta, last_tsc;
+ struct timespec ts;
+ int version, nsec, cpu = smp_processor_id();
+
+ do {
+ version = shared_info.wc_version;
+ rmb();
+ wc_sec = shared_info.wc_sec;
+ wc_nsec = shared_info.wc_nsec;
+ last_tsc = get_clock(cpu, tsc_timestamp);
+ rmb();
+ } while ((shared_info.wc_version != version) || (version & 1));
+
+ delta = kvm_get_delta(last_tsc);
+ delta += wc_nsec;
+ nsec = do_div(delta, NSEC_PER_SEC);
+ set_normalized_timespec(&ts, wc_sec + delta, nsec);
+ /*
+ * Of all mechanisms of time adjustment I've tested, this one
+ * was the champion!
+ */
+ return ts.tv_sec + 1;
+}
+
+int kvm_set_wallclock(unsigned long now)
+{
+ return 0;
+}
+
+/*
+ * This is our read_clock function. The host puts an tsc timestamp each time
+ * it updates a new time, and then we can use it to derive a slightly more
+ * precise notion of elapsed time, converted to nanoseconds.
+ */
+static cycle_t kvm_clock_read(void)
+{
+ u64 last_tsc, now;
+ u32 version;
+ int cpu;
+
+ preempt_disable();
+ cpu = smp_processor_id();
+
+ do {
+ version = get_clock(cpu, version);
+ rmb();
+ last_tsc = get_clock(cpu, tsc_timestamp);
+ now = get_clock(cpu, system_time);
+ rmb();
+ } while ((get_clock(cpu, version) != version) || (version & 1));
+
+ now += kvm_get_delta(last_tsc);
+ preempt_enable();
+
+ return now;
+}
+
+static struct clocksource kvm_clock = {
+ .name = "kvm-clock",
+ .read = kvm_clock_read,
+ .rating = 400,
+ .mask = CLOCKSOURCE_MASK(64),
+ .mult = 1 << KVM_SCALE,
+ .shift = KVM_SCALE,
+ .flags = CLOCK_SOURCE_IS_CONTINUOUS,
+};
+
+static int kvm_register_clock(void)
+{
+ int cpu = smp_processor_id();
+ u64 kvm_clock_info = (u64)(u64 *)&shared_info;
+ kvm_clock_info = __pa(kvm_clock_info);
+ __get_cpu_var(hv_clock) = &(shared_info.vcpu_info[cpu].time);
+
+ return native_write_msr_safe(MSR_PARAVIRT_CLOCK, kvm_clock_info);
+}
+
+static void kvm_setup_secondary_clock(void)
+{
+ /*
+ * Now that the first cpu already had this clocksource initialized,
+ * we shouldn't fail.
+ */
+ WARN_ON(kvm_register_clock());
+ /* ok, done with our trickery, call native */
+ setup_secondary_APIC_clock();
+}
+
+void __init kvmclock_init(void)
+{
+ if (!kvm_para_available())
+ return;
+
+ if (kvmclock && kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE)) {
+ if (kvm_register_clock())
+ return;
+ pv_time_ops.get_wallclock = kvm_get_wallclock;
+ pv_time_ops.set_wallclock = kvm_set_wallclock;
+ pv_time_ops.sched_clock = kvm_clock_read;
+ pv_apic_ops.setup_secondary_clock = kvm_setup_secondary_clock;
+ clocksource_register(&kvm_clock);
+ }
+}
diff --git a/arch/x86/kernel/setup_32.c b/arch/x86/kernel/setup_32.c
index 9c24b45..89c0eb2 100644
--- a/arch/x86/kernel/setup_32.c
+++ b/arch/x86/kernel/setup_32.c
@@ -44,6 +44,7 @@
#include <linux/crash_dump.h>
#include <linux/dmi.h>
#include <linux/pfn.h>
+#include <linux/kvm_para.h>
#include <video/edid.h>
@@ -614,6 +615,10 @@ void __init setup_arch(char **cmdline_p)
max_low_pfn = setup_memory();
+#ifdef CONFIG_KVM_CLOCK
+ kvmclock_init();
+#endif
+
#ifdef CONFIG_VMI
/*
* Must be after max_low_pfn is determined, and before kernel
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index d083ff5..7728b87 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -28,7 +28,7 @@
#define TIMER_SLOP 100000
#define NS_PER_TICK (1000000000LL / HZ)
-static cycle_t xen_clocksource_read(void);
+cycle_t xen_clocksource_read(void);
/* These are perodically updated in shared_info, and then copied here. */
struct shadow_time_info {
@@ -304,7 +304,7 @@ static u64 get_nsec_offset(struct shadow_time_info *shadow)
return scale_delta(delta, shadow->tsc_to_nsec_mul, shadow->tsc_shift);
}
-static cycle_t xen_clocksource_read(void)
+cycle_t xen_clocksource_read(void)
{
struct shadow_time_info *shadow = &get_cpu_var(shadow_time);
cycle_t ret;
@@ -322,7 +322,7 @@ static cycle_t xen_clocksource_read(void)
return ret;
}
-static void xen_read_wallclock(struct timespec *ts)
+void xen_read_wallclock(struct timespec *ts)
{
const struct shared_info *s = HYPERVISOR_shared_info;
u32 version;
--
1.5.0.6
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] [PATCH] kvmclock implementation, the guest part.
[not found] ` <12000570863750-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2008-01-11 13:23 ` Gerd Hoffmann
[not found] ` <47876DB6.3020105-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Gerd Hoffmann @ 2008-01-11 13:23 UTC (permalink / raw)
To: Glauber de Oliveira Costa
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, jeremy-TSDbQ3PG+2Y,
avi-atKUWr5tajBWk0Htik3J/w
Hi,
> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
> index d083ff5..7728b87 100644
> --- a/arch/x86/xen/time.c
> +++ b/arch/x86/xen/time.c
> -static cycle_t xen_clocksource_read(void);
> +cycle_t xen_clocksource_read(void);
Huh? You kill the static, but don't use the functions anywhere? Looks
like half-done code sharing with xen paravirt clock ...
cheers,
Gerd
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] [PATCH] kvmclock implementation, the guest part.
[not found] ` <47876DB6.3020105-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2008-01-11 13:32 ` Glauber de Oliveira Costa
0 siblings, 0 replies; 12+ messages in thread
From: Glauber de Oliveira Costa @ 2008-01-11 13:32 UTC (permalink / raw)
To: Gerd Hoffmann
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, jeremy-TSDbQ3PG+2Y,
avi-atKUWr5tajBWk0Htik3J/w
Gerd Hoffmann wrote:
> Hi,
>
>> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
>> index d083ff5..7728b87 100644
>> --- a/arch/x86/xen/time.c
>> +++ b/arch/x86/xen/time.c
>> -static cycle_t xen_clocksource_read(void);
>> +cycle_t xen_clocksource_read(void);
>
> Huh? You kill the static, but don't use the functions anywhere? Looks
> like half-done code sharing with xen paravirt clock ...
>
> cheers,
> Gerd
>
It's called experimentation. I was sure I deleted this part of the patch
altogheter, but I must have
missed it. Thanks for the point out.
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] put kvm_para.h include outside __KERNEL__
[not found] ` <12000570732755-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 13:10 ` [PATCH 2/3] [PATCH] kvmclock - the host part Glauber de Oliveira Costa
@ 2008-01-11 15:09 ` Amit Shah
1 sibling, 0 replies; 12+ messages in thread
From: Amit Shah @ 2008-01-11 15:09 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Cc: jeremy-TSDbQ3PG+2Y, avi-atKUWr5tajBWk0Htik3J/w,
Glauber de Oliveira Costa
On Friday 11 January 2008 18:40:54 Glauber de Oliveira Costa wrote:
> kvm_para.h potentially contains definitions that are to be used by
> kvm-userspace, so it should not be included inside the __KERNEL__ block. To
> protect its own data structures, kvm_para.h already includes its own
> __KERNEL__ block.
>
> Signed-off-by: Glauber de Oliveira Costa <gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Acked-by: Amit Shah <amit.shah-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] [PATCH] kvmclock - the host part.
[not found] ` <12000570802212-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 13:10 ` [PATCH 3/3] [PATCH] kvmclock implementation, the guest part Glauber de Oliveira Costa
@ 2008-01-11 15:47 ` Anthony Liguori
[not found] ` <47878F8A.4010506-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2008-01-12 20:38 ` Avi Kivity
2 siblings, 1 reply; 12+ messages in thread
From: Anthony Liguori @ 2008-01-11 15:47 UTC (permalink / raw)
To: Glauber de Oliveira Costa
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, jeremy-TSDbQ3PG+2Y,
avi-atKUWr5tajBWk0Htik3J/w
Glauber de Oliveira Costa wrote:
> This is the host part of kvm clocksource implementation. As it does
> not include clockevents, it is a fairly simple implementation. We
> only have to register a per-vcpu area, and start writting to it periodically.
>
> The area is binary compatible with xen, as we use the same shadow_info structure.
>
> diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
> index c6f3fd8..abe412a 100644
> --- a/include/asm-x86/kvm_para.h
> +++ b/include/asm-x86/kvm_para.h
> @@ -1,5 +1,6 @@
> #ifndef __X86_KVM_PARA_H
> #define __X86_KVM_PARA_H
> +#include <xen/interface/xen.h>
Can we abstract that out into a neutral header instead of including the
Xen headers directly in KVM. Please rename the structure too to
something neutral.
Including something as generic as xen/interface/xen.h when CONFIG_XEN
may not be set is not a good thing, I believe.
Regards,
Anthony Liguori
>
> /* This CPUID returns the signature 'KVMKVMKVM' in ebx, ecx, and edx. It
> * should be used to determine that a VM is running under KVM.
> @@ -10,9 +11,13 @@
> * paravirtualization, the appropriate feature bit should be checked.
> */
> #define KVM_CPUID_FEATURES 0x40000001
> +#define KVM_FEATURE_CLOCKSOURCE 0
>
> +#define MSR_PARAVIRT_CLOCK 0x11
> #ifdef __KERNEL__
> #include <asm/processor.h>
> +extern void kvmclock_init(void);
> +
>
> /* This instruction is vmcall. On non-VT architectures, it will generate a
> * trap that we will then rewrite to the appropriate instruction.
> diff --git a/include/linux/kvm.h b/include/linux/kvm.h
> index 4de4fd2..78ce53f 100644
> --- a/include/linux/kvm.h
> +++ b/include/linux/kvm.h
> @@ -232,6 +232,7 @@ struct kvm_vapic_addr {
> #define KVM_CAP_SET_TSS_ADDR 4
> #define KVM_CAP_EXT_CPUID 5
> #define KVM_CAP_VAPIC 6
> +#define KVM_CAP_CLOCKSOURCE 7
>
> /*
> * ioctls for VM fds
>
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] KVM clock, new iteration
[not found] ` <12000570563874-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 13:10 ` [PATCH 1/3] put kvm_para.h include outside __KERNEL__ Glauber de Oliveira Costa
@ 2008-01-11 15:49 ` Anthony Liguori
1 sibling, 0 replies; 12+ messages in thread
From: Anthony Liguori @ 2008-01-11 15:49 UTC (permalink / raw)
To: Glauber de Oliveira Costa
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, jeremy-TSDbQ3PG+2Y,
avi-atKUWr5tajBWk0Htik3J/w
Glauber de Oliveira Costa wrote:
> Hi folks,
>
> A lot of time has been elapsed since I last sent a version of kvm clock,
> with a lot of things in the middle, including my vacations and a nasty bug
> in the clock itself ;-)
>
> Here it is a new version. I'm following avi's last suggestion of using an msr
> to make it more migration-proof. If it's not approved, we can easily switch back
> to the hypercall mechanism. That's the easy part ;-)
>
> Clock updates are done every time we're scheduled. It is _not_ enough to check
> if we're changing cpus, because the tsc will change its frequency when the cpu is
> idle (if using the mwait idle). So after a long idle period, one cpu will end up stepping
> ahead causing a backward time (remember the bug I talked about? :-) )
>
> As I fell it ready for inclusion (which obviously does not mean it is perfect, nor that
> I don't value any further comments ), I'm also posting the userspace patches in a separate
> series that will follow shortly.
>
>
This patch series is looking good. I like the switch to an MSR.
Regards,
Anthony Liguori
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] [PATCH] kvmclock - the host part.
[not found] ` <12000570802212-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 13:10 ` [PATCH 3/3] [PATCH] kvmclock implementation, the guest part Glauber de Oliveira Costa
2008-01-11 15:47 ` [PATCH 2/3] [PATCH] kvmclock - the host part Anthony Liguori
@ 2008-01-12 20:38 ` Avi Kivity
2 siblings, 0 replies; 12+ messages in thread
From: Avi Kivity @ 2008-01-12 20:38 UTC (permalink / raw)
To: Glauber de Oliveira Costa
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, jeremy-TSDbQ3PG+2Y
Glauber de Oliveira Costa wrote:
> This is the host part of kvm clocksource implementation. As it does
> not include clockevents, it is a fairly simple implementation. We
> only have to register a per-vcpu area, and start writting to it periodically.
>
> The area is binary compatible with xen, as we use the same shadow_info structure.
>
> @@ -412,7 +413,7 @@ static u32 msrs_to_save[] = {
> #ifdef CONFIG_X86_64
> MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR,
> #endif
> - MSR_IA32_TIME_STAMP_COUNTER,
> + MSR_IA32_TIME_STAMP_COUNTER, MSR_PARAVIRT_CLOCK,
> };
>
Let us be good citizens and put the msr into our private MSR_KVM_*
namespace.
>
>
> +#define WC_OFFSET offsetof(struct shared_info, wc_version)
> +
> +static void kvm_write_guest_time(struct kvm_vcpu *v)
> +{
> + struct timespec ts, wc_ts;
> + int wc_args[3]; /* version, wc_sec, wc_nsec */
> + unsigned long flags;
> + struct kvm_vcpu_arch *vcpu = &v->arch;
> +
> + if (!vcpu->shared_info)
> + return;
> +
> + /* Make the update of both version numbers visible to guest */
> + wc_args[0] = ++vcpu->wc_version;
> + kvm_write_guest(v->kvm, vcpu->shared_info + WC_OFFSET, wc_args,
> + sizeof(wc_args));
> + vcpu->hv_clock.version++;
> + kvm_write_guest(v->kvm, vcpu->clock_addr, &vcpu->hv_clock,
> + sizeof(vcpu->hv_clock));
> +
> + /* Keep irq disabled to prevent changes to the clock */
> + local_irq_save(flags);
> + kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER,
> + &vcpu->hv_clock.tsc_timestamp);
> + wc_ts = current_kernel_time();
> + ktime_get_ts(&ts);
> + local_irq_restore(flags);
> +
> + /* With all the info we got, fill in the values */
> + wc_args[1] = wc_ts.tv_sec;
> + wc_args[2] = wc_ts.tv_nsec;
> + wc_args[0] = ++vcpu->wc_version;
> +
> + vcpu->hv_clock.system_time = ts.tv_nsec +
> + (NSEC_PER_SEC * (u64)ts.tv_sec);
> + vcpu->hv_clock.version++;
> +
> + /* And finally, let the guest see them */
> + kvm_write_guest(v->kvm, vcpu->shared_info + WC_OFFSET, wc_args,
> + sizeof(wc_args));
> + kvm_write_guest(v->kvm, vcpu->clock_addr, &vcpu->hv_clock,
> + sizeof(vcpu->hv_clock));
> +}
> +
>
Why write things twice? This is per-vcpu, and the guest cannot see
anything between the two writes.
Also, kvm_write_guest() needs mmap_sem for read. Suggest you grab the
page using gfn_to_page() and kmap_atomic() it instead.
--
Any sufficiently difficult bug is indistinguishable from a feature.
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] [PATCH] kvmclock - the host part.
[not found] ` <47878F8A.4010506-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
@ 2008-01-12 20:49 ` Avi Kivity
[not found] ` <478927DA.3000800-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2008-01-12 20:49 UTC (permalink / raw)
To: Anthony Liguori
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, jeremy-TSDbQ3PG+2Y,
Glauber de Oliveira Costa
Anthony Liguori wrote:
> Glauber de Oliveira Costa wrote:
>
>> This is the host part of kvm clocksource implementation. As it does
>> not include clockevents, it is a fairly simple implementation. We
>> only have to register a per-vcpu area, and start writting to it periodically.
>>
>> The area is binary compatible with xen, as we use the same shadow_info structure.
>>
>> diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
>> index c6f3fd8..abe412a 100644
>> --- a/include/asm-x86/kvm_para.h
>> +++ b/include/asm-x86/kvm_para.h
>> @@ -1,5 +1,6 @@
>> #ifndef __X86_KVM_PARA_H
>> #define __X86_KVM_PARA_H
>> +#include <xen/interface/xen.h>
>>
>
> Can we abstract that out into a neutral header instead of including the
> Xen headers directly in KVM. Please rename the structure too to
> something neutral.
>
> Including something as generic as xen/interface/xen.h when CONFIG_XEN
> may not be set is not a good thing, I believe.
>
Well, one of the motivations behind this is to actually supply a Xen
clock to Xen guests. So maybe we can call the feature a "kvm xenclock"
instead and feel justified in including Xen headers.
However, you are probably right in that we are getting into a dependency
and kconfig hell we do not yet deserve. Not even mentioning that
CONFIG_XEN is a guest thing while kvmclock is also a host thing.
So we are probably better of using a non-Xen structure that accidentally
happens to be binary compatible with the Xen interface. Stranger things
have happened.
If we agree on this, please define _only_ the time-related parts so we
have a decoupled interface. We'll need two msrs: one for wc and one
for vcpu time.
--
Any sufficiently difficult bug is indistinguishable from a feature.
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] [PATCH] kvmclock - the host part.
[not found] ` <478927DA.3000800-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2008-01-12 22:51 ` Anthony Liguori
0 siblings, 0 replies; 12+ messages in thread
From: Anthony Liguori @ 2008-01-12 22:51 UTC (permalink / raw)
To: Avi Kivity
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, jeremy-TSDbQ3PG+2Y,
Glauber de Oliveira Costa
Avi Kivity wrote:
> Anthony Liguori wrote:
>> Glauber de Oliveira Costa wrote:
>>
>>> This is the host part of kvm clocksource implementation. As it does
>>> not include clockevents, it is a fairly simple implementation. We
>>> only have to register a per-vcpu area, and start writting to it
>>> periodically.
>>>
>>> The area is binary compatible with xen, as we use the same
>>> shadow_info structure.
>>>
>>> diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
>>> index c6f3fd8..abe412a 100644
>>> --- a/include/asm-x86/kvm_para.h
>>> +++ b/include/asm-x86/kvm_para.h
>>> @@ -1,5 +1,6 @@
>>> #ifndef __X86_KVM_PARA_H
>>> #define __X86_KVM_PARA_H
>>> +#include <xen/interface/xen.h>
>>>
>>
>> Can we abstract that out into a neutral header instead of including
>> the Xen headers directly in KVM. Please rename the structure too to
>> something neutral.
>>
>> Including something as generic as xen/interface/xen.h when CONFIG_XEN
>> may not be set is not a good thing, I believe.
>>
>
> Well, one of the motivations behind this is to actually supply a Xen
> clock to Xen guests. So maybe we can call the feature a "kvm
> xenclock" instead and feel justified in including Xen headers.
>
> However, you are probably right in that we are getting into a
> dependency and kconfig hell we do not yet deserve. Not even
> mentioning that CONFIG_XEN is a guest thing while kvmclock is also a
> host thing.
>
> So we are probably better of using a non-Xen structure that
> accidentally happens to be binary compatible with the Xen interface.
> Stranger things have happened.
Yes, this is what I meant. My concerns is dependencies and namespace
pollution. I think using the same interface is a very Good Thing.
Regards,
Anthony Liguori
> If we agree on this, please define _only_ the time-related parts so we
> have a decoupled interface. We'll need two msrs: one for wc and one
> for vcpu time.
>
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-01-12 22:51 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-11 13:10 [PATCH 0/3] KVM clock, new iteration Glauber de Oliveira Costa
[not found] ` <12000570563874-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 13:10 ` [PATCH 1/3] put kvm_para.h include outside __KERNEL__ Glauber de Oliveira Costa
[not found] ` <12000570732755-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 13:10 ` [PATCH 2/3] [PATCH] kvmclock - the host part Glauber de Oliveira Costa
[not found] ` <12000570802212-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 13:10 ` [PATCH 3/3] [PATCH] kvmclock implementation, the guest part Glauber de Oliveira Costa
[not found] ` <12000570863750-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 13:23 ` Gerd Hoffmann
[not found] ` <47876DB6.3020105-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-01-11 13:32 ` Glauber de Oliveira Costa
2008-01-11 15:47 ` [PATCH 2/3] [PATCH] kvmclock - the host part Anthony Liguori
[not found] ` <47878F8A.4010506-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2008-01-12 20:49 ` Avi Kivity
[not found] ` <478927DA.3000800-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-01-12 22:51 ` Anthony Liguori
2008-01-12 20:38 ` Avi Kivity
2008-01-11 15:09 ` [PATCH 1/3] put kvm_para.h include outside __KERNEL__ Amit Shah
2008-01-11 15:49 ` [PATCH 0/3] KVM clock, new iteration Anthony Liguori
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox