* [PATCH] kvm: Fix kvmclock initialization on !CONFIG_KVM_GUEST
@ 2012-08-15 14:05 OGAWA Hirofumi
2012-08-16 19:57 ` Marcelo Tosatti
2012-08-16 20:00 ` [PATCH] x86: KVM guest: merge CONFIG_KVM_CLOCK into CONFIG_KVM_GUEST Marcelo Tosatti
0 siblings, 2 replies; 6+ messages in thread
From: OGAWA Hirofumi @ 2012-08-15 14:05 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm
If !CONFIG_KVM_GUEST, kvm_smp_prepare_boot_cpu() is not defined. So,
kvm_register_clock("primary cpu clock") in kvm_smp_prepare_boot_cpu()
is not called.
The detail of problem is hv_clock percpu usage. hv_clock is percpu
variable, but kvmclock_init() is called _before_ initializing percpu
area, and doesn't update address after initialized percpu area.
So, host kvm modify the memory area _before_ initializing percpu. This
became the cause of strange memory corruption on guest OS.
This fixes it by adding kvm_smp_prepare_boot_cpu(). [we might be
better to kill the usage before percpu initialization.]
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
---
arch/x86/kernel/kvmclock.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff -puN arch/x86/kernel/kvmclock.c~kvm-fix-kvmclock-init arch/x86/kernel/kvmclock.c
--- tux3fs/arch/x86/kernel/kvmclock.c~kvm-fix-kvmclock-init 2012-08-15 22:15:22.000000000 +0900
+++ tux3fs-hirofumi/arch/x86/kernel/kvmclock.c 2012-08-15 22:16:14.000000000 +0900
@@ -145,6 +145,14 @@ static void kvm_restore_sched_clock_stat
kvm_register_clock("primary cpu clock, resume");
}
+#if defined(CONFIG_SMP) && !defined(CONFIG_KVM_GUEST)
+static void __init kvm_smp_prepare_boot_cpu(void)
+{
+ WARN_ON(kvm_register_clock("primary cpu clock"));
+ native_smp_prepare_boot_cpu();
+}
+#endif
+
#ifdef CONFIG_X86_LOCAL_APIC
static void __cpuinit kvm_setup_secondary_clock(void)
{
@@ -194,6 +202,12 @@ void __init kvmclock_init(void)
printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
msr_kvm_system_time, msr_kvm_wall_clock);
+ /*
+ * This is temporary register until percpu is initialized.
+ * After percpu was initialized, we register real hv_clock via
+ * kvm_smp_prepare_boot_cpu().
+ * FIXME: it wouldn't be good to use before percpu is initialized.
+ */
if (kvm_register_clock("boot clock"))
return;
pv_time_ops.sched_clock = kvm_clock_read;
@@ -210,6 +224,9 @@ void __init kvmclock_init(void)
#ifdef CONFIG_KEXEC
machine_ops.crash_shutdown = kvm_crash_shutdown;
#endif
+#if defined(CONFIG_SMP) && !defined(CONFIG_KVM_GUEST)
+ smp_ops.smp_prepare_boot_cpu = kvm_smp_prepare_boot_cpu;
+#endif
kvm_get_preset_lpj();
clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
pv_info.paravirt_enabled = 1;
_
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kvm: Fix kvmclock initialization on !CONFIG_KVM_GUEST
2012-08-15 14:05 [PATCH] kvm: Fix kvmclock initialization on !CONFIG_KVM_GUEST OGAWA Hirofumi
@ 2012-08-16 19:57 ` Marcelo Tosatti
2012-08-17 0:10 ` OGAWA Hirofumi
2012-08-16 20:00 ` [PATCH] x86: KVM guest: merge CONFIG_KVM_CLOCK into CONFIG_KVM_GUEST Marcelo Tosatti
1 sibling, 1 reply; 6+ messages in thread
From: Marcelo Tosatti @ 2012-08-16 19:57 UTC (permalink / raw)
To: OGAWA Hirofumi; +Cc: Avi Kivity, kvm
On Wed, Aug 15, 2012 at 11:05:57PM +0900, OGAWA Hirofumi wrote:
>
> If !CONFIG_KVM_GUEST, kvm_smp_prepare_boot_cpu() is not defined. So,
> kvm_register_clock("primary cpu clock") in kvm_smp_prepare_boot_cpu()
> is not called.
>
> The detail of problem is hv_clock percpu usage. hv_clock is percpu
> variable, but kvmclock_init() is called _before_ initializing percpu
> area, and doesn't update address after initialized percpu area.
>
> So, host kvm modify the memory area _before_ initializing percpu. This
> became the cause of strange memory corruption on guest OS.
>
>
> This fixes it by adding kvm_smp_prepare_boot_cpu(). [we might be
> better to kill the usage before percpu initialization.]
>
> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
The distinction between CONFIG_KVM_CLOCK and CONFIG_KVM_GUEST is
not so clear anymore, as this bug demonstrates.
There is no point in having a separate config option, therefore i
propose to merge the two (see other reply) instead.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] x86: KVM guest: merge CONFIG_KVM_CLOCK into CONFIG_KVM_GUEST
2012-08-15 14:05 [PATCH] kvm: Fix kvmclock initialization on !CONFIG_KVM_GUEST OGAWA Hirofumi
2012-08-16 19:57 ` Marcelo Tosatti
@ 2012-08-16 20:00 ` Marcelo Tosatti
1 sibling, 0 replies; 6+ messages in thread
From: Marcelo Tosatti @ 2012-08-16 20:00 UTC (permalink / raw)
To: OGAWA Hirofumi, Avi Kivity; +Cc: kvm
The distinction between CONFIG_KVM_CLOCK and CONFIG_KVM_GUEST is
not so clear anymore, as demonstrated by recent bugs caused by poor
handling of on/off combinations of these options.
Merge CONFIG_KVM_CLOCK into CONFIG_KVM_GUEST.
Reported-By: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8ec3a1a..e0f3164 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -573,23 +573,18 @@ config PARAVIRT_TIME_ACCOUNTING
source "arch/x86/xen/Kconfig"
-config KVM_CLOCK
- bool "KVM paravirtualized clock"
- select PARAVIRT
- select PARAVIRT_CLOCK
- ---help---
- Turning on this option will allow you to run a paravirtualized clock
- when running over the KVM hypervisor. Instead of relying on a PIT
- (or probably other) emulation by the underlying device model, the host
- provides the guest with timing infrastructure such as time of day, and
- system time
-
config KVM_GUEST
- bool "KVM Guest support"
+ bool "KVM Guest support (including kvmclock)"
+ select PARAVIRT
select PARAVIRT
+ select PARAVIRT_CLOCK
+ default y if PARAVIRT_GUEST
---help---
This option enables various optimizations for running under the KVM
- hypervisor.
+ hypervisor. It includes a paravirtualized clock, so that instead
+ of relying on a PIT (or probably other) emulation by the
+ underlying device model, the host provides the guest with
+ timing infrastructure such as time of day, and system time
source "arch/x86/lguest/Kconfig"
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 20f5697..eb3e9d8 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -102,14 +102,14 @@ struct kvm_vcpu_pv_apf_data {
extern void kvmclock_init(void);
extern int kvm_register_clock(char *txt);
-#ifdef CONFIG_KVM_CLOCK
+#ifdef CONFIG_KVM_GUEST
bool kvm_check_and_clear_guest_paused(void);
#else
static inline bool kvm_check_and_clear_guest_paused(void)
{
return false;
}
-#endif /* CONFIG_KVMCLOCK */
+#endif /* CONFIG_KVM_GUEST */
/* This instruction is vmcall. On non-VT architectures, it will generate a
* trap that we will then rewrite to the appropriate instruction.
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 8215e56..7203298 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -81,8 +81,7 @@ obj-$(CONFIG_DEBUG_RODATA_TEST) += test_rodata.o
obj-$(CONFIG_DEBUG_NX_TEST) += test_nx.o
obj-$(CONFIG_DEBUG_NMI_SELFTEST) += nmi_selftest.o
-obj-$(CONFIG_KVM_GUEST) += kvm.o
-obj-$(CONFIG_KVM_CLOCK) += kvmclock.o
+obj-$(CONFIG_KVM_GUEST) += kvm.o kvmclock.o
obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch_$(BITS).o
obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
obj-$(CONFIG_PARAVIRT_CLOCK) += pvclock.o
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index c1d61ee..84a90b6 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -396,9 +396,7 @@ void kvm_disable_steal_time(void)
#ifdef CONFIG_SMP
static void __init kvm_smp_prepare_boot_cpu(void)
{
-#ifdef CONFIG_KVM_CLOCK
WARN_ON(kvm_register_clock("primary cpu clock"));
-#endif
kvm_guest_cpu_init();
native_smp_prepare_boot_cpu();
}
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index f4b9b80..b3386ae 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -957,7 +957,7 @@ void __init setup_arch(char **cmdline_p)
initmem_init();
memblock_find_dma_reserve();
-#ifdef CONFIG_KVM_CLOCK
+#ifdef CONFIG_KVM_GUEST
kvmclock_init();
#endif
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] kvm: Fix kvmclock initialization on !CONFIG_KVM_GUEST
2012-08-16 19:57 ` Marcelo Tosatti
@ 2012-08-17 0:10 ` OGAWA Hirofumi
2012-08-17 17:54 ` Marcelo Tosatti
0 siblings, 1 reply; 6+ messages in thread
From: OGAWA Hirofumi @ 2012-08-17 0:10 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Avi Kivity, kvm
Marcelo Tosatti <mtosatti@redhat.com> writes:
> On Wed, Aug 15, 2012 at 11:05:57PM +0900, OGAWA Hirofumi wrote:
>>
>> If !CONFIG_KVM_GUEST, kvm_smp_prepare_boot_cpu() is not defined. So,
>> kvm_register_clock("primary cpu clock") in kvm_smp_prepare_boot_cpu()
>> is not called.
>>
>> The detail of problem is hv_clock percpu usage. hv_clock is percpu
>> variable, but kvmclock_init() is called _before_ initializing percpu
>> area, and doesn't update address after initialized percpu area.
>>
>> So, host kvm modify the memory area _before_ initializing percpu. This
>> became the cause of strange memory corruption on guest OS.
>>
>>
>> This fixes it by adding kvm_smp_prepare_boot_cpu(). [we might be
>> better to kill the usage before percpu initialization.]
>>
>> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
>
> The distinction between CONFIG_KVM_CLOCK and CONFIG_KVM_GUEST is
> not so clear anymore, as this bug demonstrates.
>
> There is no point in having a separate config option, therefore i
> propose to merge the two (see other reply) instead.
Yes, it was an another option to fix this. As note, the wrong percpu
usage (use it before initialization) is still true even if merged
KVM_CLOCK.
Thanks.
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kvm: Fix kvmclock initialization on !CONFIG_KVM_GUEST
2012-08-17 0:10 ` OGAWA Hirofumi
@ 2012-08-17 17:54 ` Marcelo Tosatti
2012-08-18 0:17 ` OGAWA Hirofumi
0 siblings, 1 reply; 6+ messages in thread
From: Marcelo Tosatti @ 2012-08-17 17:54 UTC (permalink / raw)
To: OGAWA Hirofumi; +Cc: Avi Kivity, kvm
On Fri, Aug 17, 2012 at 09:10:42AM +0900, OGAWA Hirofumi wrote:
> Marcelo Tosatti <mtosatti@redhat.com> writes:
>
> > On Wed, Aug 15, 2012 at 11:05:57PM +0900, OGAWA Hirofumi wrote:
> >>
> >> If !CONFIG_KVM_GUEST, kvm_smp_prepare_boot_cpu() is not defined. So,
> >> kvm_register_clock("primary cpu clock") in kvm_smp_prepare_boot_cpu()
> >> is not called.
> >>
> >> The detail of problem is hv_clock percpu usage. hv_clock is percpu
> >> variable, but kvmclock_init() is called _before_ initializing percpu
> >> area, and doesn't update address after initialized percpu area.
> >>
> >> So, host kvm modify the memory area _before_ initializing percpu. This
> >> became the cause of strange memory corruption on guest OS.
> >>
> >>
> >> This fixes it by adding kvm_smp_prepare_boot_cpu(). [we might be
> >> better to kill the usage before percpu initialization.]
> >>
> >> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
> >
> > The distinction between CONFIG_KVM_CLOCK and CONFIG_KVM_GUEST is
> > not so clear anymore, as this bug demonstrates.
> >
> > There is no point in having a separate config option, therefore i
> > propose to merge the two (see other reply) instead.
>
> Yes, it was an another option to fix this. As note, the wrong percpu
> usage (use it before initialization) is still true even if merged
> KVM_CLOCK.
Its fine, i believe, because there is a percpu area for the "boot
processor" (see __per_cpu_offset at arch/x86/kernel/setup_percpu.c)
before proper initialization.
Can you please confirm the proposed config merge fixes the problem
for you?
> Thanks.
> --
> OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kvm: Fix kvmclock initialization on !CONFIG_KVM_GUEST
2012-08-17 17:54 ` Marcelo Tosatti
@ 2012-08-18 0:17 ` OGAWA Hirofumi
0 siblings, 0 replies; 6+ messages in thread
From: OGAWA Hirofumi @ 2012-08-18 0:17 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Avi Kivity, kvm
Marcelo Tosatti <mtosatti@redhat.com> writes:
>> Yes, it was an another option to fix this. As note, the wrong percpu
>> usage (use it before initialization) is still true even if merged
>> KVM_CLOCK.
>
> Its fine, i believe, because there is a percpu area for the "boot
> processor" (see __per_cpu_offset at arch/x86/kernel/setup_percpu.c)
> before proper initialization.
Yes. I thought to use early_per_cpu() or similar though, finally I
decided to make a patch with minimum logic change.
> Can you please confirm the proposed config merge fixes the problem
> for you?
kvm-clock: Using msrs 4b564d01 and 4b564d00
kvm-clock: cpu 0, msr 0:1e80f01, boot clock
...
kvm-clock: cpu 0, msr 0:3e3d2f01, primary cpu clock
...
kvm-clock: cpu 1, msr 0:3e5d2f01, secondary cpu clock
I confirmed kvm-clock of boot-cpu switched address properly with your
patch (msr xxx:yyy is gpa). This fix the problem.
- hypervisor.
+ hypervisor. It includes a paravirtualized clock, so that instead
BTW, in the patch, above line is having the trailing space - "instead ".
+ of relying on a PIT (or probably other) emulation by the
+ underlying device model, the host provides the guest with
+ timing infrastructure such as time of day, and system time
Thanks.
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-08-18 0:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-15 14:05 [PATCH] kvm: Fix kvmclock initialization on !CONFIG_KVM_GUEST OGAWA Hirofumi
2012-08-16 19:57 ` Marcelo Tosatti
2012-08-17 0:10 ` OGAWA Hirofumi
2012-08-17 17:54 ` Marcelo Tosatti
2012-08-18 0:17 ` OGAWA Hirofumi
2012-08-16 20:00 ` [PATCH] x86: KVM guest: merge CONFIG_KVM_CLOCK into CONFIG_KVM_GUEST Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).