From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@bugzilla.kernel.org
Subject: [Bug 59521] KVM linux guest reads uninitialized pvclock values
before executing rdmsr MSR_KVM_WALL_CLOCK
Date: Sat, 15 Jun 2013 17:17:19 +0000 (UTC)
Message-ID: <20130615171719.98CD711FB38@bugzilla.kernel.org>
References:
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
To: kvm@vger.kernel.org
Return-path:
Received: from mail.kernel.org ([198.145.19.201]:38181 "EHLO mail.kernel.org"
rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
id S1751837Ab3FORRX (ORCPT );
Sat, 15 Jun 2013 13:17:23 -0400
Received: from mail.kernel.org (localhost [127.0.0.1])
by mail.kernel.org (Postfix) with ESMTP id E7054201AA
for ; Sat, 15 Jun 2013 17:17:21 +0000 (UTC)
Received: from bugzilla.kernel.org (bugzilla.kernel.org [198.145.19.217])
by mail.kernel.org (Postfix) with ESMTP id 4D29B20198
for ; Sat, 15 Jun 2013 17:17:20 +0000 (UTC)
In-Reply-To:
Sender: kvm-owner@vger.kernel.org
List-ID:
https://bugzilla.kernel.org/show_bug.cgi?id=59521
--- Comment #2 from Eugene Batalov 2013-06-15 17:17:18 ---
KVM pv_clock initialization for SMP CPU is called very early on SMP CPU boot
stage:
arch/x86/kernel/smpboot.c:
__cpuinit start_secondary(void *unused)
...
cpu_init();
x86_cpuinit.early_percpu_clock_init();
early_percpu_clock_init is set as virtual function in
./arch/x86/kernel/kvmclock.c:
#ifdef CONFIG_X86_LOCAL_APIC
x86_cpuinit.early_percpu_clock_init =
kvm_setup_secondary_clock;
#endif
The problem is in cpu_init() which is called earlier.
cpu_init() calls printk and possibly other stuff which can use timestamps.
printk calls local_clock() to obtain a timestamp of a log message. On KVM
guests call sequence usually ends up in kvm_clock_read but needed rdmsr is
executed only in x86_cpuinit.early_percpu_clock_init().
I consider two approaches to fix the problem:
1. Swap cpu_init(); and x86_cpuinit.early_percpu_clock_init();
+ Simple
- We will get excessive restrictions on operations which allowed to be
performed in early_percpu_clock_init() because percpu specific data is
initialized only in cpu_init().
2. Return 0ULL from kvm_clock_read untill it is initialized.
+ Simple too
- Additional if statement inside kvm_clock_read (not serious even for
performance paranoiacs)
- Returning 0ULL looks ok because it is the same thing which kernel bootstrap
CPU does on early boot stages. But I am not quite sure. Better to ask guys who
maintain the needed relevant subsystem.
I prefer the second way. It doesn't add complex restrictions to CPU bootup
code. I'll send a patch soon which fixes the problem in the second way.
I don't propagate such a logic to levels higher then KVM clocksource
(pv_time_ops level for example) because of the following code:
void __init kvmclock_init(void)
...
263 pv_time_ops.sched_clock = kvm_clock_read;
264 x86_platform.calibrate_tsc = kvm_get_tsc_khz;
265 x86_platform.get_wallclock = kvm_get_wallclock;
266 x86_platform.set_wallclock = kvm_set_wallclock;
267 #ifdef CONFIG_X86_LOCAL_APIC
268 x86_cpuinit.early_percpu_clock_init =
269 kvm_setup_secondary_clock;
270 #endif
271 x86_platform.save_sched_clock_state = kvm_save_sched_clock_state;
272 x86_platform.restore_sched_clock_state =
kvm_restore_sched_clock_state;
To propagate the logic I need to make changes both in x86_platform and
pv_time_ops also I should make a similar fix for ia64 arch. It needs some
subsystems refactoring to make the changes clean. Dont' think that its worth to
fix the bug. Better to make a simple fix.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.