kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 59521] New: KVM linux guest reads uninitialized pvclock values before executing rdmsr MSR_KVM_WALL_CLOCK
@ 2013-06-10 14:11 bugzilla-daemon
  2013-06-10 16:31 ` [PATCH] x86: kvmclock: zero initialize pvclock shared memory area Igor Mammedov
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: bugzilla-daemon @ 2013-06-10 14:11 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=59521

           Summary: KVM linux guest reads uninitialized pvclock values
                    before executing rdmsr MSR_KVM_WALL_CLOCK
           Product: Virtualization
           Version: unspecified
    Kernel Version: 3.8 and higher
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: kvm
        AssignedTo: virtualization_kvm@kernel-bugs.osdl.org
        ReportedBy: eabatalov89@gmail.com
        Regression: Yes


It turns out that since linux-image-3.8.0-1-generic_3.8.0-1.5 built from
sources Ubuntu kernel reads uninitialized KVM paravirtualized clocksource data
while working under hypervisor which supports KVM_FEATURE_CLOCKSOURCE.

I've trapped all the guest r/w memory accesses to paravirtualized clock
structures to proof it.

The event sequence is the following:
1. Kernel bootstrap cpu executes  rdmsr MSR_KVM_WALL_CLOCK and reads proper
initialized values of kvm clocksource
2. Memory for percpu paravirtualized KVM clocksources is allocated during
initialization of KVM subsystem.
arch/x86/kernel/kvmclock.c:
221 void __init kvmclock_init(void)
...
240         mem = memblock_alloc(size, PAGE_SIZE);
3. SMP boot is performed. One by one cpus 0, 1, 2, .. n are initialized.
4. Each cpu reads its uninitialized preallocated paravirtualized wallclock
structures before executing rdmsr MSR_KVM_WALL_CLOCK which should initialize
them. So the contents of structs are more or less random.

Here is the struct which is read:
struct pvclock_vcpu_time_info {
 62         u32   version;
 63         u32   pad0;
 64         u64   tsc_timestamp;
 65         u64   system_time;
 66         u32   tsc_to_system_mul;
 67         s8    tsc_shift;
 68         u8    flags;
 69         u8    pad[2];
 70 } __attribute__((__packed__)); /* 32 bytes */
Reading uninitialized contents creates 2 problems for us:
1. If version field is odd then cpu spins forever (see
./arch/x86/kernel/pvclock.c:pvclock_clocksource_read)
2. Depending on values of other struct fields and on the features of
virtualized hardware kernel scheduler can behave in different ways. I've
observed kernel boot hang after start of SMP initialization.

SMP cpu#0 doesn't read unintialized data because it reads bootstrap cpu struct
which was properly initialized using preliminary rdmsr.
At the same time SMP cpu#0 performs the same thing as other SMP cpus - executes
rdmsr MSR_KVM_WALL_CLOCK after reading its paravirtualized clock struct. So no
reproduction on UMP machine.

It may be hard to reproduce the bug because of its random nature.
To easily reproduce it I propose the following patch:
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 5bedbdd..44aaf25 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -219,6 +222,7 @@ void __init kvmclock_init(void)
 {
        unsigned long mem;
        int size;
+       int i;

        size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);

@@ -238,6 +242,9 @@ void __init kvmclock_init(void)
        if (!mem)
                return;
        hv_clock = __va(mem);
+       for (i = 0; i < NR_CPUS; ++i) {
+               hv_clock[i].pvti.version = 0xDEADBEAF;
+       }

        if (kvm_register_clock("boot clock")) {
                hv_clock = NULL;

This patch set version field to an odd value so if a cpu reads it before rdmsr
it spins forever because hypervisor won't set it to even value.

The first patch which makes the bug reproducible is kernel mainline commit:
"x86: kvmclock: allocate pvclock shared memory area".

Before the patch memory for KVM clocksources was allocated in percpu variables.
Every SMP cpu which starts execution got copy of bootstrab cpu pvclock
variable.
That's why pvclock structs were "initialized" before the patch and there was no
bug with uninitialized memory read.
After the patch memory is allocated using memblock_alloc. Nothing is written to
the allocated memory before actual call of MSR_KVM_WALL_CLOCK.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-06-21  9:01 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-10 14:11 [Bug 59521] New: KVM linux guest reads uninitialized pvclock values before executing rdmsr MSR_KVM_WALL_CLOCK bugzilla-daemon
2013-06-10 16:31 ` [PATCH] x86: kvmclock: zero initialize pvclock shared memory area Igor Mammedov
2013-06-10 20:19   ` Marcelo Tosatti
2013-06-15 18:01     ` [PATCHv1] kvm guest: fix uninitialized kvmclock read by KVM guest Eugene Batalov
2013-06-18 22:21       ` Marcelo Tosatti
2013-06-19 13:05         ` Paolo Bonzini
     [not found]           ` <CAJF2t5sYHy9q9a7-fZauf1Z7_FkK1_DOP13GHji=8-vDUsnnsQ@mail.gmail.com>
2013-06-19 13:29             ` Paolo Bonzini
2013-06-20  8:30               ` Igor Mammedov
2013-06-20  8:35                 ` Paolo Bonzini
2013-06-11 16:03 ` [Bug 59521] KVM linux guest reads uninitialized pvclock values before executing rdmsr MSR_KVM_WALL_CLOCK bugzilla-daemon
2013-06-15 17:17 ` bugzilla-daemon
2013-06-17 15:50   ` Paolo Bonzini
2013-06-17 15:50 ` bugzilla-daemon
2013-06-17 21:29 ` bugzilla-daemon
2013-06-21  9:01 ` [PATCH 0/2 v2] x86: kvmclock: Prevent uninitialized per-cpu kvmclock usage Igor Mammedov
2013-06-21  9:01 ` [PATCH 1/2] x86: kvmclock: zero initialize pvclock shared memory area Igor Mammedov
2013-06-21  9:01 ` [PATCH 2/2] x86: kvmclock: register per-cpu kvmclock at earliest possible time Igor Mammedov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).