public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Fernando Luis Vazquez Cao <fernando_b1@lab.ntt.co.jp>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>, Will Auld <will.auld@intel.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org,
	Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: [PATCH] target-i386: clear guest TSC on reset
Date: Thu, 05 Dec 2013 22:15:06 +0900	[thread overview]
Message-ID: <52A07C5A.9090105@lab.ntt.co.jp> (raw)
In-Reply-To: <52A04732.4040105@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2734 bytes --]

(2013/12/05 18:28), Paolo Bonzini wrote:
> Il 05/12/2013 07:15, Fernando Luis Vázquez Cao ha scritto:
>> VCPU TSC is not cleared by a warm reset (*), which leaves many Linux
>> guests vulnerable to the overflow in cyc2ns_offset fixed by upstream
>> commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 ("sched/x86: Fix overflow
>> in cyc2ns_offset").
>>
>> To put it in a nutshell, if a Linux guest without the patch above applied
>> has been up more than 208 days and attempts a warm reset chances are that
>> the newly booted kernel will panic or hang.
>>
>> (*) Intel Xeon E5 processors show the same broken behavior due to
>>      the errata "TSC is Not Affected by Warm Reset" (Intel® Xeon®
>>      Processor E5 Family Specification Update - August 2013): "The
>>      TSC (Time Stamp Counter MSR 10H) should be cleared on
>>      reset. Due to this erratum the TSC is not affected by warm
>>      reset."
>>
>> Cc: stable@vger.kernel.org
>> Cc: Will Auld <will.auld@intel.com>
>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
>> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
> I agree that the bug is in QEMU.  One small nit in your patch is that
> you should reset env->tsc_adjust and env->tsc in x86_cpu_reset.  This
> would already be pretty good.

Yes, that is certainly cleaner (I should try not to take shortcuts...). 
I am attaching
an updated patch (I apologize for not sending it inline - for reasons 
better left
untold I am writing this on a problematic email client :) ).



> However, a bigger problem is that env->tsc is a useless duplicate of
> "cpu_get_ticks() + env->tsc_adjust".  It would be nice to drop env->tsc
> completely except for migration backwards compatibility.  Thus you can:
>
> - fill in env->tsc as mentioned above from target-i386/machine.c's
> cpu_pre_save function.  This guarantees backwards compatibility.
>
> - add a function cpu_set_ticks(int64_t ticks) to cpus.c.  The function
> does nothing if use_icount is true, otherwise it needs to have (roughly)
> the opposite logic compared to cpu_get_ticks.  You then call this
> function from x86_cpu_reset instead of setting env->tsc.  You can
> similarly call this function from kvm_get_msrs.
>
> - add a function kvm_set_ticks(int64_t ticks) to kvm-all.c and
> kvm-stub.c.  For kvm-all.c it calls kvm_arch_set_ticks(CPUState *cpu,
> int64_t ticks) in target-*/kvm.c.  The kvm_arch_set_tsc() function has a
> dummy implementation for all architectures except x86.  For x86 it calls
> KVM_SET_MSRS passing "ticks + env->tsc_offset".
>
> - call kvm_set_ticks() from cpu_set_ticks() and cpu_enable_ticks()
>
> Can you do this?

Can you pick my original fix first? I can do what you suggest in a follow-up
patch.

Thanks,
Fernando

[-- Attachment #2: target-i386-clear-guest-TSC-on-reset-v2.patch --]
[-- Type: text/plain, Size: 3041 bytes --]

[PATCH v2] target-i386: clear guest TSC on reset

From: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>

VCPU TSC is not cleared by a warm reset (*), which leaves many Linux
guests vulnerable to the overflow in cyc2ns_offset fixed by upstream
commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 ("sched/x86: Fix overflow
in cyc2ns_offset").

To put it in a nutshell, if a Linux guest without the patch above applied
has been up more than 208 days and attempts a warm reset chances are that
the newly booted kernel will panic or hang.

(*) Intel Xeon E5 processors show the same broken behavior due to
    the errata "TSC is Not Affected by Warm Reset" (Intelツョ Xeonツョ
    Processor E5 Family Specification Update - August 2013): "The
    TSC (Time Stamp Counter MSR 10H) should be cleared on
    reset. Due to this erratum the TSC is not affected by warm
    reset."

Cc: Will Auld <will.auld@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
---

diff -urNp qemu-orig/target-i386/cpu.c qemu/target-i386/cpu.c
--- qemu-orig/target-i386/cpu.c	2013-11-28 07:02:45.000000000 +0900
+++ qemu/target-i386/cpu.c	2013-12-05 21:45:19.980156320 +0900
@@ -2446,6 +2446,9 @@ static void x86_cpu_reset(CPUState *s)
     cpu_breakpoint_remove_all(env, BP_CPU);
     cpu_watchpoint_remove_all(env, BP_CPU);
 
+    env->tsc_adjust = 0;
+    env->tsc = 0;
+
 #if !defined(CONFIG_USER_ONLY)
     /* We hard-wire the BSP to the first CPU. */
     if (s->cpu_index == 0) {
diff -urNp qemu-orig/target-i386/kvm.c qemu/target-i386/kvm.c
--- qemu-orig/target-i386/kvm.c	2013-11-28 07:02:45.000000000 +0900
+++ qemu/target-i386/kvm.c	2013-12-05 21:45:28.900200552 +0900
@@ -1139,22 +1139,20 @@ static int kvm_put_msrs(X86CPU *cpu, int
         kvm_msr_entry_set(&msrs[n++], MSR_LSTAR, env->lstar);
     }
 #endif
-    if (level == KVM_PUT_FULL_STATE) {
+    /*
+     * The following MSRs have side effects on the guest or are too heavy
+     * for normal writeback. Limit them to reset or full state updates.
+     */
+    if (level >= KVM_PUT_RESET_STATE) {
         /*
          * KVM is yet unable to synchronize TSC values of multiple VCPUs on
          * writeback. Until this is fixed, we only write the offset to SMP
          * guests after migration, desynchronizing the VCPUs, but avoiding
          * huge jump-backs that would occur without any writeback at all.
          */
-        if (smp_cpus == 1 || env->tsc != 0) {
+        if (smp_cpus == 1 || env->tsc != 0 || level == KVM_PUT_RESET_STATE) {
             kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc);
         }
-    }
-    /*
-     * The following MSRs have side effects on the guest or are too heavy
-     * for normal writeback. Limit them to reset or full state updates.
-     */
-    if (level >= KVM_PUT_RESET_STATE) {
         kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,
                           env->system_time_msr);
         kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);

  reply	other threads:[~2013-12-05 13:15 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-03  7:08 [PATCH] kvm: clear guest TSC on reset Fernando Luis Vázquez Cao
2013-12-03  8:04 ` Fernando Luis Vázquez Cao
2013-12-05  6:08   ` Fernando Luis Vázquez Cao
2013-12-05  6:15     ` [PATCH] target-i386: " Fernando Luis Vázquez Cao
2013-12-05  9:28       ` Paolo Bonzini
2013-12-05 13:15         ` Fernando Luis Vazquez Cao [this message]
2013-12-05 13:53           ` Paolo Bonzini
2013-12-05 15:42             ` Fernando Luis Vazquez Cao
2013-12-05 16:02               ` Paolo Bonzini
2013-12-05 16:40                 ` Marcelo Tosatti
2013-12-05 17:06                   ` Marcelo Tosatti
2013-12-05 16:17               ` Marcelo Tosatti
2013-12-05 16:38                 ` Paolo Bonzini
2013-12-06  8:24                   ` Fernando Luis Vázquez Cao
2013-12-06  8:33                     ` [PATCH 1//2 v3] " Fernando Luis Vázquez Cao
2013-12-06  8:38                       ` [PATCH 2/2] target-i386: do not special case TSC writeback Fernando Luis Vázquez Cao
2013-12-06  8:36                     ` [PATCH] target-i386: clear guest TSC on reset Paolo Bonzini
2013-12-06  8:56                       ` Fernando Luis Vázquez Cao
2013-12-06  9:08                         ` Paolo Bonzini
2013-12-06  9:20                           ` Fernando Luis Vazquez Cao
2013-12-06 14:22                     ` Marcelo Tosatti
2013-12-09  8:50                       ` Fernando Luis Vázquez Cao
2013-12-12  2:52                         ` Fernando Luis Vázquez Cao
2013-12-12 12:18                           ` Paolo Bonzini
2013-12-05 16:12         ` Marcelo Tosatti
2013-12-05 16:32           ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52A07C5A.9090105@lab.ntt.co.jp \
    --to=fernando_b1@lab.ntt.co.jp \
    --cc=gleb@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=will.auld@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox