From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34080) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Voqqa-0007pf-U8 for qemu-devel@nongnu.org; Fri, 06 Dec 2013 03:33:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VoqqT-0004lz-Bt for qemu-devel@nongnu.org; Fri, 06 Dec 2013 03:33:16 -0500 Received: from tama50.ecl.ntt.co.jp ([129.60.39.147]:58621) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VoqqS-0004iW-TC for qemu-devel@nongnu.org; Fri, 06 Dec 2013 03:33:09 -0500 Message-ID: <1386318781.20106.3.camel@nexus> From: Fernando Luis =?ISO-8859-1?Q?V=E1zquez?= Cao Date: Fri, 06 Dec 2013 17:33:01 +0900 In-Reply-To: <52A189B2.4060305@lab.ntt.co.jp> References: <1386054500.25757.10.camel@nexus> <529D90A6.2080801@lab.ntt.co.jp> <52A0186A.2050207@lab.ntt.co.jp> <1386224104.3091.3.camel@nexus> <52A04732.4040105@redhat.com> <52A07C5A.9090105@lab.ntt.co.jp> <52A08541.6090702@redhat.com> <52A09EF4.5080800@lab.ntt.co.jp> <20131205161707.GB17277@amt.cnet> <52A0AC09.4090202@redhat.com> <52A189B2.4060305@lab.ntt.co.jp> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] [PATCH 1//2 v3] target-i386: clear guest TSC on reset List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Gleb Natapov , Will Auld , Marcelo Tosatti , qemu-devel@nongnu.org, kvm@vger.kernel.org VCPU TSC is not cleared by a warm reset (*), which leaves some types of L= inux guests (non-pvops guests and those with the kernel parameter no-kvmclock= set) vulnerable to the overflow in cyc2ns_offset fixed by upstream commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 ("sched/x86: Fix overflow in cyc2ns_offset"). To put it in a nutshell, if such a Linux guest without the patch above ap= plied has been up more than 208 days and attempts a warm reset chances are that the newly booted kernel will panic or hang. (*) Intel Xeon E5 processors show the same broken behavior due to the errata "TSC is Not Affected by Warm Reset" (Intel=C2=AE Xeon=C2=AE Processor E5 Family Specification Update - August 2013): "The TSC (Time Stamp Counter MSR 10H) should be cleared on reset. Due to this erratum the TSC is not affected by warm reset." Cc: Will Auld Cc: Marcelo Tosatti Signed-off-by: Fernando Luis Vazquez Cao --- diff -urNp qemu-orig/target-i386/cpu.c qemu/target-i386/cpu.c --- qemu-orig/target-i386/cpu.c 2013-11-28 07:02:45.000000000 +0900 +++ qemu/target-i386/cpu.c 2013-12-05 21:45:19.980156320 +0900 @@ -2446,6 +2446,9 @@ static void x86_cpu_reset(CPUState *s) cpu_breakpoint_remove_all(env, BP_CPU); cpu_watchpoint_remove_all(env, BP_CPU); =20 + env->tsc_adjust =3D 0; + env->tsc =3D 0; + #if !defined(CONFIG_USER_ONLY) /* We hard-wire the BSP to the first CPU. */ if (s->cpu_index =3D=3D 0) { diff -urNp qemu-orig/target-i386/kvm.c qemu/target-i386/kvm.c --- qemu-orig/target-i386/kvm.c 2013-11-28 07:02:45.000000000 +0900 +++ qemu/target-i386/kvm.c 2013-12-05 21:45:28.900200552 +0900 @@ -1139,22 +1139,20 @@ static int kvm_put_msrs(X86CPU *cpu, int kvm_msr_entry_set(&msrs[n++], MSR_LSTAR, env->lstar); } #endif - if (level =3D=3D KVM_PUT_FULL_STATE) { + /* + * The following MSRs have side effects on the guest or are too heav= y + * for normal writeback. Limit them to reset or full state updates. + */ + if (level >=3D KVM_PUT_RESET_STATE) { /* * KVM is yet unable to synchronize TSC values of multiple VCPUs= on * writeback. Until this is fixed, we only write the offset to S= MP * guests after migration, desynchronizing the VCPUs, but avoidi= ng * huge jump-backs that would occur without any writeback at all. */ - if (smp_cpus =3D=3D 1 || env->tsc !=3D 0) { + if (smp_cpus =3D=3D 1 || env->tsc !=3D 0 || level =3D=3D KVM_PUT= _RESET_STATE) { kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc); } - } - /* - * The following MSRs have side effects on the guest or are too heav= y - * for normal writeback. Limit them to reset or full state updates. - */ - if (level >=3D KVM_PUT_RESET_STATE) { kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr); kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_cloc= k_msr);