From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fernando Luis =?ISO-8859-1?Q?V=E1zquez?= Cao Subject: [PATCH] target-i386: clear guest TSC on reset Date: Thu, 05 Dec 2013 15:15:04 +0900 Message-ID: <1386224104.3091.3.camel@nexus> References: <1386054500.25757.10.camel@nexus> <529D90A6.2080801@lab.ntt.co.jp> <52A0186A.2050207@lab.ntt.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Will Auld , qemu-devel@nongnu.org, kvm@vger.kernel.org, Marcelo Tosatti To: Paolo Bonzini , Gleb Natapov Return-path: Received: from tama500.ecl.ntt.co.jp ([129.60.39.148]:55220 "EHLO tama500.ecl.ntt.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751246Ab3LEGPP (ORCPT ); Thu, 5 Dec 2013 01:15:15 -0500 In-Reply-To: <52A0186A.2050207@lab.ntt.co.jp> Sender: kvm-owner@vger.kernel.org List-ID: VCPU TSC is not cleared by a warm reset (*), which leaves many Linux guests vulnerable to the overflow in cyc2ns_offset fixed by upstream commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 ("sched/x86: Fix overfl= ow in cyc2ns_offset"). To put it in a nutshell, if a Linux guest without the patch above appli= ed has been up more than 208 days and attempts a warm reset chances are th= at the newly booted kernel will panic or hang. (*) Intel Xeon E5 processors show the same broken behavior due to the errata "TSC is Not Affected by Warm Reset" (Intel=C2=AE Xeon=C2= =AE Processor E5 Family Specification Update - August 2013): "The TSC (Time Stamp Counter MSR 10H) should be cleared on reset. Due to this erratum the TSC is not affected by warm reset." Cc: stable@vger.kernel.org Cc: Will Auld Cc: Marcelo Tosatti Signed-off-by: Fernando Luis Vazquez Cao --- --- qemu-orig/target-i386/kvm.c 2013-11-28 07:02:45.000000000 +0900 +++ qemu/target-i386/kvm.c 2013-12-05 14:47:03.085738175 +0900 @@ -1125,6 +1125,8 @@ static int kvm_put_msrs(X86CPU *cpu, int kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave); } if (has_msr_tsc_adjust) { + if (level =3D=3D KVM_PUT_RESET_STATE) + env->tsc_adjust =3D 0; kvm_msr_entry_set(&msrs[n++], MSR_TSC_ADJUST, env->tsc_adjust)= ; } if (has_msr_misc_enable) { @@ -1139,22 +1141,22 @@ static int kvm_put_msrs(X86CPU *cpu, int kvm_msr_entry_set(&msrs[n++], MSR_LSTAR, env->lstar); } #endif - if (level =3D=3D KVM_PUT_FULL_STATE) { + /* + * The following MSRs have side effects on the guest or are too he= avy + * for normal writeback. Limit them to reset or full state updates= =2E + */ + if (level >=3D KVM_PUT_RESET_STATE) { + if (level =3D=3D KVM_PUT_RESET_STATE) + env->tsc =3D 0; /* * KVM is yet unable to synchronize TSC values of multiple VCP= Us on * writeback. Until this is fixed, we only write the offset to= SMP * guests after migration, desynchronizing the VCPUs, but avoi= ding * huge jump-backs that would occur without any writeback at a= ll. */ - if (smp_cpus =3D=3D 1 || env->tsc !=3D 0) { + if (smp_cpus =3D=3D 1 || env->tsc !=3D 0 || level =3D=3D KVM_P= UT_RESET_STATE) { kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc); } - } - /* - * The following MSRs have side effects on the guest or are too he= avy - * for normal writeback. Limit them to reset or full state updates= =2E - */ - if (level >=3D KVM_PUT_RESET_STATE) { kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr); kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_cl= ock_msr);