From: Marcelo Tosatti <mtosatti@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Gleb Natapov" <gleb@kernel.org>,
"Will Auld" <will.auld@intel.com>,
qemu-devel@nongnu.org, kvm@vger.kernel.org,
"Fernando Luis Vázquez Cao" <fernando_b1@lab.ntt.co.jp>
Subject: Re: [Qemu-devel] [PATCH] target-i386: clear guest TSC on reset
Date: Thu, 5 Dec 2013 14:12:34 -0200 [thread overview]
Message-ID: <20131205161234.GA17277@amt.cnet> (raw)
In-Reply-To: <52A04732.4040105@redhat.com>
On Thu, Dec 05, 2013 at 10:28:18AM +0100, Paolo Bonzini wrote:
> Il 05/12/2013 07:15, Fernando Luis Vázquez Cao ha scritto:
> > VCPU TSC is not cleared by a warm reset (*), which leaves many Linux
> > guests vulnerable to the overflow in cyc2ns_offset fixed by upstream
> > commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 ("sched/x86: Fix overflow
> > in cyc2ns_offset").
> >
> > To put it in a nutshell, if a Linux guest without the patch above applied
> > has been up more than 208 days and attempts a warm reset chances are that
> > the newly booted kernel will panic or hang.
> >
> > (*) Intel Xeon E5 processors show the same broken behavior due to
> > the errata "TSC is Not Affected by Warm Reset" (Intel® Xeon®
> > Processor E5 Family Specification Update - August 2013): "The
> > TSC (Time Stamp Counter MSR 10H) should be cleared on
> > reset. Due to this erratum the TSC is not affected by warm
> > reset."
> >
> > Cc: stable@vger.kernel.org
> > Cc: Will Auld <will.auld@intel.com>
> > Cc: Marcelo Tosatti <mtosatti@redhat.com>
> > Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
>
> I agree that the bug is in QEMU. One small nit in your patch is that
> you should reset env->tsc_adjust and env->tsc in x86_cpu_reset. This
> would already be pretty good.
>
> However, a bigger problem is that env->tsc is a useless duplicate of
> "cpu_get_ticks() + env->tsc_adjust". It would be nice to drop env->tsc
> completely except for migration backwards compatibility. Thus you can:
>
> - fill in env->tsc as mentioned above from target-i386/machine.c's
> cpu_pre_save function. This guarantees backwards compatibility.
>
> - add a function cpu_set_ticks(int64_t ticks) to cpus.c. The function
> does nothing if use_icount is true, otherwise it needs to have (roughly)
> the opposite logic compared to cpu_get_ticks. You then call this
> function from x86_cpu_reset instead of setting env->tsc. You can
> similarly call this function from kvm_get_msrs.
>
> - add a function kvm_set_ticks(int64_t ticks) to kvm-all.c and
> kvm-stub.c. For kvm-all.c it calls kvm_arch_set_ticks(CPUState *cpu,
> int64_t ticks) in target-*/kvm.c. The kvm_arch_set_tsc() function has a
> dummy implementation for all architectures except x86. For x86 it calls
> KVM_SET_MSRS passing "ticks + env->tsc_offset".
>
> - call kvm_set_ticks() from cpu_set_ticks() and cpu_enable_ticks()
env->tsc is just a placeholder for the vcpu TSC.
A vcpus TSC from QEMU's point of view is a register initialized to zero,
which requires read/write from KVM, and migration.
Not sure what is the point of your idea.
>
> Can you do this?
>
> Thanks,
>
> Paolo
>
> > ---
> >
> > --- qemu-orig/target-i386/kvm.c 2013-11-28 07:02:45.000000000 +0900
> > +++ qemu/target-i386/kvm.c 2013-12-05 14:47:03.085738175 +0900
> > @@ -1125,6 +1125,8 @@ static int kvm_put_msrs(X86CPU *cpu, int
> > kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave);
> > }
> > if (has_msr_tsc_adjust) {
> > + if (level == KVM_PUT_RESET_STATE)
> > + env->tsc_adjust = 0;
> > kvm_msr_entry_set(&msrs[n++], MSR_TSC_ADJUST, env->tsc_adjust);
> > }
> > if (has_msr_misc_enable) {
> > @@ -1139,22 +1141,22 @@ static int kvm_put_msrs(X86CPU *cpu, int
> > kvm_msr_entry_set(&msrs[n++], MSR_LSTAR, env->lstar);
> > }
> > #endif
> > - if (level == KVM_PUT_FULL_STATE) {
> > + /*
> > + * The following MSRs have side effects on the guest or are too heavy
> > + * for normal writeback. Limit them to reset or full state updates.
> > + */
> > + if (level >= KVM_PUT_RESET_STATE) {
> > + if (level == KVM_PUT_RESET_STATE)
> > + env->tsc = 0;
> > /*
> > * KVM is yet unable to synchronize TSC values of multiple VCPUs on
> > * writeback. Until this is fixed, we only write the offset to SMP
> > * guests after migration, desynchronizing the VCPUs, but avoiding
> > * huge jump-backs that would occur without any writeback at all.
> > */
> > - if (smp_cpus == 1 || env->tsc != 0) {
> > + if (smp_cpus == 1 || env->tsc != 0 || level == KVM_PUT_RESET_STATE) {
> > kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc);
> > }
> > - }
> > - /*
> > - * The following MSRs have side effects on the guest or are too heavy
> > - * for normal writeback. Limit them to reset or full state updates.
> > - */
> > - if (level >= KVM_PUT_RESET_STATE) {
> > kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,
> > env->system_time_msr);
> > kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
next prev parent reply other threads:[~2013-12-05 16:13 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1386054500.25757.10.camel@nexus>
[not found] ` <529D90A6.2080801@lab.ntt.co.jp>
2013-12-05 6:08 ` [Qemu-devel] [PATCH] kvm: clear guest TSC on reset Fernando Luis Vázquez Cao
2013-12-05 6:15 ` [Qemu-devel] [PATCH] target-i386: " Fernando Luis Vázquez Cao
2013-12-05 9:28 ` Paolo Bonzini
2013-12-05 13:15 ` Fernando Luis Vazquez Cao
2013-12-05 13:53 ` Paolo Bonzini
2013-12-05 15:42 ` Fernando Luis Vazquez Cao
2013-12-05 16:02 ` Paolo Bonzini
2013-12-05 16:40 ` Marcelo Tosatti
2013-12-05 17:06 ` Marcelo Tosatti
2013-12-05 16:17 ` Marcelo Tosatti
2013-12-05 16:38 ` Paolo Bonzini
2013-12-06 8:24 ` Fernando Luis Vázquez Cao
2013-12-06 8:33 ` [Qemu-devel] [PATCH 1//2 v3] " Fernando Luis Vázquez Cao
2013-12-06 8:38 ` [Qemu-devel] [PATCH 2/2] target-i386: do not special case TSC writeback Fernando Luis Vázquez Cao
2013-12-06 8:36 ` [Qemu-devel] [PATCH] target-i386: clear guest TSC on reset Paolo Bonzini
2013-12-06 8:56 ` Fernando Luis Vázquez Cao
2013-12-06 9:08 ` Paolo Bonzini
2013-12-06 9:20 ` Fernando Luis Vazquez Cao
2013-12-06 14:22 ` Marcelo Tosatti
2013-12-09 8:50 ` Fernando Luis Vázquez Cao
2013-12-12 2:52 ` Fernando Luis Vázquez Cao
2013-12-12 12:18 ` Paolo Bonzini
2013-12-05 16:12 ` Marcelo Tosatti [this message]
2013-12-05 16:32 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131205161234.GA17277@amt.cnet \
--to=mtosatti@redhat.com \
--cc=fernando_b1@lab.ntt.co.jp \
--cc=gleb@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=will.auld@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).