From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alan Jenkins Subject: Re: [PATCH v2] KVM: x86: reset lapic_timer.expired_tscdeadline at SET_LAPIC time Date: Mon, 20 Jun 2016 16:22:27 +0100 Message-ID: <0aba982a-5dcf-2338-69df-b8e0c968ecc6@gmail.com> References: <20160617234126.GA24514@amt.cnet> <6bc78368-d559-fa09-7e77-389b0e87d695@gmail.com> <20160620130531.GA8139@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: kvm-devel , Paolo Bonzini To: Marcelo Tosatti Return-path: Received: from mail-wm0-f68.google.com ([74.125.82.68]:36086 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754552AbcFTPfD (ORCPT ); Mon, 20 Jun 2016 11:35:03 -0400 Received: by mail-wm0-f68.google.com with SMTP id c82so12064267wme.3 for ; Mon, 20 Jun 2016 08:34:16 -0700 (PDT) In-Reply-To: <20160620130531.GA8139@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: On 20/06/16 14:05, Marcelo Tosatti wrote: > Alan Jenkins reports hang at > https://bugzilla.redhat.com/show_bug.cgi?id=3D1337667, > due to guest TSC being set far behind than > lapic_timer.expired_tscdeadline, when restoring VM state > on top of currently active VM. > > It is not possible to disable LAPIC timer advancement > (by setting lapic_timer.expired_tscdeadline =3D 0), at > guest TSC write I like that it acknowledges (though only implicitly) the guest can=20 trigger arbitrary lockups of host CPUs. > because: > > * APIC write: expiration =3D 1000. > * LAPIC tsc deadline code sets timer to 1000-30. > * Timer fires at 970. > * Guest writes TSC=3Dw. > > Guest fails to VM-entry to process signal to perform > "vmload" in userspace. > > Case 1: w > 970: > Guest entry can be performed. > > Case 2: w < 970: > Guest entry should not be performed because "An interrupt is generate= d > when the logical processor=E2=80=99s time-stamp counter equals or exc= eeds the > target value in the IA32_TSC_DEADLINE MSR." > > In case 2, hardware would not fire an interrupt. > > To fix the problem, disable timer advancement when > userspace sets the LAPIC state. Setting of APIC > resets all APIC state, including > any pending interrupt. > > Signed-off-by: Marcelo Tosatti > Reported-by: Alan Jenkins However I feel this doesn't admit (even implicitly) that host software=20 (not necessarily root) can still hard-lockup the CPU. It depends on the= =20 sequence of operations, and the message doesn't show that sequence=20 explicitly. I now understand what the sequence that _is_ in the message= =20 shows, but it's unfortunately distracting. I.e. if you restore the LAPIC first (or omit to do so at all), then=20 restore the TSC deadline MSR, then the TSC MSR. The patch assumes that the LAPIC is restored after the MSRs so it can=20 clear the incorrect value of expired_tscdeadline, right? I didn't know whether this patch would work until I tested it, because = I=20 didn't try to nail down the exact sequence QEMU is using. > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index ea306ad..89be6e9 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2991,6 +2991,7 @@ static int kvm_vcpu_ioctl_set_lapic(struct kvm_= vcpu *vcpu, > { > kvm_apic_post_state_restore(vcpu, s); > update_cr8_intercept(vcpu); > + vcpu->arch.apic->lapic_timer.expired_tscdeadline =3D 0; > > return 0; > }