From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42114) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YdHjr-00063V-GK for qemu-devel@nongnu.org; Wed, 01 Apr 2015 08:27:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YdHjo-000097-VA for qemu-devel@nongnu.org; Wed, 01 Apr 2015 08:27:19 -0400 Received: from mail-la0-x22c.google.com ([2a00:1450:4010:c03::22c]:34142) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YdHjo-00008O-FA for qemu-devel@nongnu.org; Wed, 01 Apr 2015 08:27:16 -0400 Received: by lagg8 with SMTP id g8so35256125lag.1 for ; Wed, 01 Apr 2015 05:27:15 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20150401114923.GH13271@potion.brq.redhat.com> References: <20150330185634.GE13271@potion.brq.redhat.com> <20150331134512.GG13271@potion.brq.redhat.com> <20150331164539.GD14262@potion.brq.redhat.com> <20150401114923.GH13271@potion.brq.redhat.com> From: Andrey Korolyov Date: Wed, 1 Apr 2015 15:26:53 +0300 Message-ID: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] E5-2620v2 - emulation stop error List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Cc: "kvm@vger.kernel.org" , "qemu-devel@nongnu.org" , "Dr. David Alan Gilbert" , Bandan Das , Kevin O'Connor , Gerd Hoffmann , Paolo Bonzini On Wed, Apr 1, 2015 at 2:49 PM, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote: > 2015-03-31 21:23+0300, Andrey Korolyov: >> On Tue, Mar 31, 2015 at 9:04 PM, Bandan Das wrote: >> > Bandan Das writes: >> >> Andrey Korolyov writes: >> >> ... >> >>> http://xdel.ru/downloads/kvm-e5v2-issue/another-tracepoint-fail-with= -apicv.dat.gz >> >>> >> >>> Something a bit more interesting, but the mess is happening just >> >>> *after* NMI firing. >> >> >> >> What happens if NMI is turned off on the host ? >> > >> > Sorry, I meant the watchdog.. >> >> Thanks, everything goes well (as it probably should go there): >> http://xdel.ru/downloads/kvm-e5v2-issue/apicv-enabled-nmi-disabled.dat.g= z > > Nice revelation! > > KVM doesn't expect host's NMIs to look like this so it doesn't pass them > to the host. What was the watchdog that casually sent NMIs? > (It worked after "nmi_watchdog=3D0" on the host?) > > (Guest's NMI should have a different result as well. NMI_EXCEPTION is > an expected exit reason for guest's hard exceptions, they are then > differentiated by intr_info and nothing hinted that this was a NMI.) Yes, I disabled host watchdog during runtime. Indeed guest-induced NMI would look different and they had no reasons to be fired at this stage inside guest. I`d suspect a hypervisor hardware misbehavior there but have a very little idea on how APICv behavior (which is completely microcode-dependent and CPU-dependent but decoupled from peripheral hardware) may vary at this point, I am using 1.20140913.1 ucode version from debian if this can matter. Will send trace suggested by Paolo in a next couple of hours. Also it would be awesome to ask hardware folks from Intel who can prove or disprove my abovementioned statement (as I was unable to catch the problem on 2603v2 so far, this hypothesis has some chance to be real).