From mboxrd@z Thu Jan  1 00:00:00 1970
From: Kevin O'Connor <kevin@koconnor.net>
Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
 problem on qemu-kvm platform
Date: Fri, 18 Dec 2015 18:13:26 -0500
Message-ID: <20151218231326.GA4138@morn.lan>
References: <563955D4.7080000@huawei.com>
 <20151104174201.GA17784@morn.lan>
 <8E78D212B8C25246BE4CE7EA0E645FE52977E8@SZXEMI504-MBS.china.huawei.com>
 <20151109133253.GA1790@morn.lan>
 <20151109200618.GA29129@morn.lan>
 <20151109202726.GA31490@morn.lan>
 <8E78D212B8C25246BE4CE7EA0E645FE52B5BE3@SZXEMI504-MBS.china.huawei.com>
 <8E78D212B8C25246BE4CE7EA0E645FE52B72B7@SZXEMI504-MBS.china.huawei.com>
 <20151119134039.GA27717@morn.lan>
 <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: "Xulei (Stone)" <stone.xulei@huawei.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	"seabios@seabios.org" <seabios@seabios.org>,
	"Huangweidong (C)" <weidong.huang@huawei.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
To: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-qg0-f46.google.com ([209.85.192.46]:34199 "EHLO
	mail-qg0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751674AbbLRXNd (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 18 Dec 2015 18:13:33 -0500
Received: by mail-qg0-f46.google.com with SMTP id p62so15022890qge.1
        for <kvm@vger.kernel.org>; Fri, 18 Dec 2015 15:13:33 -0800 (PST)
Content-Disposition: inline
In-Reply-To: <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote:
> Hi Kevin & Paolo,
> 
> Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck.
[...]
> [2015-12-18 10:38:10]  gonglei: finish while   
[...]
> <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
> <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
> <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
> <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0

This is an odd finding.  It seems to indicate that the code is caught
in an infinite irq loop once irqs are enabled.  What doesn't make
sense is that an NMI shouldn't depend on the cpu irq enable flag.
Also, I can't explain why rip would be 0x03, nor why a #UD in an
exception handler wouldn't result in a triple fault.  Maybe someone
with more kvm knowledge could help here.

I did notice that you appear to be running with SeaBIOS v1.8.1 - I
recommend you upgrade to the latest.  There were two important fixes
in this area (8b9942fa and 3156b71a).  I don't think either of these
fixes would explain the log above, but it would be best to eliminate
the possibility.

-Kevin

From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:34788)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kevin@koconnor.net>) id 1aA4DQ-0004jG-SU
	for qemu-devel@nongnu.org; Fri, 18 Dec 2015 18:13:37 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <kevin@koconnor.net>) id 1aA4DN-0004By-Mg
	for qemu-devel@nongnu.org; Fri, 18 Dec 2015 18:13:36 -0500
Received: from mail-qg0-x22d.google.com ([2607:f8b0:400d:c04::22d]:35282)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <kevin@koconnor.net>) id 1aA4DN-0004Bu-G9
	for qemu-devel@nongnu.org; Fri, 18 Dec 2015 18:13:33 -0500
Received: by mail-qg0-x22d.google.com with SMTP id v36so44910679qgd.2
	for <qemu-devel@nongnu.org>; Fri, 18 Dec 2015 15:13:33 -0800 (PST)
Date: Fri, 18 Dec 2015 18:13:26 -0500
From: Kevin O'Connor <kevin@koconnor.net>
Message-ID: <20151218231326.GA4138@morn.lan>
References: <563955D4.7080000@huawei.com> <20151104174201.GA17784@morn.lan>
	<8E78D212B8C25246BE4CE7EA0E645FE52977E8@SZXEMI504-MBS.china.huawei.com>
	<20151109133253.GA1790@morn.lan> <20151109200618.GA29129@morn.lan>
	<20151109202726.GA31490@morn.lan>
	<8E78D212B8C25246BE4CE7EA0E645FE52B5BE3@SZXEMI504-MBS.china.huawei.com>
	<8E78D212B8C25246BE4CE7EA0E645FE52B72B7@SZXEMI504-MBS.china.huawei.com>
	<20151119134039.GA27717@morn.lan>
	<33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com>
Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
 problem on qemu-kvm platform
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Cc: "Huangweidong (C)" <weidong.huang@huawei.com>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, "seabios@seabios.org" <seabios@seabios.org>, "Xulei (Stone)" <stone.xulei@huawei.com>, qemu-devel <qemu-devel@nongnu.org>, Paolo Bonzini <pbonzini@redhat.com>

On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote:
> Hi Kevin & Paolo,
> 
> Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck.
[...]
> [2015-12-18 10:38:10]  gonglei: finish while   
[...]
> <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
> <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
> <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
> <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0

This is an odd finding.  It seems to indicate that the code is caught
in an infinite irq loop once irqs are enabled.  What doesn't make
sense is that an NMI shouldn't depend on the cpu irq enable flag.
Also, I can't explain why rip would be 0x03, nor why a #UD in an
exception handler wouldn't result in a triple fault.  Maybe someone
with more kvm knowledge could help here.

I did notice that you appear to be running with SeaBIOS v1.8.1 - I
recommend you upgrade to the latest.  There were two important fixes
in this area (8b9942fa and 3156b71a).  I don't think either of these
fixes would explain the log above, but it would be best to eliminate
the possibility.

-Kevin