From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kevin O'Connor Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform Date: Fri, 18 Dec 2015 18:13:26 -0500 Message-ID: <20151218231326.GA4138@morn.lan> References: <563955D4.7080000@huawei.com> <20151104174201.GA17784@morn.lan> <8E78D212B8C25246BE4CE7EA0E645FE52977E8@SZXEMI504-MBS.china.huawei.com> <20151109133253.GA1790@morn.lan> <20151109200618.GA29129@morn.lan> <20151109202726.GA31490@morn.lan> <8E78D212B8C25246BE4CE7EA0E645FE52B5BE3@SZXEMI504-MBS.china.huawei.com> <8E78D212B8C25246BE4CE7EA0E645FE52B72B7@SZXEMI504-MBS.china.huawei.com> <20151119134039.GA27717@morn.lan> <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "Xulei (Stone)" , Paolo Bonzini , qemu-devel , "seabios@seabios.org" , "Huangweidong (C)" , "kvm@vger.kernel.org" To: "Gonglei (Arei)" Return-path: Received: from mail-qg0-f46.google.com ([209.85.192.46]:34199 "EHLO mail-qg0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751674AbbLRXNd (ORCPT ); Fri, 18 Dec 2015 18:13:33 -0500 Received: by mail-qg0-f46.google.com with SMTP id p62so15022890qge.1 for ; Fri, 18 Dec 2015 15:13:33 -0800 (PST) Content-Disposition: inline In-Reply-To: <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com> Sender: kvm-owner@vger.kernel.org List-ID: On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote: > Hi Kevin & Paolo, > > Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck. [...] > [2015-12-18 10:38:10] gonglei: finish while [...] > <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 > <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real) > <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0) > <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0 This is an odd finding. It seems to indicate that the code is caught in an infinite irq loop once irqs are enabled. What doesn't make sense is that an NMI shouldn't depend on the cpu irq enable flag. Also, I can't explain why rip would be 0x03, nor why a #UD in an exception handler wouldn't result in a triple fault. Maybe someone with more kvm knowledge could help here. I did notice that you appear to be running with SeaBIOS v1.8.1 - I recommend you upgrade to the latest. There were two important fixes in this area (8b9942fa and 3156b71a). I don't think either of these fixes would explain the log above, but it would be best to eliminate the possibility. -Kevin From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34788) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aA4DQ-0004jG-SU for qemu-devel@nongnu.org; Fri, 18 Dec 2015 18:13:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aA4DN-0004By-Mg for qemu-devel@nongnu.org; Fri, 18 Dec 2015 18:13:36 -0500 Received: from mail-qg0-x22d.google.com ([2607:f8b0:400d:c04::22d]:35282) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aA4DN-0004Bu-G9 for qemu-devel@nongnu.org; Fri, 18 Dec 2015 18:13:33 -0500 Received: by mail-qg0-x22d.google.com with SMTP id v36so44910679qgd.2 for ; Fri, 18 Dec 2015 15:13:33 -0800 (PST) Date: Fri, 18 Dec 2015 18:13:26 -0500 From: Kevin O'Connor Message-ID: <20151218231326.GA4138@morn.lan> References: <563955D4.7080000@huawei.com> <20151104174201.GA17784@morn.lan> <8E78D212B8C25246BE4CE7EA0E645FE52977E8@SZXEMI504-MBS.china.huawei.com> <20151109133253.GA1790@morn.lan> <20151109200618.GA29129@morn.lan> <20151109202726.GA31490@morn.lan> <8E78D212B8C25246BE4CE7EA0E645FE52B5BE3@SZXEMI504-MBS.china.huawei.com> <8E78D212B8C25246BE4CE7EA0E645FE52B72B7@SZXEMI504-MBS.china.huawei.com> <20151119134039.GA27717@morn.lan> <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Gonglei (Arei)" Cc: "Huangweidong (C)" , "kvm@vger.kernel.org" , "seabios@seabios.org" , "Xulei (Stone)" , qemu-devel , Paolo Bonzini On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote: > Hi Kevin & Paolo, > > Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck. [...] > [2015-12-18 10:38:10] gonglei: finish while [...] > <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 > <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real) > <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0) > <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0 This is an odd finding. It seems to indicate that the code is caught in an infinite irq loop once irqs are enabled. What doesn't make sense is that an NMI shouldn't depend on the cpu irq enable flag. Also, I can't explain why rip would be 0x03, nor why a #UD in an exception handler wouldn't result in a triple fault. Maybe someone with more kvm knowledge could help here. I did notice that you appear to be running with SeaBIOS v1.8.1 - I recommend you upgrade to the latest. There were two important fixes in this area (8b9942fa and 3156b71a). I don't think either of these fixes would explain the log above, but it would be best to eliminate the possibility. -Kevin