From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: Questions on the VMentry failure patch Date: Mon, 14 Jul 2008 19:12:38 +0300 Message-ID: <487B7AF6.2060607@qumranet.com> References: <52d4a3890807070707n4e0039ccgc07aa0fa3ab28d8e@mail.gmail.com> <48722720.7050409@qumranet.com> <52d4a3890807070744i66a9db56r787eecc62081c8e8@mail.gmail.com> <48722D99.2030009@qumranet.com> <4872363F.5010103@codemonkey.ws> <52d4a3890807071752g5675558el38bded8bd475c68a@mail.gmail.com> <52d4a3890807091056j1ff4db6fo16cf364dfa8a36de@mail.gmail.com> <52d4a3890807100648n2909eda1h1aeb993ae00aaa18@mail.gmail.com> <52d4a3890807140910v2157fc14p397dd78cc949dc5b@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Anthony Liguori , kvm@vger.kernel.org, Rik van Riel , Guillaume Thouvenin To: Mohammed Gamal Return-path: Received: from il.qumranet.com ([212.179.150.194]:59219 "EHLO il.qumranet.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753491AbYGNQMk (ORCPT ); Mon, 14 Jul 2008 12:12:40 -0400 In-Reply-To: <52d4a3890807140910v2157fc14p397dd78cc949dc5b@mail.gmail.com> Sender: kvm-owner@vger.kernel.org List-ID: Mohammed Gamal wrote: > On Thu, Jul 10, 2008 at 4:48 PM, Mohammed Gamal wrote: > >>>> It's true indeed, the patch did increase the likelihood of the >>>> problem with me (although it occurs every few runs). I modified >>>> invalid_guest_state() to call kvm_report_emulation_failure() in all >>>> cases and I noticed that whenever the crash happens it happens here: >>>> >>>> rip 6e10 66 b8 20 00 >>>> >>>> It's too late at night here, so I'll not lookup the opcode map now :) >>>> . I'll further look into it later. >>>> >>>> >>> Another thing, I tried -no-kvm-pit switch and it tremendously increase >>> the likelihood of the crash to almost a 100%. >>> >>> >> I updated to the latest kvm-userspace git tree, and now the failure is >> happening at completely random instructions whether or not we are >> using -no-kvm-pit. >> >> > > I didn't have the gfxboot source code in hand, but now that I've got > it. It clears out that the failure always occurs in the > switch_to_pm_20 routine. However, the failure doesn't happen at one > particular instruction, but either doesn't happen at all or happens at > any instruction between addresses 6e10 and 6e27. > > I'm suspecting it might be some kind of a race condition, although I > don't see where in the code - kernel side to specific - that this race > exactly might occur. Maybe the locking changes in the userspace side > helped some underlying issue to come up to the surface just like what > happened with FreeDOS. I'll look further into it, any > pointers/help/suggestions are appreciated. > I suspected an interrupt, which fits the scenario you describe. Although Anthony tested this and found out interrupts were not involved, IIRC. -- error compiling committee.c: too many arguments to function