From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Guangrong Subject: Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled Date: Tue, 13 Oct 2015 02:20:28 +0800 Message-ID: <561BF9EC.5060605@linux.intel.com> References: <55FBDB6D.4040207@gmail.com> <55FBE248.4010809@redhat.com> <55FC4E6F.8030104@gmail.com> <55FF7095.5060106@linux.intel.com> <55FF7C41.7070400@linux.intel.com> <560D3F31.5000703@gmail.com> <560D40C2.5080205@redhat.com> <560E96D8.9080007@gmail.com> <56196FF1.8060902@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: edk2-devel@ml01.01.org To: Janusz , Paolo Bonzini , Wanpeng Li , Laszlo Ersek , kvm@vger.kernel.org Return-path: Received: from mga14.intel.com ([192.55.52.115]:34670 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751373AbbJLS0i (ORCPT ); Mon, 12 Oct 2015 14:26:38 -0400 In-Reply-To: <56196FF1.8060902@linux.intel.com> Sender: kvm-owner@vger.kernel.org List-ID: On 10/11/2015 04:07 AM, Xiao Guangrong wrote: > > > On 10/02/2015 10:38 PM, Janusz wrote: >> W dniu 01.10.2015 o 16:18, Paolo Bonzini pisze: >>> >>> On 01/10/2015 16:12, Janusz wrote: >>>> Now, I can also add, that the problem is only when I allow VM to use >>>> more than one core, so with option for example: >>>> -smp 8,cores=4,threads=2,sockets=1 and other combinations like -smp >>>> 4,threads=1 its not working, and without it I am always running VM >>>> without problems >>>> >>>> Any ideas what can it be? or any idea what would help to find out what >>>> is causing this? >>> I am going to send a revert of the patch tomorrow. >>> >>> Paolo >> Thanks, but revert patch doesn't help, so something else is wrong here >> > > It seems i can reproduce it now ... and finally i get little free time now :( > I will dig into it and fix it asap. > > Thank you, Janusz and Paolo! I think i have figured out the root case, i got these traces: <...>-47935 [052] d... 20017.763244: kvm_exit: reason EPT_VIOLATION rip 0xa0000 info 184 0 <...>-47935 [052] .... 20017.763244: kvm_page_fault: address a0000 error_code 184 <...>-47935 [052] .... 20017.763269: mark_mmio_spte: sptep:ffff880841c3d500 gfn a0 access 6 gen fff94 <...>-47935 [052] .... 20017.763272: kvm_mmu_pagetable_walk: addr a0000 pferr 10 F <...>-47935 [052] .... 20017.763272: kvm_mmu_paging_element: pte bfeff023 level 4 <...>-47935 [052] .... 20017.763273: kvm_mmu_paging_element: pte bff00023 level 3 <...>-47935 [052] .... 20017.763273: kvm_mmu_paging_element: pte e3 level 2 <...>-47935 [052] .... 20017.763274: kvm_emulate_insn: 0:a0000: (prot32) <...>-47935 [052] .... 20017.763274: kvm_emulate_insn: 0:a0000: (prot32) failed <...>- It told me that guest is executing on address 0xa0000 but it is a MMIO address, so KVM can not emulate it and complained with internal error. Actually, 0xa0000 is belong to SMRAM (0x30000 is SMRAM base and 0x80000 is EIP offset, 0x30000 + 0x80000 = 0xa0000), however, from QEMU's dump: EAX=bfefe000 EBX=00000002 ECX=00000000 EDX=00000600 ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000 EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 we see that VCPU is not in SMM. I dropped some patches (MTRR patches) then this bug can not be trigged so frequently but it can not completely be avoided :( I think we need to check OVMF's code to see if there is rare case that SMM hahdler is called but KVM have not received SMI at that time...