From mboxrd@z Thu Jan 1 00:00:00 1970 From: Janusz Subject: Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled Date: Thu, 15 Oct 2015 09:21:33 +0200 Message-ID: <561F53FD.7070303@gmail.com> References: <55FBDB6D.4040207@gmail.com> <55FBE248.4010809@redhat.com> <55FC4E6F.8030104@gmail.com> <55FF7095.5060106@linux.intel.com> <55FF7C41.7070400@linux.intel.com> <560D3F31.5000703@gmail.com> <560D40C2.5080205@redhat.com> <560E96D8.9080007@gmail.com> <561DD2EC.5040800@linux.intel.com> <561E0655.8080508@gmail.com> <561E1121.7030502@linux.intel.com> <561E1329.5080109@linux.intel.com> <561E9A36.3080302@gmail.com> <561F2952.5060300@linux.intel.com> <561F4589.5050609@gmail.com> <561F4AAE.3060204@linux.intel.com> <561F4E92.3090403@gmail.com> <561F516D.7070504@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: edk2-devel@ml01.01.org, Alex Williamson To: Xiao Guangrong , Paolo Bonzini , Wanpeng Li , Laszlo Ersek , kvm@vger.kernel.org Return-path: In-Reply-To: <561F516D.7070504@linux.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: edk2-devel-bounces@lists.01.org Sender: "edk2-devel" List-Id: kvm.vger.kernel.org W dniu 15.10.2015 o 09:10, Xiao Guangrong pisze: > > > On 10/15/2015 02:58 PM, Janusz wrote: >> W dniu 15.10.2015 o 08:41, Xiao Guangrong pisze: >>> >>> >>> On 10/15/2015 02:19 PM, Janusz wrote: >>>> W dniu 15.10.2015 o 06:19, Xiao Guangrong pisze: >>>>> >>>>> >>>>> >>>>> Well, the bug may be not in KVM. When this bug happened, i saw OVMF >>>>> only checked 1 CPU out, there is the log from OVMF's debug input: >>>>> >>>>> Flushing GCD >>>>> Flushing GCD >>>>> Flushing GCD >>>>> Flushing GCD >>>>> Flushing GCD >>>>> Flushing GCD >>>>> Flushing GCD >>>>> Flushing GCD >>>>> Flushing GCD >>>>> Flushing GCDs >>>>> Detect CPU count: 1 >>>>> >>>>> So that the startup code has been freed however the APs are still >>>>> running, >>>>> i think that why we saw the vCPUs executed on unexpected address. >>>>> >>>>> After digging into OVMF's code, i noticed that BSP CPU waits for APs >>>>> for a fixed timer period, however, KVM recent changes require zap all >>>>> mappings if CR0.CD is changed, that means the APs need more time to >>>>> startup. >>>>> >>>>> After following changes to OVMF, the bug is completely gone on my >>>>> side: >>>>> >>>>> --- a/UefiCpuPkg/CpuDxe/ApStartup.c >>>>> +++ b/UefiCpuPkg/CpuDxe/ApStartup.c >>>>> @@ -454,7 +454,9 @@ StartApsStackless ( >>>>> // >>>>> // Wait 100 milliseconds for APs to arrive at the ApEntryPoint >>>>> routine >>>>> // >>>>> - MicroSecondDelay (100 * 1000); >>>>> + MicroSecondDelay (10 * 100 * 1000); >>>>> >>>>> return EFI_SUCCESS; >>>>> } >>>>> >>>>> Janusz, could you please check this instead? You can switch to your >>>>> previous kernel to do this test. >>>>> >>>>> >>>> Ok, now first time when I started VM I was able to start system >>>> successfully. When I turned it off and started it again, it >>>> restarted my >>>> vm at system boot couple of times. Sometimes I also get very high cpu >>>> usage for no reason. Also, I get less fps in GTA 5 than in kernel >>>> 4.1, I >>>> get something like 30-55, but on 4.1 I get all the time 60 fps. >>>> This is >>>> my new log: https://bpaste.net/show/61a122ad7fe5 >>>> >>> >>> Just confirm: the Qemu internal error did not appear any more, right? >> Yes, when I reverted your first patch, switched to -vga std from -vga >> none and didn't passthrough my GPU (case when I got this internal >> error), vm started without problem. I even didn't get any VM restarts >> like with passthrough >> > > Wow, it seems we have fixed the QEMU internal error now. :) > > Recurrently, Paolo has reverted some MTRR patches, was your test > based on these reverted patches? > > The GPU passthrough issue may be related to vfio (not sure), Alex, do > you have any idea? > > Laszlo, could you please check the root case is reasonable and fix it in > OVMF if it's right? > > BTW, OVMF handles #UD with no trace - nothing is killed, and no call > trace > in the debug input... > Yes, reverted MTRR code is already in kernel I use - 4.3-r5+