From mboxrd@z Thu Jan 1 00:00:00 1970 From: Erik Rull Subject: Re: [Qemu-devel] QEMU with KVM does not start Win8 on kernel 3.4.67 and core2duo Date: Fri, 03 Oct 2014 19:54:20 +0200 Message-ID: <542EE2CC.20306@rdsoftware.de> References: <1588278346.272810.1407323965730.open-xchange@oxbaltgw03.schlund.de> <560458840.59096.1410441935489.open-xchange@oxbaltgw07.schlund.de> <5411A459.70905@siemens.com> <902712974.116667.1410524963893.open-xchange@oxbaltgw00.schlund.de> <54132A40.1050407@siemens.com> <54132D6E.404@siemens.com> <1548298201.268113.1410861978716.open-xchange@oxbaltgw00.schlund.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit To: Jan Kiszka , "kvm@vger.kernel.org" Return-path: Received: from mout.kundenserver.de ([212.227.126.130]:57213 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751163AbaJCRyQ (ORCPT ); Fri, 3 Oct 2014 13:54:16 -0400 In-Reply-To: <1548298201.268113.1410861978716.open-xchange@oxbaltgw00.schlund.de> Sender: kvm-owner@vger.kernel.org List-ID: Erik Rull wrote: >> On September 12, 2014 at 7:29 PM Jan Kiszka wrote: >> >> >> On 2014-09-12 19:15, Jan Kiszka wrote: >>> On 2014-09-12 14:29, Erik Rull wrote: >>>>> On September 11, 2014 at 3:32 PM Jan Kiszka >>>>> wrote: >>>>> >>>>> >>>>> On 2014-09-11 15:25, Erik Rull wrote: >>>>>>> On August 6, 2014 at 1:19 PM Erik Rull wrote: >>>>>>> >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I did already several tests and I'm not completely sure what's going >>>>>>> wrong, >>>>>>> but >>>>>>> here my scenario: >>>>>>> >>>>>>> When I start up QEMU w/ KVM 1.7.0 on a Core2Duo machine running a >>>>>>> vanilla >>>>>>> kernel >>>>>>> 3.4.67 to run a Windows 8.0 guest, the guest freezes at boot without any >>>>>>> error. >>>>>>> When I dump the CPU registers via "info registers", nothing changes, >>>>>>> that >>>>>>> means >>>>>>> the system really stalled. Same happens with QEMU 2.0.0. >>>>>>> >>>>>>> But - when I run the very same guest using Kernel 2.6.32.12 and QEMU >>>>>>> 1.7.0 >>>>>>> on >>>>>>> the host side it works on the Core2Duo. Also the system above but just >>>>>>> with >>>>>>> an >>>>>>> i3 or i5 CPU it works, too. >>>>>>> >>>>>>> I already disabled networking and USB for the guest and changed the >>>>>>> graphics >>>>>>> card - no effect. I assume that some mean bits and bytes have to be set >>>>>>> up >>>>>>> properly to get the thing running. >>>>>>> >>>>>>> Any hint what to change / test would be really appreciated. >>>>>>> >>>>>>> Thanks in advance, >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Erik >>>>>>> >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I opened a qemu bug report on that and Jan helped me creating a kvm >>>>>> trace. I >>>>>> attached it to the bug report. >>>>>> https://bugs.launchpad.net/qemu/+bug/1366836 >>>>>> >>>>>> If you have further questions, please let me know. >>>>> >>>>> "File possibly truncated. Need at least 346583040, but file size is >>>>> 133414912." >>>>> >>>>> Does "trace-cmd report" work for you? Is your file larger? >>>>> >>>>> Again, please also validate the behavior on latest next branch from >>>>> kvm.git. >>>>> >>>>> Jan >>>>> >>>> >>>> Hi all, >>>> >>>> confirmed. The issue is still existing in the kvm.git Version of the >>>> kernel. >>>> The trace.tgz was uploaded to the bugtracker. >>> >>> Thanks. Could you provide a good-case of your setup as well, i.e. with >>> that older kernel version? At least I'm not yet seeing something >>> obviously wrong. >> >> Well, except that we have continuously EXTERNAL_INTERRUPTs, vector 0xf6, >> throughout most of the trace. Maybe a self-IPI (this is single-core), >> maybe something external that is stuck. You could do a full trace (-e >> all) and check for what happens after things like >> >> kvm_exit: reason EXTERNAL_INTERRUPT rip 0x8168ed83 info 0 800000ef >> >> Jan >> > > The huge number of interrupts seem to be rescheduling interrupts from qemu/kvm. > I disabled SMP (kernel cmdline "nosmp") and retried - same effect, Windows 8 > does not boot. > But I was able to get rid of the reschedulding interrupts. The trace after / > around a kvm_exit looks like this: > > qemu-system-x86-954 [001] 261013.227405: kvm_entry: vcpu 0 > qemu-system-x86-952 [000] 261013.227405: kmem_cache_free: > call_site=c10ef001 ptr=0xf1d2ae48 > qemu-system-x86-952 [000] 261013.227406: mm_filemap_delete_from_page_cache: > dev 0:3 ino 0 page=0xf5bcc9c0 pfn=4122790336 ofs=507641856 > qemu-system-x86-952 [000] 261013.227406: kmem_cache_free: > call_site=c10ef001 ptr=0xf1d2ae10 > qemu-system-x86-952 [000] 261013.227406: mm_filemap_delete_from_page_cache: > dev 0:3 ino 0 page=0xf5bcc9e0 pfn=4122790368 ofs=507645952 > qemu-system-x86-954 [001] 261013.227406: kvm_exit: reason > EXCEPTION_NMI rip 0x812a1d83 info 80201120 80000b0e > qemu-system-x86-954 [001] 261013.227407: kvm_page_fault: address > 80201120 error_code 3 > qemu-system-x86-952 [000] 261013.227407: mm_page_free_batched: > page=0xf5bcc9e0 pfn=4122790368 order=0 cold=0 > qemu-system-x86-954 [001] 261013.227407: kvm_mmu_pagetable_walk: addr > 80201120 pferr 3 P|W > qemu-system-x86-952 [000] 261013.227407: mm_page_free: > page=0xf5bcc9e0 pfn=4122790368 order=0 > qemu-system-x86-954 [001] 261013.227407: kvm_mmu_paging_element: pte 188001 > level 3 > qemu-system-x86-952 [000] 261013.227407: mm_page_free_batched: > page=0xf5bcc9c0 pfn=4122790336 order=0 cold=0 > qemu-system-x86-954 [001] 261013.227407: kvm_mmu_paging_element: pte 39b863 > level 2 > > or > > qemu-system-x86-954 [001] 261013.276282: kvm_mmu_paging_element: pte 188001 > level 3 > qemu-system-x86-954 [001] 261013.276283: kvm_mmu_paging_element: pte 39b863 > level 2 > qemu-system-x86-954 [001] 261013.276283: kvm_mmu_paging_element: pte > 8000000000188963 level 1 > qemu-system-x86-954 [001] 261013.276284: rcu_utilization: Start context > switch > qemu-system-x86-954 [001] 261013.276284: rcu_utilization: End context > switch > qemu-system-x86-954 [001] 261013.276284: kvm_entry: vcpu 0 > qemu-system-x86-954 [001] 261013.276285: kvm_exit: reason > EXCEPTION_NMI rip 0x812a1d83 info 80201120 80000b0e > qemu-system-x86-954 [001] 261013.276286: kvm_page_fault: address > 80201120 error_code 3 > qemu-system-x86-954 [001] 261013.276286: kvm_mmu_pagetable_walk: addr > 80201120 pferr 3 P|W > qemu-system-x86-954 [001] 261013.276286: kvm_mmu_paging_element: pte 188001 > level 3 > qemu-system-x86-954 [001] 261013.276287: kvm_mmu_paging_element: pte 39b863 > level 2 > qemu-system-x86-954 [001] 261013.276287: kvm_mmu_paging_element: pte > 8000000000188963 level 1 > qemu-system-x86-954 [001] 261013.276288: kvm_mmu_pagetable_walk: addr > 812a1d83 pferr 10 F > qemu-system-x86-954 [001] 261013.276289: kvm_mmu_paging_element: pte 188001 > level 3 > qemu-system-x86-954 [001] 261013.276289: kvm_mmu_paging_element: pte 387063 > level 2 > qemu-system-x86-954 [001] 261013.276289: kvm_mmu_paging_element: pte 24a1121 > level 1 > qemu-system-x86-954 [001] 261013.276290: kvm_emulate_insn: 0:812a1d83:f0 > 0f c7 0f (prot32) > qemu-system-x86-954 [001] 261013.276290: kvm_mmu_pagetable_walk: addr > 80201120 pferr 0 > qemu-system-x86-954 [001] 261013.276290: kvm_mmu_paging_element: pte 188001 > level 3 > qemu-system-x86-954 [001] 261013.276291: kvm_mmu_paging_element: pte 39b863 > level 2 > qemu-system-x86-954 [001] 261013.276291: kvm_mmu_paging_element: pte > 8000000000188963 level 1 > qemu-system-x86-954 [001] 261013.276291: kvm_mmu_pagetable_walk: addr > 80201120 pferr 2 W > qemu-system-x86-954 [001] 261013.276291: kvm_mmu_paging_element: pte 188001 > level 3 > qemu-system-x86-954 [001] 261013.276292: kvm_mmu_paging_element: pte 39b863 > level 2 > > Most of the exit reasons are NMIs. When filtering them out there are only few > external interrupts, a lot of IO_INSTRUCTION, some CPUID. > > Is there a chance to catch the point where the virtual processor gets stuck? > > Best regards, > > Erik > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Hi all, I'm still stuck at the same point - I would like to proceed my work but I need assistance by getting an evaluation result on the trace files I posted. It's getting a bit more time critical now on my side, because updates for ~ 100 systems worldwide have to be rolled out within the next weeks... Thanks a lot. Best regards, Erik