From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: EFI Xen unstable crashes on Dell E6410 when calling efi_get_time. Date: Wed, 22 Oct 2014 12:01:08 +0100 Message-ID: <54478E74.3010408@citrix.com> References: <5446CAE2.8070206@oracle.com> <5446FA79.70602@oracle.com> <54477CCD.1010600@citrix.com> <5447A1790200007800040F54@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1Xgtfa-0003yz-Uh for xen-devel@lists.xenproject.org; Wed, 22 Oct 2014 11:01:35 +0000 In-Reply-To: <5447A1790200007800040F54@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , "Marcos E. Matsunaga" Cc: xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org On 22/10/14 11:22, Jan Beulich wrote: >>>> On 22.10.14 at 11:45, wrote: >> On 22/10/14 01:29, Marcos E. Matsunaga wrote: >>> I went out and got the serial cable. Attached is the full output. >>> >>> >>> On 10/21/2014 05:06 PM, Marcos E. Matsunaga wrote: >>>> Folks, >>>> >>>> I am trying to boot Xen using efibootmgr on a Dell E6410 laptop with >>>> 4GB RAM, running an Intel I5 dual core with VT and all the >>>> virtualization options enabled. >>>> >>>> It crashes almost immediately. I am working on getting the serial >>>> console up so that I can get a more detailed stack. >>>> >>>> A screenshot of the console is attached. >>>> >>>> The xen.cfg file is: >>>> >>>> [global] >>>> default=xen >>>> >>>> [xen] >>>> options=console=vga,com1 com1=115200,8n1 dom0_max_vcpus=2 vga="qxl" >>>> kernel=vmlinuz-3.8.13-48.el7uek.Other_EFI_v1.x86_64 >>>> root=UUID=917bfc7f-8d9c-4acf-a98a-a9f558daccf2 ro console=hvc0 >>>> enforcing=0 biosdevname=0 earlyprintk=xen nomodeset >>>> ramdisk=initramfs-3.8.13-48.el7uek.Other_EFI_v1.x86_64.img >>>> >>>> >>>> The codepath is "(gdb) x/20i get_cmos_time >>>> 0xffff82d080188825 : push %rbp >>>> 0xffff82d080188826 : mov %rsp,%rbp >>>> 0xffff82d080188829 : push %r12 >>>> 0xffff82d08018882b : push %rbx >>>> 0xffff82d08018882c : cmpb >>>> $0x0,0xb620d(%rip) # 0xffff82d08023ea40 >>>> 0xffff82d080188833 : je 0xffff82d080188843 >>>> >>>> 0xffff82d080188835 : callq >>>> 0xffff82d080100069 " >>>> >> Ok - there are two separate bugs here. >> >> The first is that we call into the efi runtime via efi_rs->GetTime, and >> a pagefault happens for the instruction at 0x00000000db25a33d for the >> virtual address 0x00000000fed1f410 >> >> The memory map looks quite weird, but the faulting address is covered in >> this range. >> >> (XEN) 00000fed1c000-00000fed1ffff type=11 attr=8000000000000000 >> >> So I would expect it to be mapped into the EFI pagetables. > Then you must have missed > > (XEN) Unknown cachability for MFNs 0xfed1c-0xfed1f > > which means no mapping got established (as we don't know what > cachability attributes to give to it). > > This is a firmware bug. I had indeed missed the secondary meaning of that message. >> The second is that once the pagefault has happened, we trap back into >> Xen and attempt to do a pagetable walk, falling over an assertion in >> map_domain_page(). >> >> For EFI calls, we run on the efi pagetables, not the idle pagetables, so >> I am not surprised that the assertion has failed. I suspect that the >> pagefault hander for hypervisor faults needs to become wise to the fact >> that we may receive a fault when calling into the firmware. As all the >> efi pagetables are xenheap pages, there is nothing conceptually wrong >> with using map_domain_page() to do the walk. > I'm not sure it's worth taking care of this special case. But yes, if > we really want to, extending the condition to also consider > efi_l4_pgtable would seem the right thing to do. I think being able to do a pagetable walk from an EFI fault would be useful, even if only to aid debugging. In this case, a non-debug build would successfully perform the walk. I have had a quick go, but it is rather hard to get the efi_l4_pgtable symbol available to use in domain_page.c without some gross extern'ing. It would be a nice fix if anyone has sufficient tuits. ~Andrew