All of lore.kernel.org
 help / color / mirror / Atom feed
* EFI boot Issue with setup_xenheap_mappings()
@ 2014-10-08 23:58 Suravee Suthikulanit
  2014-10-09  8:51 ` Ian Campbell
  0 siblings, 1 reply; 3+ messages in thread
From: Suravee Suthikulanit @ 2014-10-08 23:58 UTC (permalink / raw)
  To: Ian Campbell, Stefano Stabellini, Julien Grall; +Cc: Roy Franz, xen-devel

Hi All,

After we resolved the flushing and tlbi issues previously discussed, I 
ran into a couple more issues. Please note that the available memory 
regions retrieved from EFI system table are:

(XEN) RAM: 0000008001000000 - 0000008007ffdfff <- (1)
(XEN) RAM: 0000008007ffe000 - 0000008007ffffff
(XEN) RAM: 0000008008000000 - 000000801fffdfff
(XEN) RAM: 000000801fffe000 - 000000801fffffff
(XEN) RAM: 0000008020000000 - 000000802fffffff
(XEN) RAM: 0000008030001000 - 00000083f0ffffff
(XEN) RAM: 00000083f1000000 - 00000083f101ffff
(XEN) RAM: 00000083f1020000 - 00000083fbbb1fff
(XEN) RAM: 00000083fc4b8000 - 00000083fc4b8fff
(XEN) RAM: 00000083fc6a4000 - 00000083fec25fff
(XEN) RAM: 00000083fec26000 - 00000083fee8bfff
(XEN) RAM: 00000083fee8c000 - 00000083ff225fff
(XEN) RAM: 00000083ff226000 - 00000083ff263fff
(XEN) RAM: 00000083ff265000 - 00000083ff2c4fff
(XEN) RAM: 00000083ffe70000 - 00000083ffffffff

Note:
(1) Physical memory actually started at 0x80_0000_0000, the first 16MB 
is used by UEFI.

ISSUE 1:
Data abort is happening at:
   start_xen()
     -> setup_mm()
       -> init_boot_pages()
         -> bootmem_region_add()
           -> Accessing bootmem_region_list causing data abort !!!

bootmem_region_list is initialized by "init_boot_pages()" at the first 
calling of "bootmem_region_add()". It points to VA 0x800000000000, which 
is at the beginning of the 1:1 mapping of RAM region (slots 256) (see 
include/asm-arm/config.h for complete ARM64 memory layout).

So, I looked at how the page table is setup for this region. This is 
done at the following place:
   start_xen()
     -> setup_mm()
       -> setup_xenheap_mappings(bank_start>>PAGE_SHIFT, 
bank_size>>PAGE_SHIFT)

Inside setup_xenheap_mapping():
   offset = pfn_to_pdx(base_mfn - xenheap_mfn_start).
     where:
       base_mfn          = 0x8000000
       xenheap_mfn_start = 0x8010000

This results in the math above to fail, and resulting in invalid offset, 
invalid vaddr, and invalid pte for this region.

So, I put a hack where I set base_mfn and xenheap_mfn_start to 0x8010000 
in the code above (not sure if this is the right things to do), and this 
seems to resolve issue 1  continue all the way to 
setup_frame_table_mapping().


ISSUE 2:
Here, it is also getting data abort when accessing frame_table (at 
0x800000000 = 32GB). It seems that the page table for this region is 
also messed up.

Need a break and more investigation at this point.

Suravee

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: EFI boot Issue with setup_xenheap_mappings()
  2014-10-08 23:58 EFI boot Issue with setup_xenheap_mappings() Suravee Suthikulanit
@ 2014-10-09  8:51 ` Ian Campbell
  2014-10-09 12:11   ` Ian Campbell
  0 siblings, 1 reply; 3+ messages in thread
From: Ian Campbell @ 2014-10-09  8:51 UTC (permalink / raw)
  To: Suravee Suthikulanit
  Cc: Roy Franz, Julien Grall, xen-devel, Vijay Kilari,
	Stefano Stabellini

On Wed, 2014-10-08 at 18:58 -0500, Suravee Suthikulanit wrote:

Adding Vijay who raised what seems to be the same issue yesterday.

> Hi All,
> 
> After we resolved the flushing and tlbi issues previously discussed, I 
> ran into a couple more issues. Please note that the available memory 
> regions retrieved from EFI system table are:
> 
> (XEN) RAM: 0000008001000000 - 0000008007ffdfff <- (1)
> (XEN) RAM: 0000008007ffe000 - 0000008007ffffff
> (XEN) RAM: 0000008008000000 - 000000801fffdfff
> (XEN) RAM: 000000801fffe000 - 000000801fffffff
> (XEN) RAM: 0000008020000000 - 000000802fffffff
> (XEN) RAM: 0000008030001000 - 00000083f0ffffff
> (XEN) RAM: 00000083f1000000 - 00000083f101ffff
> (XEN) RAM: 00000083f1020000 - 00000083fbbb1fff
> (XEN) RAM: 00000083fc4b8000 - 00000083fc4b8fff
> (XEN) RAM: 00000083fc6a4000 - 00000083fec25fff
> (XEN) RAM: 00000083fec26000 - 00000083fee8bfff
> (XEN) RAM: 00000083fee8c000 - 00000083ff225fff
> (XEN) RAM: 00000083ff226000 - 00000083ff263fff
> (XEN) RAM: 00000083ff265000 - 00000083ff2c4fff
> (XEN) RAM: 00000083ffe70000 - 00000083ffffffff
> 
> Note:
> (1) Physical memory actually started at 0x80_0000_0000, the first 16MB 
> is used by UEFI.
> 
> ISSUE 1:
> Data abort is happening at:
>    start_xen()
>      -> setup_mm()
>        -> init_boot_pages()
>          -> bootmem_region_add()
>            -> Accessing bootmem_region_list causing data abort !!!
> 
> bootmem_region_list is initialized by "init_boot_pages()" at the first 
> calling of "bootmem_region_add()". It points to VA 0x800000000000, which 
> is at the beginning of the 1:1 mapping of RAM region (slots 256) (see 
> include/asm-arm/config.h for complete ARM64 memory layout).
> 
> So, I looked at how the page table is setup for this region. This is 
> done at the following place:
>    start_xen()
>      -> setup_mm()
>        -> setup_xenheap_mappings(bank_start>>PAGE_SHIFT, 
> bank_size>>PAGE_SHIFT)
> 
> Inside setup_xenheap_mapping():
>    offset = pfn_to_pdx(base_mfn - xenheap_mfn_start).
>      where:
>        base_mfn          = 0x8000000
>        xenheap_mfn_start = 0x8010000
> 
> This results in the math above to fail, and resulting in invalid offset, 
> invalid vaddr, and invalid pte for this region.
> 
> So, I put a hack where I set base_mfn and xenheap_mfn_start to 0x8010000 
> in the code above (not sure if this is the right things to do), and this 
> seems to resolve issue 1  continue all the way to 
> setup_frame_table_mapping().

Yes, looking again at the arm64 setup_xenheap_mapping I think it is
pretty broken when facing non-aligned memory. I think it's just a
coincidence that EFI happens to expose this (by reserving/fragmenting
more memory), where u-boot based systems tend not to do so much of that
sort of thing.

Setting both base_mfn and xenheap_mfn_start to the aligned address seems
likely to be the correct path forward. I think though some care may need
to be taken with maddr_to_virt at the same time, since we need to
arrange for any offset in the frametable to match the offset in the
xenheap mapping (which is what I think I was trying to do with the
existing broken code).

virt_to_maddr uses a h/w pt walk, so should be find whatever we do.

> ISSUE 2:
> Here, it is also getting data abort when accessing frame_table (at 
> 0x800000000 = 32GB). It seems that the page table for this region is 
> also messed up.

It's quite possible this is broken too, or this could potentially be
fallout from the mismatch I mention above.

> Need a break and more investigation at this point.

I'm going to have a look this morning.

Ian.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: EFI boot Issue with setup_xenheap_mappings()
  2014-10-09  8:51 ` Ian Campbell
@ 2014-10-09 12:11   ` Ian Campbell
  0 siblings, 0 replies; 3+ messages in thread
From: Ian Campbell @ 2014-10-09 12:11 UTC (permalink / raw)
  To: Suravee Suthikulanit
  Cc: Roy Franz, Stefano Stabellini, Julien Grall, Vijay Kilari,
	xen-devel

On Thu, 2014-10-09 at 09:51 +0100, Ian Campbell wrote:
> Yes, looking again at the arm64 setup_xenheap_mapping I think it is
> pretty broken when facing non-aligned memory. I think it's just a
> coincidence that EFI happens to expose this (by reserving/fragmenting
> more memory), where u-boot based systems tend not to do so much of that
> sort of thing.

The key turns out to be that where u-boot/dtb presents all of the RAM in
the memory node (which is generally aligned in h/w) and uses the dtb
memreserve function to carve out holes EFI instead registers only the
unreserved RAM to start with, which is more likely to be misaligned.

It is legitimate for EFI to do things this way, so it needs to be fixed.

I've now reproduced locally, including the fix to setup_xenheap_mapping
resulting in an issue accessing the frame table.

I'm still investigating, hopefully I'll have a fix today or tomorrow.

Ian.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-10-09 12:11 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-08 23:58 EFI boot Issue with setup_xenheap_mappings() Suravee Suthikulanit
2014-10-09  8:51 ` Ian Campbell
2014-10-09 12:11   ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.