xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: David Vrabel <david.vrabel@citrix.com>
Cc: xen-devel@lists.xen.org
Subject: Re: [PATCH] x86/xen: do not identity map E820 memory regions that are UNUSABLE
Date: Fri, 12 Jul 2013 15:33:42 -0400	[thread overview]
Message-ID: <20130712193342.GA6364@phenom.dumpdata.com> (raw)
In-Reply-To: <51DE9546.8050206@citrix.com>

On Thu, Jul 11, 2013 at 12:21:42PM +0100, David Vrabel wrote:
> On 09/07/13 19:45, Konrad Rzeszutek Wilk wrote:
> > On Tue, Jul 09, 2013 at 03:44:38PM +0100, David Vrabel wrote:
> >> On 09/07/13 15:13, Konrad Rzeszutek Wilk wrote:
> >>> On Tue, Jul 09, 2013 at 02:38:53PM +0100, David Vrabel wrote:
> >>>> From: David Vrabel <david.vrabel@citrix.com>
> >>>>
> >>>> If there are UNUSABLE regions in the machine memory map, dom0 will
> >>>> attempt to map them 1:1 which is not permitted by Xen and the kernel
> >>>> will crash.
> >>>>
> >>>> There isn't anything interesting in the UNUSABLE region that the dom0
> >>>> kernel needs access to so we can avoid making the 1:1 mapping and
> >>>> leave the region as RAM.
> >>>>
> >>>> Since the obtaining the memory map for dom0 and domU are now more
> >>>> different, refactor each into separate functions.
> >>>>
> >>>> This fixes a dom0 boot failure if tboot is used (because tboot was
> >>>> marking a memory region as UNUSABLE).
> >>>
> >>> Please also include the serial log that shows the crash.
> >>
> >> It's a domain crash by Xen and it isn't useful as none of the stack is
> >> decoded.
> > 
> > Could you include the E820 at least to get a sense of where and how
> > this looks? As in - without tboot and then with tboot?
> 
> This would take time to set up the host again and I don't think
> including a specific example of an E820 map with an UNUSABLE region
> really adds anything useful to the commit log.
> 
> You can look at some of the previous threads for examples.
> 
> >>>> +static int __init xen_get_memory_map_dom0(struct e820entry *map,
> >>>> +					  unsigned *nr_entries)
> >>>> +{
> >>>> +	struct xen_memory_map memmap;
> >>>> +	unsigned i;
> >>>> +	int ret;
> >>>> +
> >>>> +	/*
> >>>> +	 * Dom0 requires access to machine addresses for BIOS data and
> >>>> +	 * MMIO (e.g. PCI) devices.  The reset of the kernel expects
> >>>> +	 * to be able to access these through a 1:1 p2m mapping.
> >>>> +	 *
> >>>> +	 * We need to take the pseudo physical memory map and set up
> >>>> +	 * 1:1 mappings corresponding to the RESERVED regions and
> >>>> +	 * holes in the /machine/ memory map, adding/expanding the RAM
> >>>> +	 * region at the end of the map for the relocated RAM.
> >>
> >> This is the key paragraph.  The apparent use of the machine memory map
> >> for dom0  is a confusing fiction.
> > 
> > OK, but I don't follow when dom0 would be using the E820_UNUSED regions.
> > Is it the xen_do_chunk that is failing on said PFNs? Or is it in this
> > code xen_set_identity_and_release_chunk:
> > 
> > "217         /*                                                                      
> > 218          * If the PFNs are currently mapped, the VA mapping also needs          
> > 219          * to be updated to be 1:1.                                             
> > 220          */                                                                     
> > 221         for (pfn = start_pfn; pfn <= max_pfn_mapped && pfn < end_pfn; pfn++)    
> > 222                 (void)HYPERVISOR_update_va_mapping(                             
> > 223                         (unsigned long)__va(pfn << PAGE_SHIFT),                 
> > 224                         mfn_pte(pfn, PAGE_KERNEL_IO), 0);                       
> > 225                                                          "
> > 
> > which updates the initial PTE's with the 1-1 PFN and the E820_UNUSABLE is
> > somehow in between two E820_RAM regions?
> 
> It's here, yes.
> 
> >>>> +	 *
> >>>> +	 * This is more easily done if we just start with the machine
> >>>> +	 * memory map.
> >>>> +	 *
> >>>> +	 * UNUSABLE regions are awkward, they are not interesting to
> >>>> +	 * dom0 and Xen won't allow them to be mapped so we want to
> >>>> +	 * leave these as RAM in the pseudo physical map.
> >>>
> >>> We just want them as such in the P2M but not do any PTE creation for it?
> >>> Why do we care about it? We are not creating any page tables for
> >>> E820_UNUSABLE regions.
> >>
> >> I don't follow what you're asking here.
> > 
> > What code maps said PFNs.
> 
> See above.

So the 'HYPERVISOR_update_va_mapping' fails b/c we include it in the
xen_set_identify_and_release_chunk. Why not make the logic that sets "gaps"
and E820_RESERVED regions to omit E820_UNUSABLE regions? That would solve
it as well - and we won't be messing with the E820.
> 
> >> In dom0, UNUSABLE regions in the machine memory map are RAM regions on
> >> the pseudo-physical memory map.  So instead of playing games and making
> >> these regions special in the pseudo-physical map we just leave them as RAM.
> > 
> > .. And then exposing them to the kernel to be used as normal RAM?
> 
> Yes.
> 
> > With your change it is. But without your change it would not map it.
> 
> Incorrect. See above.

It would map it using the hypercall. But it would not create pagetables for it
(the generic code that is it).
> 
> >>>> +	 */
> >>>> +
> >>>> +	memmap.nr_entries = *nr_entries;
> >>>> +	set_xen_guest_handle(memmap.buffer, map);
> >>>> +
> >>>> +	ret = HYPERVISOR_memory_op(XENMEM_machine_memory_map, &memmap);
> >>>> +	if (ret < 0)
> >>>> +		return ret;
> >>>> +
> >>>> +	for (i = 0; i < memmap.nr_entries; i++) {
> >>>> +		if (map[i].type == E820_UNUSABLE)
> >>>
> >>> What if the E820_UNUSABLE regions were manufactured by the BIOS? Or
> >>> somebody booted Xen with mem=3G (in which we clip the memory) on a 16GB
> >>> box.
> >>
> >> The resulting memory map should be clipped by the result of the call to
> >> xen_get_max_pages().
> > 
> > OK. What about the BIOS manufacturing it?
> 
> What about it? As a PV guest we don't care what the machine memory map
> looks like, /except/ as a means to find interesting bits of hardware
> that we want 1:1 mappings for.

Right but now you are converting it from 1:1 to a RAM region - where we
don't do 1:1.
> 
> David

  reply	other threads:[~2013-07-12 19:33 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1373377133-11018-1-git-send-email-david.vrabel@citrix.com>
2013-07-09 14:13 ` [PATCH] x86/xen: do not identity map E820 memory regions that are UNUSABLE Konrad Rzeszutek Wilk
     [not found] ` <20130709141329.GC24897@phenom.dumpdata.com>
2013-07-09 14:44   ` David Vrabel
     [not found]   ` <51DC21D6.4000107@citrix.com>
2013-07-09 18:45     ` Konrad Rzeszutek Wilk
     [not found]     ` <20130709184538.GB10188@phenom.dumpdata.com>
2013-07-11 11:21       ` David Vrabel
2013-07-12 19:33         ` Konrad Rzeszutek Wilk [this message]
2013-07-12 21:38           ` David Vrabel
2013-07-15 12:48             ` Konrad Rzeszutek Wilk
2013-07-09 17:00 ` Aurelien Chartier
     [not found] ` <51DC41A1.6010909@citrix.com>
2013-07-09 18:36   ` Konrad Rzeszutek Wilk
     [not found]   ` <20130709183647.GA10188@phenom.dumpdata.com>
2013-07-10 14:24     ` Aurelien Chartier
2013-07-09 13:38 David Vrabel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130712193342.GA6364@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).