All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: David Vrabel <david.vrabel@citrix.com>
Cc: xen-devel@lists.xen.org
Subject: Re: [PATCH] x86/xen: do not identity map E820 memory regions that are UNUSABLE
Date: Fri, 12 Jul 2013 15:33:42 -0400	[thread overview]
Message-ID: <20130712193342.GA6364@phenom.dumpdata.com> (raw)
In-Reply-To: <51DE9546.8050206@citrix.com>

On Thu, Jul 11, 2013 at 12:21:42PM +0100, David Vrabel wrote:
> On 09/07/13 19:45, Konrad Rzeszutek Wilk wrote:
> > On Tue, Jul 09, 2013 at 03:44:38PM +0100, David Vrabel wrote:
> >> On 09/07/13 15:13, Konrad Rzeszutek Wilk wrote:
> >>> On Tue, Jul 09, 2013 at 02:38:53PM +0100, David Vrabel wrote:
> >>>> From: David Vrabel <david.vrabel@citrix.com>
> >>>>
> >>>> If there are UNUSABLE regions in the machine memory map, dom0 will
> >>>> attempt to map them 1:1 which is not permitted by Xen and the kernel
> >>>> will crash.
> >>>>
> >>>> There isn't anything interesting in the UNUSABLE region that the dom0
> >>>> kernel needs access to so we can avoid making the 1:1 mapping and
> >>>> leave the region as RAM.
> >>>>
> >>>> Since the obtaining the memory map for dom0 and domU are now more
> >>>> different, refactor each into separate functions.
> >>>>
> >>>> This fixes a dom0 boot failure if tboot is used (because tboot was
> >>>> marking a memory region as UNUSABLE).
> >>>
> >>> Please also include the serial log that shows the crash.
> >>
> >> It's a domain crash by Xen and it isn't useful as none of the stack is
> >> decoded.
> > 
> > Could you include the E820 at least to get a sense of where and how
> > this looks? As in - without tboot and then with tboot?
> 
> This would take time to set up the host again and I don't think
> including a specific example of an E820 map with an UNUSABLE region
> really adds anything useful to the commit log.
> 
> You can look at some of the previous threads for examples.
> 
> >>>> +static int __init xen_get_memory_map_dom0(struct e820entry *map,
> >>>> +					  unsigned *nr_entries)
> >>>> +{
> >>>> +	struct xen_memory_map memmap;
> >>>> +	unsigned i;
> >>>> +	int ret;
> >>>> +
> >>>> +	/*
> >>>> +	 * Dom0 requires access to machine addresses for BIOS data and
> >>>> +	 * MMIO (e.g. PCI) devices.  The reset of the kernel expects
> >>>> +	 * to be able to access these through a 1:1 p2m mapping.
> >>>> +	 *
> >>>> +	 * We need to take the pseudo physical memory map and set up
> >>>> +	 * 1:1 mappings corresponding to the RESERVED regions and
> >>>> +	 * holes in the /machine/ memory map, adding/expanding the RAM
> >>>> +	 * region at the end of the map for the relocated RAM.
> >>
> >> This is the key paragraph.  The apparent use of the machine memory map
> >> for dom0  is a confusing fiction.
> > 
> > OK, but I don't follow when dom0 would be using the E820_UNUSED regions.
> > Is it the xen_do_chunk that is failing on said PFNs? Or is it in this
> > code xen_set_identity_and_release_chunk:
> > 
> > "217         /*                                                                      
> > 218          * If the PFNs are currently mapped, the VA mapping also needs          
> > 219          * to be updated to be 1:1.                                             
> > 220          */                                                                     
> > 221         for (pfn = start_pfn; pfn <= max_pfn_mapped && pfn < end_pfn; pfn++)    
> > 222                 (void)HYPERVISOR_update_va_mapping(                             
> > 223                         (unsigned long)__va(pfn << PAGE_SHIFT),                 
> > 224                         mfn_pte(pfn, PAGE_KERNEL_IO), 0);                       
> > 225                                                          "
> > 
> > which updates the initial PTE's with the 1-1 PFN and the E820_UNUSABLE is
> > somehow in between two E820_RAM regions?
> 
> It's here, yes.
> 
> >>>> +	 *
> >>>> +	 * This is more easily done if we just start with the machine
> >>>> +	 * memory map.
> >>>> +	 *
> >>>> +	 * UNUSABLE regions are awkward, they are not interesting to
> >>>> +	 * dom0 and Xen won't allow them to be mapped so we want to
> >>>> +	 * leave these as RAM in the pseudo physical map.
> >>>
> >>> We just want them as such in the P2M but not do any PTE creation for it?
> >>> Why do we care about it? We are not creating any page tables for
> >>> E820_UNUSABLE regions.
> >>
> >> I don't follow what you're asking here.
> > 
> > What code maps said PFNs.
> 
> See above.

So the 'HYPERVISOR_update_va_mapping' fails b/c we include it in the
xen_set_identify_and_release_chunk. Why not make the logic that sets "gaps"
and E820_RESERVED regions to omit E820_UNUSABLE regions? That would solve
it as well - and we won't be messing with the E820.
> 
> >> In dom0, UNUSABLE regions in the machine memory map are RAM regions on
> >> the pseudo-physical memory map.  So instead of playing games and making
> >> these regions special in the pseudo-physical map we just leave them as RAM.
> > 
> > .. And then exposing them to the kernel to be used as normal RAM?
> 
> Yes.
> 
> > With your change it is. But without your change it would not map it.
> 
> Incorrect. See above.

It would map it using the hypercall. But it would not create pagetables for it
(the generic code that is it).
> 
> >>>> +	 */
> >>>> +
> >>>> +	memmap.nr_entries = *nr_entries;
> >>>> +	set_xen_guest_handle(memmap.buffer, map);
> >>>> +
> >>>> +	ret = HYPERVISOR_memory_op(XENMEM_machine_memory_map, &memmap);
> >>>> +	if (ret < 0)
> >>>> +		return ret;
> >>>> +
> >>>> +	for (i = 0; i < memmap.nr_entries; i++) {
> >>>> +		if (map[i].type == E820_UNUSABLE)
> >>>
> >>> What if the E820_UNUSABLE regions were manufactured by the BIOS? Or
> >>> somebody booted Xen with mem=3G (in which we clip the memory) on a 16GB
> >>> box.
> >>
> >> The resulting memory map should be clipped by the result of the call to
> >> xen_get_max_pages().
> > 
> > OK. What about the BIOS manufacturing it?
> 
> What about it? As a PV guest we don't care what the machine memory map
> looks like, /except/ as a means to find interesting bits of hardware
> that we want 1:1 mappings for.

Right but now you are converting it from 1:1 to a RAM region - where we
don't do 1:1.
> 
> David

  reply	other threads:[~2013-07-12 19:33 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1373377133-11018-1-git-send-email-david.vrabel@citrix.com>
2013-07-09 14:13 ` [PATCH] x86/xen: do not identity map E820 memory regions that are UNUSABLE Konrad Rzeszutek Wilk
     [not found] ` <20130709141329.GC24897@phenom.dumpdata.com>
2013-07-09 14:44   ` David Vrabel
     [not found]   ` <51DC21D6.4000107@citrix.com>
2013-07-09 18:45     ` Konrad Rzeszutek Wilk
     [not found]     ` <20130709184538.GB10188@phenom.dumpdata.com>
2013-07-11 11:21       ` David Vrabel
2013-07-12 19:33         ` Konrad Rzeszutek Wilk [this message]
2013-07-12 21:38           ` David Vrabel
2013-07-15 12:48             ` Konrad Rzeszutek Wilk
2013-07-09 17:00 ` Aurelien Chartier
     [not found] ` <51DC41A1.6010909@citrix.com>
2013-07-09 18:36   ` Konrad Rzeszutek Wilk
     [not found]   ` <20130709183647.GA10188@phenom.dumpdata.com>
2013-07-10 14:24     ` Aurelien Chartier
2013-07-09 13:38 David Vrabel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130712193342.GA6364@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.