Re: Tracking "Cannot allocate memory" error in shadow_alloc_p2m_table

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Steven Rostedt <srostedt@redhat.com>
To: Tim Deegan <Tim.Deegan@xensource.com>
Cc: Chris Lalancette <clalance@redhat.com>, xen-devel@lists.xensource.com
Subject: Re: Tracking "Cannot allocate memory" error in	shadow_alloc_p2m_table
Date: Fri, 12 Jan 2007 00:13:45 -0500	[thread overview]
Message-ID: <45A71909.8070300@redhat.com> (raw)
In-Reply-To: <20070110113207.GD21843@york.uk.xensource.com>

Tim Deegan wrote:

> The code you cut out is start-of-day code that builds the p2m map of a
> domain from the m2p entries of its allocated pages.  It was needed
> originally because the domain builder tools would set up the guest's
> memory and m2p mapping and _then_ enable shadow-translate mode.  It may
> not be necessary now that HVM domains are put in shadow mode at creation
> time.
> 
> AFAICS, either the domain should have no memory assigned yet (in which
> case it does nothing), or the domain's pages should have m2p entries
> that are valid, explicitly invalid, or set to the "debug pattern" of all
> 5s.  I'll look at a more sensible failure mode then -ENOMEM, though.
> 

OK, I've been debugging the hell out of this one.  And I don't see where 
the m2p mapping is updated for the guest yet!  I put in prints to show 
me where the m2p is updated for the guest and this seems to have shown 
me that the m2p mapping doesn't know about the new guest at this point.

I used my internal log tracer (ring buffer) to map output, and this is 
what it showed me.

Note: I'm using RHEL5 HV for this, but it goes the same for Xen.

(XEN) [   98.395925] cpu:0 share_xen_page_with_guest:264 guest 1 mapping 
pages page=f6811280 mfn=2928 from:ff106fa2
(XEN) [   98.395927] cpu:0 share_xen_page_with_guest:264 guest 1 mapping 
pages page=f6811298 mfn=2929 from:ff106fa2
(XEN) [   98.395928] cpu:0 share_xen_page_with_guest:264 guest 1 mapping 
pages page=f68112b0 mfn=2930 from:ff106fa2
(XEN) [   98.395929] cpu:0 share_xen_page_with_guest:264 guest 1 mapping 
pages page=f68112c8 mfn=2931 from:ff106fa2
(XEN) [   98.395936] cpu:0 share_xen_page_with_guest:264 guest 1 mapping 
pages page=f68112f8 mfn=2933 from:ff11db10

The above is not filtered (meaning that I didn't put in any limit to the 
amount of prints it will make, or snip it from this email) so the above 
is the only calls to share_xen_page_with_guest (that updates m2p 
mapping) before .

These show me that this was called by grant_table_create (0xff106fa2), 
and arch_domain_create (0xff11db10).

(XEN) [   99.008441] cpu:1 set_sh_allocation:1201 alloc_domheap_pages 
(1) pages=768
(XEN) [   99.008444] cpu:1 set_sh_allocation:1205  arch total_pages=0

The above shows that we are going to call alloc_domheap_pages next (768 
pages to allocate) from set_sh_allocation, and the total_pages currently 
set by the domain is zero.

This only allocates the pages for the domain (puts it on its 
d->page_list), but it does not modify the m2p mapping here. Perhaps we 
should have this invalidate the m2p mapping for the page?

(XEN) [   99.054858] cpu:0 shadow_alloc_p2m_table:1040 page=f7049e70 
gfn=2669a mfn=362138 cnt=0
(XEN) [   99.054861] cpu:0 shadow_alloc_p2m_table:1040 page=f73b3ba8 
gfn=752e1 mfn=511271 cnt=1
(XEN) [   99.054867] cpu:0 shadow_alloc_p2m_table:1040 page=f7049e58 
gfn=26699 mfn=362137 cnt=2
(XEN) [   99.054868] cpu:0 shadow_alloc_p2m_table:1040 page=f7049e40 
gfn=26698 mfn=362136 cnt=3
(XEN) [   99.054869] cpu:0 shadow_alloc_p2m_table:1040 page=f73b3c98 
gfn=752d7 mfn=511281 cnt=4

The above is next, but I do filter it, since it prints out for every page.

mfn is in decimal (don't ask why, it was my first default)
gfn's will be in hex, and page is the pointer.

This is the code that we are having a problem with. Basically, we are 
updating the shadow page tables with bogus memory!  Unless we wanted the 
5  pages that were mapped for the grant table, and the one for the arch 
setup, but I don't even think those are in this.

Why do we need that code, it's using stale data, and updating the shadow 
page tables with a m2p mapping that is from a older domain. See more below:

[...]

(XEN) [   99.054911] cpu:0 shadow_alloc_p2m_table:1040 page=f73b2000 
gfn=2000 mfn=510976 cnt=49
(XEN) [   99.076309] cpu:0 shadow_guest_physmap_add_page:2901 guest 1 
calling sh_p2m_remove_page ogfn=2669a
(XEN) [   99.076317] cpu:0 sh_p2m_remove_page:2832 guest 1 mapping pages 
invalid(XEN) [   99.076319] cpu:0 shadow_guest_physmap_add_page:2912 
guest 1 mapping pages gfn=0 mfn=362138 from:ff12d10f
(XEN) [   99.076321] cpu:0 shadow_guest_physmap_add_page:2901 guest 1 
calling sh_p2m_remove_page ogfn=752e1
(XEN) [   99.076322] cpu:0 sh_p2m_remove_page:2832 guest 1 mapping pages 
invalid(XEN) [   99.076323] cpu:0 shadow_guest_physmap_add_page:2912 
guest 1 mapping pages gfn=1 mfn=511271 from:ff12d10f
(XEN) [   99.076324] cpu:0 shadow_guest_physmap_add_page:2901 guest 1 
calling sh_p2m_remove_page ogfn=26699
(XEN) [   99.076326] cpu:0 sh_p2m_remove_page:2832 guest 1 mapping pages 
invalid(XEN) [   99.076327] cpu:0 shadow_guest_physmap_add_page:2912 
guest 1 mapping pages gfn=2 mfn=362137 from:ff12d10f

Finally we are calling shadow_guest_physmap_add_page, and this is called 
from ff12d10f or do_mmu_update which I believe (correct me if I'm wrong) 
is being called by xend and friends.

First thing that is done, is that it *removes* the m2p mapping 
(sh_p2m_remove_page). It actually does both, remove the shadow mapping 
that we set up before, and it invalidates the m2p mapping.

We then set up this page to be the correct shadow mapping, with the call 
to shadow_guest_physmap_add_page and this also updates to the correct 
m2p mapping.

So in conclusion, I don't see why we need the code that does this bogus 
mapping, and that the patch that Chris proposed, in removing that code 
looks like the better approach.

If you think I'm wrong, please feel free to convince me. I've been wrong 
before, but I don't thing that we should have code that just undoes itself.

Oh, I also tried this logging after removing the code, and I got this at 
the end:

(XEN) [   52.764736] cpu:1 shadow_guest_physmap_add_page:2914 guest 1 
mapping pages gfn=0 mfn=346470 from:ff12d10f
(XEN) [   52.764739] cpu:1 shadow_guest_physmap_add_page:2914 guest 1 
mapping pages gfn=1 mfn=346467 from:ff12d10f
(XEN) [   52.764741] cpu:1 shadow_guest_physmap_add_page:2914 guest 1 
mapping pages gfn=2 mfn=346465 from:ff12d10f
(XEN) [   52.764742] cpu:1 shadow_guest_physmap_add_page:2914 guest 1 
mapping pages gfn=3 mfn=346462 from:ff12d10f
(XEN) [   52.764744] cpu:1 shadow_guest_physmap_add_page:2914 guest 1 
mapping pages gfn=4 mfn=346402 from:ff12d10f
(XEN) [   52.764749] cpu:1 shadow_guest_physmap_add_page:2914 guest 1 
mapping pages gfn=5 mfn=346426 from:ff12d10f

Notice that there's no effort in removing the pages before setting them up.

-- Steve

next prev parent reply	other threads:[~2007-01-12  5:13 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-09 21:16 Tracking "Cannot allocate memory" error in shadow_alloc_p2m_table Chris Lalancette
2007-01-10 11:32 ` Tim Deegan
2007-01-10 15:41   ` Chris Lalancette
2007-01-12  5:13   ` Steven Rostedt [this message]
2007-01-12 11:07     ` Tim Deegan
2007-01-12 13:05       ` Steven Rostedt
2007-01-12 14:36         ` Steven Rostedt
2007-01-12 14:37         ` Tim Deegan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45A71909.8070300@redhat.com \
    --to=srostedt@redhat.com \
    --cc=Tim.Deegan@xensource.com \
    --cc=clalance@redhat.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.