xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Simon Graham <simon.graham@virtualcomputer.com>
Cc: Andrew Lyon <andrew.lyon@gmail.com>,
	xen-devel@lists.xensource.com, Jan Beulich <JBeulich@novell.com>
Subject: Re: Re: [Xen-users] rebased openSUSE Xen dom0 Patches
Date: Tue, 20 Apr 2010 15:01:48 -0400	[thread overview]
Message-ID: <20100420190148.GF32720@phenom.dumpdata.com> (raw)
In-Reply-To: <BC8BC73D12649F439B502B4CB6FD58B1014D5922@be23.exg4.exghost.com>

On Tue, Apr 20, 2010 at 11:07:54AM -0500, Simon Graham wrote:
> > >
> > > But that code is precisely what guarantees that the pages *can* be
> > > converted to page table pages (by completely unmapping them from
> > > the kernel image part of the address space). So your explanation is
> > > rather confusing than clarifying to me...
> > 
> > I agree that that is the intent of this code -- what we _seem_ to
> > observe (and this
> > is hard to prove) is that the page type ref count is not being
> > decremented by this
> > code which would imply that the unmapping is not happening for some
> > reason. The only
> > real evidence I have for this is that the failure always occurs on one
> > of these pages.
> > 
> 
> We now think we've found the problem which seems to be due to the
> following two calls in Linux within mark_rodata_ro():
> 
>     free_init_pages("unused kernel memory",
>                     (unsigned long)
>                      page_address(virt_to_page(text_end)),
>                     (unsigned long)
>                      page_address(virt_to_page(rodata_start)));
>     free_init_pages("unused kernel memory",
>                     (unsigned long)
>                      page_address(virt_to_page(rodata_end)),
>                     (unsigned long)
>                      page_address(virt_to_page(data_start)));
> 
> The first of these calls is trying to free the range
> page_address(virt_to_page(text_end)) through
> page_address(virt_to_page(rodata_start)).
> 
> With text_end == 0xffffffff80610000 and  rodata_start ==
> 0xffffffff80800000 the actual values received by free_init_pages() are
> 0xffff880000610000 and 0xffff880000800000 (i.e. within the 64-bit direct
> mapping region).
> 
> In free_init_pages() there is a test of addr >= __start_kernel_map
> (which is 0xffffffff80000000). Because of this test, the two calls to
> HYPERVISOR_update_va_mapping() are not made.
> 
> The net effect (we believe) is that this range of pages is freed from
> Linux's viewpoint but the pages are still marked as PGT_writable_page
> with a non-zero page type ref count in the hypervisor. When Linux tries
> to use these pages later on for page table pages, the hypervisor traps.
> 
> Note, we have traced all uses of the pages in question.  Apparently they
> are never used by Linux prior to the trap. Our traces show them being
> initialized in the hypervisor by construct_dom0(), marked as readonly in
> Linux by mark_rodata_ro() and then causing the hypervisor trap when
> Linux tries to use one them for a page tables.

Oh man, I remember this one. I submitted an initial patch for this.
https://patchwork.kernel.org/patch/79086/
> 
> Presumably the correct fix will be to change the address range test in
> free_init_pages...

And this was the final fix:
http://marc.info/?l=linux-kernel&m=126652277705569&w=2

The end result was that the a different mechanism to get the kernel address
and use that to set the _PAGE_RW on them. And ignore the other mapping.
I think, this has been some time ago.

  reply	other threads:[~2010-04-20 19:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-15 18:36 [Xen-users] rebased openSUSE Xen dom0 Patches Simon Graham
2010-04-15 18:41 ` Keir Fraser
2010-04-16  7:58 ` Jan Beulich
2010-04-16 13:42   ` Simon Graham
2010-04-19  8:41     ` Jan Beulich
2010-04-19 14:52       ` Simon Graham
2010-04-19 15:09         ` Jan Beulich
2010-04-20 16:07       ` Simon Graham
2010-04-20 19:01         ` Konrad Rzeszutek Wilk [this message]
  -- strict thread matches above, loose matches on Subject: below --
2010-04-21  7:04 Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100420190148.GF32720@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@novell.com \
    --cc=andrew.lyon@gmail.com \
    --cc=simon.graham@virtualcomputer.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).