All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Simon Graham <simon.graham@virtualcomputer.com>
Cc: Andrew Lyon <andrew.lyon@gmail.com>,
	xen-devel@lists.xensource.com, Jan Beulich <JBeulich@novell.com>
Subject: Re: Re: [Xen-users] rebased openSUSE Xen dom0 Patches
Date: Tue, 20 Apr 2010 15:01:48 -0400	[thread overview]
Message-ID: <20100420190148.GF32720@phenom.dumpdata.com> (raw)
In-Reply-To: <BC8BC73D12649F439B502B4CB6FD58B1014D5922@be23.exg4.exghost.com>

On Tue, Apr 20, 2010 at 11:07:54AM -0500, Simon Graham wrote:
> > >
> > > But that code is precisely what guarantees that the pages *can* be
> > > converted to page table pages (by completely unmapping them from
> > > the kernel image part of the address space). So your explanation is
> > > rather confusing than clarifying to me...
> > 
> > I agree that that is the intent of this code -- what we _seem_ to
> > observe (and this
> > is hard to prove) is that the page type ref count is not being
> > decremented by this
> > code which would imply that the unmapping is not happening for some
> > reason. The only
> > real evidence I have for this is that the failure always occurs on one
> > of these pages.
> > 
> 
> We now think we've found the problem which seems to be due to the
> following two calls in Linux within mark_rodata_ro():
> 
>     free_init_pages("unused kernel memory",
>                     (unsigned long)
>                      page_address(virt_to_page(text_end)),
>                     (unsigned long)
>                      page_address(virt_to_page(rodata_start)));
>     free_init_pages("unused kernel memory",
>                     (unsigned long)
>                      page_address(virt_to_page(rodata_end)),
>                     (unsigned long)
>                      page_address(virt_to_page(data_start)));
> 
> The first of these calls is trying to free the range
> page_address(virt_to_page(text_end)) through
> page_address(virt_to_page(rodata_start)).
> 
> With text_end == 0xffffffff80610000 and  rodata_start ==
> 0xffffffff80800000 the actual values received by free_init_pages() are
> 0xffff880000610000 and 0xffff880000800000 (i.e. within the 64-bit direct
> mapping region).
> 
> In free_init_pages() there is a test of addr >= __start_kernel_map
> (which is 0xffffffff80000000). Because of this test, the two calls to
> HYPERVISOR_update_va_mapping() are not made.
> 
> The net effect (we believe) is that this range of pages is freed from
> Linux's viewpoint but the pages are still marked as PGT_writable_page
> with a non-zero page type ref count in the hypervisor. When Linux tries
> to use these pages later on for page table pages, the hypervisor traps.
> 
> Note, we have traced all uses of the pages in question.  Apparently they
> are never used by Linux prior to the trap. Our traces show them being
> initialized in the hypervisor by construct_dom0(), marked as readonly in
> Linux by mark_rodata_ro() and then causing the hypervisor trap when
> Linux tries to use one them for a page tables.

Oh man, I remember this one. I submitted an initial patch for this.
https://patchwork.kernel.org/patch/79086/
> 
> Presumably the correct fix will be to change the address range test in
> free_init_pages...

And this was the final fix:
http://marc.info/?l=linux-kernel&m=126652277705569&w=2

The end result was that the a different mechanism to get the kernel address
and use that to set the _PAGE_RW on them. And ignore the other mapping.
I think, this has been some time ago.

  reply	other threads:[~2010-04-20 19:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-15 18:36 [Xen-users] rebased openSUSE Xen dom0 Patches Simon Graham
2010-04-15 18:41 ` Keir Fraser
2010-04-16  7:58 ` Jan Beulich
2010-04-16 13:42   ` Simon Graham
2010-04-19  8:41     ` Jan Beulich
2010-04-19 14:52       ` Simon Graham
2010-04-19 15:09         ` Jan Beulich
2010-04-20 16:07       ` Simon Graham
2010-04-20 19:01         ` Konrad Rzeszutek Wilk [this message]
  -- strict thread matches above, loose matches on Subject: below --
2010-04-21  7:04 Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100420190148.GF32720@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@novell.com \
    --cc=andrew.lyon@gmail.com \
    --cc=simon.graham@virtualcomputer.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.