* testing pmdval/pteval page presence bit
@ 2009-02-09 19:58 Pekka Paalanen
2009-02-10 22:42 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 4+ messages in thread
From: Pekka Paalanen @ 2009-02-09 19:58 UTC (permalink / raw)
To: linux-kernel; +Cc: Ingo Molnar, Vegard Nossum, Stuart Bennett
Hi all,
This question is related to mmiotrace which toggles the page presence
bit to trigger page faults on ioremapped regions. Page faults are used
to trace MMIO reads and writes of proprietary drivers.
I understood that large pages use pmd's instead of pte's. If there is a
union like this:
+ union {
+ pmdval_t pmdval;
+ pteval_t pteval;
+ } saved; /* stored value prior to arming */
and it is being assigned the proper content, as in the following:
+static int clear_page_present(struct kmmio_fault_page *f, bool clear)
{
pteval_t pteval;
pmdval_t pmdval;
unsigned int level;
pmd_t *pmd;
+ pte_t *pte = lookup_address(f->page, &level);
if (!pte) {
+ pr_err("kmmio: no pte for page 0x%08lx\n", f->page);
return -1;
}
switch (level) {
case PG_LEVEL_2M:
pmd = (pmd_t *)pte;
+ if (clear) {
+ f->saved.pmdval = pmd_val(*pmd);
+ pmdval = f->saved.pmdval & ~_PAGE_PRESENT;
+ } else
+ pmdval = f->saved.pmdval;
set_pmd(pmd, __pmd(pmdval));
break;
case PG_LEVEL_4K:
+ if (clear) {
+ f->saved.pteval = pte_val(*pte);
+ pteval = f->saved.pteval & ~_PAGE_PRESENT;
+ } else
+ pteval = f->saved.pteval;
set_pte_atomic(pte, __pte(pteval));
break;
Then regardless of was it pmdval or pteval being set, the test
if (!(faultpage->saved.pteval & _PAGE_PRESENT))
should be ok. But is it?
Can large page (pmd) presence be handled just like a normal page (pte)?
Thanks.
--
Pekka Paalanen
http://www.iki.fi/pq/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: testing pmdval/pteval page presence bit
2009-02-09 19:58 testing pmdval/pteval page presence bit Pekka Paalanen
@ 2009-02-10 22:42 ` Jeremy Fitzhardinge
2009-02-11 18:08 ` Pekka Paalanen
0 siblings, 1 reply; 4+ messages in thread
From: Jeremy Fitzhardinge @ 2009-02-10 22:42 UTC (permalink / raw)
To: Pekka Paalanen; +Cc: linux-kernel, Ingo Molnar, Vegard Nossum, Stuart Bennett
Pekka Paalanen wrote:
> Hi all,
>
> This question is related to mmiotrace which toggles the page presence
> bit to trigger page faults on ioremapped regions. Page faults are used
> to trace MMIO reads and writes of proprietary drivers.
>
> I understood that large pages use pmd's instead of pte's. If there is a
> union like this:
>
> + union {
> + pmdval_t pmdval;
> + pteval_t pteval;
> + } saved; /* stored value prior to arming */
>
> and it is being assigned the proper content, as in the following:
>
> +static int clear_page_present(struct kmmio_fault_page *f, bool clear)
> {
> pteval_t pteval;
> pmdval_t pmdval;
> unsigned int level;
> pmd_t *pmd;
> + pte_t *pte = lookup_address(f->page, &level);
>
> if (!pte) {
> + pr_err("kmmio: no pte for page 0x%08lx\n", f->page);
> return -1;
> }
>
> switch (level) {
> case PG_LEVEL_2M:
> pmd = (pmd_t *)pte;
> + if (clear) {
> + f->saved.pmdval = pmd_val(*pmd);
> + pmdval = f->saved.pmdval & ~_PAGE_PRESENT;
> + } else
> + pmdval = f->saved.pmdval;
> set_pmd(pmd, __pmd(pmdval));
> break;
>
> case PG_LEVEL_4K:
> + if (clear) {
> + f->saved.pteval = pte_val(*pte);
> + pteval = f->saved.pteval & ~_PAGE_PRESENT;
> + } else
> + pteval = f->saved.pteval;
> set_pte_atomic(pte, __pte(pteval));
> break;
>
>
> Then regardless of was it pmdval or pteval being set, the test
>
> if (!(faultpage->saved.pteval & _PAGE_PRESENT))
>
> should be ok. But is it?
> Can large page (pmd) presence be handled just like a normal page (pte)?
>
_PAGE_PRESENT is meaningful for both ptes and pmds; you can use
pmd_present() to test for it rather than open-coding it.
But there's one other theoretical problem with this code. In general it
isn't safe to just toggle the _PAGE_PRESENT bit on its own, because the
rest of the non-present pte could get interpreted as a swap entry. If
you're guaranteed that these are kernel mappings then there's no problem
in practice.
J
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: testing pmdval/pteval page presence bit
2009-02-10 22:42 ` Jeremy Fitzhardinge
@ 2009-02-11 18:08 ` Pekka Paalanen
2009-02-11 18:51 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 4+ messages in thread
From: Pekka Paalanen @ 2009-02-11 18:08 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: linux-kernel, Ingo Molnar, Vegard Nossum, Stuart Bennett,
Oliver McFadden, david jeannot
(Oliver and David, I added you to CC, since I recall you were planning
for user space tracing in mmiotrace.)
On Tue, 10 Feb 2009 14:42:56 -0800
Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Pekka Paalanen wrote:
> > Hi all,
> >
> > This question is related to mmiotrace which toggles the page presence
> > bit to trigger page faults on ioremapped regions. Page faults are used
> > to trace MMIO reads and writes of proprietary drivers.
> >
> > I understood that large pages use pmd's instead of pte's. If there is a
> > union like this:
> >
> > + union {
> > + pmdval_t pmdval;
> > + pteval_t pteval;
> > + } saved; /* stored value prior to arming */
> >
> > and it is being assigned the proper content, as in the following:
> >
> > +static int clear_page_present(struct kmmio_fault_page *f, bool clear)
> > {
> > pteval_t pteval;
> > pmdval_t pmdval;
> > unsigned int level;
> > pmd_t *pmd;
> > + pte_t *pte = lookup_address(f->page, &level);
> >
> > if (!pte) {
> > + pr_err("kmmio: no pte for page 0x%08lx\n", f->page);
> > return -1;
> > }
> >
> > switch (level) {
> > case PG_LEVEL_2M:
> > pmd = (pmd_t *)pte;
> > + if (clear) {
> > + f->saved.pmdval = pmd_val(*pmd);
> > + pmdval = f->saved.pmdval & ~_PAGE_PRESENT;
> > + } else
> > + pmdval = f->saved.pmdval;
> > set_pmd(pmd, __pmd(pmdval));
> > break;
> >
> > case PG_LEVEL_4K:
> > + if (clear) {
> > + f->saved.pteval = pte_val(*pte);
> > + pteval = f->saved.pteval & ~_PAGE_PRESENT;
> > + } else
> > + pteval = f->saved.pteval;
> > set_pte_atomic(pte, __pte(pteval));
> > break;
> >
> >
> > Then regardless of was it pmdval or pteval being set, the test
> >
> > if (!(faultpage->saved.pteval & _PAGE_PRESENT))
> >
> > should be ok. But is it?
> > Can large page (pmd) presence be handled just like a normal page (pte)?
> >
>
> _PAGE_PRESENT is meaningful for both ptes and pmds; you can use
> pmd_present() to test for it rather than open-coding it.
Okay, will look into those, but it also means I need to record which
one I am dealing with.
> But there's one other theoretical problem with this code. In general it
> isn't safe to just toggle the _PAGE_PRESENT bit on its own, because the
> rest of the non-present pte could get interpreted as a swap entry. If
> you're guaranteed that these are kernel mappings then there's no problem
> in practice.
This is good to know. So far these are kernel mappings, as they are all
created by ioremap*(), but there are plans to extend mmiotrace to
trace IO-mappings accessed from user space. Do you have hints for that?
OTOH, we are always dealing with PCI IO-mem-mappings, so would those ever
be not present, excluding the mmiotrace case?
Well, Stuart already found out that the kernel ioremap*()'ed pages might
not really be present, there are some fixes coming up to mmiotrace to
cope with that. The plan is to restore the state of the pte like it was
before mmiotrace cleared the _PAGE_PRESENT flag, and if the same
instruction and address faults again, fall through to the normal page
fault handling. How can/will this fail? And if it is a user page?
Oh, we are on x86/x86_64 only.
Thanks.
--
Pekka Paalanen
http://www.iki.fi/pq/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: testing pmdval/pteval page presence bit
2009-02-11 18:08 ` Pekka Paalanen
@ 2009-02-11 18:51 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 4+ messages in thread
From: Jeremy Fitzhardinge @ 2009-02-11 18:51 UTC (permalink / raw)
To: Pekka Paalanen
Cc: linux-kernel, Ingo Molnar, Vegard Nossum, Stuart Bennett,
Oliver McFadden, david jeannot
Pekka Paalanen wrote:
> This is good to know. So far these are kernel mappings, as they are all
> created by ioremap*(), but there are plans to extend mmiotrace to
> trace IO-mappings accessed from user space. Do you have hints for that?
>
Set _PAGE_PROTNONE, which will make the kernel logically treat it as a
present pte with no page permissions, which is what you're doing. Bear
in mind that _PAGE_PROTNONE is overloaded with _PAGE_GLOBAL, so you'll
need to make sure you restore that properly too (kernel mappings are
generally global).
On the other hand, I assume you intercept mmiotrace faults fairly early
in the fault handler, so that path shouldn't ever see an unexpected
not-present pte. But other code, like mprotect, mlock, etc, may inspect
those ptes and get a nasty surprise if it sees them non-present.
(What happens if someone uses mprotect on a mapping that mmiotrace has
made non-present?)
> OTOH, we are always dealing with PCI IO-mem-mappings, so would those ever
> be not present, excluding the mmiotrace case?
>
> Well, Stuart already found out that the kernel ioremap*()'ed pages might
> not really be present, there are some fixes coming up to mmiotrace to
> cope with that. The plan is to restore the state of the pte like it was
> before mmiotrace cleared the _PAGE_PRESENT flag, and if the same
> instruction and address faults again, fall through to the normal page
> fault handling. How can/will this fail? And if it is a user page?
>
My main concern is non-fault handler code which looks at ptes. How will
it know what its looking at if mmiotrace has fiddled with the state?
J
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-02-11 18:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-09 19:58 testing pmdval/pteval page presence bit Pekka Paalanen
2009-02-10 22:42 ` Jeremy Fitzhardinge
2009-02-11 18:08 ` Pekka Paalanen
2009-02-11 18:51 ` Jeremy Fitzhardinge
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox