From: "H. Peter Anvin" <hpa@zytor.com>
To: Keir Fraser <Keir.Fraser@cl.cam.ac.uk>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
Ingo Molnar <mingo@elte.hu>, LKML <linux-kernel@vger.kernel.org>,
Andi Kleen <ak@suse.de>, Jan Beulich <jbeulich@novell.com>,
Eduardo Pereira Habkost <ehabkost@redhat.com>,
Ian Campbell <ijc@hellion.org.uk>,
William Irwin <wli@holomorphy.com>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH 11 of 11] x86: defer cr3 reload when doing pud_clear()
Date: Fri, 25 Jan 2008 16:10:48 -0800 [thread overview]
Message-ID: <479A7A88.6010505@zytor.com> (raw)
In-Reply-To: <C3C0235A.12D8E%Keir.Fraser@cl.cam.ac.uk>
Keir Fraser wrote:
> On 25/1/08 22:54, "Jeremy Fitzhardinge" <jeremy@goop.org> wrote:
>
>> The only possibly relevant comment I can find in vol3a is:
>>
>> Older IA-32 processors that implement the PAE mechanism use uncached
>> accesses when loading page-directory-pointer table entries. This
>> behavior is
>> model specific and not architectural. More recent IA-32 processors may
>> cache page-directory-pointer table entries.
>
> Go read the Intel application note "TLBs, Paging-Structure Caches, and Their
> Invalidation" at http://www.intel.com/design/processor/applnots/317080.pdf
>
> Section 8.1 explains about the PDPTR cache in 32-bit PAE mode, which can
> only be refreshed by appropriate tickling of CR0, CR3 or CR4.
>
> It is also important to note that *any* valid page directory entry at *any*
> level in the page-table hierarchy can become cached at *any* time. Basically
> TLB lookup is performed as a longest-prefix match on the linear address to
> skip as many levels in a page-table walk as possible (where a walk is
> needed, because there is no full-length match on the linear address). So, if
> you modify a directory entry from present to not-present, or change the page
> directory that a valid pde points to, you probably need to flush the pde
> caching structure. One piece of good news is that all pde caches are flushed
> by any arbitrary INVLPG.
>
Actually, it's trickier than that. The PDPTR, just like the segments,
aren't a real cache, and aren't invalidated by INVLPG. This means you
can't go from less permissive to more permissive, which is normally
permitted in the x86. The PDPTR should really be thought of as an
extended cr3 with four entries (this is also how it would be typically
implemented in hardware) rather than as a part of the paging structure
per se.
We do NOT want to frob %cr4 unless we actually need to clear all the
global pages.
The stuff in chapter 10 sounds like they're flagging for a revised
INVLPG instruction or mode which would fit some of the extremely serious
defects in INVLPG that was introduced by haphazard semantics from the P5
and early P6 days.
In general, we should assume that INVLPG only flushes the hierarchy
above it, and not rely on side effects. In particular, we should only
assume INVLPG invalidates the hierarchy immediately above it, not on any
side effects. That's basically sane design anyway.
Now, all of this reminds me of something somewhat messy: if we share the
kernel page tables for trampoline page tables, as discussed elsewhere,
we HAVE to do a complete, all-tlb-including-global-pages flush after
use, since the kernel pages are global and otherwise will stick around.
Unlike the permissions pages, there aren't G enable bits on the higher
levels, but only for the PTEs themselves.
-hpa
next prev parent reply other threads:[~2008-01-26 0:15 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-25 21:23 [PATCH 00 of 11] x86: separate pmd lifetime from pgd Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 01 of 11] xen: fix mismerge in masking pte flags Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 02 of 11] x86: use the same pgd_list for PAE and 64-bit Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 03 of 11] x86: add mm parameter to paravirt_alloc_pd Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 04 of 11] x86: fix early_ioremap pagetable ops Jeremy Fitzhardinge
2008-01-31 19:01 ` Ian Campbell
2008-01-31 19:52 ` Jeremy Fitzhardinge
2008-01-31 20:37 ` Ingo Molnar
2008-01-31 20:41 ` Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 05 of 11] x86: demacro asm-x86/pgalloc_32.h Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 06 of 11] x86: unify PAE/non-PAE pgd_ctor Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 07 of 11] x86: don't special-case pmd allocations as much Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 08 of 11] xen: deal with pmd being allocated/freed Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 09 of 11] x86: preallocate pmds at pgd creation time Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 10 of 11] x86: allocate and initialize unshared pmds Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 11 of 11] x86: defer cr3 reload when doing pud_clear() Jeremy Fitzhardinge
2008-01-25 21:37 ` H. Peter Anvin
2008-01-25 22:54 ` Jeremy Fitzhardinge
2008-01-25 23:38 ` Keir Fraser
2008-01-25 23:44 ` Jeremy Fitzhardinge
2008-01-26 0:11 ` Ingo Molnar
2008-01-26 0:20 ` H. Peter Anvin
2008-01-26 5:57 ` Andi Kleen
2008-01-26 6:03 ` H. Peter Anvin
2008-01-26 0:10 ` H. Peter Anvin [this message]
2008-01-26 0:57 ` Jeremy Fitzhardinge
2008-01-26 1:09 ` H. Peter Anvin
2008-01-28 15:17 ` [PATCH 00 of 11] x86: separate pmd lifetime from pgd Ingo Molnar
2008-01-28 15:39 ` Jeremy Fitzhardinge
2008-01-28 15:41 ` Ingo Molnar
2008-01-28 15:47 ` Ingo Molnar
2008-01-28 16:20 ` Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=479A7A88.6010505@zytor.com \
--to=hpa@zytor.com \
--cc=Keir.Fraser@cl.cam.ac.uk \
--cc=ak@suse.de \
--cc=ehabkost@redhat.com \
--cc=ijc@hellion.org.uk \
--cc=jbeulich@novell.com \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=torvalds@linux-foundation.org \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox