All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: LKML <linux-kernel@vger.kernel.org>, Andi Kleen <ak@suse.de>,
	Jan Beulich <jbeulich@novell.com>,
	Eduardo Pereira Habkost <ehabkost@redhat.com>,
	Ian Campbell <ijc@hellion.org.uk>, H Peter Anvin <hpa@zytor.com>,
	William Irwin <wli@holomorphy.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 11 of 11] x86: defer cr3 reload when doing pud_clear()
Date: Fri, 25 Jan 2008 13:23:20 -0800	[thread overview]
Message-ID: <c084def0f6afb2b2ef47.1201296200@localhost> (raw)
In-Reply-To: <patchbomb.1201296189@localhost>

PAE mode requires that we reload cr3 in order to guarantee that
changes to the pgd will be noticed by the processor.  This means that
in principle pud_clear needs to reload cr3 every time.  However,
because reloading cr3 implies a tlb flush, we want to avoid it where
possible.

pud_clear() is only used in a couple of places:
 - in free_pmd_range(), when pulling down a range of process address space, and
 - huge_pmd_unshare()

In both cases, the calling code will do a a tlb flush anyway, so
there's no need to do it within pud_clear().

In free_pmd_range(), the pud_clear is immediately followed by
pmd_free_tlb(); we can hook that to make the mmu_gather do an
unconditional full flush to make sure cr3 gets reloaded.

In huge_pmd_unshare, it is followed by flush_tlb_range, which always
results in a full cr3-reload tlb flush.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: William Irwin <wli@holomorphy.com>
---
 include/asm-x86/pgalloc_32.h     |    7 +++++++
 include/asm-x86/pgtable-3level.h |   21 +++++++++++++++------
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/include/asm-x86/pgalloc_32.h b/include/asm-x86/pgalloc_32.h
--- a/include/asm-x86/pgalloc_32.h
+++ b/include/asm-x86/pgalloc_32.h
@@ -74,6 +74,13 @@ static inline void pmd_free(pmd_t *pmd)
 
 static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
 {
+	/* This is called just after the pmd has been detached from
+	   the pgd, which requires a full tlb flush to be recognized
+	   by the CPU.  Rather than incurring multiple tlb flushes
+	   while the address space is being pulled down, make the tlb
+	   gathering machinery do a full flush when we're done. */
+	tlb->fullmm = 1;
+
 	paravirt_release_pd(__pa(pmd) >> PAGE_SHIFT);
 	tlb_remove_page(tlb, virt_to_page(pmd));
 }
diff --git a/include/asm-x86/pgtable-3level.h b/include/asm-x86/pgtable-3level.h
--- a/include/asm-x86/pgtable-3level.h
+++ b/include/asm-x86/pgtable-3level.h
@@ -96,14 +96,23 @@ static inline void pud_clear(pud_t *pudp
 	set_pud(pudp, __pud(0));
 
 	/*
-	 * Pentium-II erratum A13: in PAE mode we explicitly have to flush
-	 * the TLB via cr3 if the top-level pgd is changed...
+	 * In principle we need to do a cr3 reload here to make sure
+	 * the processor recognizes the changed pgd.  In practice, all
+	 * the places where pud_clear() gets called are followed by
+	 * full tlb flushes anyway, so we can defer the cost here.
 	 *
-	 * XXX I don't think we need to worry about this here, since
-	 * when clearing the pud, the calling code needs to flush the
-	 * tlb anyway.  But do it now for safety's sake. - jsgf
+	 * Specifically:
+	 *
+	 * mm/memory.c:free_pmd_range() - immediately after the
+	 * pud_clear() it does a pmd_free_tlb().  We change the
+	 * mmu_gather structure to do a full tlb flush (which has the
+	 * effect of reloading cr3) when the pagetable free is
+	 * complete.
+	 *
+	 * arch/x86/mm/hugetlbpage.c:huge_pmd_unshare() - the call to
+	 * this is followed by a flush_tlb_range, which on x86 does a
+	 * full tlb flush.
 	 */
-	write_cr3(read_cr3());
 }
 
 #define pud_page(pud) \



  parent reply	other threads:[~2008-01-25 21:56 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-25 21:23 [PATCH 00 of 11] x86: separate pmd lifetime from pgd Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 01 of 11] xen: fix mismerge in masking pte flags Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 02 of 11] x86: use the same pgd_list for PAE and 64-bit Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 03 of 11] x86: add mm parameter to paravirt_alloc_pd Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 04 of 11] x86: fix early_ioremap pagetable ops Jeremy Fitzhardinge
2008-01-31 19:01   ` Ian Campbell
2008-01-31 19:52     ` Jeremy Fitzhardinge
2008-01-31 20:37     ` Ingo Molnar
2008-01-31 20:41       ` Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 05 of 11] x86: demacro asm-x86/pgalloc_32.h Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 06 of 11] x86: unify PAE/non-PAE pgd_ctor Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 07 of 11] x86: don't special-case pmd allocations as much Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 08 of 11] xen: deal with pmd being allocated/freed Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 09 of 11] x86: preallocate pmds at pgd creation time Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 10 of 11] x86: allocate and initialize unshared pmds Jeremy Fitzhardinge
2008-01-25 21:23 ` Jeremy Fitzhardinge [this message]
2008-01-25 21:37   ` [PATCH 11 of 11] x86: defer cr3 reload when doing pud_clear() H. Peter Anvin
2008-01-25 22:54     ` Jeremy Fitzhardinge
2008-01-25 23:38       ` Keir Fraser
2008-01-25 23:44         ` Jeremy Fitzhardinge
2008-01-26  0:11           ` Ingo Molnar
2008-01-26  0:20             ` H. Peter Anvin
2008-01-26  5:57             ` Andi Kleen
2008-01-26  6:03               ` H. Peter Anvin
2008-01-26  0:10         ` H. Peter Anvin
2008-01-26  0:57           ` Jeremy Fitzhardinge
2008-01-26  1:09             ` H. Peter Anvin
2008-01-28 15:17 ` [PATCH 00 of 11] x86: separate pmd lifetime from pgd Ingo Molnar
2008-01-28 15:39   ` Jeremy Fitzhardinge
2008-01-28 15:41   ` Ingo Molnar
2008-01-28 15:47     ` Ingo Molnar
2008-01-28 16:20     ` Jeremy Fitzhardinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c084def0f6afb2b2ef47.1201296200@localhost \
    --to=jeremy@goop.org \
    --cc=ak@suse.de \
    --cc=ehabkost@redhat.com \
    --cc=hpa@zytor.com \
    --cc=ijc@hellion.org.uk \
    --cc=jbeulich@novell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=torvalds@linux-foundation.org \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.