* [PATCH 1 of 4] x86: unify PAE/non-PAE pgd_ctor
2008-01-28 23:48 [PATCH 0 of 4] x86: cleanups from pmd lifetime series Jeremy Fitzhardinge
@ 2008-01-28 23:48 ` Jeremy Fitzhardinge
2008-01-28 23:48 ` [PATCH 2 of 4] x86: revert "defer cr3 reload when doing pud_clear()" Jeremy Fitzhardinge
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Jeremy Fitzhardinge @ 2008-01-28 23:48 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Andi Kleen, Jan Beulich, Eduardo Pereira Habkost,
Ian Campbell, H Peter Anvin
The constructors for PAE and non-PAE pgd_ctors are more or less
identical, and can be made into the same function.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: William Irwin <wli@holomorphy.com>
---
arch/x86/mm/pgtable_32.c | 58 +++++++++++++++++-----------------------------
1 file changed, 22 insertions(+), 36 deletions(-)
diff --git a/arch/x86/mm/pgtable_32.c b/arch/x86/mm/pgtable_32.c
--- a/arch/x86/mm/pgtable_32.c
+++ b/arch/x86/mm/pgtable_32.c
@@ -219,50 +219,39 @@ static inline void pgd_list_del(pgd_t *p
list_del(&page->lru);
}
+#define UNSHARED_PTRS_PER_PGD \
+ (SHARED_KERNEL_PMD ? USER_PTRS_PER_PGD : PTRS_PER_PGD)
-
-#if (PTRS_PER_PMD == 1)
-/* Non-PAE pgd constructor */
-static void pgd_ctor(void *pgd)
+static void pgd_ctor(void *p)
{
+ pgd_t *pgd = p;
unsigned long flags;
- /* !PAE, no pagetable sharing */
+ /* Clear usermode parts of PGD */
memset(pgd, 0, USER_PTRS_PER_PGD*sizeof(pgd_t));
spin_lock_irqsave(&pgd_lock, flags);
- /* must happen under lock */
- clone_pgd_range((pgd_t *)pgd + USER_PTRS_PER_PGD,
- swapper_pg_dir + USER_PTRS_PER_PGD,
- KERNEL_PGD_PTRS);
- paravirt_alloc_pd_clone(__pa(pgd) >> PAGE_SHIFT,
- __pa(swapper_pg_dir) >> PAGE_SHIFT,
- USER_PTRS_PER_PGD,
+ /* If the pgd points to a shared pagetable level (either the
+ ptes in non-PAE, or shared PMD in PAE), then just copy the
+ references from swapper_pg_dir. */
+ if (PAGETABLE_LEVELS == 2 ||
+ (PAGETABLE_LEVELS == 3 && SHARED_KERNEL_PMD)) {
+ clone_pgd_range(pgd + USER_PTRS_PER_PGD,
+ swapper_pg_dir + USER_PTRS_PER_PGD,
KERNEL_PGD_PTRS);
- pgd_list_add(pgd);
+ paravirt_alloc_pd_clone(__pa(pgd) >> PAGE_SHIFT,
+ __pa(swapper_pg_dir) >> PAGE_SHIFT,
+ USER_PTRS_PER_PGD,
+ KERNEL_PGD_PTRS);
+ }
+
+ /* list required to sync kernel mapping updates */
+ if (!SHARED_KERNEL_PMD)
+ pgd_list_add(pgd);
+
spin_unlock_irqrestore(&pgd_lock, flags);
}
-#else /* PTRS_PER_PMD > 1 */
-/* PAE pgd constructor */
-static void pgd_ctor(void *pgd)
-{
- /* PAE, kernel PMD may be shared */
-
- if (SHARED_KERNEL_PMD) {
- clone_pgd_range((pgd_t *)pgd + USER_PTRS_PER_PGD,
- swapper_pg_dir + USER_PTRS_PER_PGD,
- KERNEL_PGD_PTRS);
- } else {
- unsigned long flags;
-
- memset(pgd, 0, USER_PTRS_PER_PGD*sizeof(pgd_t));
- spin_lock_irqsave(&pgd_lock, flags);
- pgd_list_add(pgd);
- spin_unlock_irqrestore(&pgd_lock, flags);
- }
-}
-#endif /* PTRS_PER_PMD */
static void pgd_dtor(void *pgd)
{
@@ -275,9 +264,6 @@ static void pgd_dtor(void *pgd)
pgd_list_del(pgd);
spin_unlock_irqrestore(&pgd_lock, flags);
}
-
-#define UNSHARED_PTRS_PER_PGD \
- (SHARED_KERNEL_PMD ? USER_PTRS_PER_PGD : PTRS_PER_PGD)
#ifdef CONFIG_X86_PAE
/*
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 2 of 4] x86: revert "defer cr3 reload when doing pud_clear()"
2008-01-28 23:48 [PATCH 0 of 4] x86: cleanups from pmd lifetime series Jeremy Fitzhardinge
2008-01-28 23:48 ` [PATCH 1 of 4] x86: unify PAE/non-PAE pgd_ctor Jeremy Fitzhardinge
@ 2008-01-28 23:48 ` Jeremy Fitzhardinge
2008-01-28 23:48 ` [PATCH 3 of 4] x86: pud_clear: only reload cr3 if necessary Jeremy Fitzhardinge
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Jeremy Fitzhardinge @ 2008-01-28 23:48 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Andi Kleen, Jan Beulich, Eduardo Pereira Habkost,
Ian Campbell, H Peter Anvin
Revert "defer cr3 reload when doing pud_clear()" since I'm going to
replace it.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
---
include/asm-x86/pgalloc_32.h | 7 -------
include/asm-x86/pgtable-3level.h | 21 ++++++---------------
2 files changed, 6 insertions(+), 22 deletions(-)
diff --git a/include/asm-x86/pgalloc_32.h b/include/asm-x86/pgalloc_32.h
--- a/include/asm-x86/pgalloc_32.h
+++ b/include/asm-x86/pgalloc_32.h
@@ -74,13 +74,6 @@ static inline void pmd_free(pmd_t *pmd)
static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
{
- /* This is called just after the pmd has been detached from
- the pgd, which requires a full tlb flush to be recognized
- by the CPU. Rather than incurring multiple tlb flushes
- while the address space is being pulled down, make the tlb
- gathering machinery do a full flush when we're done. */
- tlb->fullmm = 1;
-
paravirt_release_pd(__pa(pmd) >> PAGE_SHIFT);
tlb_remove_page(tlb, virt_to_page(pmd));
}
diff --git a/include/asm-x86/pgtable-3level.h b/include/asm-x86/pgtable-3level.h
--- a/include/asm-x86/pgtable-3level.h
+++ b/include/asm-x86/pgtable-3level.h
@@ -96,23 +96,14 @@ static inline void pud_clear(pud_t *pudp
set_pud(pudp, __pud(0));
/*
- * In principle we need to do a cr3 reload here to make sure
- * the processor recognizes the changed pgd. In practice, all
- * the places where pud_clear() gets called are followed by
- * full tlb flushes anyway, so we can defer the cost here.
+ * Pentium-II erratum A13: in PAE mode we explicitly have to flush
+ * the TLB via cr3 if the top-level pgd is changed...
*
- * Specifically:
- *
- * mm/memory.c:free_pmd_range() - immediately after the
- * pud_clear() it does a pmd_free_tlb(). We change the
- * mmu_gather structure to do a full tlb flush (which has the
- * effect of reloading cr3) when the pagetable free is
- * complete.
- *
- * arch/x86/mm/hugetlbpage.c:huge_pmd_unshare() - the call to
- * this is followed by a flush_tlb_range, which on x86 does a
- * full tlb flush.
+ * XXX I don't think we need to worry about this here, since
+ * when clearing the pud, the calling code needs to flush the
+ * tlb anyway. But do it now for safety's sake. - jsgf
*/
+ write_cr3(read_cr3());
}
#define pud_page(pud) \
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 3 of 4] x86: pud_clear: only reload cr3 if necessary
2008-01-28 23:48 [PATCH 0 of 4] x86: cleanups from pmd lifetime series Jeremy Fitzhardinge
2008-01-28 23:48 ` [PATCH 1 of 4] x86: unify PAE/non-PAE pgd_ctor Jeremy Fitzhardinge
2008-01-28 23:48 ` [PATCH 2 of 4] x86: revert "defer cr3 reload when doing pud_clear()" Jeremy Fitzhardinge
@ 2008-01-28 23:48 ` Jeremy Fitzhardinge
2008-01-28 23:48 ` [PATCH 4 of 4] x86: update reference for PAE tlb flushing Jeremy Fitzhardinge
2008-02-01 16:26 ` [PATCH 0 of 4] x86: cleanups from pmd lifetime series Ingo Molnar
4 siblings, 0 replies; 9+ messages in thread
From: Jeremy Fitzhardinge @ 2008-01-28 23:48 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Andi Kleen, Jan Beulich, Eduardo Pereira Habkost,
Ian Campbell, H Peter Anvin
Rather than unconditionally reloading cr3, only do so if the pud we're
updating is within the active pgd.
This eliminates TLB flushes most of the time. The
performance-critical uses of pud_clear are during execve and exit, but
in those cases cr3 is referring to some other pagetable. The only
other use of pud_clear is during a large (1Gbyte+) munmap, and those
are sufficiently rare that a couple of cr3 reloads won't hurt.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
---
include/asm-x86/pgtable-3level.h | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/include/asm-x86/pgtable-3level.h b/include/asm-x86/pgtable-3level.h
--- a/include/asm-x86/pgtable-3level.h
+++ b/include/asm-x86/pgtable-3level.h
@@ -93,17 +93,20 @@ static inline void native_pmd_clear(pmd_
static inline void pud_clear(pud_t *pudp)
{
+ unsigned long pgd;
+
set_pud(pudp, __pud(0));
/*
* Pentium-II erratum A13: in PAE mode we explicitly have to flush
* the TLB via cr3 if the top-level pgd is changed...
*
- * XXX I don't think we need to worry about this here, since
- * when clearing the pud, the calling code needs to flush the
- * tlb anyway. But do it now for safety's sake. - jsgf
+ * Make sure the pud entry we're updating is within the
+ * current pgd to avoid unnecessary TLB flushes.
*/
- write_cr3(read_cr3());
+ pgd = read_cr3();
+ if (__pa(pudp) >= pgd && __pa(pudp) < (pgd + sizeof(pgd_t)*PTRS_PER_PGD))
+ write_cr3(pgd);
}
#define pud_page(pud) \
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 4 of 4] x86: update reference for PAE tlb flushing
2008-01-28 23:48 [PATCH 0 of 4] x86: cleanups from pmd lifetime series Jeremy Fitzhardinge
` (2 preceding siblings ...)
2008-01-28 23:48 ` [PATCH 3 of 4] x86: pud_clear: only reload cr3 if necessary Jeremy Fitzhardinge
@ 2008-01-28 23:48 ` Jeremy Fitzhardinge
2008-02-01 16:26 ` [PATCH 0 of 4] x86: cleanups from pmd lifetime series Ingo Molnar
4 siblings, 0 replies; 9+ messages in thread
From: Jeremy Fitzhardinge @ 2008-01-28 23:48 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Andi Kleen, Jan Beulich, Eduardo Pereira Habkost,
Ian Campbell, H Peter Anvin
Remove bogus reference to "Pentium-II erratum A13" and point to the
actual canonical source of information about what requirements x86
processors have for PAE pagetable updates.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
---
include/asm-x86/pgalloc_32.h | 6 ++++--
include/asm-x86/pgtable-3level.h | 6 ++++--
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/include/asm-x86/pgalloc_32.h b/include/asm-x86/pgalloc_32.h
--- a/include/asm-x86/pgalloc_32.h
+++ b/include/asm-x86/pgalloc_32.h
@@ -87,8 +87,10 @@ static inline void pud_populate(struct m
set_pud(pudp, __pud(__pa(pmd) | _PAGE_PRESENT));
/*
- * Pentium-II erratum A13: in PAE mode we explicitly have to flush
- * the TLB via cr3 if the top-level pgd is changed...
+ * According to Intel App note "TLBs, Paging-Structure Caches,
+ * and Their Invalidation", April 2007, document 317080-001,
+ * section 8.1: in PAE mode we explicitly have to flush the
+ * TLB via cr3 if the top-level pgd is changed...
*/
if (mm == current->active_mm)
write_cr3(read_cr3());
diff --git a/include/asm-x86/pgtable-3level.h b/include/asm-x86/pgtable-3level.h
--- a/include/asm-x86/pgtable-3level.h
+++ b/include/asm-x86/pgtable-3level.h
@@ -98,8 +98,10 @@ static inline void pud_clear(pud_t *pudp
set_pud(pudp, __pud(0));
/*
- * Pentium-II erratum A13: in PAE mode we explicitly have to flush
- * the TLB via cr3 if the top-level pgd is changed...
+ * According to Intel App note "TLBs, Paging-Structure Caches,
+ * and Their Invalidation", April 2007, document 317080-001,
+ * section 8.1: in PAE mode we explicitly have to flush the
+ * TLB via cr3 if the top-level pgd is changed...
*
* Make sure the pud entry we're updating is within the
* current pgd to avoid unnecessary TLB flushes.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH 0 of 4] x86: cleanups from pmd lifetime series
2008-01-28 23:48 [PATCH 0 of 4] x86: cleanups from pmd lifetime series Jeremy Fitzhardinge
` (3 preceding siblings ...)
2008-01-28 23:48 ` [PATCH 4 of 4] x86: update reference for PAE tlb flushing Jeremy Fitzhardinge
@ 2008-02-01 16:26 ` Ingo Molnar
2008-02-01 16:35 ` Jeremy Fitzhardinge
4 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2008-02-01 16:26 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: LKML, Andi Kleen, Jan Beulich, Eduardo Pereira Habkost,
Ian Campbell, H Peter Anvin
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> Hi Ingo,
>
> Here's a followup set from that last batch of patches:
> 1. fix up the pgd_ctor merge, so that non-PAE will end up getting
> kernel mappings
FYI, only this one applied to the latest x86.git tree, could you please
resend? I guess the pgalloc.h related revert interfered.
Ingo
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH 0 of 4] x86: cleanups from pmd lifetime series
2008-02-01 16:26 ` [PATCH 0 of 4] x86: cleanups from pmd lifetime series Ingo Molnar
@ 2008-02-01 16:35 ` Jeremy Fitzhardinge
2008-02-01 17:08 ` Ingo Molnar
0 siblings, 1 reply; 9+ messages in thread
From: Jeremy Fitzhardinge @ 2008-02-01 16:35 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Andi Kleen, Jan Beulich, Eduardo Pereira Habkost,
Ian Campbell, H Peter Anvin
Ingo Molnar wrote:
> FYI, only this one applied to the latest x86.git tree, could you please
> resend? I guess the pgalloc.h related revert interfered.
>
OK. I'll do a quick rebase to this morning's tree and resend.
J
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH 0 of 4] x86: cleanups from pmd lifetime series
2008-02-01 16:35 ` Jeremy Fitzhardinge
@ 2008-02-01 17:08 ` Ingo Molnar
0 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2008-02-01 17:08 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: LKML, Andi Kleen, Jan Beulich, Eduardo Pereira Habkost,
Ian Campbell, H Peter Anvin
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>> FYI, only this one applied to the latest x86.git tree, could you
>> please resend? I guess the pgalloc.h related revert interfered.
>
> OK. I'll do a quick rebase to this morning's tree and resend.
thanks, picked them up, and they just made today's x86.git#mm pushout as
well.
Ingo
^ permalink raw reply [flat|nested] 9+ messages in thread