From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: LKML <linux-kernel@vger.kernel.org>, Andi Kleen <ak@suse.de>,
Jan Beulich <jbeulich@novell.com>,
Eduardo Pereira Habkost <ehabkost@redhat.com>,
Ian Campbell <ijc@hellion.org.uk>, H Peter Anvin <hpa@zytor.com>,
William Irwin <wli@holomorphy.com>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 09 of 11] x86: preallocate pmds at pgd creation time
Date: Fri, 25 Jan 2008 13:23:18 -0800 [thread overview]
Message-ID: <9e9568e9f481b44f50f7.1201296198@localhost> (raw)
In-Reply-To: <patchbomb.1201296189@localhost>
In PAE mode, an update to the pgd requires a cr3 reload to make sure
the processor notices the changes. Since this also has the
side-effect of flushing the tlb, its an expensive operation which we
want to avoid where possible.
This patch mitigates the cost of installing the initial set of pmds on
process creation by preallocating them when the pgd is allocated.
This avoids up to three tlb flushes during exec, as it creates the new
process address space while the pagetable is in active use.
The pmds will be freed as part of the normal pagetable teardown in
free_pgtables, which is called in munmap and process exit. However,
free_pgtables will only free parts of the pagetable which actually
contain mappings, so stray pmds may still be attached to the pgd at
pgd_free time. We must mop them up to prevent a memory leak.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: William Irwin <wli@holomorphy.com>
---
arch/x86/mm/pgtable_32.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/arch/x86/mm/pgtable_32.c b/arch/x86/mm/pgtable_32.c
--- a/arch/x86/mm/pgtable_32.c
+++ b/arch/x86/mm/pgtable_32.c
@@ -258,17 +258,87 @@ static void pgd_dtor(void *pgd)
spin_unlock_irqrestore(&pgd_lock, flags);
}
+#ifdef CONFIG_X86_PAE
+/*
+ * Mop up any pmd pages which may still be attached to the pgd.
+ * Normally they will be freed by munmap/exit_mmap, but any pmd we
+ * preallocate which never got a corresponding vma will need to be
+ * freed manually.
+ */
+static void pgd_mop_up_pmds(pgd_t *pgdp)
+{
+ int i;
+
+ for(i = 0; i < USER_PTRS_PER_PGD; i++) {
+ pgd_t pgd = pgdp[i];
+
+ if (pgd_val(pgd) != 0) {
+ pmd_t *pmd = (pmd_t *)pgd_page_vaddr(pgd);
+
+ pgdp[i] = native_make_pgd(0);
+
+ paravirt_release_pd(pgd_val(pgd) >> PAGE_SHIFT);
+ pmd_free(pmd);
+ }
+ }
+}
+
+/*
+ * In PAE mode, we need to do a cr3 reload (=tlb flush) when
+ * updating the top-level pagetable entries to guarantee the
+ * processor notices the update. Since this is expensive, and
+ * all 4 top-level entries are used almost immediately in a
+ * new process's life, we just pre-populate them here.
+ */
+static int pgd_prepopulate_pmd(struct mm_struct *mm, pgd_t *pgd)
+{
+ pud_t *pud;
+ unsigned long addr;
+ int i;
+
+ pud = pud_offset(pgd, 0);
+ for (addr = i = 0; i < USER_PTRS_PER_PGD; i++, pud++, addr += PUD_SIZE) {
+ pmd_t *pmd = pmd_alloc_one(mm, addr);
+
+ if (!pmd) {
+ pgd_mop_up_pmds(pgd);
+ return 0;
+ }
+
+ pud_populate(mm, pud, pmd);
+ }
+
+ return 1;
+}
+#else /* !CONFIG_X86_PAE */
+/* No need to prepopulate any pagetable entries in non-PAE modes. */
+static int pgd_prepopulate_pmd(struct mm_struct *mm, pgd_t *pgd)
+{
+ return 1;
+}
+
+static void pgd_mop_up_pmds(pgd_t *pgd)
+{
+}
+#endif /* CONFIG_X86_PAE */
+
pgd_t *pgd_alloc(struct mm_struct *mm)
{
pgd_t *pgd = quicklist_alloc(0, GFP_KERNEL, pgd_ctor);
mm->pgd = pgd; /* so that alloc_pd can use it */
+ if (pgd && !pgd_prepopulate_pmd(mm, pgd)) {
+ quicklist_free(0, pgd_dtor, pgd);
+ pgd = NULL;
+ }
+
return pgd;
}
void pgd_free(pgd_t *pgd)
{
+ pgd_mop_up_pmds(pgd);
quicklist_free(0, pgd_dtor, pgd);
}
next prev parent reply other threads:[~2008-01-25 21:55 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-25 21:23 [PATCH 00 of 11] x86: separate pmd lifetime from pgd Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 01 of 11] xen: fix mismerge in masking pte flags Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 02 of 11] x86: use the same pgd_list for PAE and 64-bit Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 03 of 11] x86: add mm parameter to paravirt_alloc_pd Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 04 of 11] x86: fix early_ioremap pagetable ops Jeremy Fitzhardinge
2008-01-31 19:01 ` Ian Campbell
2008-01-31 19:52 ` Jeremy Fitzhardinge
2008-01-31 20:37 ` Ingo Molnar
2008-01-31 20:41 ` Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 05 of 11] x86: demacro asm-x86/pgalloc_32.h Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 06 of 11] x86: unify PAE/non-PAE pgd_ctor Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 07 of 11] x86: don't special-case pmd allocations as much Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 08 of 11] xen: deal with pmd being allocated/freed Jeremy Fitzhardinge
2008-01-25 21:23 ` Jeremy Fitzhardinge [this message]
2008-01-25 21:23 ` [PATCH 10 of 11] x86: allocate and initialize unshared pmds Jeremy Fitzhardinge
2008-01-25 21:23 ` [PATCH 11 of 11] x86: defer cr3 reload when doing pud_clear() Jeremy Fitzhardinge
2008-01-25 21:37 ` H. Peter Anvin
2008-01-25 22:54 ` Jeremy Fitzhardinge
2008-01-25 23:38 ` Keir Fraser
2008-01-25 23:44 ` Jeremy Fitzhardinge
2008-01-26 0:11 ` Ingo Molnar
2008-01-26 0:20 ` H. Peter Anvin
2008-01-26 5:57 ` Andi Kleen
2008-01-26 6:03 ` H. Peter Anvin
2008-01-26 0:10 ` H. Peter Anvin
2008-01-26 0:57 ` Jeremy Fitzhardinge
2008-01-26 1:09 ` H. Peter Anvin
2008-01-28 15:17 ` [PATCH 00 of 11] x86: separate pmd lifetime from pgd Ingo Molnar
2008-01-28 15:39 ` Jeremy Fitzhardinge
2008-01-28 15:41 ` Ingo Molnar
2008-01-28 15:47 ` Ingo Molnar
2008-01-28 16:20 ` Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9e9568e9f481b44f50f7.1201296198@localhost \
--to=jeremy@goop.org \
--cc=ak@suse.de \
--cc=ehabkost@redhat.com \
--cc=hpa@zytor.com \
--cc=ijc@hellion.org.uk \
--cc=jbeulich@novell.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=torvalds@linux-foundation.org \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox