From: Nick Piggin <nickpiggin@yahoo.com.au>
To: linux-ia64@vger.kernel.org
Subject: [PATCH] pte prefetching
Date: Thu, 24 Mar 2005 07:18:17 +0000 [thread overview]
Message-ID: <424269B9.9020306@yahoo.com.au> (raw)
[-- Attachment #1: Type: text/plain, Size: 1184 bytes --]
Hi,
Sending this to the ia64 list, because that is so far the only platform
I have tested on, and because the patch may be more likely to have real
applications on ia64 systems.
I have been looking at different implementations of unmapping and page
table freeing recently. As a consequence, I came to notice that the
vast majority of L2 cache misses on ia64 (and probably all
architectures) in an unmapping workload comes from the line:
pte_t ptent = *pte;
In zap_pte_range, ie. walking the bottom level page table pages.
I should qualify that - that is the case when the page tables
aren't in cache - this does not apply to a simple lmbench fork/exit
test for example.
Anyway, I tried prefetching a line ahead of the one we're currently
working in, and put the prefetching into zap_pte_range, and
copy_pte_range (which does a similar pte walk to set up page tables
on fork()).
microbenchmark results are pretty good - but I wonder if anyone might
have a real-world use for it?
After applying the recent freepgt patchset from Hugh (on lkml), the
time to fork+exit a process mapping 64GB of address (32MB of page
tables) is 0.471s. With the prefetch patch, this drops to 0.357s.
[-- Attachment #2: pte-prefetch --]
[-- Type: text/plain, Size: 2843 bytes --]
Index: linux-2.6/include/asm-generic/pgtable.h
===================================================================
--- linux-2.6.orig/include/asm-generic/pgtable.h 2005-03-24 10:43:38.000000000 +1100
+++ linux-2.6/include/asm-generic/pgtable.h 2005-03-24 12:08:57.000000000 +1100
@@ -160,6 +160,39 @@ static inline void ptep_set_wrprotect(st
})
#endif
+#define PTES_PER_LINE (L1_CACHE_BYTES / sizeof(pte_t))
+#define PTE_LINE_MASK (~(PTES_PER_LINE - 1))
+#define ADDR_PER_LINE (PTES_PER_LINE << PAGE_SHIFT)
+#define ADDR_LINE_MASK (~((PTES_PER_LINE << PAGE_SHIFT) - 1))
+
+#define pte_prefetch(pte, addr, end) \
+({ \
+ unsigned long nextline = ((addr) + ADDR_PER_LINE) & ADDR_LINE_MASK; \
+ if (nextline < (end)) \
+ prefetch(pte + PTES_PER_LINE); \
+})
+
+#define pte_prefetch_next(pte, addr, end) \
+({ \
+ unsigned long _addr = (addr); \
+ if (!(_addr & ~ADDR_LINE_MASK)) /* We hit a new cacheline */ \
+ pte_prefetch(pte, _addr, end); \
+})
+
+#define pte_prefetchw(pte, addr, end) \
+({ \
+ unsigned long nextline = ((addr) + ADDR_PER_LINE) & ADDR_LINE_MASK; \
+ if (nextline < (end)) \
+ prefetchw(pte + PTES_PER_LINE); \
+})
+
+#define pte_prefetchw_next(pte, addr, end) \
+({ \
+ unsigned long _addr = (addr); \
+ if (!(_addr & ~ADDR_LINE_MASK)) /* We hit a new cacheline */ \
+ pte_prefetchw(pte, _addr, end); \
+})
+
#ifndef __ASSEMBLY__
/*
* When walking page tables, we usually want to skip any p?d_none entries;
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c 2005-03-24 12:08:43.000000000 +1100
+++ linux-2.6/mm/memory.c 2005-03-24 12:08:57.000000000 +1100
@@ -411,6 +411,7 @@ again:
progress = 0;
spin_lock(&src_mm->page_table_lock);
+ pte_prefetch(src_pte, addr, end);
do {
/*
* We are holding two locks at this point - either of them
@@ -426,7 +427,9 @@ again:
}
copy_one_pte(dst_mm, src_mm, dst_pte, src_pte, vm_flags, addr);
progress += 8;
- } while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
+ } while (dst_pte++, src_pte++, addr += PAGE_SIZE,
+ pte_prefetch_next(src_pte, addr, end), addr != end);
+
spin_unlock(&src_mm->page_table_lock);
pte_unmap_nested(src_pte - 1);
@@ -512,6 +515,7 @@ static void zap_pte_range(struct mmu_gat
pte_t *pte;
pte = pte_offset_map(pmd, addr);
+ pte_prefetchw(pte, addr, end);
do {
pte_t ptent = *pte;
if (pte_none(ptent))
@@ -571,7 +575,8 @@ static void zap_pte_range(struct mmu_gat
if (!pte_file(ptent))
free_swap_and_cache(pte_to_swp_entry(ptent));
pte_clear(tlb->mm, addr, pte);
- } while (pte++, addr += PAGE_SIZE, addr != end);
+ } while (pte++, addr += PAGE_SIZE,
+ pte_prefetchw_next(pte, addr, end), addr != end);
pte_unmap(pte - 1);
}
next reply other threads:[~2005-03-24 7:18 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-03-24 7:18 Nick Piggin [this message]
2005-03-24 20:15 ` [PATCH] pte prefetching David Mosberger
2005-03-25 5:22 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=424269B9.9020306@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.