From: Zachary Amsden <zach@vmware.com>
To: Andi Kleen <ak@suse.de>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Pratap Subrahmanyam <pratap@vmware.com>
Subject: [PATCH] x86_64 Avoid some atomic operations during address space destruction
Date: Sun, 07 Aug 2005 05:16:26 -0700 [thread overview]
Message-ID: <42F5FB9A.5000708@vmware.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 878 bytes --]
This turned out to be a huge win on 32-bit i386 in PAE mode, but it is
likely not as significant on x86_64; I don't know because I haven't
actually measured the cost. I don't have 64-bit hardware that I have
the luxury of rebooting right now, so this patch is untested, but if
someone wants to try this out, it might actually show a measurable win
on fork/exit. I lost my cycle count measurement diffs, but I don't
think they would apply cleanly to x86_64 anyways. This patch at least
looks good, and compiles cleanly on 2.6.13-rc5-mm1, thus passing some
level of testing.
Also, it might show reduced latency on pre-emptible kernels during heavy
fork/exit activity, possibly allowing ZAP_BLOCK_SIZE to be raised for
some architectures (I measured a ~30-50% reduction in cycle timings for
zap_pte_range on i386 with CONFIG_PREEMPT with the analogous patch).
Zach
[-- Attachment #2: x86_64-pte-destruction --]
[-- Type: text/plain, Size: 1576 bytes --]
Any architecture that has hardware updated A/D bits that require
synchronization against other processors during PTE operations
can benefit from doing non-atomic PTE updates during address space
destruction. Originally done on i386, now ported to x86_64.
Doing a read/write pair instead of an xchg() operation saves the
implicit lock, which turns out to be a big win on 32-bit (esp w PAE).
Diffs-against: 2.6.13-rc5-mm1
Signed-off-by: Zachary Amsden <zach@vmware.com>
Index: linux-2.6.13-rc5-mm1/include/asm-x86_64/pgtable.h
===================================================================
--- linux-2.6.13-rc5-mm1.orig/include/asm-x86_64/pgtable.h 2005-08-07 04:56:37.000000000 -0700
+++ linux-2.6.13-rc5-mm1/include/asm-x86_64/pgtable.h 2005-08-07 04:59:18.601856096 -0700
@@ -104,6 +104,19 @@
((unsigned long) __va(pud_val(pud) & PHYSICAL_PAGE_MASK))
#define ptep_get_and_clear(mm,addr,xp) __pte(xchg(&(xp)->pte, 0))
+
+static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, unsigned long addr, pte_t *ptep, int full)
+{
+ pte_t pte;
+ if (full) {
+ pte = *ptep;
+ *ptep = __pte(0);
+ } else {
+ pte = ptep_get_and_clear(mm, addr, ptep);
+ }
+ return pte;
+}
+
#define pte_same(a, b) ((a).pte == (b).pte)
#define PMD_SIZE (1UL << PMD_SHIFT)
@@ -433,6 +446,7 @@
#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_DIRTY
#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL
#define __HAVE_ARCH_PTEP_SET_WRPROTECT
#define __HAVE_ARCH_PTE_SAME
#include <asm-generic/pgtable.h>
next reply other threads:[~2005-08-07 12:17 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-08-07 12:16 Zachary Amsden [this message]
2005-08-25 16:54 ` [PATCH] x86_64 Avoid some atomic operations during address space destruction Andi Kleen
2005-08-25 17:12 ` Zachary Amsden
2005-08-25 17:26 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42F5FB9A.5000708@vmware.com \
--to=zach@vmware.com \
--cc=ak@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=pratap@vmware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.