From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, "H. Peter Anvin" <hpa@linux.intel.com>,
linux-kernel@vger.kernel.org,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Arnd Bergmann <arnd@arndb.de>, Ingo Molnar <mingo@kernel.org>,
linux-arch@vger.kernel.org
Subject: Re: [PATCH 0/3] Virtual huge zero page
Date: Mon, 1 Oct 2012 16:49:48 +0300 [thread overview]
Message-ID: <20121001134948.GA5812@otc-wbsnb-06> (raw)
In-Reply-To: <20120929143737.GF26989@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3374 bytes --]
On Sat, Sep 29, 2012 at 04:37:37PM +0200, Andrea Arcangeli wrote:
> But I agree we need to verify it before taking a decision, and that
> the numbers are better than theory, or to rephrase it "let's check the
> theory is right" :)
Okay, microbenchmark:
% cat test_memcmp.c
#include <assert.h>
#include <stdlib.h>
#include <string.h>
#define MB (1024ul * 1024ul)
#define GB (1024ul * MB)
int main(int argc, char **argv)
{
char *p;
int i;
posix_memalign((void **)&p, 2 * MB, 8 * GB);
for (i = 0; i < 100; i++) {
assert(memcmp(p, p + 4*GB, 4*GB) == 0);
asm volatile ("": : :"memory");
}
return 0;
}
huge zero page (initial implementation):
Performance counter stats for './test_memcmp' (5 runs):
32356.272845 task-clock # 0.998 CPUs utilized ( +- 0.13% )
40 context-switches # 0.001 K/sec ( +- 0.94% )
0 CPU-migrations # 0.000 K/sec
4,218 page-faults # 0.130 K/sec ( +- 0.00% )
76,712,481,765 cycles # 2.371 GHz ( +- 0.13% ) [83.31%]
36,279,577,636 stalled-cycles-frontend # 47.29% frontend cycles idle ( +- 0.28% ) [83.35%]
1,684,049,110 stalled-cycles-backend # 2.20% backend cycles idle ( +- 2.96% ) [66.67%]
134,355,715,816 instructions # 1.75 insns per cycle
# 0.27 stalled cycles per insn ( +- 0.10% ) [83.35%]
13,526,169,702 branches # 418.039 M/sec ( +- 0.10% ) [83.31%]
1,058,230 branch-misses # 0.01% of all branches ( +- 0.91% ) [83.36%]
32.413866442 seconds time elapsed ( +- 0.13% )
virtual huge zero page (the second implementation):
Performance counter stats for './test_memcmp' (5 runs):
30327.183829 task-clock # 0.998 CPUs utilized ( +- 0.13% )
38 context-switches # 0.001 K/sec ( +- 1.53% )
0 CPU-migrations # 0.000 K/sec
4,218 page-faults # 0.139 K/sec ( +- 0.01% )
71,964,773,660 cycles # 2.373 GHz ( +- 0.13% ) [83.35%]
31,191,284,231 stalled-cycles-frontend # 43.34% frontend cycles idle ( +- 0.40% ) [83.32%]
773,484,474 stalled-cycles-backend # 1.07% backend cycles idle ( +- 6.61% ) [66.67%]
134,982,215,437 instructions # 1.88 insns per cycle
# 0.23 stalled cycles per insn ( +- 0.11% ) [83.32%]
13,509,150,683 branches # 445.447 M/sec ( +- 0.11% ) [83.34%]
1,017,667 branch-misses # 0.01% of all branches ( +- 1.07% ) [83.32%]
30.381324695 seconds time elapsed ( +- 0.13% )
On Westmere-EX virtual huge zero page is ~6.7% faster.
--
Kirill A. Shutemov
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2012-10-01 13:49 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-28 23:37 [PATCH 0/3] Virtual huge zero page Kirill A. Shutemov
2012-09-28 23:37 ` [PATCH 1/3] asm-generic: introduce pmd_special() and pmd_mkspecial() Kirill A. Shutemov
2012-09-28 23:37 ` [PATCH 2/3] mm, thp: implement virtual huge zero page Kirill A. Shutemov
2012-09-28 23:37 ` [PATCH 3/3] x86: implement HAVE_PMD_SPECAIL Kirill A. Shutemov
2012-09-29 13:48 ` [PATCH 0/3] Virtual huge zero page Andrea Arcangeli
2012-09-29 14:30 ` Andi Kleen
2012-09-29 14:37 ` Andrea Arcangeli
2012-10-01 13:49 ` Kirill A. Shutemov [this message]
2012-10-01 16:14 ` Andrea Arcangeli
2012-10-01 17:18 ` Kirill A. Shutemov
2012-10-01 15:34 ` H. Peter Anvin
2012-10-01 16:31 ` Andrea Arcangeli
2012-10-01 17:03 ` H. Peter Anvin
2012-10-01 17:15 ` Kirill A. Shutemov
2012-10-01 18:03 ` Andrea Arcangeli
2012-10-01 17:26 ` Andrea Arcangeli
2012-10-01 17:33 ` H. Peter Anvin
2012-10-01 17:36 ` Kirill A. Shutemov
2012-10-01 17:37 ` H. Peter Anvin
2012-10-01 17:44 ` Kirill A. Shutemov
2012-10-01 17:52 ` H. Peter Anvin
2012-10-01 18:56 ` Kirill A. Shutemov
2012-10-01 18:05 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121001134948.GA5812@otc-wbsnb-06 \
--to=kirill.shutemov@linux.intel.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=hpa@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).