From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Wang, Yalin" <Yalin.Wang@sonymobile.com>,
"'arnd@arndb.de'" <arnd@arndb.de>,
"'linux-arch@vger.kernel.org'" <linux-arch@vger.kernel.org>,
"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>,
"'linux@arm.linux.org.uk'" <linux@arm.linux.org.uk>,
"'linux-arm-kernel@lists.infradead.org'"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [RFC] change non-atomic bitops method
Date: Tue, 3 Feb 2015 12:39:32 +0200 [thread overview]
Message-ID: <20150203103932.GA14259@node.dhcp.inet.fi> (raw)
In-Reply-To: <20150203011730.GA15653@node.dhcp.inet.fi>
[-- Attachment #1: Type: text/plain, Size: 586 bytes --]
On Tue, Feb 03, 2015 at 03:17:30AM +0200, Kirill A. Shutemov wrote:
> Results for 10 runs on my laptop -- i5-3427U (IvyBridge 1.8 Ghz, 2.8Ghz Turbo
> with 3MB LLC):
I've screwed up the inner loop condition and step. As result the benchmark
touches the same cache line 8 times and scan SIZE/8 of memory. Fixed test
is in attach.
Avg Stddev
baseline 14.0663 0.0182
-DCHECK_BEFORE_SET 13.8594 0.0458
-DCACHE_HOT 12.3896 0.0867
-DCACHE_HOT -DCHECK_BEFORE_SET 11.7480 0.2497
And now it's faster *with* the check. Sometimes CPU is just too clever. ;)
--
Kirill A. Shutemov
[-- Attachment #2: test.c --]
[-- Type: text/plain, Size: 901 bytes --]
#include <stdio.h>
#include <time.h>
#include <sys/mman.h>
#ifdef CACHE_HOT
#define SIZE (2UL << 20)
#define TIMES 100000
#else
#define SIZE (1UL << 30)
#define TIMES 100
#endif
#define CACHE_LINE 64
int main(int argc, char **argv)
{
struct timespec a, b, diff;
unsigned long i, *p, times = TIMES;
p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
clock_gettime(CLOCK_MONOTONIC, &a);
while (times--) {
for (i = 0; i < SIZE / sizeof(*p);
i += CACHE_LINE / sizeof(*p)) {
#ifdef CHECK_BEFORE_SET
if (p[i] != times)
#endif
p[i] = times;
}
}
clock_gettime(CLOCK_MONOTONIC, &b);
diff.tv_sec = b.tv_sec - a.tv_sec;
if (a.tv_nsec > b.tv_nsec) {
diff.tv_sec--;
diff.tv_nsec = 1000000000 + b.tv_nsec - a.tv_nsec;
} else
diff.tv_nsec = b.tv_nsec - a.tv_nsec;
printf("%lu.%09lu\n", diff.tv_sec, diff.tv_nsec);
return 0;
}
WARNING: multiple messages have this Message-ID (diff)
From: kirill@shutemov.name (Kirill A. Shutemov)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC] change non-atomic bitops method
Date: Tue, 3 Feb 2015 12:39:32 +0200 [thread overview]
Message-ID: <20150203103932.GA14259@node.dhcp.inet.fi> (raw)
In-Reply-To: <20150203011730.GA15653@node.dhcp.inet.fi>
On Tue, Feb 03, 2015 at 03:17:30AM +0200, Kirill A. Shutemov wrote:
> Results for 10 runs on my laptop -- i5-3427U (IvyBridge 1.8 Ghz, 2.8Ghz Turbo
> with 3MB LLC):
I've screwed up the inner loop condition and step. As result the benchmark
touches the same cache line 8 times and scan SIZE/8 of memory. Fixed test
is in attach.
Avg Stddev
baseline 14.0663 0.0182
-DCHECK_BEFORE_SET 13.8594 0.0458
-DCACHE_HOT 12.3896 0.0867
-DCACHE_HOT -DCHECK_BEFORE_SET 11.7480 0.2497
And now it's faster *with* the check. Sometimes CPU is just too clever. ;)
--
Kirill A. Shutemov
-------------- next part --------------
#include <stdio.h>
#include <time.h>
#include <sys/mman.h>
#ifdef CACHE_HOT
#define SIZE (2UL << 20)
#define TIMES 100000
#else
#define SIZE (1UL << 30)
#define TIMES 100
#endif
#define CACHE_LINE 64
int main(int argc, char **argv)
{
struct timespec a, b, diff;
unsigned long i, *p, times = TIMES;
p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
clock_gettime(CLOCK_MONOTONIC, &a);
while (times--) {
for (i = 0; i < SIZE / sizeof(*p);
i += CACHE_LINE / sizeof(*p)) {
#ifdef CHECK_BEFORE_SET
if (p[i] != times)
#endif
p[i] = times;
}
}
clock_gettime(CLOCK_MONOTONIC, &b);
diff.tv_sec = b.tv_sec - a.tv_sec;
if (a.tv_nsec > b.tv_nsec) {
diff.tv_sec--;
diff.tv_nsec = 1000000000 + b.tv_nsec - a.tv_nsec;
} else
diff.tv_nsec = b.tv_nsec - a.tv_nsec;
printf("%lu.%09lu\n", diff.tv_sec, diff.tv_nsec);
return 0;
}
next prev parent reply other threads:[~2015-02-03 10:40 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-02 3:55 [RFC] change non-atomic bitops method Wang, Yalin
2015-02-02 3:55 ` Wang, Yalin
2015-02-02 18:53 ` Laura Abbott
2015-02-02 18:53 ` Laura Abbott
2015-02-02 19:31 ` Uwe Kleine-König
2015-02-02 19:31 ` Uwe Kleine-König
2015-02-03 15:14 ` David Howells
2015-02-03 15:14 ` David Howells
2015-02-03 15:14 ` David Howells
2015-02-03 19:10 ` Uwe Kleine-König
2015-02-03 19:10 ` Uwe Kleine-König
2015-02-02 23:29 ` Andrew Morton
2015-02-02 23:29 ` Andrew Morton
2015-02-02 23:31 ` Russell King - ARM Linux
2015-02-02 23:31 ` Russell King - ARM Linux
2015-02-03 1:17 ` Kirill A. Shutemov
2015-02-03 1:17 ` Kirill A. Shutemov
2015-02-03 2:13 ` Wang, Yalin
2015-02-03 2:13 ` Wang, Yalin
2015-02-03 5:42 ` Wang, Yalin
2015-02-03 5:42 ` Wang, Yalin
2015-02-03 6:38 ` Andrew Morton
2015-02-03 6:38 ` Andrew Morton
2015-02-03 7:03 ` Wang, Yalin
2015-02-03 7:03 ` Wang, Yalin
2015-02-03 8:42 ` Wang, Yalin
2015-02-03 8:42 ` Wang, Yalin
2015-02-03 10:59 ` Andrew Morton
2015-02-03 10:59 ` Andrew Morton
2015-02-09 8:18 ` Wang, Yalin
2015-02-09 8:18 ` Wang, Yalin
2015-02-09 20:34 ` Andrew Morton
2015-02-09 20:34 ` Andrew Morton
2015-02-10 7:05 ` Wang, Yalin
2015-02-10 7:05 ` Wang, Yalin
2015-02-09 21:42 ` Rasmus Villemoes
2015-02-09 21:42 ` Rasmus Villemoes
2015-02-09 21:42 ` Rasmus Villemoes
2015-02-03 8:40 ` David Miller
2015-02-03 8:40 ` David Miller
2015-02-03 8:48 ` Andrew Morton
2015-02-03 8:48 ` Andrew Morton
2015-02-03 9:34 ` Rasmus Villemoes
2015-02-03 9:34 ` Rasmus Villemoes
2015-02-03 9:34 ` Rasmus Villemoes
2015-02-03 9:41 ` Wang, Yalin
2015-02-03 9:41 ` Wang, Yalin
2015-02-03 10:39 ` Kirill A. Shutemov [this message]
2015-02-03 10:39 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150203103932.GA14259@node.dhcp.inet.fi \
--to=kirill@shutemov.name \
--cc=Yalin.Wang@sonymobile.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.