All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, Davidlohr Bueso <dave@stgolabs.net>,
	Davidlohr Bueso <dbueso@suse.de>,
	Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 01/15] perf bench futex: Avoid worker cacheline bouncing
Date: Thu, 27 Oct 2016 18:40:41 -0200	[thread overview]
Message-ID: <1477600855-27580-2-git-send-email-acme@kernel.org> (raw)
In-Reply-To: <1477600855-27580-1-git-send-email-acme@kernel.org>

From: Davidlohr Bueso <dave@stgolabs.net>

Sebastian noted that overhead for worker thread ops (throughput)
accounting was producing 'perf' to appear in the profiles, consuming a
non-trivial (i.e. 13%) amount of CPU.

This is due to cacheline bouncing due to the increment of w->ops.

We can easily fix this by just working on a local copy and updating the
actual worker once done running, and ready to show the program summary.
There is no danger of the worker being concurrent, so we can trust that
no stale value is being seen by another thread.

This also gets rid of the unnecessary cache alignment hack; its not
worth it.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/1477342613-9938-2-git-send-email-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/bench/futex-hash.c    | 11 +++++------
 tools/perf/bench/futex-lock-pi.c |  4 +++-
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index d9e5e80bb4d0..da04b8c5568a 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -39,15 +39,12 @@ static unsigned int threads_starting;
 static struct stats throughput_stats;
 static pthread_cond_t thread_parent, thread_worker;
 
-#define SMP_CACHE_BYTES 256
-#define __cacheline_aligned __attribute__ ((aligned (SMP_CACHE_BYTES)))
-
 struct worker {
 	int tid;
 	u_int32_t *futex;
 	pthread_t thread;
 	unsigned long ops;
-} __cacheline_aligned;
+};
 
 static const struct option options[] = {
 	OPT_UINTEGER('t', "threads", &nthreads, "Specify amount of threads"),
@@ -66,8 +63,9 @@ static const char * const bench_futex_hash_usage[] = {
 static void *workerfn(void *arg)
 {
 	int ret;
-	unsigned int i;
 	struct worker *w = (struct worker *) arg;
+	unsigned int i;
+	unsigned long ops = w->ops; /* avoid cacheline bouncing */
 
 	pthread_mutex_lock(&thread_lock);
 	threads_starting--;
@@ -77,7 +75,7 @@ static void *workerfn(void *arg)
 	pthread_mutex_unlock(&thread_lock);
 
 	do {
-		for (i = 0; i < nfutexes; i++, w->ops++) {
+		for (i = 0; i < nfutexes; i++, ops++) {
 			/*
 			 * We want the futex calls to fail in order to stress
 			 * the hashing of uaddr and not measure other steps,
@@ -91,6 +89,7 @@ static void *workerfn(void *arg)
 		}
 	}  while (!done);
 
+	w->ops = ops;
 	return NULL;
 }
 
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
index 936d89d30483..7032e4643c65 100644
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -75,6 +75,7 @@ static void toggle_done(int sig __maybe_unused,
 static void *workerfn(void *arg)
 {
 	struct worker *w = (struct worker *) arg;
+	unsigned long ops = w->ops;
 
 	pthread_mutex_lock(&thread_lock);
 	threads_starting--;
@@ -103,9 +104,10 @@ static void *workerfn(void *arg)
 		if (ret && !silent)
 			warn("thread %d: Could not unlock pi-lock for %p (%d)",
 			     w->tid, w->futex, ret);
-		w->ops++; /* account for thread's share of work */
+		ops++; /* account for thread's share of work */
 	}  while (!done);
 
+	w->ops = ops;
 	return NULL;
 }
 
-- 
2.7.4

  reply	other threads:[~2016-10-27 20:41 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-27 20:40 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo
2016-10-27 20:40 ` Arnaldo Carvalho de Melo [this message]
2016-10-27 20:40 ` [PATCH 02/15] perf bench futex: Sanitize numeric parameters Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 03/15] perf hist browser: Fix hierarchy column counts Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 04/15] tools lib subcmd: Suppport cascading options Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 05/15] perf sched: Make common options cascading Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 06/15] perf sched map: Apply cpu color when there's an activity Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 07/15] perf sched map: Always show task comm with -v Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 08/15] perf tools: Introduce timestamp_in_usec() Arnaldo Carvalho de Melo
2016-10-27 23:14   ` Joonwoo Park
2016-10-28 12:53     ` Arnaldo Carvalho de Melo
2016-10-28 13:30       ` Arnaldo Carvalho de Melo
2016-10-28 14:42         ` [GIT PULL] " Arnaldo Carvalho de Melo
2016-10-28 17:39           ` Ingo Molnar
2016-10-28 17:46             ` Joonwoo Park
2016-10-27 20:40 ` [PATCH 09/15] perf list: Support matching by topic Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 10/15] perf bench mem: Ignore export.h related changes to mem{cpy,set}.S Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 11/15] tools: Update asm-generic/mman-common.h copy from the kernel Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 12/15] perf tools: Update x86's syscall_64.tbl, adding pkey_(alloc,free,mprotect) Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 13/15] perf scripting: Avoid leaking the scripting_context variable Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 14/15] perf scripting: Don't die if scripting can't be setup, disable it Arnaldo Carvalho de Melo
2016-10-27 20:40 ` [PATCH 15/15] perf tools: Add missing object file to the python binding linkage list Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1477600855-27580-2-git-send-email-acme@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=dave@stgolabs.net \
    --cc=dbueso@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.