linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, Davidlohr Bueso <dave@stgolabs.net>,
	Davidlohr Bueso <dbueso@suse.de>, Mel Gorman <mgorman@suse.de>,
	Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 13/13] perf bench futex: Add lock_pi stresser
Date: Mon, 20 Jul 2015 17:58:44 -0300	[thread overview]
Message-ID: <1437425924-31064-14-git-send-email-acme@kernel.org> (raw)
In-Reply-To: <1437425924-31064-1-git-send-email-acme@kernel.org>

From: Davidlohr Bueso <dave@stgolabs.net>

Allows a way of measuring low level kernel implementation of FUTEX_LOCK_PI and
FUTEX_UNLOCK_PI.

The program comes in two flavors:

(i) single futex (default), all threads contend on the same uaddr.  For the
sake of the benchmark, we call into kernel space even when the lock is
uncontended.  The kernel will set it to TID, any waters that come in and
contend for the pi futex will be handled respectively by the kernel.

(ii) -M option for multiple futexes, each thread deals with its own futex. This
is a trivial scenario and only measures kernel handling of 0->TID transition.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Mel Gorman <mgorman@suse.de>
Link: http://lkml.kernel.org/r/1436259353.12255.78.camel@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-bench.txt |   4 +
 tools/perf/bench/Build                  |   1 +
 tools/perf/bench/bench.h                |   2 +
 tools/perf/bench/futex-lock-pi.c        | 219 ++++++++++++++++++++++++++++++++
 tools/perf/bench/futex.h                |  20 +++
 tools/perf/builtin-bench.c              |   2 +
 6 files changed, 248 insertions(+)
 create mode 100644 tools/perf/bench/futex-lock-pi.c

diff --git a/tools/perf/Documentation/perf-bench.txt b/tools/perf/Documentation/perf-bench.txt
index bf3d0644bf10..ab632d9fbd7d 100644
--- a/tools/perf/Documentation/perf-bench.txt
+++ b/tools/perf/Documentation/perf-bench.txt
@@ -216,6 +216,10 @@ Suite for evaluating parallel wake calls.
 *requeue*::
 Suite for evaluating requeue calls.
 
+*lock-pi*::
+Suite for evaluating futex lock_pi calls.
+
+
 SEE ALSO
 --------
 linkperf:perf[1]
diff --git a/tools/perf/bench/Build b/tools/perf/bench/Build
index c3ab760e06b4..573e28896038 100644
--- a/tools/perf/bench/Build
+++ b/tools/perf/bench/Build
@@ -5,6 +5,7 @@ perf-y += futex-hash.o
 perf-y += futex-wake.o
 perf-y += futex-wake-parallel.o
 perf-y += futex-requeue.o
+perf-y += futex-lock-pi.o
 
 perf-$(CONFIG_X86_64) += mem-memcpy-x86-64-asm.o
 perf-$(CONFIG_X86_64) += mem-memset-x86-64-asm.o
diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 70b2f718cc21..a50df86f2b9b 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -36,6 +36,8 @@ extern int bench_futex_wake(int argc, const char **argv, const char *prefix);
 extern int bench_futex_wake_parallel(int argc, const char **argv,
 				     const char *prefix);
 extern int bench_futex_requeue(int argc, const char **argv, const char *prefix);
+/* pi futexes */
+extern int bench_futex_lock_pi(int argc, const char **argv, const char *prefix);
 
 #define BENCH_FORMAT_DEFAULT_STR	"default"
 #define BENCH_FORMAT_DEFAULT		0
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
new file mode 100644
index 000000000000..bc6a16adbca8
--- /dev/null
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -0,0 +1,219 @@
+/*
+ * Copyright (C) 2015 Davidlohr Bueso.
+ */
+
+#include "../perf.h"
+#include "../util/util.h"
+#include "../util/stat.h"
+#include "../util/parse-options.h"
+#include "../util/header.h"
+#include "bench.h"
+#include "futex.h"
+
+#include <err.h>
+#include <stdlib.h>
+#include <sys/time.h>
+#include <pthread.h>
+
+struct worker {
+	int tid;
+	u_int32_t *futex;
+	pthread_t thread;
+	unsigned long ops;
+};
+
+static u_int32_t global_futex = 0;
+static struct worker *worker;
+static unsigned int nsecs = 10;
+static bool silent = false, multi = false;
+static bool done = false, fshared = false;
+static unsigned int ncpus, nthreads = 0;
+static int futex_flag = 0;
+struct timeval start, end, runtime;
+static pthread_mutex_t thread_lock;
+static unsigned int threads_starting;
+static struct stats throughput_stats;
+static pthread_cond_t thread_parent, thread_worker;
+
+static const struct option options[] = {
+	OPT_UINTEGER('t', "threads",  &nthreads, "Specify amount of threads"),
+	OPT_UINTEGER('r', "runtime", &nsecs,     "Specify runtime (in seconds)"),
+	OPT_BOOLEAN( 'M', "multi",   &multi,     "Use multiple futexes"),
+	OPT_BOOLEAN( 's', "silent",  &silent,    "Silent mode: do not display data/details"),
+	OPT_BOOLEAN( 'S', "shared",  &fshared,   "Use shared futexes instead of private ones"),
+	OPT_END()
+};
+
+static const char * const bench_futex_lock_pi_usage[] = {
+	"perf bench futex requeue <options>",
+	NULL
+};
+
+static void print_summary(void)
+{
+	unsigned long avg = avg_stats(&throughput_stats);
+	double stddev = stddev_stats(&throughput_stats);
+
+	printf("%sAveraged %ld operations/sec (+- %.2f%%), total secs = %d\n",
+	       !silent ? "\n" : "", avg, rel_stddev_stats(stddev, avg),
+	       (int) runtime.tv_sec);
+}
+
+static void toggle_done(int sig __maybe_unused,
+			siginfo_t *info __maybe_unused,
+			void *uc __maybe_unused)
+{
+	/* inform all threads that we're done for the day */
+	done = true;
+	gettimeofday(&end, NULL);
+	timersub(&end, &start, &runtime);
+}
+
+static void *workerfn(void *arg)
+{
+	struct worker *w = (struct worker *) arg;
+
+	pthread_mutex_lock(&thread_lock);
+	threads_starting--;
+	if (!threads_starting)
+		pthread_cond_signal(&thread_parent);
+	pthread_cond_wait(&thread_worker, &thread_lock);
+	pthread_mutex_unlock(&thread_lock);
+
+	do {
+		int ret;
+	again:
+		ret = futex_lock_pi(w->futex, NULL, 0, futex_flag);
+
+		if (ret) { /* handle lock acquisition */
+			if (!silent)
+				warn("thread %d: Could not lock pi-lock for %p (%d)",
+				     w->tid, w->futex, ret);
+			if (done)
+				break;
+
+			goto again;
+		}
+
+		usleep(1);
+		ret = futex_unlock_pi(w->futex, futex_flag);
+		if (ret && !silent)
+			warn("thread %d: Could not unlock pi-lock for %p (%d)",
+			     w->tid, w->futex, ret);
+		w->ops++; /* account for thread's share of work */
+	}  while (!done);
+
+	return NULL;
+}
+
+static void create_threads(struct worker *w, pthread_attr_t thread_attr)
+{
+	cpu_set_t cpu;
+	unsigned int i;
+
+	threads_starting = nthreads;
+
+	for (i = 0; i < nthreads; i++) {
+		worker[i].tid = i;
+
+		if (multi) {
+			worker[i].futex = calloc(1, sizeof(u_int32_t));
+			if (!worker[i].futex)
+				err(EXIT_FAILURE, "calloc");
+		} else
+			worker[i].futex = &global_futex;
+
+		CPU_ZERO(&cpu);
+		CPU_SET(i % ncpus, &cpu);
+
+		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu))
+			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
+
+		if (pthread_create(&w[i].thread, &thread_attr, workerfn, &worker[i]))
+			err(EXIT_FAILURE, "pthread_create");
+	}
+}
+
+int bench_futex_lock_pi(int argc, const char **argv,
+			const char *prefix __maybe_unused)
+{
+	int ret = 0;
+	unsigned int i;
+	struct sigaction act;
+	pthread_attr_t thread_attr;
+
+	argc = parse_options(argc, argv, options, bench_futex_lock_pi_usage, 0);
+	if (argc)
+		goto err;
+
+	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+
+	sigfillset(&act.sa_mask);
+	act.sa_sigaction = toggle_done;
+	sigaction(SIGINT, &act, NULL);
+
+	if (!nthreads)
+		nthreads = ncpus;
+
+	worker = calloc(nthreads, sizeof(*worker));
+	if (!worker)
+		err(EXIT_FAILURE, "calloc");
+
+	if (!fshared)
+		futex_flag = FUTEX_PRIVATE_FLAG;
+
+	printf("Run summary [PID %d]: %d threads doing pi lock/unlock pairing for %d secs.\n\n",
+	       getpid(), nthreads, nsecs);
+
+	init_stats(&throughput_stats);
+	pthread_mutex_init(&thread_lock, NULL);
+	pthread_cond_init(&thread_parent, NULL);
+	pthread_cond_init(&thread_worker, NULL);
+
+	threads_starting = nthreads;
+	pthread_attr_init(&thread_attr);
+	gettimeofday(&start, NULL);
+
+	create_threads(worker, thread_attr);
+	pthread_attr_destroy(&thread_attr);
+
+	pthread_mutex_lock(&thread_lock);
+	while (threads_starting)
+		pthread_cond_wait(&thread_parent, &thread_lock);
+	pthread_cond_broadcast(&thread_worker);
+	pthread_mutex_unlock(&thread_lock);
+
+	sleep(nsecs);
+	toggle_done(0, NULL, NULL);
+
+	for (i = 0; i < nthreads; i++) {
+		ret = pthread_join(worker[i].thread, NULL);
+		if (ret)
+			err(EXIT_FAILURE, "pthread_join");
+	}
+
+	/* cleanup & report results */
+	pthread_cond_destroy(&thread_parent);
+	pthread_cond_destroy(&thread_worker);
+	pthread_mutex_destroy(&thread_lock);
+
+	for (i = 0; i < nthreads; i++) {
+		unsigned long t = worker[i].ops/runtime.tv_sec;
+
+		update_stats(&throughput_stats, t);
+		if (!silent)
+			printf("[thread %3d] futex: %p [ %ld ops/sec ]\n",
+			       worker[i].tid, worker[i].futex, t);
+
+		if (multi)
+			free(worker[i].futex);
+	}
+
+	print_summary();
+
+	free(worker);
+	return ret;
+err:
+	usage_with_options(bench_futex_lock_pi_usage, options);
+	exit(EXIT_FAILURE);
+}
diff --git a/tools/perf/bench/futex.h b/tools/perf/bench/futex.h
index 7ed22ff1e1ac..d44de9f44281 100644
--- a/tools/perf/bench/futex.h
+++ b/tools/perf/bench/futex.h
@@ -56,6 +56,26 @@ futex_wake(u_int32_t *uaddr, int nr_wake, int opflags)
 }
 
 /**
+ * futex_lock_pi() - block on uaddr as a PI mutex
+ * @detect:	whether (1) or not (0) to perform deadlock detection
+ */
+static inline int
+futex_lock_pi(u_int32_t *uaddr, struct timespec *timeout, int detect,
+	      int opflags)
+{
+	return futex(uaddr, FUTEX_LOCK_PI, detect, timeout, NULL, 0, opflags);
+}
+
+/**
+ * futex_unlock_pi() - release uaddr as a PI mutex, waking the top waiter
+ */
+static inline int
+futex_unlock_pi(u_int32_t *uaddr, int opflags)
+{
+	return futex(uaddr, FUTEX_UNLOCK_PI, 0, NULL, NULL, 0, opflags);
+}
+
+/**
 * futex_cmp_requeue() - requeue tasks from uaddr to uaddr2
 * @nr_wake:        wake up to this many tasks
 * @nr_requeue:        requeue up to this many tasks
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index b5314e452ec7..f67934d46d40 100644
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@@ -60,6 +60,8 @@ static struct bench futex_benchmarks[] = {
 	{ "wake",	"Benchmark for futex wake calls",               bench_futex_wake	},
 	{ "wake-parallel", "Benchmark for parallel futex wake calls",   bench_futex_wake_parallel },
 	{ "requeue",	"Benchmark for futex requeue calls",            bench_futex_requeue	},
+	/* pi-futexes */
+	{ "lock-pi",	"Benchmark for futex lock_pi calls",            bench_futex_lock_pi	},
 	{ "all",	"Test all futex benchmarks",			NULL			},
 	{ NULL,		NULL,						NULL			}
 };
-- 
2.1.0


  parent reply	other threads:[~2015-07-20 21:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-20 20:58 [GIT PULL 00/13] perf/core improvements and fixes Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 01/13] perf record: Document setting '-e pmu/period=N/' in man page Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 02/13] perf strlist: load() should return a negative errno Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 03/13] perf strlist: Make dupstr be the default and part of an extensible config parm Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 04/13] perf strlist: Allow substitutions from file contents in a given directory Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 05/13] perf strlist: Make parse_list() private Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 06/13] perf trace: Support 'strace' syscall event groups Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 07/13] perf record: Apply filter to all events in a glob matching Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 08/13] perf record: Allow filtering perf's pid via --exclude-perf Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 09/13] perf probe: Simplify __add_probe_trace_events code Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 10/13] perf probe: Move ftrace probe-event operations to probe-file.c Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 11/13] perf buildid: Use SBUILD_ID_SIZE macro Arnaldo Carvalho de Melo
2015-07-20 20:58 ` [PATCH 12/13] perf tools: Fix makefile generation under dash Arnaldo Carvalho de Melo
2015-07-20 20:58 ` Arnaldo Carvalho de Melo [this message]
2015-07-21  6:00 ` [GIT PULL 00/13] perf/core improvements and fixes Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1437425924-31064-14-git-send-email-acme@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=dave@stgolabs.net \
    --cc=dbueso@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).