From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965291AbcJXQW1 (ORCPT ); Mon, 24 Oct 2016 12:22:27 -0400 Received: from merlin.infradead.org ([205.233.59.134]:56602 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757615AbcJXQVj (ORCPT ); Mon, 24 Oct 2016 12:21:39 -0400 From: Arnaldo Carvalho de Melo To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, Sebastian Andrzej Siewior , Davidlohr Bueso , Peter Zijlstra , Arnaldo Carvalho de Melo Subject: [PATCH 33/37] perf bench futex: Cache align the worker struct Date: Mon, 24 Oct 2016 13:20:53 -0300 Message-Id: <1477326057-24080-34-git-send-email-acme@kernel.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1477326057-24080-1-git-send-email-acme@kernel.org> References: <1477326057-24080-1-git-send-email-acme@kernel.org> X-SRS-Rewrite: SMTP reverse-path rewritten from by merlin.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sebastian Andrzej Siewior It popped up in perf testing that the worker consumes some amount of CPU. It boils down to the increment of `ops` which causes cache line bouncing between the individual threads. This patch aligns the struct by 256 bytes to ensure that not a cache line is shared among CPUs. 128 byte is the x86 worst case and grep says that L1_CACHE_SHIFT is set to 8 on s390. Signed-off-by: Sebastian Andrzej Siewior Cc: Davidlohr Bueso Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20161016190803.3392-1-bigeasy@linutronix.de Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/bench/futex-hash.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c index 8024cd5febd2..d9e5e80bb4d0 100644 --- a/tools/perf/bench/futex-hash.c +++ b/tools/perf/bench/futex-hash.c @@ -39,12 +39,15 @@ static unsigned int threads_starting; static struct stats throughput_stats; static pthread_cond_t thread_parent, thread_worker; +#define SMP_CACHE_BYTES 256 +#define __cacheline_aligned __attribute__ ((aligned (SMP_CACHE_BYTES))) + struct worker { int tid; u_int32_t *futex; pthread_t thread; unsigned long ops; -}; +} __cacheline_aligned; static const struct option options[] = { OPT_UINTEGER('t', "threads", &nthreads, "Specify amount of threads"), -- 2.7.4