From: Peter Zijlstra <peterz@infradead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Oleg Nesterov <oleg@redhat.com>,
linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
akpm@linux-foundation.org, josh@joshtriplett.org,
tglx@linutronix.de, Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
laijs@cn.fujitsu.com, dipankar@in.ibm.com,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier (v3a)
Date: Mon, 11 Jan 2010 18:50:40 +0100 [thread overview]
Message-ID: <1263232240.4244.70.camel@laptop> (raw)
In-Reply-To: <20100111042903.GC32213@Krystal>
On Sun, 2010-01-10 at 23:29 -0500, Mathieu Desnoyers wrote:
> Here is an implementation of a new system call, sys_membarrier(), which
> executes a memory barrier on all threads of the current process.
Please start a new thread for new versions, I really didn't find this
until I started reading in date order instead of thread order.
> Index: linux-2.6-lttng/arch/x86/include/asm/unistd_64.h
> ===================================================================
> --- linux-2.6-lttng.orig/arch/x86/include/asm/unistd_64.h 2010-01-10 19:21:31.000000000 -0500
> +++ linux-2.6-lttng/arch/x86/include/asm/unistd_64.h 2010-01-10 19:21:37.000000000 -0500
> @@ -661,6 +661,8 @@ __SYSCALL(__NR_pwritev, sys_pwritev)
> __SYSCALL(__NR_rt_tgsigqueueinfo, sys_rt_tgsigqueueinfo)
> #define __NR_perf_event_open 298
> __SYSCALL(__NR_perf_event_open, sys_perf_event_open)
> +#define __NR_membarrier 299
> +__SYSCALL(__NR_membarrier, sys_membarrier)
>
> #ifndef __NO_STUBS
> #define __ARCH_WANT_OLD_READDIR
> Index: linux-2.6-lttng/kernel/sched.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/sched.c 2010-01-10 19:21:31.000000000 -0500
> +++ linux-2.6-lttng/kernel/sched.c 2010-01-10 22:22:40.000000000 -0500
> @@ -2861,12 +2861,26 @@ context_switch(struct rq *rq, struct tas
> */
> arch_start_context_switch(prev);
>
> + /*
> + * sys_membarrier IPI-mb scheme requires a memory barrier between
> + * user-space thread execution and update to mm_cpumask.
> + */
> + if (likely(oldmm) && likely(oldmm != mm))
> + smp_mb__before_clear_bit();
> +
> if (unlikely(!mm)) {
> next->active_mm = oldmm;
> atomic_inc(&oldmm->mm_count);
> enter_lazy_tlb(oldmm, next);
> - } else
> + } else {
> switch_mm(oldmm, mm, next);
> + /*
> + * sys_membarrier IPI-mb scheme requires a memory barrier
> + * between update to mm_cpumask and user-space thread execution.
> + */
> + if (likely(oldmm != mm))
> + smp_mb__after_clear_bit();
> + }
>
> if (unlikely(!prev->mm)) {
> prev->active_mm = NULL;
> @@ -10822,6 +10836,49 @@ struct cgroup_subsys cpuacct_subsys = {
> };
> #endif /* CONFIG_CGROUP_CPUACCT */
>
> +/*
> + * Execute a memory barrier on all active threads from the current process
> + * on SMP systems. Do not rely on implicit barriers in
> + * smp_call_function_many(), just in case they are ever relaxed in the future.
> + */
> +static void membarrier_ipi(void *unused)
> +{
> + smp_mb();
> +}
> +
> +/*
> + * sys_membarrier - issue memory barrier on current process running threads
> + *
> + * Execute a memory barrier on all running threads of the current process.
> + * Upon completion, the caller thread is ensured that all process threads
> + * have passed through a state where memory accesses match program order.
> + * (non-running threads are de facto in such a state)
> + */
> +SYSCALL_DEFINE0(membarrier)
> +{
> +#ifdef CONFIG_SMP
> + if (unlikely(thread_group_empty(current)))
> + return 0;
> + /*
> + * Memory barrier on the caller thread _before_ sending first
> + * IPI. Matches memory barriers around mm_cpumask modification in
> + * context_switch().
> + */
> + smp_mb();
> + preempt_disable();
> + smp_call_function_many(mm_cpumask(current->mm), membarrier_ipi,
> + NULL, 1);
> + preempt_enable();
> + /*
> + * Memory barrier on the caller thread _after_ we finished
> + * waiting for the last IPI. Matches memory barriers around mm_cpumask
> + * modification in context_switch().
> + */
> + smp_mb();
> +#endif /* #ifdef CONFIG_SMP */
> + return 0;
> +}
> +
> #ifndef CONFIG_SMP
>
> int rcu_expedited_torture_stats(char *page)
Right, so here you rely on the arch switch_mm() implementation to keep
mm_cpumask() current, but then stick a memory barrier in the generic
code... seems odd.
x86 switch_mm() does indeed keep it current, but writing cr3 is also a
rather serializing instruction.
Furthermore, do we really need that smp_mb() in the membarrier_ipi()
function? Shouldn't we investigate if either:
- receiving an IPI implies an mb, or
- enter/leave kernelspace implies an mb
?
So while I much like the simplified version, that previous one was
heavily over engineer, I
1) don't like that memory barrier in the generic code,
2) don't think that arch's necessarily keep that mask as tight.
[ even if for x86 smp_mb__{before,after}_clear_bit are a nop, tying that
to switch_mm() semantics just reeks ]
See for example the sparc64 implementation of switch_mm() that only sets
cpus in mm_cpumask(), but only tlb flushes clear them. Also, I wouldn't
know if switch_mm() implies an mb on sparc64.
next prev parent reply other threads:[~2010-01-11 17:51 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-07 4:40 [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier Mathieu Desnoyers
2010-01-07 5:02 ` Paul E. McKenney
2010-01-07 5:39 ` Mathieu Desnoyers
2010-01-07 8:32 ` Peter Zijlstra
2010-01-07 16:39 ` Paul E. McKenney
2010-01-07 5:28 ` Josh Triplett
2010-01-07 6:04 ` Mathieu Desnoyers
2010-01-07 6:32 ` Josh Triplett
2010-01-07 17:45 ` Mathieu Desnoyers
2010-01-07 16:46 ` Paul E. McKenney
2010-01-07 5:40 ` Steven Rostedt
2010-01-07 6:19 ` Mathieu Desnoyers
2010-01-07 6:35 ` Josh Triplett
2010-01-07 8:44 ` Peter Zijlstra
2010-01-07 13:15 ` Steven Rostedt
2010-01-07 15:07 ` Mathieu Desnoyers
2010-01-07 16:52 ` Paul E. McKenney
2010-01-07 17:18 ` Peter Zijlstra
2010-01-07 17:31 ` Paul E. McKenney
2010-01-07 17:44 ` Mathieu Desnoyers
2010-01-07 17:55 ` Paul E. McKenney
2010-01-07 17:44 ` Steven Rostedt
2010-01-07 17:56 ` Paul E. McKenney
2010-01-07 18:04 ` Steven Rostedt
2010-01-07 18:40 ` Paul E. McKenney
2010-01-07 17:36 ` Mathieu Desnoyers
2010-01-07 14:27 ` Steven Rostedt
2010-01-07 15:10 ` Mathieu Desnoyers
2010-01-07 16:49 ` Paul E. McKenney
2010-01-07 17:00 ` Steven Rostedt
2010-01-07 8:27 ` Peter Zijlstra
2010-01-07 18:30 ` Oleg Nesterov
2010-01-07 18:39 ` Paul E. McKenney
2010-01-07 18:59 ` Steven Rostedt
2010-01-07 19:16 ` Paul E. McKenney
2010-01-07 19:40 ` Steven Rostedt
2010-01-07 20:58 ` Paul E. McKenney
2010-01-07 21:35 ` Steven Rostedt
2010-01-07 22:34 ` Paul E. McKenney
2010-01-08 22:28 ` Mathieu Desnoyers
2010-01-08 23:53 ` Mathieu Desnoyers
2010-01-09 0:20 ` Paul E. McKenney
2010-01-09 1:02 ` Mathieu Desnoyers
2010-01-09 1:21 ` Paul E. McKenney
2010-01-09 1:22 ` Paul E. McKenney
2010-01-09 2:38 ` Mathieu Desnoyers
2010-01-09 5:42 ` Paul E. McKenney
2010-01-09 19:20 ` Mathieu Desnoyers
2010-01-09 23:05 ` Steven Rostedt
2010-01-09 23:16 ` Steven Rostedt
2010-01-10 0:03 ` Paul E. McKenney
2010-01-10 0:41 ` Steven Rostedt
2010-01-10 1:14 ` Mathieu Desnoyers
2010-01-10 1:44 ` Mathieu Desnoyers
2010-01-10 2:12 ` Steven Rostedt
2010-01-10 5:25 ` Paul E. McKenney
2010-01-10 11:50 ` Steven Rostedt
2010-01-10 16:03 ` Mathieu Desnoyers
2010-01-10 16:21 ` Steven Rostedt
2010-01-10 17:10 ` Mathieu Desnoyers
2010-01-10 21:02 ` Steven Rostedt
2010-01-10 21:41 ` Mathieu Desnoyers
2010-01-11 1:21 ` Paul E. McKenney
2010-01-10 17:45 ` Paul E. McKenney
2010-01-10 18:24 ` Mathieu Desnoyers
2010-01-11 1:17 ` Paul E. McKenney
2010-01-11 4:25 ` Mathieu Desnoyers
2010-01-11 4:29 ` [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier (v3a) Mathieu Desnoyers
2010-01-11 17:27 ` Paul E. McKenney
2010-01-11 17:35 ` Mathieu Desnoyers
2010-01-11 17:50 ` Peter Zijlstra [this message]
2010-01-11 20:52 ` Mathieu Desnoyers
2010-01-11 21:19 ` Peter Zijlstra
2010-01-11 22:04 ` Mathieu Desnoyers
2010-01-11 22:20 ` Peter Zijlstra
2010-01-11 22:48 ` Paul E. McKenney
2010-01-11 22:48 ` Mathieu Desnoyers
2010-01-11 21:19 ` Peter Zijlstra
2010-01-11 21:31 ` Peter Zijlstra
2010-01-11 4:30 ` [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier (v3b) Mathieu Desnoyers
2010-01-11 22:43 ` Paul E. McKenney
2010-01-12 15:38 ` Mathieu Desnoyers
2010-01-12 16:27 ` Steven Rostedt
2010-01-12 16:38 ` Mathieu Desnoyers
2010-01-12 16:54 ` Paul E. McKenney
2010-01-12 18:12 ` Paul E. McKenney
2010-01-12 18:56 ` Mathieu Desnoyers
2010-01-13 0:23 ` Paul E. McKenney
2010-01-11 16:25 ` [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier Paul E. McKenney
2010-01-11 20:21 ` Mathieu Desnoyers
2010-01-11 21:48 ` Paul E. McKenney
2010-01-14 2:56 ` Lai Jiangshan
2010-01-14 5:13 ` Paul E. McKenney
2010-01-14 5:39 ` Mathieu Desnoyers
2010-01-10 5:18 ` Paul E. McKenney
2010-01-10 1:12 ` Mathieu Desnoyers
2010-01-10 5:19 ` Paul E. McKenney
2010-01-10 1:04 ` Mathieu Desnoyers
2010-01-10 1:01 ` Mathieu Desnoyers
2010-01-09 23:59 ` Paul E. McKenney
2010-01-10 1:11 ` Mathieu Desnoyers
2010-01-07 9:50 ` Andi Kleen
2010-01-07 15:12 ` Mathieu Desnoyers
2010-01-07 16:56 ` Paul E. McKenney
2010-01-07 11:04 ` David Howells
2010-01-07 15:15 ` Mathieu Desnoyers
2010-01-07 15:47 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1263232240.4244.70.camel@laptop \
--to=peterz@infradead.org \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=josh@joshtriplett.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.