public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Ingo Molnar <mingo@elte.hu>, torvalds@osdl.org
Cc: dhowells@redhat.com, Oleg Nesterov <oleg@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	serue@us.ibm.com, viro@zeniv.linux.org.uk,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	linux-kernel@vger.kernel.org
Subject: [PATCH] It may not be assumed that wake_up(), finish_wait() and co. imply a memory barrier
Date: Thu, 23 Apr 2009 17:32:24 +0100	[thread overview]
Message-ID: <21209.1240504344@redhat.com> (raw)
In-Reply-To: <20090422175753.GA14236@elte.hu>

Ingo Molnar <mingo@elte.hu> wrote:

> Why would an unlock be needed before a call to wake_up() variants?

Good point.  I've amended my patch again (see attached).

> In fact i'd encourage to _not_ document try_to_lock() as a write barrier
> either

Did you mean try_to_wake_up()?  Or did you mean things like
spin_lock_trylock()?  If the latter, it *has* to be a LOCK-class barrier if it
succeeds - otherwise what's the point?

> - but rather have explicit barriers where they are needed. Then we
> could remove that barrier from try_to_wake_up() too ;-)

I was wondering if wake_up() and friends should in fact imply smp_wmb(), but I
guess that they're often used in conjunction with spinlocks - and in such a
situation a barrier is unnecessary overhead.

David
---
From: David Howells <dhowells@redhat.com>
Subject: [PATCH] It may not be assumed that wake_up(), finish_wait() and co. imply a memory barrier

Add to the memory barriers document to note that try_to_wake_up(), wake_up(),
complete(), finish_wait() and co. should not be assumed to imply any sort of
memory barrier.

This is because:

 (1) A lot of the time, memory barriers in the wake-up and sleeper paths would
     be superfluous due to the use of locks.

 (2) It is possible to pass right through wake_up() and co. without hitting
     anything at all or anything other than a spin_lock/spin_unlock (if
     try_to_wake_up() isn't invoked).

 (3) The smp_wmb() should probably move out of try_to_wake_up() and into
     suitable places in its callers.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 Documentation/memory-barriers.txt |   34 +++++++++++++++++++++++++++++++++-
 kernel/sched.c                    |   11 +++++++++++
 2 files changed, 44 insertions(+), 1 deletions(-)


diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index f5b7127..8c32e23 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -42,6 +42,7 @@ Contents:
 
      - Interprocessor interaction.
      - Atomic operations.
+     - Wake up of processes
      - Accessing devices.
      - Interrupts.
 
@@ -1366,13 +1367,15 @@ WHERE ARE MEMORY BARRIERS NEEDED?
 
 Under normal operation, memory operation reordering is generally not going to
 be a problem as a single-threaded linear piece of code will still appear to
-work correctly, even if it's in an SMP kernel.  There are, however, three
+work correctly, even if it's in an SMP kernel.  There are, however, five
 circumstances in which reordering definitely _could_ be a problem:
 
  (*) Interprocessor interaction.
 
  (*) Atomic operations.
 
+ (*) Wake up of processes.
+
  (*) Accessing devices.
 
  (*) Interrupts.
@@ -1568,6 +1571,35 @@ and in such cases the special barrier primitives will be no-ops.
 See Documentation/atomic_ops.txt for more information.
 
 
+WAKE UP OF PROCESSES
+--------------------
+
+If locking is not used, and if the waker sets some state that the sleeper will
+need to see, a write memory barrier or a full memory barrier may be needed
+before one of the following calls is used to wake up another process:
+
+	complete();
+	try_to_wake_up();
+	wake_up();
+	wake_up_all();
+	wake_up_bit();
+	wake_up_interruptible();
+	wake_up_interruptible_all();
+	wake_up_interruptible_nr();
+	wake_up_interruptible_poll();
+	wake_up_interruptible_sync();
+	wake_up_interruptible_sync_poll();
+	wake_up_locked();
+	wake_up_locked_poll();
+	wake_up_nr();
+	wake_up_poll();
+
+After waking, and assuming it doesn't take a matching lock, the sleeper may
+need to interpolate a read or full memory barrier before accessing that state
+as finish_wait() does not imply a barrier either, and schedule() only implies a
+barrier on entry.
+
+
 ACCESSING DEVICES
 -----------------
 
diff --git a/kernel/sched.c b/kernel/sched.c
index b902e58..2ef0479 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2337,6 +2337,9 @@ static int sched_balance_self(int cpu, int flag)
  * runnable without the overhead of this.
  *
  * returns failure only if the task is already active.
+ *
+ * It should not be assumed that this function implies any sort of memory
+ * barrier.
  */
 static int try_to_wake_up(struct task_struct *p, unsigned int state, int sync)
 {
@@ -5241,6 +5244,8 @@ void __wake_up_common(wait_queue_head_t *q, unsigned int mode,
  * @mode: which threads
  * @nr_exclusive: how many wake-one or wake-many threads to wake up
  * @key: is directly passed to the wakeup function
+ *
+ * It may not be assumed that this function implies any sort of memory barrier.
  */
 void __wake_up(wait_queue_head_t *q, unsigned int mode,
 			int nr_exclusive, void *key)
@@ -5279,6 +5284,8 @@ void __wake_up_locked_key(wait_queue_head_t *q, unsigned int mode, void *key)
  * with each other. This can prevent needless bouncing between CPUs.
  *
  * On UP it can prevent extra preemption.
+ *
+ * It may not be assumed that this function implies any sort of memory barrier.
  */
 void __wake_up_sync_key(wait_queue_head_t *q, unsigned int mode,
 			int nr_exclusive, void *key)
@@ -5315,6 +5322,8 @@ EXPORT_SYMBOL_GPL(__wake_up_sync);	/* For internal use only */
  * awakened in the same order in which they were queued.
  *
  * See also complete_all(), wait_for_completion() and related routines.
+ *
+ * It may not be assumed that this function implies any sort of memory barrier.
  */
 void complete(struct completion *x)
 {
@@ -5332,6 +5341,8 @@ EXPORT_SYMBOL(complete);
  * @x:  holds the state of this particular completion
  *
  * This will wake up all threads waiting on this particular completion event.
+ *
+ * It may not be assumed that this function implies any sort of memory barrier.
  */
 void complete_all(struct completion *x)
 {

  reply	other threads:[~2009-04-23 16:35 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-13 18:17 [PATCH] slow_work_thread() should do the exclusive wait Oleg Nesterov
2009-04-13 19:03 ` Trond Myklebust
2009-04-13 19:14   ` Oleg Nesterov
2009-04-13 21:40   ` David Howells
2009-04-13 21:48     ` Oleg Nesterov
2009-04-13 21:57       ` Trond Myklebust
2009-04-13 22:24         ` Oleg Nesterov
2009-04-15 23:27           ` Andrew Morton
2009-04-16  9:10             ` David Howells
2009-04-16 14:33               ` Oleg Nesterov
2009-04-22 13:37                 ` [PATCH] Document that wake_up(), complete() and co. imply a full memory barrier David Howells
2009-04-22 13:51                   ` Ingo Molnar
2009-04-22 14:39                     ` Oleg Nesterov
2009-04-22 14:56                       ` Ingo Molnar
2009-04-22 15:07                         ` Oleg Nesterov
2009-04-22 15:12                     ` David Howells
2009-04-22 15:19                       ` Ingo Molnar
2009-04-22 16:23                       ` David Howells
2009-04-22 17:57                         ` Ingo Molnar
2009-04-23 16:32                           ` David Howells [this message]
2009-04-23 16:55                             ` [PATCH] It may not be assumed that wake_up(), finish_wait() and co. imply a " Oleg Nesterov
2009-04-24 11:46                               ` David Howells
2009-04-24 15:08                                 ` Paul E. McKenney
2009-04-24 17:08                                   ` Oleg Nesterov
2009-04-24 17:43                                     ` Paul E. McKenney
2009-04-24 17:48                                   ` David Howells
2009-04-24 18:06                                     ` Paul E. McKenney
2009-04-28 10:18                                       ` David Howells
2009-04-28 13:00                                         ` Paul E. McKenney
2009-04-24 17:28                                 ` Oleg Nesterov
2009-04-24 17:53                                   ` David Howells
2009-04-24 18:30                                     ` Oleg Nesterov
2009-04-23 17:07                             ` Linus Torvalds
2009-04-23 20:35                               ` David Howells
2009-04-23 21:12                                 ` Linus Torvalds
2009-04-23 21:24                                   ` Ingo Molnar
2009-04-23 16:36                           ` [PATCH] Document that wake_up(), complete() and co. imply a full " Oleg Nesterov
2009-04-23 20:37                             ` David Howells
2009-04-23 16:00             ` [PATCH] slow_work_thread() should do the exclusive wait David Howells
2009-04-23 16:18               ` Oleg Nesterov
2009-04-13 21:35 ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=21209.1240504344@redhat.com \
    --to=dhowells@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=serue@us.ibm.com \
    --cc=torvalds@osdl.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox