All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Ingo Molnar <mingo@elte.hu>, torvalds@osdl.org
Cc: dhowells@redhat.com, Oleg Nesterov <oleg@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	serue@us.ibm.com, viro@zeniv.linux.org.uk,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	linux-kernel@vger.kernel.org
Subject: [PATCH] It may not be assumed that wake_up(), finish_wait() and co. imply a memory barrier
Date: Thu, 23 Apr 2009 17:32:24 +0100	[thread overview]
Message-ID: <21209.1240504344@redhat.com> (raw)
In-Reply-To: <20090422175753.GA14236@elte.hu>

Ingo Molnar <mingo@elte.hu> wrote:

> Why would an unlock be needed before a call to wake_up() variants?

Good point.  I've amended my patch again (see attached).

> In fact i'd encourage to _not_ document try_to_lock() as a write barrier
> either

Did you mean try_to_wake_up()?  Or did you mean things like
spin_lock_trylock()?  If the latter, it *has* to be a LOCK-class barrier if it
succeeds - otherwise what's the point?

> - but rather have explicit barriers where they are needed. Then we
> could remove that barrier from try_to_wake_up() too ;-)

I was wondering if wake_up() and friends should in fact imply smp_wmb(), but I
guess that they're often used in conjunction with spinlocks - and in such a
situation a barrier is unnecessary overhead.

David
---
From: David Howells <dhowells@redhat.com>
Subject: [PATCH] It may not be assumed that wake_up(), finish_wait() and co. imply a memory barrier

Add to the memory barriers document to note that try_to_wake_up(), wake_up(),
complete(), finish_wait() and co. should not be assumed to imply any sort of
memory barrier.

This is because:

 (1) A lot of the time, memory barriers in the wake-up and sleeper paths would
     be superfluous due to the use of locks.

 (2) It is possible to pass right through wake_up() and co. without hitting
     anything at all or anything other than a spin_lock/spin_unlock (if
     try_to_wake_up() isn't invoked).

 (3) The smp_wmb() should probably move out of try_to_wake_up() and into
     suitable places in its callers.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 Documentation/memory-barriers.txt |   34 +++++++++++++++++++++++++++++++++-
 kernel/sched.c                    |   11 +++++++++++
 2 files changed, 44 insertions(+), 1 deletions(-)


diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index f5b7127..8c32e23 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -42,6 +42,7 @@ Contents:
 
      - Interprocessor interaction.
      - Atomic operations.
+     - Wake up of processes
      - Accessing devices.
      - Interrupts.
 
@@ -1366,13 +1367,15 @@ WHERE ARE MEMORY BARRIERS NEEDED?
 
 Under normal operation, memory operation reordering is generally not going to
 be a problem as a single-threaded linear piece of code will still appear to
-work correctly, even if it's in an SMP kernel.  There are, however, three
+work correctly, even if it's in an SMP kernel.  There are, however, five
 circumstances in which reordering definitely _could_ be a problem:
 
  (*) Interprocessor interaction.
 
  (*) Atomic operations.
 
+ (*) Wake up of processes.
+
  (*) Accessing devices.
 
  (*) Interrupts.
@@ -1568,6 +1571,35 @@ and in such cases the special barrier primitives will be no-ops.
 See Documentation/atomic_ops.txt for more information.
 
 
+WAKE UP OF PROCESSES
+--------------------
+
+If locking is not used, and if the waker sets some state that the sleeper will
+need to see, a write memory barrier or a full memory barrier may be needed
+before one of the following calls is used to wake up another process:
+
+	complete();
+	try_to_wake_up();
+	wake_up();
+	wake_up_all();
+	wake_up_bit();
+	wake_up_interruptible();
+	wake_up_interruptible_all();
+	wake_up_interruptible_nr();
+	wake_up_interruptible_poll();
+	wake_up_interruptible_sync();
+	wake_up_interruptible_sync_poll();
+	wake_up_locked();
+	wake_up_locked_poll();
+	wake_up_nr();
+	wake_up_poll();
+
+After waking, and assuming it doesn't take a matching lock, the sleeper may
+need to interpolate a read or full memory barrier before accessing that state
+as finish_wait() does not imply a barrier either, and schedule() only implies a
+barrier on entry.
+
+
 ACCESSING DEVICES
 -----------------
 
diff --git a/kernel/sched.c b/kernel/sched.c
index b902e58..2ef0479 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2337,6 +2337,9 @@ static int sched_balance_self(int cpu, int flag)
  * runnable without the overhead of this.
  *
  * returns failure only if the task is already active.
+ *
+ * It should not be assumed that this function implies any sort of memory
+ * barrier.
  */
 static int try_to_wake_up(struct task_struct *p, unsigned int state, int sync)
 {
@@ -5241,6 +5244,8 @@ void __wake_up_common(wait_queue_head_t *q, unsigned int mode,
  * @mode: which threads
  * @nr_exclusive: how many wake-one or wake-many threads to wake up
  * @key: is directly passed to the wakeup function
+ *
+ * It may not be assumed that this function implies any sort of memory barrier.
  */
 void __wake_up(wait_queue_head_t *q, unsigned int mode,
 			int nr_exclusive, void *key)
@@ -5279,6 +5284,8 @@ void __wake_up_locked_key(wait_queue_head_t *q, unsigned int mode, void *key)
  * with each other. This can prevent needless bouncing between CPUs.
  *
  * On UP it can prevent extra preemption.
+ *
+ * It may not be assumed that this function implies any sort of memory barrier.
  */
 void __wake_up_sync_key(wait_queue_head_t *q, unsigned int mode,
 			int nr_exclusive, void *key)
@@ -5315,6 +5322,8 @@ EXPORT_SYMBOL_GPL(__wake_up_sync);	/* For internal use only */
  * awakened in the same order in which they were queued.
  *
  * See also complete_all(), wait_for_completion() and related routines.
+ *
+ * It may not be assumed that this function implies any sort of memory barrier.
  */
 void complete(struct completion *x)
 {
@@ -5332,6 +5341,8 @@ EXPORT_SYMBOL(complete);
  * @x:  holds the state of this particular completion
  *
  * This will wake up all threads waiting on this particular completion event.
+ *
+ * It may not be assumed that this function implies any sort of memory barrier.
  */
 void complete_all(struct completion *x)
 {

  reply	other threads:[~2009-04-23 16:35 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-13 18:17 [PATCH] slow_work_thread() should do the exclusive wait Oleg Nesterov
2009-04-13 19:03 ` Trond Myklebust
2009-04-13 19:14   ` Oleg Nesterov
2009-04-13 21:40   ` David Howells
2009-04-13 21:48     ` Oleg Nesterov
2009-04-13 21:57       ` Trond Myklebust
2009-04-13 22:24         ` Oleg Nesterov
2009-04-15 23:27           ` Andrew Morton
2009-04-16  9:10             ` David Howells
2009-04-16 14:33               ` Oleg Nesterov
2009-04-22 13:37                 ` [PATCH] Document that wake_up(), complete() and co. imply a full memory barrier David Howells
2009-04-22 13:51                   ` Ingo Molnar
2009-04-22 14:39                     ` Oleg Nesterov
2009-04-22 14:56                       ` Ingo Molnar
2009-04-22 15:07                         ` Oleg Nesterov
2009-04-22 15:12                     ` David Howells
2009-04-22 15:19                       ` Ingo Molnar
2009-04-22 16:23                       ` David Howells
2009-04-22 17:57                         ` Ingo Molnar
2009-04-23 16:32                           ` David Howells [this message]
2009-04-23 16:55                             ` [PATCH] It may not be assumed that wake_up(), finish_wait() and co. imply a " Oleg Nesterov
2009-04-24 11:46                               ` David Howells
2009-04-24 15:08                                 ` Paul E. McKenney
2009-04-24 17:08                                   ` Oleg Nesterov
2009-04-24 17:43                                     ` Paul E. McKenney
2009-04-24 17:48                                   ` David Howells
2009-04-24 18:06                                     ` Paul E. McKenney
2009-04-28 10:18                                       ` David Howells
2009-04-28 13:00                                         ` Paul E. McKenney
2009-04-24 17:28                                 ` Oleg Nesterov
2009-04-24 17:53                                   ` David Howells
2009-04-24 18:30                                     ` Oleg Nesterov
2009-04-23 17:07                             ` Linus Torvalds
2009-04-23 20:35                               ` David Howells
2009-04-23 21:12                                 ` Linus Torvalds
2009-04-23 21:24                                   ` Ingo Molnar
2009-04-23 16:36                           ` [PATCH] Document that wake_up(), complete() and co. imply a full " Oleg Nesterov
2009-04-23 20:37                             ` David Howells
2009-04-23 16:00             ` [PATCH] slow_work_thread() should do the exclusive wait David Howells
2009-04-23 16:18               ` Oleg Nesterov
2009-04-13 21:35 ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=21209.1240504344@redhat.com \
    --to=dhowells@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=serue@us.ibm.com \
    --cc=torvalds@osdl.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.