public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Bruno Prémont" <bonbons@linux-vserver.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Peter Hurley" <peter@hurleysoftware.com>,
	"Davidlohr Bueso" <dave@stgolabs.net>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ilya Dryomov" <ilya.dryomov@inktank.com>,
	"Mike Galbraith" <umgwanakikbuti@gmail.com>,
	"Oleg Nesterov" <oleg@redhat.com>
Subject: Re: Linux 3.19-rc5
Date: Fri, 6 Feb 2015 17:39:47 +0100	[thread overview]
Message-ID: <20150206163947.GR21418@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CA+55aFw21PK8gBMNk_7kp1MmJoEsRyvPx+qaXSgwLFX2UN793g@mail.gmail.com>

On Fri, Feb 06, 2015 at 08:02:41AM -0800, Linus Torvalds wrote:
> On Fri, Feb 6, 2015 at 3:50 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > Also, set_current_state(TASK_RUNNING) is almost always pointless, nobody
> > cares about that barrier, so make it go away.
> 
> I'd rather not mix this with the patch, and wonder if we should just
> do that globally with some preprocessor magic. We do have a fair
> number of "set_current_state(TASK_RUNNING)" 

138

> and at least for the
> *documented* reason for the memory barrier, all of them could/should
> be barrier-less.
> 
> So something like
> 
>     if (__is_constant_p(state) && state == TASK_RUNNING)
>         tsk->state = state;
>     else
>         set_mb(tsk->state, state);
> 
> might be more general solution than randomly doing one at a time when
> changing code around it..

Yeah, or we could do the coccinelle thing and do a mass conversion.

I like the macro one though; I worry a wee bit about non-documented
cases through. If someone is doing something way subtle we'll break it
:/


---
Subject: sched: Avoid the full memory-barrier for set_current_state(TASK_RUNNING)

One should never need the full memory barrier implied by
set_current_state() to set TASK_RUNNING for the documented reason of
avoiding races against wakeup.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8db31ef98d2f..aea44c4eeed8 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -243,6 +243,27 @@ extern char ___assert_task_state[1 - 2*!!(
 				((task->state & TASK_UNINTERRUPTIBLE) != 0 && \
 				 (task->flags & PF_FROZEN) == 0)
 
+/*
+ * set_current_state() includes a barrier so that the write of current->state
+ * is correctly serialised wrt the caller's subsequent test of whether to
+ * actually sleep:
+ *
+ *	set_current_state(TASK_UNINTERRUPTIBLE);
+ *	if (do_i_need_to_sleep())
+ *		schedule();
+ *
+ * If the caller does not need such serialisation then use
+ * __set_current_state(). This is always true for TASK_RUNNING since
+ * there is no race against wakeup -- both write the same value.
+ */
+#define ___set_current_state(state)				\
+do {								\
+	if (__is_constant_p(state) && (state) == TASK_RUNNING)	\
+		current->state = (state);			\
+	else							\
+		set_mb(current->state, (state));		\
+} while (0)
+
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
 
 #define __set_task_state(tsk, state_value)			\
@@ -256,17 +277,6 @@ extern char ___assert_task_state[1 - 2*!!(
 		set_mb((tsk)->state, (state_value));		\
 	} while (0)
 
-/*
- * set_current_state() includes a barrier so that the write of current->state
- * is correctly serialised wrt the caller's subsequent test of whether to
- * actually sleep:
- *
- *	set_current_state(TASK_UNINTERRUPTIBLE);
- *	if (do_i_need_to_sleep())
- *		schedule();
- *
- * If the caller does not need such serialisation then use __set_current_state()
- */
 #define __set_current_state(state_value)			\
 	do {							\
 		current->task_state_change = _THIS_IP_;		\
@@ -275,7 +285,7 @@ extern char ___assert_task_state[1 - 2*!!(
 #define set_current_state(state_value)				\
 	do {							\
 		current->task_state_change = _THIS_IP_;		\
-		set_mb(current->state, (state_value));		\
+		___set_current_state(state_value);		\
 	} while (0)
 
 #else
@@ -285,21 +295,10 @@ extern char ___assert_task_state[1 - 2*!!(
 #define set_task_state(tsk, state_value)		\
 	set_mb((tsk)->state, (state_value))
 
-/*
- * set_current_state() includes a barrier so that the write of current->state
- * is correctly serialised wrt the caller's subsequent test of whether to
- * actually sleep:
- *
- *	set_current_state(TASK_UNINTERRUPTIBLE);
- *	if (do_i_need_to_sleep())
- *		schedule();
- *
- * If the caller does not need such serialisation then use __set_current_state()
- */
 #define __set_current_state(state_value)		\
 	do { current->state = (state_value); } while (0)
 #define set_current_state(state_value)			\
-	set_mb(current->state, (state_value))
+	___set_current_state(state_value);
 
 #endif
 



  reply	other threads:[~2015-02-06 16:40 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-18  9:17 Linux 3.19-rc5 Linus Torvalds
2015-01-19 18:02 ` Bruno Prémont
2015-01-20  6:24   ` Linus Torvalds
2015-01-21 20:37     ` Bruno Prémont
2015-01-21 21:37       ` Bruno Prémont
2015-01-21 22:12         ` Davidlohr Bueso
2015-01-21 22:54           ` Peter Hurley
2015-01-30  1:22             ` Rafael J. Wysocki
2015-01-30  1:12               ` Linus Torvalds
2015-01-30  1:25                 ` Linus Torvalds
2015-01-30  1:35                   ` Linus Torvalds
2015-01-30  2:06                   ` Rafael J. Wysocki
2015-01-30 15:47                   ` Oleg Nesterov
2015-01-31 18:32                     ` Linus Torvalds
2015-01-31 20:16                       ` Peter Zijlstra
2015-01-31 21:54                         ` Linus Torvalds
2015-02-02 13:19                           ` Peter Zijlstra
2015-02-01 19:43                         ` Oleg Nesterov
2015-02-01 20:09                           ` Linus Torvalds
2015-02-01 20:19                             ` Oleg Nesterov
2015-02-02 15:34                             ` Peter Zijlstra
2015-01-31  9:16                   ` Bruno Prémont
2015-01-31  9:48                   ` Peter Zijlstra
2015-02-03 11:18                     ` Ingo Molnar
2015-02-05 21:14                   ` Bruno Prémont
2015-02-06 11:50                     ` Peter Zijlstra
2015-02-06 16:02                       ` Linus Torvalds
2015-02-06 16:39                         ` Peter Zijlstra [this message]
2015-02-06 16:46                           ` Linus Torvalds
2015-01-30  1:49                 ` Rafael J. Wysocki
2015-02-02  9:48                   ` Zdenek Kabelac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150206163947.GR21418@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bonbons@linux-vserver.org \
    --cc=dave@stgolabs.net \
    --cc=ilya.dryomov@inktank.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peter@hurleysoftware.com \
    --cc=rjw@rjwysocki.net \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=umgwanakikbuti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox