All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	nicolas.pitre@linaro.org, daniel.lezcano@linaro.org,
	Mike Galbraith <umgwanakikbuti@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC][PATCH 0/8] sched,idle: need resched polling rework
Date: Thu, 22 May 2014 14:58:18 +0200	[thread overview]
Message-ID: <20140522125818.GV30445@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CALCETrWJZ1-T+qOY9v=KBOSJkck7YYe7khw_dYPc9Ar8WFgMKw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4858 bytes --]

On Fri, Apr 11, 2014 at 08:00:23AM -0700, Andy Lutomirski wrote:
> 
> That being said, I think that this addresses once one of the two major
> issues.  While the race you're fixing is more interesting, I think its
> impact is dwarfed by the fact that ttwu_queue_remote completely
> ignores polling.  (NB: I haven't actually tested this patch set, but I
> did try to instrument this stuff awhile ago.)
> 
> To fix this, presumably the wake-from-idle path needs a
> sched_ttwu_pending call, and ttwu_queue_remote could use resched_task.
>  sched_ttwu_pending could benefit from a straightforward optimization:
> it doesn't need rq->lock if llist is empty.
> 
> If you're not planning on trying to fix that, I can try to write up a
> patch in the next day or two.
> 
> Even with all of this fixed, what happens when ttwu_queue_remote is
> called with a task that has lower priority than whatever is currently
> running on the targeted cpu?  I think the result is an IPI that serves
> very little purpose other than avoiding taking a spinlock in the
> waking thread.  This may be a bad tradeoff.  I doubt that this matters
> for my particular workload, though.
> 

Ok, now that that all settled, how about something like this on top?

---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4ea7b3f1a087..a5da85fb3570 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -546,12 +546,38 @@ static bool set_nr_and_not_polling(struct task_struct *p)
 	struct thread_info *ti = task_thread_info(p);
 	return !(fetch_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG);
 }
+
+/*
+ * Atomically set TIF_NEED_RESCHED if TIF_POLLING_NRFLAG is set.
+ */
+static bool set_nr_if_polling(struct task_struct *p)
+{
+	struct thread_info *ti = task_thread_info(p);
+	typeof(ti->flags) old, val = ti->flags;
+
+	for (;;) {
+		if (!(val & _TIF_POLLING_NRFLAG))
+			return false;
+		if (val & _TIF_NEED_RESCHED)
+			return true;
+		old = cmpxchg(&ti->flags, val, val | _TIF_NEED_RESCHED);
+		if (old == val)
+			return true;
+		val = old;
+	}
+}
+
 #else
 static bool set_nr_and_not_polling(struct task_struct *p)
 {
 	set_tsk_need_resched(p);
 	return true;
 }
+
+static bool set_nr_if_polling(struct task_struct *p)
+{
+	return false;
+}
 #endif
 
 /*
@@ -652,16 +678,7 @@ static void wake_up_idle_cpu(int cpu)
 	if (rq->curr != rq->idle)
 		return;
 
-	/*
-	 * We can set TIF_RESCHED on the idle task of the other CPU
-	 * lockless. The worst case is that the other CPU runs the
-	 * idle task through an additional NOOP schedule()
-	 */
-	set_tsk_need_resched(rq->idle);
-
-	/* NEED_RESCHED must be visible before we test polling */
-	smp_mb();
-	if (!tsk_is_polling(rq->idle))
+	if (set_nr_and_not_polling(rq->idle))
 		smp_send_reschedule(cpu);
 }
 
@@ -1521,12 +1538,15 @@ static int ttwu_remote(struct task_struct *p, int wake_flags)
 }
 
 #ifdef CONFIG_SMP
-static void sched_ttwu_pending(void)
+void sched_ttwu_pending(void)
 {
 	struct rq *rq = this_rq();
 	struct llist_node *llist = llist_del_all(&rq->wake_list);
 	struct task_struct *p;
 
+	if (!llist)
+		return;
+
 	raw_spin_lock(&rq->lock);
 
 	while (llist) {
@@ -1581,8 +1601,10 @@ void scheduler_ipi(void)
 
 static void ttwu_queue_remote(struct task_struct *p, int cpu)
 {
-	if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list))
-		smp_send_reschedule(cpu);
+	if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) {
+		if (!set_nr_if_polling(p))
+			smp_send_reschedule(cpu);
+	}
 }
 
 bool cpus_share_cache(int this_cpu, int that_cpu)
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 34083c9ac976..4286f2e59136 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -12,6 +12,8 @@
 
 #include <trace/events/power.h>
 
+#include "sched.h"
+
 static int __read_mostly cpu_idle_force_poll;
 
 void cpu_idle_poll_ctrl(bool enable)
@@ -223,6 +225,7 @@ static void cpu_idle_loop(void)
 		 */
 		preempt_set_need_resched();
 		tick_nohz_idle_exit();
+		sched_ttwu_pending();
 		schedule_preempt_disabled();
 	}
 }
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b2cbe81308af..2b35ec6df6b3 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -670,6 +670,8 @@ extern int migrate_swap(struct task_struct *, struct task_struct *);
 
 #ifdef CONFIG_SMP
 
+extern void sched_ttwu_pending(void);
+
 #define rcu_dereference_check_sched_domain(p) \
 	rcu_dereference_check((p), \
 			      lockdep_is_held(&sched_domains_mutex))
@@ -787,6 +789,10 @@ static inline unsigned int group_first_cpu(struct sched_group *group)
 
 extern int group_balance_cpu(struct sched_group *sg);
 
+#else
+
+static inline void sched_ttwu_pending(void) { }
+
 #endif /* CONFIG_SMP */
 
 #include "stats.h"

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

  parent reply	other threads:[~2014-05-22 12:58 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-11 13:42 [RFC][PATCH 0/8] sched,idle: need resched polling rework Peter Zijlstra
2014-04-11 13:42 ` [RFC][PATCH 1/8] sched,idle,alpha: Switch from TS_POLLING to TIF_POLLING_NRFLAG Peter Zijlstra
2014-04-11 14:38   ` Richard Henderson
2014-04-11 13:42 ` [RFC][PATCH 2/8] sched,idle,tile: " Peter Zijlstra
2014-04-11 15:15   ` Chris Metcalf
2014-04-11 15:30     ` Peter Zijlstra
2014-04-11 13:42 ` [RFC][PATCH 3/8] sched,idle,ia64: " Peter Zijlstra
2014-04-11 13:42 ` [RFC][PATCH 4/8] sched,idle,x86: " Peter Zijlstra
2014-04-11 13:42 ` [RFC][PATCH 5/8] sched,idle: Remove TS_POLLING support Peter Zijlstra
2014-04-11 13:42 ` [RFC][PATCH 6/8] sched,idle: Avoid spurious wakeup IPIs Peter Zijlstra
2014-04-13 21:41   ` Nicolas Pitre
2014-05-09 13:37   ` James Hogan
2014-05-09 13:37     ` James Hogan
2014-05-09 14:15     ` Peter Zijlstra
2014-05-09 14:15       ` Peter Zijlstra
2014-05-09 14:40       ` Catalin Marinas
2014-05-09 14:40         ` Catalin Marinas
2014-05-09 14:50         ` Peter Zijlstra
2014-05-09 14:50           ` Peter Zijlstra
2014-05-09 14:57           ` Catalin Marinas
2014-05-09 14:57             ` Catalin Marinas
2014-05-09 17:02             ` Peter Zijlstra
2014-05-09 17:02               ` Peter Zijlstra
2014-05-09 17:06               ` Peter Zijlstra
2014-05-09 17:06                 ` Peter Zijlstra
2014-05-09 17:09                 ` Catalin Marinas
2014-05-09 17:09                   ` Catalin Marinas
2014-05-09 17:20                   ` Peter Zijlstra
2014-05-09 17:20                     ` Peter Zijlstra
2014-05-19 12:54                 ` [tip:sched/arch] arm64: Remove TIF_POLLING_NRFLAG tip-bot for Peter Zijlstra
2014-05-22 12:26                 ` [tip:sched/core] " tip-bot for Peter Zijlstra
2014-05-09 14:51       ` [RFC][PATCH 6/8] sched,idle: Avoid spurious wakeup IPIs James Hogan
2014-05-09 14:51         ` James Hogan
2014-05-09 14:51         ` James Hogan
     [not found]         ` <536CEB7E.9080007-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
2014-05-15  9:17           ` James Hogan
2014-05-15  9:17             ` James Hogan
2014-05-15  9:17             ` James Hogan
2014-05-19 12:54         ` [tip:sched/arch] metag: Remove TIF_POLLING_NRFLAG tip-bot for James Hogan
2014-05-22 12:26         ` [tip:sched/core] " tip-bot for James Hogan
2014-04-11 13:42 ` [RFC][PATCH 7/8] sched,idle: Delay clearing the polling bit Peter Zijlstra
2014-04-13 21:51   ` Nicolas Pitre
2014-04-11 13:42 ` [RFC][PATCH 8/8] sched,idle: Reflow cpuidle_idle_call() Peter Zijlstra
2014-04-13 21:36   ` Nicolas Pitre
2014-04-14  8:59     ` Peter Zijlstra
2014-04-14  9:25       ` Peter Zijlstra
2014-04-14 13:55         ` Nicolas Pitre
2014-04-11 15:00 ` [RFC][PATCH 0/8] sched,idle: need resched polling rework Andy Lutomirski
2014-04-11 15:14   ` Peter Zijlstra
2014-05-22 12:58   ` Peter Zijlstra [this message]
2014-05-22 13:09     ` Peter Zijlstra
2014-05-29  0:01       ` Andy Lutomirski
2014-05-29  6:48         ` Peter Zijlstra
2014-06-03  6:40           ` Andy Lutomirski
2014-06-03  6:51             ` Peter Zijlstra
2014-06-03 10:43             ` Peter Zijlstra
2014-06-03 14:02               ` Peter Zijlstra
2014-06-03 16:05                 ` Andy Lutomirski
2014-06-03 16:19                   ` Peter Zijlstra
2014-06-03 16:52                     ` Andy Lutomirski
2014-06-03 17:00                       ` Peter Zijlstra
2014-06-03 18:28                         ` Peter Zijlstra
2014-06-03 18:44                           ` Andy Lutomirski
2014-06-03 20:07                             ` Andy Lutomirski
2014-04-12  8:35 ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140522125818.GV30445@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=nicolas.pitre@linaro.org \
    --cc=tglx@linutronix.de \
    --cc=umgwanakikbuti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.