public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: torvalds@linux-foundation.org, peterz@infradead.org,
	tglx@linutronix.de, linux-pm@vger.kernel.org,
	Tejun Heo <tj@kernel.org>,
	stable@vger.kernel.org
Subject: [PATCH 1/9] workqueue: perform cpu down operations from low priority cpu_notifier()
Date: Tue, 17 Jul 2012 10:12:21 -0700	[thread overview]
Message-ID: <1342545149-3515-2-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1342545149-3515-1-git-send-email-tj@kernel.org>

Currently, all workqueue cpu hotplug operations run off
CPU_PRI_WORKQUEUE which is higher than normal notifiers.  This is to
ensure that workqueue is up and running while bringing up a CPU before
other notifiers try to use workqueue on the CPU.

Per-cpu workqueues are supposed to remain working and bound to the CPU
for normal CPU_DOWN_PREPARE notifiers.  This holds mostly true even
with workqueue offlining running with higher priority because
workqueue CPU_DOWN_PREPARE only creates a bound trustee thread which
runs the per-cpu workqueue without concurrency management without
explicitly detaching the existing workers.

However, if the trustee needs to create new workers, it creates
unbound workers which may wander off to other CPUs while
CPU_DOWN_PREPARE notifiers are in progress.  Furthermore, if the CPU
down is cancelled, the per-CPU workqueue may end up with workers which
aren't bound to the CPU.

While reliably reproducible with a convoluted artificial test-case
involving scheduling and flushing CPU burning work items from CPU down
notifiers, this isn't very likely to happen in the wild, and, even
when it happens, the effects are likely to be hidden by the following
successful CPU down.

Fix it by using different priorities for up and down notifiers - high
priority for up operations and low priority for down operations.

Workqueue cpu hotplug operations will soon go through further cleanup.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
---
 include/linux/cpu.h |    5 +++--
 kernel/workqueue.c  |   38 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 2e9b9eb..ce7a074 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -73,8 +73,9 @@ enum {
 	/* migration should happen before other stuff but after perf */
 	CPU_PRI_PERF		= 20,
 	CPU_PRI_MIGRATION	= 10,
-	/* prepare workqueues for other notifiers */
-	CPU_PRI_WORKQUEUE	= 5,
+	/* bring up workqueues before normal notifiers and down after */
+	CPU_PRI_WORKQUEUE_UP	= 5,
+	CPU_PRI_WORKQUEUE_DOWN	= -5,
 };
 
 #define CPU_ONLINE		0x0002 /* CPU (unsigned)v is up */
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 4fa9e35..f59b7fd 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3644,6 +3644,41 @@ err_destroy:
 	return NOTIFY_BAD;
 }
 
+/*
+ * Workqueues should be brought up before normal priority CPU notifiers.
+ * This will be registered high priority CPU notifier.
+ */
+static int __devinit workqueue_cpu_up_callback(struct notifier_block *nfb,
+					       unsigned long action,
+					       void *hcpu)
+{
+	switch (action & ~CPU_TASKS_FROZEN) {
+	case CPU_UP_PREPARE:
+	case CPU_UP_CANCELED:
+	case CPU_DOWN_FAILED:
+	case CPU_ONLINE:
+		return workqueue_cpu_callback(nfb, action, hcpu);
+	}
+	return NOTIFY_OK;
+}
+
+/*
+ * Workqueues should be brought down after normal priority CPU notifiers.
+ * This will be registered as low priority CPU notifier.
+ */
+static int __devinit workqueue_cpu_down_callback(struct notifier_block *nfb,
+						 unsigned long action,
+						 void *hcpu)
+{
+	switch (action & ~CPU_TASKS_FROZEN) {
+	case CPU_DOWN_PREPARE:
+	case CPU_DYING:
+	case CPU_POST_DEAD:
+		return workqueue_cpu_callback(nfb, action, hcpu);
+	}
+	return NOTIFY_OK;
+}
+
 #ifdef CONFIG_SMP
 
 struct work_for_cpu {
@@ -3839,7 +3874,8 @@ static int __init init_workqueues(void)
 	unsigned int cpu;
 	int i;
 
-	cpu_notifier(workqueue_cpu_callback, CPU_PRI_WORKQUEUE);
+	cpu_notifier(workqueue_cpu_up_callback, CPU_PRI_WORKQUEUE_UP);
+	cpu_notifier(workqueue_cpu_down_callback, CPU_PRI_WORKQUEUE_DOWN);
 
 	/* initialize gcwqs */
 	for_each_gcwq_cpu(cpu) {
-- 
1.7.7.3


  reply	other threads:[~2012-07-17 17:12 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-17 17:12 [PATCHSET] workqueue: reimplement CPU hotplug to keep idle workers Tejun Heo
2012-07-17 17:12 ` Tejun Heo [this message]
2012-07-20 21:52   ` [PATCH 1/9] workqueue: perform cpu down operations from low priority cpu_notifier() Paul E. McKenney
2012-07-20 21:58     ` Tejun Heo
2012-07-21 21:36       ` Paul E. McKenney
2012-07-22 16:43         ` [PATCH] workqueue: fix spurious CPU locality WARN from process_one_work() Tejun Heo
2012-07-22 21:23           ` Paul E. McKenney
2012-07-17 17:12 ` [PATCH 2/9] workqueue: drop CPU_DYING notifier operation Tejun Heo
2012-07-17 17:12 ` [PATCH 3/9] workqueue: ROGUE workers are UNBOUND workers Tejun Heo
2012-07-17 17:12 ` [PATCH 4/9] workqueue: use mutex for global_cwq manager exclusion Tejun Heo
2012-07-17 17:12 ` [PATCH 5/9] workqueue: drop @bind from create_worker() Tejun Heo
2012-07-17 17:12 ` [PATCH 6/9] workqueue: reimplement CPU online rebinding to handle idle workers Tejun Heo
2012-07-17 17:12 ` [PATCH 7/9] workqueue: don't butcher idle workers on an offline CPU Tejun Heo
2012-07-17 17:12 ` [PATCH 8/9] workqueue: remove CPU offline trustee Tejun Heo
2012-07-17 17:12 ` [PATCH 9/9] workqueue: simplify CPU hotplug code Tejun Heo
2012-07-17 18:43 ` [PATCHSET] workqueue: reimplement CPU hotplug to keep idle workers Rafael J. Wysocki
2012-07-17 19:40   ` Tejun Heo
2012-07-20 15:48 ` Peter Zijlstra
2012-07-20 17:02   ` Tejun Heo
2012-07-20 17:21     ` Peter Zijlstra
2012-07-20 17:50       ` Tejun Heo
2012-07-20 18:22         ` Peter Zijlstra
2012-07-20 18:34           ` Tejun Heo
2012-07-20 19:44             ` Rafael J. Wysocki
2012-07-20 19:41               ` Tejun Heo
2012-07-21  6:42               ` Shilimkar, Santosh
2012-07-23  8:38               ` Peter De Schrijver
2012-07-20 16:39 ` Peter Zijlstra
2012-07-20 16:52   ` Tejun Heo
2012-07-20 17:01     ` Peter Zijlstra
2012-07-20 17:08       ` Tejun Heo
2012-07-20 17:19         ` Peter Zijlstra
2012-07-20 17:43           ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1342545149-3515-2-git-send-email-tj@kernel.org \
    --to=tj@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox