From: Tejun Heo <tj@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, Lai Jiangshan <jiangshanlai@gmail.com>
Subject: [GIT PULL] workqueue fixes for v4.13-rc3
Date: Mon, 31 Jul 2017 08:38:06 -0700 [thread overview]
Message-ID: <20170731153806.GC447614@devbig577.frc2.facebook.com> (raw)
Hello, Linus.
Two notable fixes.
* While adding NUMA affinity support to unbound workqueues, the
assumption that an unbound workqueue with max_active == 1 is ordered
was broken. The plan was to use explicit alloc_ordered_workqueue()
for those cases. Unfortunately, I forgot to update the
documentation properly and we grew a handful of use cases which
depend on that assumption.
While we want to convert them to alloc_ordered_workqueue(), we don't
really lose anything by enforcing ordered execution on unbound
max_active == 1 workqueues and it doesn't make sense to risk subtle
bugs. Restore the assumption.
* Workqueue assumes that CPU <-> NUMA node mapping remains static.
This is a general assumption - we don't have any synchronization
mechanism around CPU <-> node mapping. Unfortunately, powerpc may
change the mapping dynamically leading to crashes. Michael added a
workaround so that we at least don't crash while powerpc hotplug
code gets updated.
Thanks.
The following changes since commit 74cbd96bc2e00f5daa805e2ebf49e998f7045062:
Merge tag 'md/4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md (2017-07-18 11:51:08 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-4.13-fixes
for you to fetch changes up to 1ad0f0a7aa1bf3bd42dcd108a96713d255eacd9f:
workqueue: Work around edge cases for calc of pool's cpumask (2017-07-28 11:05:52 -0400)
----------------------------------------------------------------
Michael Bringmann (1):
workqueue: Work around edge cases for calc of pool's cpumask
Tejun Heo (2):
workqueue: restore WQ_UNBOUND/max_active==1 to be ordered
workqueue: implicit ordered attribute should be overridable
include/linux/workqueue.h | 4 +++-
kernel/workqueue.c | 30 ++++++++++++++++++++++++++----
2 files changed, 29 insertions(+), 5 deletions(-)
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index c102ef6..db6dc9d 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -323,6 +323,7 @@ enum {
__WQ_DRAINING = 1 << 16, /* internal: workqueue is draining */
__WQ_ORDERED = 1 << 17, /* internal: workqueue is ordered */
+ __WQ_ORDERED_EXPLICIT = 1 << 18, /* internal: alloc_ordered_workqueue() */
__WQ_LEGACY = 1 << 18, /* internal: create*_workqueue() */
WQ_MAX_ACTIVE = 512, /* I like 512, better ideas? */
@@ -422,7 +423,8 @@ __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
* Pointer to the allocated workqueue on success, %NULL on failure.
*/
#define alloc_ordered_workqueue(fmt, flags, args...) \
- alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args)
+ alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | \
+ __WQ_ORDERED_EXPLICIT | (flags), 1, ##args)
#define create_workqueue(name) \
alloc_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, 1, (name))
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index a86688f..ca937b0 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3577,6 +3577,13 @@ static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, int node,
/* yeap, return possible CPUs in @node that @attrs wants */
cpumask_and(cpumask, attrs->cpumask, wq_numa_possible_cpumask[node]);
+
+ if (cpumask_empty(cpumask)) {
+ pr_warn_once("WARNING: workqueue cpumask: online intersect > "
+ "possible intersect\n");
+ return false;
+ }
+
return !cpumask_equal(cpumask, attrs->cpumask);
use_dfl:
@@ -3744,8 +3751,12 @@ static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
return -EINVAL;
/* creating multiple pwqs breaks ordering guarantee */
- if (WARN_ON((wq->flags & __WQ_ORDERED) && !list_empty(&wq->pwqs)))
- return -EINVAL;
+ if (!list_empty(&wq->pwqs)) {
+ if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
+ return -EINVAL;
+
+ wq->flags &= ~__WQ_ORDERED;
+ }
ctx = apply_wqattrs_prepare(wq, attrs);
if (!ctx)
@@ -3929,6 +3940,16 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
struct workqueue_struct *wq;
struct pool_workqueue *pwq;
+ /*
+ * Unbound && max_active == 1 used to imply ordered, which is no
+ * longer the case on NUMA machines due to per-node pools. While
+ * alloc_ordered_workqueue() is the right way to create an ordered
+ * workqueue, keep the previous behavior to avoid subtle breakages
+ * on NUMA.
+ */
+ if ((flags & WQ_UNBOUND) && max_active == 1)
+ flags |= __WQ_ORDERED;
+
/* see the comment above the definition of WQ_POWER_EFFICIENT */
if ((flags & WQ_POWER_EFFICIENT) && wq_power_efficient)
flags |= WQ_UNBOUND;
@@ -4119,13 +4140,14 @@ void workqueue_set_max_active(struct workqueue_struct *wq, int max_active)
struct pool_workqueue *pwq;
/* disallow meddling with max_active for ordered workqueues */
- if (WARN_ON(wq->flags & __WQ_ORDERED))
+ if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
return;
max_active = wq_clamp_max_active(max_active, wq->flags, wq->name);
mutex_lock(&wq->mutex);
+ wq->flags &= ~__WQ_ORDERED;
wq->saved_max_active = max_active;
for_each_pwq(pwq, wq)
@@ -5253,7 +5275,7 @@ int workqueue_sysfs_register(struct workqueue_struct *wq)
* attributes breaks ordering guarantee. Disallow exposing ordered
* workqueues.
*/
- if (WARN_ON(wq->flags & __WQ_ORDERED))
+ if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
return -EINVAL;
wq->wq_dev = wq_dev = kzalloc(sizeof(*wq_dev), GFP_KERNEL);
next reply other threads:[~2017-07-31 15:38 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-31 15:38 Tejun Heo [this message]
2017-08-07 12:18 ` [GIT PULL] workqueue fixes for v4.13-rc3 Geert Uytterhoeven
2017-08-07 17:06 ` Tejun Heo
2017-08-23 8:10 ` Geert Uytterhoeven
2017-08-23 14:24 ` Tejun Heo
2017-08-23 14:47 ` Geert Uytterhoeven
2017-08-23 17:08 ` Tejun Heo
2017-08-24 13:32 ` Geert Uytterhoeven
2017-08-24 14:33 ` Tejun Heo
2017-08-28 9:07 ` Geert Uytterhoeven
2017-08-28 21:52 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170731153806.GC447614@devbig577.frc2.facebook.com \
--to=tj@kernel.org \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.