[RFC PATCH 0/6] workqueue: Introduce low-level unbound wq sysfs cpumask v2

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

* [RFC PATCH 0/6] workqueue: Introduce low-level unbound wq sysfs cpumask v2
@ 2014-05-07 16:36 Frederic Weisbecker
  2014-05-07 16:36 ` [PATCH 1/6] workqueue: Allow changing attributions of ordered workqueues Frederic Weisbecker
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Frederic Weisbecker @ 2014-05-07 16:36 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

Hi,

This is the 2nd version of https://lkml.org/lkml/2014/4/24/427

Changes in this version include:

* Allow to call apply_workqueue_attrs() on ordered workqueues and remove
the associated special cases (thanks Lai!)

* Improve error handling

* Add some lockdep_assert_held() to check locking requirements

* Rename unbounds_cpumask file to cpumask (some prefix still in discusion
with Tejun).

* Better handle widening of cpumask (see 6th patch), but need to be done
more properly, see changelog.

* Rebase on top of Tejun's workqueue next branch

Thanks,
	Frederic

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	core/workqueue-v4
---

Frederic Weisbecker (5):
      workqueue: Reorder sysfs code
      workqueue: Create low-level unbound workqueues cpumask
      workqueue: Split apply attrs code from its locking
      workqueue: Allow modifying low level unbound workqueue cpumask
      workqueue: Record real per-workqueue cpumask

Lai Jiangshan (1):
      workqueue: Allow changing attributions of ordered workqueues


 kernel/workqueue.c | 1672 ++++++++++++++++++++++++++++------------------------
 1 file changed, 896 insertions(+), 776 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/6] workqueue: Allow changing attributions of ordered workqueues
  2014-05-07 16:36 [RFC PATCH 0/6] workqueue: Introduce low-level unbound wq sysfs cpumask v2 Frederic Weisbecker
@ 2014-05-07 16:36 ` Frederic Weisbecker
  2014-05-07 16:36 ` [PATCH 2/6] workqueue: Reorder sysfs code Frederic Weisbecker
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Frederic Weisbecker @ 2014-05-07 16:36 UTC (permalink / raw)
  To: LKML
  Cc: Lai Jiangshan, Christoph Lameter, Kevin Hilman, Mike Galbraith,
	Paul E. McKenney, Tejun Heo, Viresh Kumar, Frederic Weisbecker

From: Lai Jiangshan <laijs@cn.fujitsu.com>

Changing the attributions of a workqueue imply the addition of new pwqs
to replace the old ones. But the current implementation doesn't handle
ordered workqueues because they can't carry multi-pwqs without breaking
ordering. Hence ordered workqueues currently aren't allowed to call
apply_workqueue_attrs().

This result in several special cases in the workqueue code to handle
ordered workqueues. And with the addition of global workqueue cpumask,
these special cases are going to spread out even further as the number
of callers of apply_workqueue_attrs() will be increasing.

So we want apply_workqueue_attrs() to be smarter and to be able to
handle all sort of unbound workqueues.

This solution propose to create new pwqs on ordered workqueues with
max_active initialized as zero. Then when the older pwq is finally
released, the new one becomes active and its max_active value is set to 1.
This way we make sure that a only a single pwq ever run at a given time
on an ordered workqueue.

This enforces ordering and non-reentrancy on higher level.
Note that ordered works then become exceptions and aren't subject to
previous pool requeue that usually guarantees reentrancy while works
requeue themselves back-to-back. Otherwise it could prevent a pool switch
from ever happening by delaying the release of the old pool forever and
never letting the new one in.

Now that we can change ordered workqueue attributes, lets allow them
to be registered as WQ_SYSFS and allow to change their nice and cpumask
values. Note that in order to preserve ordering guarantee, we still
disallow changing their max_active and no_numa values.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/workqueue.c | 69 +++++++++++++++++++++++++++++++++---------------------
 1 file changed, 42 insertions(+), 27 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c3f076f..c68e84f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1355,8 +1355,16 @@ retry:
 	 * If @work was previously on a different pool, it might still be
 	 * running there, in which case the work needs to be queued on that
 	 * pool to guarantee non-reentrancy.
+	 *
+	 * We guarantee that only one pwq is active on an ordered workqueue.
+	 * That alone enforces non-reentrancy for works. So ordered works don't
+	 * need to be requeued to their previous pool. Not to mention that
+	 * an ordered work requeing itself over and over on the same pool may
+	 * prevent a pwq from being released in case of a pool switch. The
+	 * newest pool in that case couldn't switch in and its pending works
+	 * would starve.
 	 */
-	last_pool = get_work_pool(work);
+	last_pool = wq->flags & __WQ_ORDERED ? NULL : get_work_pool(work);
 	if (last_pool && last_pool != pwq->pool) {
 		struct worker *worker;
 
@@ -3319,6 +3327,10 @@ static ssize_t wq_numa_store(struct device *dev, struct device_attribute *attr,
 	struct workqueue_attrs *attrs;
 	int v, ret;
 
+	/* Creating per-node pwqs breaks ordering guarantee. Keep no_numa = 1 */
+	if (WARN_ON(wq->flags & __WQ_ORDERED))
+		return -EINVAL;
+
 	attrs = wq_sysfs_prep_attrs(wq);
 	if (!attrs)
 		return -ENOMEM;
@@ -3379,14 +3391,6 @@ int workqueue_sysfs_register(struct workqueue_struct *wq)
 	struct wq_device *wq_dev;
 	int ret;
 
-	/*
-	 * Adjusting max_active or creating new pwqs by applyting
-	 * attributes breaks ordering guarantee.  Disallow exposing ordered
-	 * workqueues.
-	 */
-	if (WARN_ON(wq->flags & __WQ_ORDERED))
-		return -EINVAL;
-
 	wq->wq_dev = wq_dev = kzalloc(sizeof(*wq_dev), GFP_KERNEL);
 	if (!wq_dev)
 		return -ENOMEM;
@@ -3708,6 +3712,13 @@ static void rcu_free_pwq(struct rcu_head *rcu)
 			container_of(rcu, struct pool_workqueue, rcu));
 }
 
+static struct pool_workqueue *oldest_pwq(struct workqueue_struct *wq)
+{
+	return list_last_entry(&wq->pwqs, struct pool_workqueue, pwqs_node);
+}
+
+static void pwq_adjust_max_active(struct pool_workqueue *pwq);
+
 /*
  * Scheduled on system_wq by put_pwq() when an unbound pwq hits zero refcnt
  * and needs to be destroyed.
@@ -3723,14 +3734,12 @@ static void pwq_unbound_release_workfn(struct work_struct *work)
 	if (WARN_ON_ONCE(!(wq->flags & WQ_UNBOUND)))
 		return;
 
-	/*
-	 * Unlink @pwq.  Synchronization against wq->mutex isn't strictly
-	 * necessary on release but do it anyway.  It's easier to verify
-	 * and consistent with the linking path.
-	 */
 	mutex_lock(&wq->mutex);
 	list_del_rcu(&pwq->pwqs_node);
 	is_last = list_empty(&wq->pwqs);
+	/* try to activate the oldest pwq when needed */
+	if (!is_last && (wq->flags & __WQ_ORDERED))
+		pwq_adjust_max_active(oldest_pwq(wq));
 	mutex_unlock(&wq->mutex);
 
 	mutex_lock(&wq_pool_mutex);
@@ -3749,6 +3758,16 @@ static void pwq_unbound_release_workfn(struct work_struct *work)
 	}
 }
 
+static bool pwq_active(struct pool_workqueue *pwq)
+{
+	/* Only the oldest pwq is active in the ordered wq */
+	if (pwq->wq->flags & __WQ_ORDERED)
+		return pwq == oldest_pwq(pwq->wq);
+
+	/* All pwqs in the non-ordered wq are active */
+	return true;
+}
+
 /**
  * pwq_adjust_max_active - update a pwq's max_active to the current setting
  * @pwq: target pool_workqueue
@@ -3771,7 +3790,8 @@ static void pwq_adjust_max_active(struct pool_workqueue *pwq)
 
 	spin_lock_irq(&pwq->pool->lock);
 
-	if (!freezable || !(pwq->pool->flags & POOL_FREEZING)) {
+	if ((!freezable || !(pwq->pool->flags & POOL_FREEZING)) &&
+	     pwq_active(pwq)) {
 		pwq->max_active = wq->saved_max_active;
 
 		while (!list_empty(&pwq->delayed_works) &&
@@ -3825,11 +3845,11 @@ static void link_pwq(struct pool_workqueue *pwq)
 	 */
 	pwq->work_color = wq->work_color;
 
+	/* link in @pwq on the head of &wq->pwqs */
+	list_add_rcu(&pwq->pwqs_node, &wq->pwqs);
+
 	/* sync max_active to the current setting */
 	pwq_adjust_max_active(pwq);
-
-	/* link in @pwq */
-	list_add_rcu(&pwq->pwqs_node, &wq->pwqs);
 }
 
 /* obtain a pool matching @attr and create a pwq associating the pool and @wq */
@@ -3955,8 +3975,8 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
 	if (WARN_ON(!(wq->flags & WQ_UNBOUND)))
 		return -EINVAL;
 
-	/* creating multiple pwqs breaks ordering guarantee */
-	if (WARN_ON((wq->flags & __WQ_ORDERED) && !list_empty(&wq->pwqs)))
+	/* creating multiple per-node pwqs breaks ordering guarantee */
+	if (WARN_ON((wq->flags & __WQ_ORDERED) && !attrs->no_numa))
 		return -EINVAL;
 
 	pwq_tbl = kzalloc(wq_numa_tbl_len * sizeof(pwq_tbl[0]), GFP_KERNEL);
@@ -4146,7 +4166,7 @@ out_unlock:
 static int alloc_and_link_pwqs(struct workqueue_struct *wq)
 {
 	bool highpri = wq->flags & WQ_HIGHPRI;
-	int cpu, ret;
+	int cpu;
 
 	if (!(wq->flags & WQ_UNBOUND)) {
 		wq->cpu_pwqs = alloc_percpu(struct pool_workqueue);
@@ -4167,12 +4187,7 @@ static int alloc_and_link_pwqs(struct workqueue_struct *wq)
 		}
 		return 0;
 	} else if (wq->flags & __WQ_ORDERED) {
-		ret = apply_workqueue_attrs(wq, ordered_wq_attrs[highpri]);
-		/* there should only be single pwq for ordering guarantee */
-		WARN(!ret && (wq->pwqs.next != &wq->dfl_pwq->pwqs_node ||
-			      wq->pwqs.prev != &wq->dfl_pwq->pwqs_node),
-		     "ordering guarantee broken for workqueue %s\n", wq->name);
-		return ret;
+		return apply_workqueue_attrs(wq, ordered_wq_attrs[highpri]);
 	} else {
 		return apply_workqueue_attrs(wq, unbound_std_wq_attrs[highpri]);
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/6] workqueue: Reorder sysfs code
  2014-05-07 16:36 [RFC PATCH 0/6] workqueue: Introduce low-level unbound wq sysfs cpumask v2 Frederic Weisbecker
  2014-05-07 16:36 ` [PATCH 1/6] workqueue: Allow changing attributions of ordered workqueues Frederic Weisbecker
@ 2014-05-07 16:36 ` Frederic Weisbecker
  2014-05-07 16:36 ` [PATCH 3/6] workqueue: Create low-level unbound workqueues cpumask Frederic Weisbecker
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Frederic Weisbecker @ 2014-05-07 16:36 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

The sysfs code usually belongs to the botom of the file since it deals
with high level objects. In the workqueue code it's misplaced and such
that we'll need to work around functions references to allow the sysfs
code to call APIs like apply_workqueue_attrs().

Lets move that block further in the file, right above alloc_workqueue_key()
which reference it.

Suggested-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/workqueue.c | 626 ++++++++++++++++++++++++++---------------------------
 1 file changed, 313 insertions(+), 313 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c68e84f..e5d7719 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3137,319 +3137,6 @@ int execute_in_process_context(work_func_t fn, struct execute_work *ew)
 }
 EXPORT_SYMBOL_GPL(execute_in_process_context);
 
-#ifdef CONFIG_SYSFS
-/*
- * Workqueues with WQ_SYSFS flag set is visible to userland via
- * /sys/bus/workqueue/devices/WQ_NAME.  All visible workqueues have the
- * following attributes.
- *
- *  per_cpu	RO bool	: whether the workqueue is per-cpu or unbound
- *  max_active	RW int	: maximum number of in-flight work items
- *
- * Unbound workqueues have the following extra attributes.
- *
- *  id		RO int	: the associated pool ID
- *  nice	RW int	: nice value of the workers
- *  cpumask	RW mask	: bitmask of allowed CPUs for the workers
- */
-struct wq_device {
-	struct workqueue_struct		*wq;
-	struct device			dev;
-};
-
-static struct workqueue_struct *dev_to_wq(struct device *dev)
-{
-	struct wq_device *wq_dev = container_of(dev, struct wq_device, dev);
-
-	return wq_dev->wq;
-}
-
-static ssize_t per_cpu_show(struct device *dev, struct device_attribute *attr,
-			    char *buf)
-{
-	struct workqueue_struct *wq = dev_to_wq(dev);
-
-	return scnprintf(buf, PAGE_SIZE, "%d\n", (bool)!(wq->flags & WQ_UNBOUND));
-}
-static DEVICE_ATTR_RO(per_cpu);
-
-static ssize_t max_active_show(struct device *dev,
-			       struct device_attribute *attr, char *buf)
-{
-	struct workqueue_struct *wq = dev_to_wq(dev);
-
-	return scnprintf(buf, PAGE_SIZE, "%d\n", wq->saved_max_active);
-}
-
-static ssize_t max_active_store(struct device *dev,
-				struct device_attribute *attr, const char *buf,
-				size_t count)
-{
-	struct workqueue_struct *wq = dev_to_wq(dev);
-	int val;
-
-	if (sscanf(buf, "%d", &val) != 1 || val <= 0)
-		return -EINVAL;
-
-	workqueue_set_max_active(wq, val);
-	return count;
-}
-static DEVICE_ATTR_RW(max_active);
-
-static struct attribute *wq_sysfs_attrs[] = {
-	&dev_attr_per_cpu.attr,
-	&dev_attr_max_active.attr,
-	NULL,
-};
-ATTRIBUTE_GROUPS(wq_sysfs);
-
-static ssize_t wq_pool_ids_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
-{
-	struct workqueue_struct *wq = dev_to_wq(dev);
-	const char *delim = "";
-	int node, written = 0;
-
-	rcu_read_lock_sched();
-	for_each_node(node) {
-		written += scnprintf(buf + written, PAGE_SIZE - written,
-				     "%s%d:%d", delim, node,
-				     unbound_pwq_by_node(wq, node)->pool->id);
-		delim = " ";
-	}
-	written += scnprintf(buf + written, PAGE_SIZE - written, "\n");
-	rcu_read_unlock_sched();
-
-	return written;
-}
-
-static ssize_t wq_nice_show(struct device *dev, struct device_attribute *attr,
-			    char *buf)
-{
-	struct workqueue_struct *wq = dev_to_wq(dev);
-	int written;
-
-	mutex_lock(&wq->mutex);
-	written = scnprintf(buf, PAGE_SIZE, "%d\n", wq->unbound_attrs->nice);
-	mutex_unlock(&wq->mutex);
-
-	return written;
-}
-
-/* prepare workqueue_attrs for sysfs store operations */
-static struct workqueue_attrs *wq_sysfs_prep_attrs(struct workqueue_struct *wq)
-{
-	struct workqueue_attrs *attrs;
-
-	attrs = alloc_workqueue_attrs(GFP_KERNEL);
-	if (!attrs)
-		return NULL;
-
-	mutex_lock(&wq->mutex);
-	copy_workqueue_attrs(attrs, wq->unbound_attrs);
-	mutex_unlock(&wq->mutex);
-	return attrs;
-}
-
-static ssize_t wq_nice_store(struct device *dev, struct device_attribute *attr,
-			     const char *buf, size_t count)
-{
-	struct workqueue_struct *wq = dev_to_wq(dev);
-	struct workqueue_attrs *attrs;
-	int ret;
-
-	attrs = wq_sysfs_prep_attrs(wq);
-	if (!attrs)
-		return -ENOMEM;
-
-	if (sscanf(buf, "%d", &attrs->nice) == 1 &&
-	    attrs->nice >= MIN_NICE && attrs->nice <= MAX_NICE)
-		ret = apply_workqueue_attrs(wq, attrs);
-	else
-		ret = -EINVAL;
-
-	free_workqueue_attrs(attrs);
-	return ret ?: count;
-}
-
-static ssize_t wq_cpumask_show(struct device *dev,
-			       struct device_attribute *attr, char *buf)
-{
-	struct workqueue_struct *wq = dev_to_wq(dev);
-	int written;
-
-	mutex_lock(&wq->mutex);
-	written = cpumask_scnprintf(buf, PAGE_SIZE, wq->unbound_attrs->cpumask);
-	mutex_unlock(&wq->mutex);
-
-	written += scnprintf(buf + written, PAGE_SIZE - written, "\n");
-	return written;
-}
-
-static ssize_t wq_cpumask_store(struct device *dev,
-				struct device_attribute *attr,
-				const char *buf, size_t count)
-{
-	struct workqueue_struct *wq = dev_to_wq(dev);
-	struct workqueue_attrs *attrs;
-	int ret;
-
-	attrs = wq_sysfs_prep_attrs(wq);
-	if (!attrs)
-		return -ENOMEM;
-
-	ret = cpumask_parse(buf, attrs->cpumask);
-	if (!ret)
-		ret = apply_workqueue_attrs(wq, attrs);
-
-	free_workqueue_attrs(attrs);
-	return ret ?: count;
-}
-
-static ssize_t wq_numa_show(struct device *dev, struct device_attribute *attr,
-			    char *buf)
-{
-	struct workqueue_struct *wq = dev_to_wq(dev);
-	int written;
-
-	mutex_lock(&wq->mutex);
-	written = scnprintf(buf, PAGE_SIZE, "%d\n",
-			    !wq->unbound_attrs->no_numa);
-	mutex_unlock(&wq->mutex);
-
-	return written;
-}
-
-static ssize_t wq_numa_store(struct device *dev, struct device_attribute *attr,
-			     const char *buf, size_t count)
-{
-	struct workqueue_struct *wq = dev_to_wq(dev);
-	struct workqueue_attrs *attrs;
-	int v, ret;
-
-	/* Creating per-node pwqs breaks ordering guarantee. Keep no_numa = 1 */
-	if (WARN_ON(wq->flags & __WQ_ORDERED))
-		return -EINVAL;
-
-	attrs = wq_sysfs_prep_attrs(wq);
-	if (!attrs)
-		return -ENOMEM;
-
-	ret = -EINVAL;
-	if (sscanf(buf, "%d", &v) == 1) {
-		attrs->no_numa = !v;
-		ret = apply_workqueue_attrs(wq, attrs);
-	}
-
-	free_workqueue_attrs(attrs);
-	return ret ?: count;
-}
-
-static struct device_attribute wq_sysfs_unbound_attrs[] = {
-	__ATTR(pool_ids, 0444, wq_pool_ids_show, NULL),
-	__ATTR(nice, 0644, wq_nice_show, wq_nice_store),
-	__ATTR(cpumask, 0644, wq_cpumask_show, wq_cpumask_store),
-	__ATTR(numa, 0644, wq_numa_show, wq_numa_store),
-	__ATTR_NULL,
-};
-
-static struct bus_type wq_subsys = {
-	.name				= "workqueue",
-	.dev_groups			= wq_sysfs_groups,
-};
-
-static int __init wq_sysfs_init(void)
-{
-	return subsys_virtual_register(&wq_subsys, NULL);
-}
-core_initcall(wq_sysfs_init);
-
-static void wq_device_release(struct device *dev)
-{
-	struct wq_device *wq_dev = container_of(dev, struct wq_device, dev);
-
-	kfree(wq_dev);
-}
-
-/**
- * workqueue_sysfs_register - make a workqueue visible in sysfs
- * @wq: the workqueue to register
- *
- * Expose @wq in sysfs under /sys/bus/workqueue/devices.
- * alloc_workqueue*() automatically calls this function if WQ_SYSFS is set
- * which is the preferred method.
- *
- * Workqueue user should use this function directly iff it wants to apply
- * workqueue_attrs before making the workqueue visible in sysfs; otherwise,
- * apply_workqueue_attrs() may race against userland updating the
- * attributes.
- *
- * Return: 0 on success, -errno on failure.
- */
-int workqueue_sysfs_register(struct workqueue_struct *wq)
-{
-	struct wq_device *wq_dev;
-	int ret;
-
-	wq->wq_dev = wq_dev = kzalloc(sizeof(*wq_dev), GFP_KERNEL);
-	if (!wq_dev)
-		return -ENOMEM;
-
-	wq_dev->wq = wq;
-	wq_dev->dev.bus = &wq_subsys;
-	wq_dev->dev.init_name = wq->name;
-	wq_dev->dev.release = wq_device_release;
-
-	/*
-	 * unbound_attrs are created separately.  Suppress uevent until
-	 * everything is ready.
-	 */
-	dev_set_uevent_suppress(&wq_dev->dev, true);
-
-	ret = device_register(&wq_dev->dev);
-	if (ret) {
-		kfree(wq_dev);
-		wq->wq_dev = NULL;
-		return ret;
-	}
-
-	if (wq->flags & WQ_UNBOUND) {
-		struct device_attribute *attr;
-
-		for (attr = wq_sysfs_unbound_attrs; attr->attr.name; attr++) {
-			ret = device_create_file(&wq_dev->dev, attr);
-			if (ret) {
-				device_unregister(&wq_dev->dev);
-				wq->wq_dev = NULL;
-				return ret;
-			}
-		}
-	}
-
-	kobject_uevent(&wq_dev->dev.kobj, KOBJ_ADD);
-	return 0;
-}
-
-/**
- * workqueue_sysfs_unregister - undo workqueue_sysfs_register()
- * @wq: the workqueue to unregister
- *
- * If @wq is registered to sysfs by workqueue_sysfs_register(), unregister.
- */
-static void workqueue_sysfs_unregister(struct workqueue_struct *wq)
-{
-	struct wq_device *wq_dev = wq->wq_dev;
-
-	if (!wq->wq_dev)
-		return;
-
-	wq->wq_dev = NULL;
-	device_unregister(&wq_dev->dev);
-}
-#else	/* CONFIG_SYSFS */
-static void workqueue_sysfs_unregister(struct workqueue_struct *wq)	{ }
-#endif	/* CONFIG_SYSFS */
-
 /**
  * free_workqueue_attrs - free a workqueue_attrs
  * @attrs: workqueue_attrs to free
@@ -4163,6 +3850,319 @@ out_unlock:
 	put_pwq_unlocked(old_pwq);
 }
 
+#ifdef CONFIG_SYSFS
+/*
+ * Workqueues with WQ_SYSFS flag set is visible to userland via
+ * /sys/bus/workqueue/devices/WQ_NAME.  All visible workqueues have the
+ * following attributes.
+ *
+ *  per_cpu	RO bool	: whether the workqueue is per-cpu or unbound
+ *  max_active	RW int	: maximum number of in-flight work items
+ *
+ * Unbound workqueues have the following extra attributes.
+ *
+ *  id		RO int	: the associated pool ID
+ *  nice	RW int	: nice value of the workers
+ *  cpumask	RW mask	: bitmask of allowed CPUs for the workers
+ */
+struct wq_device {
+	struct workqueue_struct		*wq;
+	struct device			dev;
+};
+
+static struct workqueue_struct *dev_to_wq(struct device *dev)
+{
+	struct wq_device *wq_dev = container_of(dev, struct wq_device, dev);
+
+	return wq_dev->wq;
+}
+
+static ssize_t per_cpu_show(struct device *dev, struct device_attribute *attr,
+			    char *buf)
+{
+	struct workqueue_struct *wq = dev_to_wq(dev);
+
+	return scnprintf(buf, PAGE_SIZE, "%d\n", (bool)!(wq->flags & WQ_UNBOUND));
+}
+static DEVICE_ATTR_RO(per_cpu);
+
+static ssize_t max_active_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct workqueue_struct *wq = dev_to_wq(dev);
+
+	return scnprintf(buf, PAGE_SIZE, "%d\n", wq->saved_max_active);
+}
+
+static ssize_t max_active_store(struct device *dev,
+				struct device_attribute *attr, const char *buf,
+				size_t count)
+{
+	struct workqueue_struct *wq = dev_to_wq(dev);
+	int val;
+
+	if (sscanf(buf, "%d", &val) != 1 || val <= 0)
+		return -EINVAL;
+
+	workqueue_set_max_active(wq, val);
+	return count;
+}
+static DEVICE_ATTR_RW(max_active);
+
+static struct attribute *wq_sysfs_attrs[] = {
+	&dev_attr_per_cpu.attr,
+	&dev_attr_max_active.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(wq_sysfs);
+
+static ssize_t wq_pool_ids_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct workqueue_struct *wq = dev_to_wq(dev);
+	const char *delim = "";
+	int node, written = 0;
+
+	rcu_read_lock_sched();
+	for_each_node(node) {
+		written += scnprintf(buf + written, PAGE_SIZE - written,
+				     "%s%d:%d", delim, node,
+				     unbound_pwq_by_node(wq, node)->pool->id);
+		delim = " ";
+	}
+	written += scnprintf(buf + written, PAGE_SIZE - written, "\n");
+	rcu_read_unlock_sched();
+
+	return written;
+}
+
+static ssize_t wq_nice_show(struct device *dev, struct device_attribute *attr,
+			    char *buf)
+{
+	struct workqueue_struct *wq = dev_to_wq(dev);
+	int written;
+
+	mutex_lock(&wq->mutex);
+	written = scnprintf(buf, PAGE_SIZE, "%d\n", wq->unbound_attrs->nice);
+	mutex_unlock(&wq->mutex);
+
+	return written;
+}
+
+/* prepare workqueue_attrs for sysfs store operations */
+static struct workqueue_attrs *wq_sysfs_prep_attrs(struct workqueue_struct *wq)
+{
+	struct workqueue_attrs *attrs;
+
+	attrs = alloc_workqueue_attrs(GFP_KERNEL);
+	if (!attrs)
+		return NULL;
+
+	mutex_lock(&wq->mutex);
+	copy_workqueue_attrs(attrs, wq->unbound_attrs);
+	mutex_unlock(&wq->mutex);
+	return attrs;
+}
+
+static ssize_t wq_nice_store(struct device *dev, struct device_attribute *attr,
+			     const char *buf, size_t count)
+{
+	struct workqueue_struct *wq = dev_to_wq(dev);
+	struct workqueue_attrs *attrs;
+	int ret;
+
+	attrs = wq_sysfs_prep_attrs(wq);
+	if (!attrs)
+		return -ENOMEM;
+
+	if (sscanf(buf, "%d", &attrs->nice) == 1 &&
+	    attrs->nice >= MIN_NICE && attrs->nice <= MAX_NICE)
+		ret = apply_workqueue_attrs(wq, attrs);
+	else
+		ret = -EINVAL;
+
+	free_workqueue_attrs(attrs);
+	return ret ?: count;
+}
+
+static ssize_t wq_cpumask_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct workqueue_struct *wq = dev_to_wq(dev);
+	int written;
+
+	mutex_lock(&wq->mutex);
+	written = cpumask_scnprintf(buf, PAGE_SIZE, wq->unbound_attrs->cpumask);
+	mutex_unlock(&wq->mutex);
+
+	written += scnprintf(buf + written, PAGE_SIZE - written, "\n");
+	return written;
+}
+
+static ssize_t wq_cpumask_store(struct device *dev,
+				struct device_attribute *attr,
+				const char *buf, size_t count)
+{
+	struct workqueue_struct *wq = dev_to_wq(dev);
+	struct workqueue_attrs *attrs;
+	int ret;
+
+	attrs = wq_sysfs_prep_attrs(wq);
+	if (!attrs)
+		return -ENOMEM;
+
+	ret = cpumask_parse(buf, attrs->cpumask);
+	if (!ret)
+		ret = apply_workqueue_attrs(wq, attrs);
+
+	free_workqueue_attrs(attrs);
+	return ret ?: count;
+}
+
+static ssize_t wq_numa_show(struct device *dev, struct device_attribute *attr,
+			    char *buf)
+{
+	struct workqueue_struct *wq = dev_to_wq(dev);
+	int written;
+
+	mutex_lock(&wq->mutex);
+	written = scnprintf(buf, PAGE_SIZE, "%d\n",
+			    !wq->unbound_attrs->no_numa);
+	mutex_unlock(&wq->mutex);
+
+	return written;
+}
+
+static ssize_t wq_numa_store(struct device *dev, struct device_attribute *attr,
+			     const char *buf, size_t count)
+{
+	struct workqueue_struct *wq = dev_to_wq(dev);
+	struct workqueue_attrs *attrs;
+	int v, ret;
+
+	/* Creating per-node pwqs breaks ordering guarantee. Keep no_numa = 1 */
+	if (WARN_ON(wq->flags & __WQ_ORDERED))
+		return -EINVAL;
+
+	attrs = wq_sysfs_prep_attrs(wq);
+	if (!attrs)
+		return -ENOMEM;
+
+	ret = -EINVAL;
+	if (sscanf(buf, "%d", &v) == 1) {
+		attrs->no_numa = !v;
+		ret = apply_workqueue_attrs(wq, attrs);
+	}
+
+	free_workqueue_attrs(attrs);
+	return ret ?: count;
+}
+
+static struct device_attribute wq_sysfs_unbound_attrs[] = {
+	__ATTR(pool_ids, 0444, wq_pool_ids_show, NULL),
+	__ATTR(nice, 0644, wq_nice_show, wq_nice_store),
+	__ATTR(cpumask, 0644, wq_cpumask_show, wq_cpumask_store),
+	__ATTR(numa, 0644, wq_numa_show, wq_numa_store),
+	__ATTR_NULL,
+};
+
+static struct bus_type wq_subsys = {
+	.name				= "workqueue",
+	.dev_groups			= wq_sysfs_groups,
+};
+
+static int __init wq_sysfs_init(void)
+{
+	return subsys_virtual_register(&wq_subsys, NULL);
+}
+core_initcall(wq_sysfs_init);
+
+static void wq_device_release(struct device *dev)
+{
+	struct wq_device *wq_dev = container_of(dev, struct wq_device, dev);
+
+	kfree(wq_dev);
+}
+
+/**
+ * workqueue_sysfs_register - make a workqueue visible in sysfs
+ * @wq: the workqueue to register
+ *
+ * Expose @wq in sysfs under /sys/bus/workqueue/devices.
+ * alloc_workqueue*() automatically calls this function if WQ_SYSFS is set
+ * which is the preferred method.
+ *
+ * Workqueue user should use this function directly iff it wants to apply
+ * workqueue_attrs before making the workqueue visible in sysfs; otherwise,
+ * apply_workqueue_attrs() may race against userland updating the
+ * attributes.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+int workqueue_sysfs_register(struct workqueue_struct *wq)
+{
+	struct wq_device *wq_dev;
+	int ret;
+
+	wq->wq_dev = wq_dev = kzalloc(sizeof(*wq_dev), GFP_KERNEL);
+	if (!wq_dev)
+		return -ENOMEM;
+
+	wq_dev->wq = wq;
+	wq_dev->dev.bus = &wq_subsys;
+	wq_dev->dev.init_name = wq->name;
+	wq_dev->dev.release = wq_device_release;
+
+	/*
+	 * unbound_attrs are created separately.  Suppress uevent until
+	 * everything is ready.
+	 */
+	dev_set_uevent_suppress(&wq_dev->dev, true);
+
+	ret = device_register(&wq_dev->dev);
+	if (ret) {
+		kfree(wq_dev);
+		wq->wq_dev = NULL;
+		return ret;
+	}
+
+	if (wq->flags & WQ_UNBOUND) {
+		struct device_attribute *attr;
+
+		for (attr = wq_sysfs_unbound_attrs; attr->attr.name; attr++) {
+			ret = device_create_file(&wq_dev->dev, attr);
+			if (ret) {
+				device_unregister(&wq_dev->dev);
+				wq->wq_dev = NULL;
+				return ret;
+			}
+		}
+	}
+
+	kobject_uevent(&wq_dev->dev.kobj, KOBJ_ADD);
+	return 0;
+}
+
+/**
+ * workqueue_sysfs_unregister - undo workqueue_sysfs_register()
+ * @wq: the workqueue to unregister
+ *
+ * If @wq is registered to sysfs by workqueue_sysfs_register(), unregister.
+ */
+static void workqueue_sysfs_unregister(struct workqueue_struct *wq)
+{
+	struct wq_device *wq_dev = wq->wq_dev;
+
+	if (!wq->wq_dev)
+		return;
+
+	wq->wq_dev = NULL;
+	device_unregister(&wq_dev->dev);
+}
+#else	/* CONFIG_SYSFS */
+static void workqueue_sysfs_unregister(struct workqueue_struct *wq)	{ }
+#endif	/* CONFIG_SYSFS */
+
 static int alloc_and_link_pwqs(struct workqueue_struct *wq)
 {
 	bool highpri = wq->flags & WQ_HIGHPRI;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/6] workqueue: Create low-level unbound workqueues cpumask
  2014-05-07 16:36 [RFC PATCH 0/6] workqueue: Introduce low-level unbound wq sysfs cpumask v2 Frederic Weisbecker
  2014-05-07 16:36 ` [PATCH 1/6] workqueue: Allow changing attributions of ordered workqueues Frederic Weisbecker
  2014-05-07 16:36 ` [PATCH 2/6] workqueue: Reorder sysfs code Frederic Weisbecker
@ 2014-05-07 16:36 ` Frederic Weisbecker
  2014-05-07 16:36 ` [PATCH 4/6] workqueue: Split apply attrs code from its locking Frederic Weisbecker
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Frederic Weisbecker @ 2014-05-07 16:36 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

Create a cpumask that limit the affinity of all unbound workqueues.
This cpumask is controlled though a file at the root of the workqueue
sysfs directory.

It works on a lower-level than the per WQ_SYSFS workqueues cpumask files
such that the effective cpumask applied for a given unbound workqueue is
the intersection of /sys/devices/virtual/workqueue/$WORKQUEUE/cpumask and
the new /sys/devices/virtual/workqueue/cpumask_unbounds file.

This patch implements the basic infrastructure and the read interface.
cpumask_unbounds is initially set to cpu_possible_mask.

Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/workqueue.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index e5d7719..1252a8c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -293,6 +293,8 @@ static DEFINE_SPINLOCK(wq_mayday_lock);	/* protects wq->maydays list */
 static LIST_HEAD(workqueues);		/* PL: list of all workqueues */
 static bool workqueue_freezing;		/* PL: have wqs started freezing? */
 
+static cpumask_var_t wq_unbound_cpumask;
+
 /* the per-cpu worker pools */
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
 				     cpu_worker_pools);
@@ -3674,7 +3676,7 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
 
 	/* make a copy of @attrs and sanitize it */
 	copy_workqueue_attrs(new_attrs, attrs);
-	cpumask_and(new_attrs->cpumask, new_attrs->cpumask, cpu_possible_mask);
+	cpumask_and(new_attrs->cpumask, new_attrs->cpumask, wq_unbound_cpumask);
 
 	/*
 	 * We may create multiple pwqs with differing cpumasks.  Make a
@@ -4071,9 +4073,29 @@ static struct bus_type wq_subsys = {
 	.dev_groups			= wq_sysfs_groups,
 };
 
+static ssize_t unbounds_cpumask_show(struct device *dev,
+				     struct device_attribute *attr, char *buf)
+{
+	int written;
+
+	written = cpumask_scnprintf(buf, PAGE_SIZE, wq_unbound_cpumask);
+	written += scnprintf(buf + written, PAGE_SIZE - written, "\n");
+
+	return written;
+}
+
+static struct device_attribute wq_sysfs_cpumask_attr =
+	__ATTR(cpumask, 0444, unbounds_cpumask_show, NULL);
+
 static int __init wq_sysfs_init(void)
 {
-	return subsys_virtual_register(&wq_subsys, NULL);
+	int err;
+
+	err = subsys_virtual_register(&wq_subsys, NULL);
+	if (err)
+		return err;
+
+	return device_create_file(wq_subsys.dev_root, &wq_sysfs_cpumask_attr);
 }
 core_initcall(wq_sysfs_init);
 
@@ -5068,6 +5090,9 @@ static int __init init_workqueues(void)
 
 	WARN_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
 
+	BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL));
+	cpumask_copy(wq_unbound_cpumask, cpu_possible_mask);
+
 	pwq_cache = KMEM_CACHE(pool_workqueue, SLAB_PANIC);
 
 	cpu_notifier(workqueue_cpu_up_callback, CPU_PRI_WORKQUEUE_UP);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/6] workqueue: Split apply attrs code from its locking
  2014-05-07 16:36 [RFC PATCH 0/6] workqueue: Introduce low-level unbound wq sysfs cpumask v2 Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2014-05-07 16:36 ` [PATCH 3/6] workqueue: Create low-level unbound workqueues cpumask Frederic Weisbecker
@ 2014-05-07 16:36 ` Frederic Weisbecker
  2014-05-07 16:37 ` [PATCH 5/6] workqueue: Allow modifying low level unbound workqueue cpumask Frederic Weisbecker
  2014-05-07 16:37 ` [PATCH 6/6] workqueue: Record real per-workqueue cpumask Frederic Weisbecker
  5 siblings, 0 replies; 11+ messages in thread
From: Frederic Weisbecker @ 2014-05-07 16:36 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

In order to allow overriding the unbound wqs low-level cpumask, we
need to be able to call apply_workqueue_attr() on all workqueues in
the pool list.

Now since traversing the pool list require to lock it, we can't currently
call apply_workqueue_attr() under the pool traversal.

So lets provide a version of apply_workqueue_attrs() that can be
called when the pool is already locked.

Suggested-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/workqueue.c | 77 +++++++++++++++++++++++++++++++-----------------------
 1 file changed, 44 insertions(+), 33 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1252a8c..2aa296d 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3637,24 +3637,9 @@ static struct pool_workqueue *numa_pwq_tbl_install(struct workqueue_struct *wq,
 	return old_pwq;
 }
 
-/**
- * apply_workqueue_attrs - apply new workqueue_attrs to an unbound workqueue
- * @wq: the target workqueue
- * @attrs: the workqueue_attrs to apply, allocated with alloc_workqueue_attrs()
- *
- * Apply @attrs to an unbound workqueue @wq.  Unless disabled, on NUMA
- * machines, this function maps a separate pwq to each NUMA node with
- * possibles CPUs in @attrs->cpumask so that work items are affine to the
- * NUMA node it was issued on.  Older pwqs are released as in-flight work
- * items finish.  Note that a work item which repeatedly requeues itself
- * back-to-back will stay on its current pwq.
- *
- * Performs GFP_KERNEL allocations.
- *
- * Return: 0 on success and -errno on failure.
- */
-int apply_workqueue_attrs(struct workqueue_struct *wq,
-			  const struct workqueue_attrs *attrs)
+static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
+					const struct workqueue_attrs *attrs,
+					cpumask_var_t unbounds_cpumask)
 {
 	struct workqueue_attrs *new_attrs, *tmp_attrs;
 	struct pool_workqueue **pwq_tbl, *dfl_pwq;
@@ -3676,7 +3661,7 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
 
 	/* make a copy of @attrs and sanitize it */
 	copy_workqueue_attrs(new_attrs, attrs);
-	cpumask_and(new_attrs->cpumask, new_attrs->cpumask, wq_unbound_cpumask);
+	cpumask_and(new_attrs->cpumask, new_attrs->cpumask, unbounds_cpumask);
 
 	/*
 	 * We may create multiple pwqs with differing cpumasks.  Make a
@@ -3686,15 +3671,6 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
 	copy_workqueue_attrs(tmp_attrs, new_attrs);
 
 	/*
-	 * CPUs should stay stable across pwq creations and installations.
-	 * Pin CPUs, determine the target cpumask for each node and create
-	 * pwqs accordingly.
-	 */
-	get_online_cpus();
-
-	mutex_lock(&wq_pool_mutex);
-
-	/*
 	 * If something goes wrong during CPU up/down, we'll fall back to
 	 * the default pwq covering whole @attrs->cpumask.  Always create
 	 * it even if we don't use it immediately.
@@ -3714,8 +3690,6 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
 		}
 	}
 
-	mutex_unlock(&wq_pool_mutex);
-
 	/* all pwqs have been created successfully, let's install'em */
 	mutex_lock(&wq->mutex);
 
@@ -3736,7 +3710,6 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
 		put_pwq_unlocked(pwq_tbl[node]);
 	put_pwq_unlocked(dfl_pwq);
 
-	put_online_cpus();
 	ret = 0;
 	/* fall through */
 out_free:
@@ -3750,14 +3723,52 @@ enomem_pwq:
 	for_each_node(node)
 		if (pwq_tbl && pwq_tbl[node] != dfl_pwq)
 			free_unbound_pwq(pwq_tbl[node]);
-	mutex_unlock(&wq_pool_mutex);
-	put_online_cpus();
 enomem:
 	ret = -ENOMEM;
 	goto out_free;
 }
 
 /**
+ * apply_workqueue_attrs - apply new workqueue_attrs to an unbound workqueue
+ * @wq: the target workqueue
+ * @attrs: the workqueue_attrs to apply, allocated with alloc_workqueue_attrs()
+ *
+ * Apply @attrs to an unbound workqueue @wq.  Unless disabled, on NUMA
+ * machines, this function maps a separate pwq to each NUMA node with
+ * possibles CPUs in @attrs->cpumask so that work items are affine to the
+ * NUMA node it was issued on.  Older pwqs are released as in-flight work
+ * items finish.  Note that a work item which repeatedly requeues itself
+ * back-to-back will stay on its current pwq.
+ *
+ * Performs GFP_KERNEL allocations.
+ *
+ * Return: 0 on success and -errno on failure.
+ */
+int apply_workqueue_attrs(struct workqueue_struct *wq,
+			  const struct workqueue_attrs *attrs)
+{
+	int ret;
+
+	/*
+	 * CPUs should stay stable across pwq creations and installations.
+	 * Pin CPUs, determine the target cpumask for each node and create
+	 * pwqs accordingly.
+	 */
+
+	get_online_cpus();
+	/*
+	 * Lock for alloc_unbound_pwq()
+	 */
+	mutex_lock(&wq_pool_mutex);
+	ret = apply_workqueue_attrs_locked(wq, attrs, wq_unbound_cpumask);
+	mutex_unlock(&wq_pool_mutex);
+	put_online_cpus();
+
+	return ret;
+}
+
+
+/**
  * wq_update_unbound_numa - update NUMA affinity of a wq for CPU hot[un]plug
  * @wq: the target workqueue
  * @cpu: the CPU coming up or going down
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/6] workqueue: Allow modifying low level unbound workqueue cpumask
  2014-05-07 16:36 [RFC PATCH 0/6] workqueue: Introduce low-level unbound wq sysfs cpumask v2 Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2014-05-07 16:36 ` [PATCH 4/6] workqueue: Split apply attrs code from its locking Frederic Weisbecker
@ 2014-05-07 16:37 ` Frederic Weisbecker
  2014-05-08 13:22   ` Lai Jiangshan
  2014-05-07 16:37 ` [PATCH 6/6] workqueue: Record real per-workqueue cpumask Frederic Weisbecker
  5 siblings, 1 reply; 11+ messages in thread
From: Frederic Weisbecker @ 2014-05-07 16:37 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

Allow to modify the low-level unbound workqueues cpumask through
sysfs. This is performed by traversing the entire workqueue list
and calling apply_workqueue_attrs() on the unbound workqueues.

Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/workqueue.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 63 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2aa296d..5978cee 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -293,7 +293,7 @@ static DEFINE_SPINLOCK(wq_mayday_lock);	/* protects wq->maydays list */
 static LIST_HEAD(workqueues);		/* PL: list of all workqueues */
 static bool workqueue_freezing;		/* PL: have wqs started freezing? */
 
-static cpumask_var_t wq_unbound_cpumask;
+static cpumask_var_t wq_unbound_cpumask; /* PL: low level cpumask for all unbound wqs */
 
 /* the per-cpu worker pools */
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
@@ -4084,19 +4084,80 @@ static struct bus_type wq_subsys = {
 	.dev_groups			= wq_sysfs_groups,
 };
 
+static int unbounds_cpumask_apply(cpumask_var_t cpumask)
+{
+	struct workqueue_struct *wq;
+	int ret;
+
+	lockdep_assert_held(&wq_pool_mutex);
+
+	list_for_each_entry(wq, &workqueues, list) {
+		struct workqueue_attrs *attrs;
+
+		if (!(wq->flags & WQ_UNBOUND))
+			continue;
+
+		attrs = wq_sysfs_prep_attrs(wq);
+		if (!attrs)
+			return -ENOMEM;
+
+		ret = apply_workqueue_attrs_locked(wq, attrs, cpumask);
+		free_workqueue_attrs(attrs);
+		if (ret)
+			break;
+	}
+
+	return 0;
+}
+
+static ssize_t unbounds_cpumask_store(struct device *dev,
+				      struct device_attribute *attr,
+				      const char *buf, size_t count)
+{
+	cpumask_var_t cpumask;
+	int ret = -EINVAL;
+
+	if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL))
+		return -ENOMEM;
+
+	ret = cpumask_parse(buf, cpumask);
+	if (ret)
+		goto out;
+
+	get_online_cpus();
+	if (cpumask_intersects(cpumask, cpu_online_mask)) {
+		mutex_lock(&wq_pool_mutex);
+		ret = unbounds_cpumask_apply(cpumask);
+		if (ret < 0) {
+			/* Warn if rollback itself fails */
+			WARN_ON_ONCE(unbounds_cpumask_apply(wq_unbound_cpumask));
+		} else {
+			cpumask_copy(wq_unbound_cpumask, cpumask);
+		}
+		mutex_unlock(&wq_pool_mutex);
+	}
+	put_online_cpus();
+out:
+	free_cpumask_var(cpumask);
+	return ret ? ret : count;
+}
+
 static ssize_t unbounds_cpumask_show(struct device *dev,
 				     struct device_attribute *attr, char *buf)
 {
 	int written;
 
+	mutex_lock(&wq_pool_mutex);
 	written = cpumask_scnprintf(buf, PAGE_SIZE, wq_unbound_cpumask);
+	mutex_unlock(&wq_pool_mutex);
+
 	written += scnprintf(buf + written, PAGE_SIZE - written, "\n");
 
 	return written;
 }
 
 static struct device_attribute wq_sysfs_cpumask_attr =
-	__ATTR(cpumask, 0444, unbounds_cpumask_show, NULL);
+	__ATTR(cpumask, 0644, unbounds_cpumask_show, unbounds_cpumask_store);
 
 static int __init wq_sysfs_init(void)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/6] workqueue: Record real per-workqueue cpumask
  2014-05-07 16:36 [RFC PATCH 0/6] workqueue: Introduce low-level unbound wq sysfs cpumask v2 Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2014-05-07 16:37 ` [PATCH 5/6] workqueue: Allow modifying low level unbound workqueue cpumask Frederic Weisbecker
@ 2014-05-07 16:37 ` Frederic Weisbecker
  2014-05-08 13:20   ` Lai Jiangshan
  5 siblings, 1 reply; 11+ messages in thread
From: Frederic Weisbecker @ 2014-05-07 16:37 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

The real cpumask set by the user on WQ_SYSFS workqueues fails to be
recorded as is: What is actually recorded as per workqueue attribute
is the per workqueue cpumask intersected with the global unbounds cpumask.

Eventually when the user overwrites a WQ_SYSFS cpumask and later read
this attibute, the value returned is not the last one written.

The other bad side effect is that widening the global unbounds cpumask
doesn't actually widen the unbound workqueues affinity because their
own cpumask has been schrinked.

In order to fix this, lets record the real per workqueue cpumask on the
workqueue struct. We restore this value when attributes are re-evaluated
later.

FIXME: Maybe I should rather invert that. Have the user set workqueue
cpumask on attributes and the effective one on the workqueue struct instead.
We'll just need some tweaking in order to make the attributes of lower layers
(pools, worker pools, worker, ...) to inherit the effective cpumask and not
the user one.

Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/workqueue.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 5978cee..504cf0a 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -248,6 +248,7 @@ struct workqueue_struct {
 	int			saved_max_active; /* WQ: saved pwq max_active */
 
 	struct workqueue_attrs	*unbound_attrs;	/* WQ: only for unbound wqs */
+	cpumask_var_t		saved_cpumask;	/* WQ: only for unbound wqs */
 	struct pool_workqueue	*dfl_pwq;	/* WQ: only for unbound wqs */
 
 #ifdef CONFIG_SYSFS
@@ -3694,6 +3695,7 @@ static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
 	mutex_lock(&wq->mutex);
 
 	copy_workqueue_attrs(wq->unbound_attrs, new_attrs);
+	cpumask_copy(wq->saved_cpumask, attrs->cpumask);
 
 	/* save the previous pwq and install the new one */
 	for_each_node(node)
@@ -4326,6 +4328,11 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
 		wq->unbound_attrs = alloc_workqueue_attrs(GFP_KERNEL);
 		if (!wq->unbound_attrs)
 			goto err_free_wq;
+
+		if (!alloc_cpumask_var(&wq->saved_cpumask, GFP_KERNEL))
+			goto err_free_wq;
+
+		cpumask_copy(wq->saved_cpumask, cpu_possible_mask);
 	}
 
 	va_start(args, lock_name);
@@ -4397,6 +4404,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
 	return wq;
 
 err_free_wq:
+	free_cpumask_var(wq->saved_cpumask);
 	free_workqueue_attrs(wq->unbound_attrs);
 	kfree(wq);
 	return NULL;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 6/6] workqueue: Record real per-workqueue cpumask
  2014-05-07 16:37 ` [PATCH 6/6] workqueue: Record real per-workqueue cpumask Frederic Weisbecker
@ 2014-05-08 13:20   ` Lai Jiangshan
  2014-05-13 14:55     ` Frederic Weisbecker
  0 siblings, 1 reply; 11+ messages in thread
From: Lai Jiangshan @ 2014-05-08 13:20 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Christoph Lameter, Kevin Hilman, Mike Galbraith,
	Paul E. McKenney, Tejun Heo, Viresh Kumar

On Thu, May 8, 2014 at 12:37 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> The real cpumask set by the user on WQ_SYSFS workqueues fails to be
> recorded as is: What is actually recorded as per workqueue attribute
> is the per workqueue cpumask intersected with the global unbounds cpumask.
>
> Eventually when the user overwrites a WQ_SYSFS cpumask and later read
> this attibute, the value returned is not the last one written.
>
> The other bad side effect is that widening the global unbounds cpumask
> doesn't actually widen the unbound workqueues affinity because their
> own cpumask has been schrinked.
>
> In order to fix this, lets record the real per workqueue cpumask on the
> workqueue struct. We restore this value when attributes are re-evaluated
> later.
>
> FIXME: Maybe I should rather invert that. Have the user set workqueue
> cpumask on attributes and the effective one on the workqueue struct instead.
> We'll just need some tweaking in order to make the attributes of lower layers
> (pools, worker pools, worker, ...) to inherit the effective cpumask and not
> the user one.
>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Kevin Hilman <khilman@linaro.org>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> Cc: Mike Galbraith <bitbucket@online.de>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> ---
>  kernel/workqueue.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 5978cee..504cf0a 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -248,6 +248,7 @@ struct workqueue_struct {
>         int                     saved_max_active; /* WQ: saved pwq max_active */
>
>         struct workqueue_attrs  *unbound_attrs; /* WQ: only for unbound wqs */
> +       cpumask_var_t           saved_cpumask;  /* WQ: only for unbound wqs */


Forgot to use it? or use it in next patches?

>         struct pool_workqueue   *dfl_pwq;       /* WQ: only for unbound wqs */
>
>  #ifdef CONFIG_SYSFS
> @@ -3694,6 +3695,7 @@ static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
>         mutex_lock(&wq->mutex);
>
>         copy_workqueue_attrs(wq->unbound_attrs, new_attrs);
> +       cpumask_copy(wq->saved_cpumask, attrs->cpumask);

I think you can use ->unbound_attrs directly:
          copy_workqueue_attrs(wq->unbound_attrs, attrs);

and update wq_update_unbound_numa():
         copy_workqueue_attrs(tmp_attrs, wq->unbound_attrs);
         cpumask_and(&tmp_attrs->cpumask, wq_unbound_cpumask)

use tmp_attr instead of wq->unbound_attrs in the left code of
wq_update_unbound_numa()

>
>         /* save the previous pwq and install the new one */
>         for_each_node(node)
> @@ -4326,6 +4328,11 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
>                 wq->unbound_attrs = alloc_workqueue_attrs(GFP_KERNEL);
>                 if (!wq->unbound_attrs)
>                         goto err_free_wq;
> +
> +               if (!alloc_cpumask_var(&wq->saved_cpumask, GFP_KERNEL))
> +                       goto err_free_wq;
> +
> +               cpumask_copy(wq->saved_cpumask, cpu_possible_mask);
>         }
>
>         va_start(args, lock_name);
> @@ -4397,6 +4404,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
>         return wq;
>
>  err_free_wq:
> +       free_cpumask_var(wq->saved_cpumask);
>         free_workqueue_attrs(wq->unbound_attrs);
>         kfree(wq);
>         return NULL;
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/6] workqueue: Allow modifying low level unbound workqueue cpumask
  2014-05-07 16:37 ` [PATCH 5/6] workqueue: Allow modifying low level unbound workqueue cpumask Frederic Weisbecker
@ 2014-05-08 13:22   ` Lai Jiangshan
  2014-05-13 14:58     ` Frederic Weisbecker
  0 siblings, 1 reply; 11+ messages in thread
From: Lai Jiangshan @ 2014-05-08 13:22 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Christoph Lameter, Kevin Hilman, Mike Galbraith,
	Paul E. McKenney, Tejun Heo, Viresh Kumar

On Thu, May 8, 2014 at 12:37 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Allow to modify the low-level unbound workqueues cpumask through
> sysfs. This is performed by traversing the entire workqueue list
> and calling apply_workqueue_attrs() on the unbound workqueues.
>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Kevin Hilman <khilman@linaro.org>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> Cc: Mike Galbraith <bitbucket@online.de>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> ---
>  kernel/workqueue.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 63 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 2aa296d..5978cee 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -293,7 +293,7 @@ static DEFINE_SPINLOCK(wq_mayday_lock);     /* protects wq->maydays list */
>  static LIST_HEAD(workqueues);          /* PL: list of all workqueues */
>  static bool workqueue_freezing;                /* PL: have wqs started freezing? */
>
> -static cpumask_var_t wq_unbound_cpumask;
> +static cpumask_var_t wq_unbound_cpumask; /* PL: low level cpumask for all unbound wqs */
>
>  /* the per-cpu worker pools */
>  static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
> @@ -4084,19 +4084,80 @@ static struct bus_type wq_subsys = {
>         .dev_groups                     = wq_sysfs_groups,
>  };
>
> +static int unbounds_cpumask_apply(cpumask_var_t cpumask)
> +{
> +       struct workqueue_struct *wq;
> +       int ret;
> +
> +       lockdep_assert_held(&wq_pool_mutex);
> +
> +       list_for_each_entry(wq, &workqueues, list) {
> +               struct workqueue_attrs *attrs;
> +
> +               if (!(wq->flags & WQ_UNBOUND))
> +                       continue;
> +
> +               attrs = wq_sysfs_prep_attrs(wq);
> +               if (!attrs)
> +                       return -ENOMEM;
> +
> +               ret = apply_workqueue_attrs_locked(wq, attrs, cpumask);
> +               free_workqueue_attrs(attrs);
> +               if (ret)
> +                       break;
> +       }
> +
> +       return 0;
> +}
> +
> +static ssize_t unbounds_cpumask_store(struct device *dev,
> +                                     struct device_attribute *attr,
> +                                     const char *buf, size_t count)
> +{
> +       cpumask_var_t cpumask;
> +       int ret = -EINVAL;
> +
> +       if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL))
> +               return -ENOMEM;
> +
> +       ret = cpumask_parse(buf, cpumask);
> +       if (ret)
> +               goto out;


     cpumask_and(cpumask, cpumask, cpu_possible_mask);

> +
> +       get_online_cpus();
> +       if (cpumask_intersects(cpumask, cpu_online_mask)) {
> +               mutex_lock(&wq_pool_mutex);
> +               ret = unbounds_cpumask_apply(cpumask);
> +               if (ret < 0) {
> +                       /* Warn if rollback itself fails */
> +                       WARN_ON_ONCE(unbounds_cpumask_apply(wq_unbound_cpumask));
> +               } else {
> +                       cpumask_copy(wq_unbound_cpumask, cpumask);
> +               }
> +               mutex_unlock(&wq_pool_mutex);
> +       }
> +       put_online_cpus();
> +out:
> +       free_cpumask_var(cpumask);
> +       return ret ? ret : count;
> +}
> +
>  static ssize_t unbounds_cpumask_show(struct device *dev,
>                                      struct device_attribute *attr, char *buf)
>  {
>         int written;
>
> +       mutex_lock(&wq_pool_mutex);
>         written = cpumask_scnprintf(buf, PAGE_SIZE, wq_unbound_cpumask);
> +       mutex_unlock(&wq_pool_mutex);
> +
>         written += scnprintf(buf + written, PAGE_SIZE - written, "\n");
>
>         return written;
>  }
>
>  static struct device_attribute wq_sysfs_cpumask_attr =
> -       __ATTR(cpumask, 0444, unbounds_cpumask_show, NULL);
> +       __ATTR(cpumask, 0644, unbounds_cpumask_show, unbounds_cpumask_store);
>
>  static int __init wq_sysfs_init(void)
>  {
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 6/6] workqueue: Record real per-workqueue cpumask
  2014-05-08 13:20   ` Lai Jiangshan
@ 2014-05-13 14:55     ` Frederic Weisbecker
  0 siblings, 0 replies; 11+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 14:55 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: LKML, Christoph Lameter, Kevin Hilman, Mike Galbraith,
	Paul E. McKenney, Tejun Heo, Viresh Kumar

On Thu, May 08, 2014 at 09:20:45PM +0800, Lai Jiangshan wrote:
> On Thu, May 8, 2014 at 12:37 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > The real cpumask set by the user on WQ_SYSFS workqueues fails to be
> > recorded as is: What is actually recorded as per workqueue attribute
> > is the per workqueue cpumask intersected with the global unbounds cpumask.
> >
> > Eventually when the user overwrites a WQ_SYSFS cpumask and later read
> > this attibute, the value returned is not the last one written.
> >
> > The other bad side effect is that widening the global unbounds cpumask
> > doesn't actually widen the unbound workqueues affinity because their
> > own cpumask has been schrinked.
> >
> > In order to fix this, lets record the real per workqueue cpumask on the
> > workqueue struct. We restore this value when attributes are re-evaluated
> > later.
> >
> > FIXME: Maybe I should rather invert that. Have the user set workqueue
> > cpumask on attributes and the effective one on the workqueue struct instead.
> > We'll just need some tweaking in order to make the attributes of lower layers
> > (pools, worker pools, worker, ...) to inherit the effective cpumask and not
> > the user one.
> >
> > Cc: Christoph Lameter <cl@linux.com>
> > Cc: Kevin Hilman <khilman@linaro.org>
> > Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> > Cc: Mike Galbraith <bitbucket@online.de>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > ---
> >  kernel/workqueue.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index 5978cee..504cf0a 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -248,6 +248,7 @@ struct workqueue_struct {
> >         int                     saved_max_active; /* WQ: saved pwq max_active */
> >
> >         struct workqueue_attrs  *unbound_attrs; /* WQ: only for unbound wqs */
> > +       cpumask_var_t           saved_cpumask;  /* WQ: only for unbound wqs */
> 
> 
> Forgot to use it? or use it in next patches?

Hmm, no it's used below.

> 
> >         struct pool_workqueue   *dfl_pwq;       /* WQ: only for unbound wqs */
> >
> >  #ifdef CONFIG_SYSFS
> > @@ -3694,6 +3695,7 @@ static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
> >         mutex_lock(&wq->mutex);
> >
> >         copy_workqueue_attrs(wq->unbound_attrs, new_attrs);
> > +       cpumask_copy(wq->saved_cpumask, attrs->cpumask);
> 
> I think you can use ->unbound_attrs directly:
>           copy_workqueue_attrs(wq->unbound_attrs, attrs);
> 
> and update wq_update_unbound_numa():
>          copy_workqueue_attrs(tmp_attrs, wq->unbound_attrs);
>          cpumask_and(&tmp_attrs->cpumask, wq_unbound_cpumask)
> 
> use tmp_attr instead of wq->unbound_attrs in the left code of
> wq_update_unbound_numa()

But wq_update_unbound_numa() is only called on cpu hotplug operations
right? So this may have no effect after setting a cpumask in sysfs.

How about keeping the sysfs cpu in wq's unbound_attrs but pass the effective
one to pwq creation in apply_workqueue_attrs_locked.

And also do what you suggest in wq_update_unbound_numa for hotplug
operations.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/6] workqueue: Allow modifying low level unbound workqueue cpumask
  2014-05-08 13:22   ` Lai Jiangshan
@ 2014-05-13 14:58     ` Frederic Weisbecker
  0 siblings, 0 replies; 11+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 14:58 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: LKML, Christoph Lameter, Kevin Hilman, Mike Galbraith,
	Paul E. McKenney, Tejun Heo, Viresh Kumar

On Thu, May 08, 2014 at 09:22:51PM +0800, Lai Jiangshan wrote:
> On Thu, May 8, 2014 at 12:37 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > Allow to modify the low-level unbound workqueues cpumask through
> > sysfs. This is performed by traversing the entire workqueue list
> > and calling apply_workqueue_attrs() on the unbound workqueues.
> >
> > Cc: Christoph Lameter <cl@linux.com>
> > Cc: Kevin Hilman <khilman@linaro.org>
> > Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> > Cc: Mike Galbraith <bitbucket@online.de>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > ---
> >  kernel/workqueue.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 63 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index 2aa296d..5978cee 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -293,7 +293,7 @@ static DEFINE_SPINLOCK(wq_mayday_lock);     /* protects wq->maydays list */
> >  static LIST_HEAD(workqueues);          /* PL: list of all workqueues */
> >  static bool workqueue_freezing;                /* PL: have wqs started freezing? */
> >
> > -static cpumask_var_t wq_unbound_cpumask;
> > +static cpumask_var_t wq_unbound_cpumask; /* PL: low level cpumask for all unbound wqs */
> >
> >  /* the per-cpu worker pools */
> >  static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
> > @@ -4084,19 +4084,80 @@ static struct bus_type wq_subsys = {
> >         .dev_groups                     = wq_sysfs_groups,
> >  };
> >
> > +static int unbounds_cpumask_apply(cpumask_var_t cpumask)
> > +{
> > +       struct workqueue_struct *wq;
> > +       int ret;
> > +
> > +       lockdep_assert_held(&wq_pool_mutex);
> > +
> > +       list_for_each_entry(wq, &workqueues, list) {
> > +               struct workqueue_attrs *attrs;
> > +
> > +               if (!(wq->flags & WQ_UNBOUND))
> > +                       continue;
> > +
> > +               attrs = wq_sysfs_prep_attrs(wq);
> > +               if (!attrs)
> > +                       return -ENOMEM;
> > +
> > +               ret = apply_workqueue_attrs_locked(wq, attrs, cpumask);
> > +               free_workqueue_attrs(attrs);
> > +               if (ret)
> > +                       break;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static ssize_t unbounds_cpumask_store(struct device *dev,
> > +                                     struct device_attribute *attr,
> > +                                     const char *buf, size_t count)
> > +{
> > +       cpumask_var_t cpumask;
> > +       int ret = -EINVAL;
> > +
> > +       if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL))
> > +               return -ENOMEM;
> > +
> > +       ret = cpumask_parse(buf, cpumask);
> > +       if (ret)
> > +               goto out;
> 
> 
>      cpumask_and(cpumask, cpumask, cpu_possible_mask);

Is it really useful? I mean in the end we only apply online bits.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-05-13 14:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-07 16:36 [RFC PATCH 0/6] workqueue: Introduce low-level unbound wq sysfs cpumask v2 Frederic Weisbecker
2014-05-07 16:36 ` [PATCH 1/6] workqueue: Allow changing attributions of ordered workqueues Frederic Weisbecker
2014-05-07 16:36 ` [PATCH 2/6] workqueue: Reorder sysfs code Frederic Weisbecker
2014-05-07 16:36 ` [PATCH 3/6] workqueue: Create low-level unbound workqueues cpumask Frederic Weisbecker
2014-05-07 16:36 ` [PATCH 4/6] workqueue: Split apply attrs code from its locking Frederic Weisbecker
2014-05-07 16:37 ` [PATCH 5/6] workqueue: Allow modifying low level unbound workqueue cpumask Frederic Weisbecker
2014-05-08 13:22   ` Lai Jiangshan
2014-05-13 14:58     ` Frederic Weisbecker
2014-05-07 16:37 ` [PATCH 6/6] workqueue: Record real per-workqueue cpumask Frederic Weisbecker
2014-05-08 13:20   ` Lai Jiangshan
2014-05-13 14:55     ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox