* [PATCH] workqueue: doc change for ST behavior on NUMA systems
@ 2017-07-18 18:12 Alexei Potashnik
2017-07-18 18:37 ` Tejun Heo
2017-07-18 19:18 ` Christoph Hellwig
0 siblings, 2 replies; 8+ messages in thread
From: Alexei Potashnik @ 2017-07-18 18:12 UTC (permalink / raw)
To: tj; +Cc: linux-kernel
NUMA rework of workqueue made the combination of max_active of 1 and
WQ_UNBOUND insufficient to guarantee ST behavior system wide.
alloc_ordered_queue should now be used instead.
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
---
Documentation/core-api/workqueue.rst | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/Documentation/core-api/workqueue.rst
b/Documentation/core-api/workqueue.rst
index ffdec94..3943b5b 100644
--- a/Documentation/core-api/workqueue.rst
+++ b/Documentation/core-api/workqueue.rst
@@ -243,11 +243,15 @@ throttling the number of active work items,
specifying '0' is
recommended.
Some users depend on the strict execution ordering of ST wq. The
-combination of ``@max_active`` of 1 and ``WQ_UNBOUND`` is used to
-achieve this behavior. Work items on such wq are always queued to the
-unbound worker-pools and only one work item can be active at any given
+combination of ``@max_active`` of 1 and ``WQ_UNBOUND`` used to
+achieve this behavior. Work items on such wq were always queued to the
+unbound worker-pools and only one work item could be active at any given
time thus achieving the same ordering property as ST wq.
+In the current implementation the above configuration only guarantees
+ST behavior within a given NUMA node. Instead alloc_ordered_queue should
+be used to achieve system wide ST behavior.
+
Example Execution Scenarios
===========================
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] workqueue: doc change for ST behavior on NUMA systems
2017-07-18 18:12 [PATCH] workqueue: doc change for ST behavior on NUMA systems Alexei Potashnik
@ 2017-07-18 18:37 ` Tejun Heo
2017-07-18 19:18 ` Christoph Hellwig
1 sibling, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2017-07-18 18:37 UTC (permalink / raw)
To: Alexei Potashnik; +Cc: linux-kernel
On Tue, Jul 18, 2017 at 11:12:53AM -0700, Alexei Potashnik wrote:
> NUMA rework of workqueue made the combination of max_active of 1 and
> WQ_UNBOUND insufficient to guarantee ST behavior system wide.
>
> alloc_ordered_queue should now be used instead.
>
> Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Patch was whitespace damanged in transit. Applied manually to
wq/for-4.14.
Thanks a lot for the fix!
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] workqueue: doc change for ST behavior on NUMA systems
2017-07-18 18:12 [PATCH] workqueue: doc change for ST behavior on NUMA systems Alexei Potashnik
2017-07-18 18:37 ` Tejun Heo
@ 2017-07-18 19:18 ` Christoph Hellwig
2017-07-18 19:36 ` Tejun Heo
` (3 more replies)
1 sibling, 4 replies; 8+ messages in thread
From: Christoph Hellwig @ 2017-07-18 19:18 UTC (permalink / raw)
To: Alexei Potashnik; +Cc: tj, linux-kernel
On Tue, Jul 18, 2017 at 11:12:53AM -0700, Alexei Potashnik wrote:
> NUMA rework of workqueue made the combination of max_active of 1 and
> WQ_UNBOUND insufficient to guarantee ST behavior system wide.
>
> alloc_ordered_queue should now be used instead.
Eww. And how many existing users might be broken by that?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] workqueue: doc change for ST behavior on NUMA systems
2017-07-18 19:18 ` Christoph Hellwig
@ 2017-07-18 19:36 ` Tejun Heo
2017-07-18 21:32 ` Alexei Potashnik
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2017-07-18 19:36 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Alexei Potashnik, linux-kernel
On Tue, Jul 18, 2017 at 12:18:29PM -0700, Christoph Hellwig wrote:
> On Tue, Jul 18, 2017 at 11:12:53AM -0700, Alexei Potashnik wrote:
> > NUMA rework of workqueue made the combination of max_active of 1 and
> > WQ_UNBOUND insufficient to guarantee ST behavior system wide.
> >
> > alloc_ordered_queue should now be used instead.
>
> Eww. And how many existing users might be broken by that?
Good point. create_singlethread_workqueue() maps to
alloc_ordered_workqueue(), so they're all good. Only the ones which
got converted to the new interface incorrectly would be broken. I
probably should make WQ_UNBOUND / 1 behave as ordered. I'll think
more about it.
Thanks!
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] workqueue: doc change for ST behavior on NUMA systems
2017-07-18 19:18 ` Christoph Hellwig
2017-07-18 19:36 ` Tejun Heo
@ 2017-07-18 21:32 ` Alexei Potashnik
2017-07-18 22:25 ` Alexei Potashnik
2017-07-18 22:41 ` [PATCH wq/for-4.13-fixes] workqueue: restore WQ_UNBOUND/max_active==1 to be ordered Tejun Heo
3 siblings, 0 replies; 8+ messages in thread
From: Alexei Potashnik @ 2017-07-18 21:32 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: tj, linux-kernel
target has a bug in TMR handling.
dev->tmr_wq = alloc_workqueue("tmr-%s", WQ_MEM_RECLAIM |
WQ_UNBOUND, 1,
dev->transport->name);
LUN_RESET can race with TASK_ABORT in different sessions.
Will send a patch to target list.
-----Original Message-----
From: Christoph Hellwig [mailto:hch@infradead.org]
Sent: Tuesday, July 18, 2017 12:18 PM
To: Alexei Potashnik <alexei@purestorage.com>
Cc: tj@kernel.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] workqueue: doc change for ST behavior on NUMA systems
On Tue, Jul 18, 2017 at 11:12:53AM -0700, Alexei Potashnik wrote:
> NUMA rework of workqueue made the combination of max_active of 1 and
> WQ_UNBOUND insufficient to guarantee ST behavior system wide.
>
> alloc_ordered_queue should now be used instead.
Eww. And how many existing users might be broken by that?
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] workqueue: doc change for ST behavior on NUMA systems
2017-07-18 19:18 ` Christoph Hellwig
2017-07-18 19:36 ` Tejun Heo
2017-07-18 21:32 ` Alexei Potashnik
@ 2017-07-18 22:25 ` Alexei Potashnik
2017-07-18 22:41 ` [PATCH wq/for-4.13-fixes] workqueue: restore WQ_UNBOUND/max_active==1 to be ordered Tejun Heo
3 siblings, 0 replies; 8+ messages in thread
From: Alexei Potashnik @ 2017-07-18 22:25 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: tj, linux-kernel
Actually there are few more places in the tree that still do this.
I doubt they actually need per-NUMA node serialization:
arch/powerpc/platforms/pseries/dlpar.c:
WQ_UNBOUND, 1);
drivers/lightnvm/pblk-init.c:
WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
drivers/md/dm-integrity.c: ic->wait_wq =
alloc_workqueue("dm-integrity-wait", WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
drivers/md/dm.c: deferred_remove_workqueue =
alloc_workqueue("kdmremove", WQ_UNBOUND, 1);
drivers/media/platform/coda/coda-common.c: dev->workqueue =
alloc_workqueue("coda", WQ_UNBOUND | WQ_MEM_RECLAIM, 1);
drivers/net/ethernet/cavium/thunder/nic_main.c:
WQ_UNBOUND | WQ_MEM_RECLAIM, 1);
drivers/net/ethernet/intel/i40e/i40e_main.c: i40e_wq =
alloc_workqueue("%s", WQ_UNBOUND | WQ_MEM_RECLAIM, 1,
drivers/net/ethernet/intel/i40evf/i40evf_main.c: i40evf_wq =
alloc_workqueue("%s", WQ_UNBOUND | WQ_MEM_RECLAIM, 1,
drivers/net/wireless/intel/iwlwifi/pcie/rx.c:
WQ_HIGHPRI | WQ_UNBOUND, 1);
drivers/net/wireless/marvell/mwifiex/cfg80211.c:
WQ_UNBOUND, 1, name);
drivers/net/wireless/marvell/mwifiex/main.c:
WQ_HIGHPRI | WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
drivers/net/wireless/marvell/mwifiex/main.c:
WQ_UNBOUND, 1);
drivers/net/wireless/marvell/mwifiex/main.c:
WQ_HIGHPRI | WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
drivers/net/wireless/marvell/mwifiex/main.c:
WQ_UNBOUND, 1);
drivers/staging/greybus/connection.c: connection->wq =
alloc_workqueue("%s:%d", WQ_UNBOUND, 1,
drivers/staging/greybus/svc.c: svc->wq = alloc_workqueue("%s:svc",
WQ_UNBOUND, 1, dev_name(&hd->dev));
drivers/target/target_core_device.c: dev->tmr_wq =
alloc_workqueue("tmr-%s", WQ_MEM_RECLAIM | WQ_UNBOUND, 1,
fs/dlm/lowcomms.c: WQ_UNBOUND |
WQ_MEM_RECLAIM, 1);
fs/dlm/lowcomms.c: WQ_UNBOUND |
WQ_MEM_RECLAIM, 1);
fs/ext4/super.c: alloc_workqueue("ext4-rsv-conversion",
WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
-----Original Message-----
From: Alexei Potashnik [mailto:alexei@purestorage.com]
Sent: Tuesday, July 18, 2017 2:33 PM
To: 'Christoph Hellwig' <hch@infradead.org>
Cc: 'tj@kernel.org' <tj@kernel.org>; 'linux-kernel@vger.kernel.org'
<linux-kernel@vger.kernel.org>
Subject: RE: [PATCH] workqueue: doc change for ST behavior on NUMA systems
target has a bug in TMR handling.
dev->tmr_wq = alloc_workqueue("tmr-%s", WQ_MEM_RECLAIM |
WQ_UNBOUND, 1,
dev->transport->name);
LUN_RESET can race with TASK_ABORT in different sessions.
Will send a patch to target list.
-----Original Message-----
From: Christoph Hellwig [mailto:hch@infradead.org]
Sent: Tuesday, July 18, 2017 12:18 PM
To: Alexei Potashnik <alexei@purestorage.com>
Cc: tj@kernel.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] workqueue: doc change for ST behavior on NUMA systems
On Tue, Jul 18, 2017 at 11:12:53AM -0700, Alexei Potashnik wrote:
> NUMA rework of workqueue made the combination of max_active of 1 and
> WQ_UNBOUND insufficient to guarantee ST behavior system wide.
>
> alloc_ordered_queue should now be used instead.
Eww. And how many existing users might be broken by that?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH wq/for-4.13-fixes] workqueue: restore WQ_UNBOUND/max_active==1 to be ordered
2017-07-18 19:18 ` Christoph Hellwig
` (2 preceding siblings ...)
2017-07-18 22:25 ` Alexei Potashnik
@ 2017-07-18 22:41 ` Tejun Heo
2017-07-19 15:25 ` Tejun Heo
3 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2017-07-18 22:41 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Alexei Potashnik, linux-kernel, kernel-team
The combination of WQ_UNBOUND and max_active == 1 used to imply
ordered execution. After NUMA affinity 4c16bd327c74 ("workqueue:
implement NUMA affinity for unbound workqueues"), this is no longer
true due to per-node worker pools.
While the right way to create an ordered workqueue is
alloc_ordered_workqueue(), the documentation has been misleading for a
long time and people do use WQ_UNBOUND and max_active == 1 for ordered
workqueues which can lead to subtle bugs which are very difficult to
trigger.
It's unlikely that we'd see noticeable performance impact by enforcing
ordering on WQ_UNBOUND / max_active == 1 workqueues. Let's
automatically set __WQ_ORDERED for those workqueues.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Christoph Hellwig <hch@infradead.org>
Reported-by: Alexei Potashnik <alexei@purestorage.com>
Fixes: 4c16bd327c74 ("workqueue: implement NUMA affinity for unbound workqueues")
Cc: stable@vger.kernel.org # v3.10+
---
Hello,
Unless somebody objects, I'll apply this to wq/for-4.13-fixes.
Thanks.
kernel/workqueue.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index a86688f..abe4a49 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3929,6 +3929,16 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
struct workqueue_struct *wq;
struct pool_workqueue *pwq;
+ /*
+ * Unbound && max_active == 1 used to imply ordered, which is no
+ * longer the case on NUMA machines due to per-node pools. While
+ * alloc_ordered_workqueue() is the right way to create an ordered
+ * workqueue, keep the previous behavior to avoid subtle breakages
+ * on NUMA.
+ */
+ if ((flags & WQ_UNBOUND) && max_active == 1)
+ flags |= __WQ_ORDERED;
+
/* see the comment above the definition of WQ_POWER_EFFICIENT */
if ((flags & WQ_POWER_EFFICIENT) && wq_power_efficient)
flags |= WQ_UNBOUND;
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH wq/for-4.13-fixes] workqueue: restore WQ_UNBOUND/max_active==1 to be ordered
2017-07-18 22:41 ` [PATCH wq/for-4.13-fixes] workqueue: restore WQ_UNBOUND/max_active==1 to be ordered Tejun Heo
@ 2017-07-19 15:25 ` Tejun Heo
0 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2017-07-19 15:25 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Alexei Potashnik, linux-kernel, kernel-team
On Tue, Jul 18, 2017 at 06:41:52PM -0400, Tejun Heo wrote:
> The combination of WQ_UNBOUND and max_active == 1 used to imply
> ordered execution. After NUMA affinity 4c16bd327c74 ("workqueue:
> implement NUMA affinity for unbound workqueues"), this is no longer
> true due to per-node worker pools.
>
> While the right way to create an ordered workqueue is
> alloc_ordered_workqueue(), the documentation has been misleading for a
> long time and people do use WQ_UNBOUND and max_active == 1 for ordered
> workqueues which can lead to subtle bugs which are very difficult to
> trigger.
>
> It's unlikely that we'd see noticeable performance impact by enforcing
> ordering on WQ_UNBOUND / max_active == 1 workqueues. Let's
> automatically set __WQ_ORDERED for those workqueues.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Christoph Hellwig <hch@infradead.org>
> Reported-by: Alexei Potashnik <alexei@purestorage.com>
> Fixes: 4c16bd327c74 ("workqueue: implement NUMA affinity for unbound workqueues")
> Cc: stable@vger.kernel.org # v3.10+
Applied to wq/for-4.13-fixes.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-07-19 15:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-18 18:12 [PATCH] workqueue: doc change for ST behavior on NUMA systems Alexei Potashnik
2017-07-18 18:37 ` Tejun Heo
2017-07-18 19:18 ` Christoph Hellwig
2017-07-18 19:36 ` Tejun Heo
2017-07-18 21:32 ` Alexei Potashnik
2017-07-18 22:25 ` Alexei Potashnik
2017-07-18 22:41 ` [PATCH wq/for-4.13-fixes] workqueue: restore WQ_UNBOUND/max_active==1 to be ordered Tejun Heo
2017-07-19 15:25 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox