From: Tejun Heo <tj@kernel.org>
To: jiangshanlai@gmail.com
Cc: linux-kernel@vger.kernel.org, Naohiro.Aota@wdc.com,
kernel-team@meta.com, Tejun Heo <tj@kernel.org>
Subject: [PATCH 10/10] workqueue: Reimplement ordered workqueue using shared nr_active
Date: Wed, 20 Dec 2023 16:24:41 +0900 [thread overview]
Message-ID: <20231220072529.1036099-11-tj@kernel.org> (raw)
In-Reply-To: <20231220072529.1036099-1-tj@kernel.org>
Because nr_active used to be tied to pwq, an ordered workqueue had to have a
single pwq to guarantee strict ordering. This led to several contortions to
avoid creating multiple pwqs.
Now that nr_active can be shared across multiple pwqs, we can simplify
ordered workqueue implementation. All that's necessary is ensuring that a
single wq_node_nr_active is shared across all pwqs, which is achieved by
making wq_node_nr_active() always return wq->node_nr_active[nr_node_ids] for
ordered workqueues.
The new implementation is simpler and allows ordered workqueues to share
locality aware worker_pools with other unbound workqueues which should
improve execution locality.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
kernel/workqueue.c | 44 ++++++--------------------------------------
1 file changed, 6 insertions(+), 38 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0017e9094034..bae7ed9cd1b4 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -441,9 +441,6 @@ static DEFINE_HASHTABLE(unbound_pool_hash, UNBOUND_POOL_HASH_ORDER);
/* I: attributes used when instantiating standard unbound pools on demand */
static struct workqueue_attrs *unbound_std_wq_attrs[NR_STD_WORKER_POOLS];
-/* I: attributes used when instantiating ordered pools on demand */
-static struct workqueue_attrs *ordered_wq_attrs[NR_STD_WORKER_POOLS];
-
/*
* I: kthread_worker to release pwq's. pwq release needs to be bounced to a
* process context while holding a pool lock. Bounce to a dedicated kthread
@@ -1435,6 +1432,9 @@ work_func_t wq_worker_last_func(struct task_struct *task)
*
* - %NULL for per-cpu workqueues as they don't need to use shared nr_active.
*
+ * - node_nr_active[nr_node_ids] if the associated workqueue is ordered so that
+ * all pwq's are limited by the same nr_active.
+ *
* - node_nr_active[nr_node_ids] if @node is %NUMA_NO_NODE.
*
* - Otherwise, node_nr_active[@node].
@@ -1445,7 +1445,7 @@ static struct wq_node_nr_active *wq_node_nr_active(struct workqueue_struct *wq,
if (!(wq->flags & WQ_UNBOUND))
return NULL;
- if (node == NUMA_NO_NODE)
+ if ((wq->flags & __WQ_ORDERED) || node == NUMA_NO_NODE)
node = nr_node_ids;
return wq->node_nr_active[node];
@@ -4312,7 +4312,7 @@ static struct wq_node_nr_active **alloc_node_nr_active(void)
nna_ar[node] = nna;
}
- /* [nr_node_ids] is used as the fallback */
+ /* [nr_node_ids] is used for ordered workqueues and as the fallback */
nna = kzalloc_node(sizeof(*nna), GFP_KERNEL, NUMA_NO_NODE);
if (!nna)
goto err_free;
@@ -4799,14 +4799,6 @@ static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
if (WARN_ON(!(wq->flags & WQ_UNBOUND)))
return -EINVAL;
- /* creating multiple pwqs breaks ordering guarantee */
- if (!list_empty(&wq->pwqs)) {
- if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
- return -EINVAL;
-
- wq->flags &= ~__WQ_ORDERED;
- }
-
ctx = apply_wqattrs_prepare(wq, attrs, wq_unbound_cpumask);
if (IS_ERR(ctx))
return PTR_ERR(ctx);
@@ -4955,15 +4947,7 @@ static int alloc_and_link_pwqs(struct workqueue_struct *wq)
}
cpus_read_lock();
- if (wq->flags & __WQ_ORDERED) {
- ret = apply_workqueue_attrs(wq, ordered_wq_attrs[highpri]);
- /* there should only be single pwq for ordering guarantee */
- WARN(!ret && (wq->pwqs.next != &wq->dfl_pwq->pwqs_node ||
- wq->pwqs.prev != &wq->dfl_pwq->pwqs_node),
- "ordering guarantee broken for workqueue %s\n", wq->name);
- } else {
- ret = apply_workqueue_attrs(wq, unbound_std_wq_attrs[highpri]);
- }
+ ret = apply_workqueue_attrs(wq, unbound_std_wq_attrs[highpri]);
cpus_read_unlock();
/* for unbound pwq, flush the pwq_release_worker ensures that the
@@ -6220,13 +6204,6 @@ static int workqueue_apply_unbound_cpumask(const cpumask_var_t unbound_cpumask)
if (!(wq->flags & WQ_UNBOUND))
continue;
- /* creating multiple pwqs breaks ordering guarantee */
- if (!list_empty(&wq->pwqs)) {
- if (wq->flags & __WQ_ORDERED_EXPLICIT)
- continue;
- wq->flags &= ~__WQ_ORDERED;
- }
-
ctx = apply_wqattrs_prepare(wq, wq->unbound_attrs, unbound_cpumask);
if (IS_ERR(ctx)) {
ret = PTR_ERR(ctx);
@@ -7023,15 +7000,6 @@ void __init workqueue_init_early(void)
BUG_ON(!(attrs = alloc_workqueue_attrs()));
attrs->nice = std_nice[i];
unbound_std_wq_attrs[i] = attrs;
-
- /*
- * An ordered wq should have only one pwq as ordering is
- * guaranteed by max_active which is enforced by pwqs.
- */
- BUG_ON(!(attrs = alloc_workqueue_attrs()));
- attrs->nice = std_nice[i];
- attrs->ordered = true;
- ordered_wq_attrs[i] = attrs;
}
system_wq = alloc_workqueue("events", 0, 0);
--
2.43.0
next prev parent reply other threads:[~2023-12-20 7:26 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-20 7:24 [PATCHSET wq/for-6.8] workqueue: Implement system-wide max_active for unbound workqueues Tejun Heo
2023-12-20 7:24 ` [PATCH 01/10] workqueue: Move pwq->max_active to wq->max_active Tejun Heo
2023-12-26 9:13 ` Lai Jiangshan
2023-12-26 20:05 ` Tejun Heo
2023-12-26 21:36 ` Tejun Heo
2023-12-20 7:24 ` [PATCH 02/10] workqueue: Factor out pwq_is_empty() Tejun Heo
2023-12-20 7:24 ` [PATCH 03/10] workqueue: Replace pwq_activate_inactive_work() with [__]pwq_activate_work() Tejun Heo
2023-12-20 7:24 ` [PATCH 04/10] workqueue: Move nr_active handling into helpers Tejun Heo
2023-12-26 9:12 ` Lai Jiangshan
2023-12-26 20:06 ` Tejun Heo
2023-12-20 7:24 ` [PATCH 05/10] workqueue: Make wq_adjust_max_active() round-robin pwqs while activating Tejun Heo
2023-12-20 7:24 ` [PATCH 06/10] workqueue: Add first_possible_node and node_nr_cpus[] Tejun Heo
2023-12-20 7:24 ` [PATCH 07/10] workqueue: Move pwq_dec_nr_in_flight() to the end of work item handling Tejun Heo
2023-12-20 7:24 ` [PATCH 08/10] workqueue: Introduce struct wq_node_nr_active Tejun Heo
2023-12-26 9:14 ` Lai Jiangshan
2023-12-26 20:12 ` Tejun Heo
2023-12-20 7:24 ` [PATCH 09/10] workqueue: Implement system-wide nr_active enforcement for unbound workqueues Tejun Heo
2023-12-20 7:24 ` Tejun Heo [this message]
2024-01-13 0:18 ` [PATCH 10/10] workqueue: Reimplement ordered workqueue using shared nr_active Tejun Heo
2023-12-20 9:20 ` [PATCHSET wq/for-6.8] workqueue: Implement system-wide max_active for unbound workqueues Lai Jiangshan
2023-12-21 23:01 ` Tejun Heo
2023-12-22 8:04 ` Lai Jiangshan
2023-12-22 9:08 ` Tejun Heo
2024-01-05 2:44 ` Naohiro Aota
2024-01-12 0:49 ` Tejun Heo
2024-01-13 0:17 ` Tejun Heo
2024-01-15 5:46 ` Naohiro Aota
2024-01-16 21:04 ` Tejun Heo
2024-01-30 2:24 ` Naohiro Aota
2024-01-30 16:11 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231220072529.1036099-11-tj@kernel.org \
--to=tj@kernel.org \
--cc=Naohiro.Aota@wdc.com \
--cc=jiangshanlai@gmail.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox