The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq
@ 2025-06-12 13:33 Marco Crivellari
  2025-06-12 13:33 ` [PATCH v4 1/3] Workqueue: add system_percpu_wq and system_dfl_wq Marco Crivellari
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Marco Crivellari @ 2025-06-12 13:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Lai Jiangshan, Thomas Gleixner, Frederic Weisbecker,
	Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko

Hi!

Below is a summary of a discussion about the Workqueue API and cpu isolation
considerations. Details and more information are available here:

    "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
    https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/

=== Current situation: problems ===

Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.

This leads to different scenarios if a work item is scheduled on an isolated
CPU where "delay" value is 0 or greater then 0:
    schedule_delayed_work(, 0);

This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:

    schedule_delayed_work(, 1);

Will move the timer on an housekeeping CPU, and schedule the work there.

Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistentcy cannot be addressed without refactoring the API.

=== Plan and future plans ===

This patchset is the first stone on a refactoring needed in order to
address the points aforementioned; it will have a positive impact also
on the cpu isolation, in the long term, moving away percpu workqueue in
favor to an unbound model.

These are the main steps:
1)  API refactoring (that this patch is introducing)
    -   Make more clear and uniform the system wq names, both per-cpu and
        unbound. This to avoid any possible confusion on what should be
        used.

    -   Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
        introduced in this patchset and used on all the callers that are not
        currently using WQ_UNBOUND.

        WQ_UNBOUND will be removed in a future release cycle.

        Most users don't need to be per-cpu, because they don't have
        locality requirements, because of that, a next future step will be
        make "unbound" the default behavior.

2)  Check who really needs to be per-cpu
    -   Remove the WQ_PERCPU flag when is not strictly required.

3)  Add a new API (prefer local cpu)
    -   There are users that don't require a local execution, like mentioned
        above; despite that, local execution yeld to performance gain.

        This new API will prefer the local execution, without requiring it.

=== Introduced Changes by this patchset ===

1)	[P1] add system_percpu_wq and system_dfl_wq
		
	system_wq is a per-CPU workqueue, but his name is not clear.
	system_unbound_wq is to be used when locality is not required.

	Because of that, system_percpu_wq and system_dfl_wq have been
	introduced in order to replace, in future, system_wq and
	system_unbound_wq.

2)	[P2] add new WQ_PERCPU flag
		
	This patch adds the new WQ_PERCPU flag to explicitly require to be per-cpu.
	WQ_UNBOUND will be removed in a next release cycle.

3)	[P3] Doc change about WQ_PERCPU
        
	Added a short section about WQ_PERCPU and a Note under WQ_UNBOUND
	mentioning that it will be removed in the future.

---
Changes in v4:
-   Take a step back from the previous version, in order to add first the new
    wq(s) and the new flag (WQ_PERCPU), addressing later all the other changes.

Changes in v3:
-   The introduction of the new wq(s) and the WQ_PERCPU flag have been moved
    in separated patches (1 for wq(s) and 1 for WQ_PERCPU).
-   WQ_PERCPU is now added to all the alloc_workqueue callers in separated patches
    addressing few subsystems first (fs, mm, net).

Changes in v2:
-   Introduction of WQ_PERCPU change has been merged with the alloc_workqueue()
    patch that pass the WQ_PERCPU flag explicitly to every caller.
-   (2 drivers) in the code not matched by Coccinelle; WQ_PERCPU added also there.
-   WQ_PERCPU added to __WQ_BH_ALLOWS.
-   queue_work() now prints a warning (pr_warn_once()) if a user is using the
    old wq and redirect the wrong / old wq to the new one.
-   Changes to workqueue.rst about the WQ_PERCPU flag and a Note about the
    future of WQ_UNBOUND.

Marco Crivellari (3):
  Workqueue: add system_percpu_wq and system_dfl_wq
  Workqueue: add new WQ_PERCPU flag
  [Doc] Workqueue: add WQ_PERCPU

 Documentation/core-api/workqueue.rst | 10 ++++++++++
 include/linux/workqueue.h            |  9 ++++++---
 kernel/workqueue.c                   |  4 ++++
 3 files changed, 20 insertions(+), 3 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4 1/3] Workqueue: add system_percpu_wq and system_dfl_wq
  2025-06-12 13:33 [PATCH v4 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq Marco Crivellari
@ 2025-06-12 13:33 ` Marco Crivellari
  2025-06-13 13:05   ` Frederic Weisbecker
  2025-06-12 13:33 ` [PATCH v4 2/3] Workqueue: add new WQ_PERCPU flag Marco Crivellari
  2025-06-12 13:33 ` [PATCH v4 3/3] [Doc] Workqueue: add WQ_PERCPU Marco Crivellari
  2 siblings, 1 reply; 8+ messages in thread
From: Marco Crivellari @ 2025-06-12 13:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Lai Jiangshan, Thomas Gleixner, Frederic Weisbecker,
	Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko

Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistentcy cannot be addressed without refactoring the API.

system_wq is a per-CPU worqueue, yet nothing in its name tells about that
CPU affinity constraint, which is very often not required by users. Make
it clear by adding a system_percpu_wq.

system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.

Adding system_dfl_wq to encourage its use when unbound work should be used.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
 include/linux/workqueue.h | 8 +++++---
 kernel/workqueue.c        | 4 ++++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 6e30f275da77..502ec4a5e32c 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -427,7 +427,7 @@ enum wq_consts {
 /*
  * System-wide workqueues which are always present.
  *
- * system_wq is the one used by schedule[_delayed]_work[_on]().
+ * system_percpu_wq is the one used by schedule[_delayed]_work[_on]().
  * Multi-CPU multi-threaded.  There are users which expect relatively
  * short queue flush time.  Don't queue works which can run for too
  * long.
@@ -438,7 +438,7 @@ enum wq_consts {
  * system_long_wq is similar to system_wq but may host long running
  * works.  Queue flushing might take relatively long.
  *
- * system_unbound_wq is unbound workqueue.  Workers are not bound to
+ * system_dfl_wq is unbound workqueue.  Workers are not bound to
  * any specific CPU, not concurrency managed, and all queued works are
  * executed immediately as long as max_active limit is not reached and
  * resources are available.
@@ -455,10 +455,12 @@ enum wq_consts {
  * system_bh[_highpri]_wq are convenience interface to softirq. BH work items
  * are executed in the queueing CPU's BH context in the queueing order.
  */
-extern struct workqueue_struct *system_wq;
+extern struct workqueue_struct *system_wq; /* use system_percpu_wq, this will be removed */
+extern struct workqueue_struct *system_percpu_wq;
 extern struct workqueue_struct *system_highpri_wq;
 extern struct workqueue_struct *system_long_wq;
 extern struct workqueue_struct *system_unbound_wq;
+extern struct workqueue_struct *system_dfl_wq;
 extern struct workqueue_struct *system_freezable_wq;
 extern struct workqueue_struct *system_power_efficient_wq;
 extern struct workqueue_struct *system_freezable_power_efficient_wq;
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 97f37b5bae66..7a3f53a9841e 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -505,12 +505,16 @@ static struct kthread_worker *pwq_release_worker __ro_after_init;
 
 struct workqueue_struct *system_wq __ro_after_init;
 EXPORT_SYMBOL(system_wq);
+struct workqueue_struct *system_percpu_wq __ro_after_init;
+EXPORT_SYMBOL(system_percpu_wq);
 struct workqueue_struct *system_highpri_wq __ro_after_init;
 EXPORT_SYMBOL_GPL(system_highpri_wq);
 struct workqueue_struct *system_long_wq __ro_after_init;
 EXPORT_SYMBOL_GPL(system_long_wq);
 struct workqueue_struct *system_unbound_wq __ro_after_init;
 EXPORT_SYMBOL_GPL(system_unbound_wq);
+struct workqueue_struct *system_dfl_wq __ro_after_init;
+EXPORT_SYMBOL_GPL(system_dfl_wq);
 struct workqueue_struct *system_freezable_wq __ro_after_init;
 EXPORT_SYMBOL_GPL(system_freezable_wq);
 struct workqueue_struct *system_power_efficient_wq __ro_after_init;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 2/3] Workqueue: add new WQ_PERCPU flag
  2025-06-12 13:33 [PATCH v4 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq Marco Crivellari
  2025-06-12 13:33 ` [PATCH v4 1/3] Workqueue: add system_percpu_wq and system_dfl_wq Marco Crivellari
@ 2025-06-12 13:33 ` Marco Crivellari
  2025-06-12 13:33 ` [PATCH v4 3/3] [Doc] Workqueue: add WQ_PERCPU Marco Crivellari
  2 siblings, 0 replies; 8+ messages in thread
From: Marco Crivellari @ 2025-06-12 13:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Lai Jiangshan, Thomas Gleixner, Frederic Weisbecker,
	Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko

Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

This patch adds a new WQ_PERCPU flag to explicitly request the use of
the per-CPU behavior. Both flags coexist for one release cycle to allow
callers to transition their calls.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
 include/linux/workqueue.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 502ec4a5e32c..6347b9b3e472 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -401,6 +401,7 @@ enum wq_flags {
 	 * http://thread.gmane.org/gmane.linux.kernel/1480396
 	 */
 	WQ_POWER_EFFICIENT	= 1 << 7,
+	WQ_PERCPU		= 1 << 8, /* bound to a specific cpu */
 
 	__WQ_DESTROYING		= 1 << 15, /* internal: workqueue is destroying */
 	__WQ_DRAINING		= 1 << 16, /* internal: workqueue is draining */
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 3/3] [Doc] Workqueue: add WQ_PERCPU
  2025-06-12 13:33 [PATCH v4 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq Marco Crivellari
  2025-06-12 13:33 ` [PATCH v4 1/3] Workqueue: add system_percpu_wq and system_dfl_wq Marco Crivellari
  2025-06-12 13:33 ` [PATCH v4 2/3] Workqueue: add new WQ_PERCPU flag Marco Crivellari
@ 2025-06-12 13:33 ` Marco Crivellari
  2025-06-13 13:13   ` Frederic Weisbecker
  2 siblings, 1 reply; 8+ messages in thread
From: Marco Crivellari @ 2025-06-12 13:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Lai Jiangshan, Thomas Gleixner, Frederic Weisbecker,
	Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko

Workqueue documentation upgraded with the description
of the new added flag, WQ_PERCPU.

Also the WQ_UNBOUND flag documentation has been integrated

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
 Documentation/core-api/workqueue.rst | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/core-api/workqueue.rst b/Documentation/core-api/workqueue.rst
index e295835fc116..ae63a648a51b 100644
--- a/Documentation/core-api/workqueue.rst
+++ b/Documentation/core-api/workqueue.rst
@@ -183,6 +183,12 @@ resources, scheduled and executed.
   BH work items cannot sleep. All other features such as delayed queueing,
   flushing and canceling are supported.
 
+``WQ_PERCPU``
+  Work items queued to a per-cpu wq are bound to that specific CPU.
+  This flag it's the right choice when cpu locality is important.
+
+  This flag is the complement of ``WQ_UNBOUND``.
+
 ``WQ_UNBOUND``
   Work items queued to an unbound wq are served by the special
   worker-pools which host workers which are not bound to any
@@ -200,6 +206,10 @@ resources, scheduled and executed.
   * Long running CPU intensive workloads which can be better
     managed by the system scheduler.
 
+  **Note:** This flag will be removed in future and all the work
+  items that dosen't need to be bound to a specific CPU, should not
+  use this flags.
+
 ``WQ_FREEZABLE``
   A freezable wq participates in the freeze phase of the system
   suspend operations.  Work items on the wq are drained and no
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/3] Workqueue: add system_percpu_wq and system_dfl_wq
  2025-06-12 13:33 ` [PATCH v4 1/3] Workqueue: add system_percpu_wq and system_dfl_wq Marco Crivellari
@ 2025-06-13 13:05   ` Frederic Weisbecker
  2025-06-13 13:19     ` Marco Crivellari
  0 siblings, 1 reply; 8+ messages in thread
From: Frederic Weisbecker @ 2025-06-13 13:05 UTC (permalink / raw)
  To: Marco Crivellari
  Cc: linux-kernel, Tejun Heo, Lai Jiangshan, Thomas Gleixner,
	Sebastian Andrzej Siewior, Michal Hocko

Le Thu, Jun 12, 2025 at 03:33:33PM +0200, Marco Crivellari a écrit :
> Currently if a user enqueue a work item using schedule_delayed_work() the
> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> schedule_work() that is using system_wq and queue_work(), that makes use
> again of WORK_CPU_UNBOUND.
> 
> This lack of consistentcy cannot be addressed without refactoring the API.
> 
> system_wq is a per-CPU worqueue, yet nothing in its name tells about that
> CPU affinity constraint, which is very often not required by users. Make
> it clear by adding a system_percpu_wq.
> 
> system_unbound_wq should be the default workqueue so as not to enforce
> locality constraints for random work whenever it's not required.
> 
> Adding system_dfl_wq to encourage its use when unbound work should be used.
> 
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
> ---
>  include/linux/workqueue.h | 8 +++++---
>  kernel/workqueue.c        | 4 ++++
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
> index 6e30f275da77..502ec4a5e32c 100644
> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -427,7 +427,7 @@ enum wq_consts {
>  /*
>   * System-wide workqueues which are always present.
>   *
> - * system_wq is the one used by schedule[_delayed]_work[_on]().
> + * system_percpu_wq is the one used by schedule[_delayed]_work[_on]().
>   * Multi-CPU multi-threaded.  There are users which expect relatively
>   * short queue flush time.  Don't queue works which can run for too
>   * long.
> @@ -438,7 +438,7 @@ enum wq_consts {
>   * system_long_wq is similar to system_wq but may host long running
>   * works.  Queue flushing might take relatively long.
>   *
> - * system_unbound_wq is unbound workqueue.  Workers are not bound to
> + * system_dfl_wq is unbound workqueue.  Workers are not bound to
>   * any specific CPU, not concurrency managed, and all queued works are
>   * executed immediately as long as max_active limit is not reached and
>   * resources are available.
> @@ -455,10 +455,12 @@ enum wq_consts {
>   * system_bh[_highpri]_wq are convenience interface to softirq. BH work items
>   * are executed in the queueing CPU's BH context in the queueing order.
>   */
> -extern struct workqueue_struct *system_wq;
> +extern struct workqueue_struct *system_wq; /* use system_percpu_wq, this will be removed */
> +extern struct workqueue_struct *system_percpu_wq;
>  extern struct workqueue_struct *system_highpri_wq;
>  extern struct workqueue_struct *system_long_wq;
>  extern struct workqueue_struct *system_unbound_wq;
> +extern struct workqueue_struct *system_dfl_wq;
>  extern struct workqueue_struct *system_freezable_wq;
>  extern struct workqueue_struct *system_power_efficient_wq;
>  extern struct workqueue_struct *system_freezable_power_efficient_wq;
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 97f37b5bae66..7a3f53a9841e 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -505,12 +505,16 @@ static struct kthread_worker *pwq_release_worker __ro_after_init;
>  
>  struct workqueue_struct *system_wq __ro_after_init;
>  EXPORT_SYMBOL(system_wq);
> +struct workqueue_struct *system_percpu_wq __ro_after_init;
> +EXPORT_SYMBOL(system_percpu_wq);
>  struct workqueue_struct *system_highpri_wq __ro_after_init;
>  EXPORT_SYMBOL_GPL(system_highpri_wq);
>  struct workqueue_struct *system_long_wq __ro_after_init;
>  EXPORT_SYMBOL_GPL(system_long_wq);
>  struct workqueue_struct *system_unbound_wq __ro_after_init;
>  EXPORT_SYMBOL_GPL(system_unbound_wq);
> +struct workqueue_struct *system_dfl_wq __ro_after_init;
> +EXPORT_SYMBOL_GPL(system_dfl_wq);
>  struct workqueue_struct *system_freezable_wq __ro_after_init;
>  EXPORT_SYMBOL_GPL(system_freezable_wq);
>  struct workqueue_struct *system_power_efficient_wq __ro_after_init;

Shouldn't you allocate system_percpu_wq and system_dfl_wq in
workqueue_init_early() ?

And yes I think we should allocate them and not make them a pointer to
system_wq and system_unbound_wq, this way you can more easily
warn deprecated uses of system_wq and system_unbound_wq in the future
after upcoming merge windows.

Thanks.

> -- 
> 2.49.0
> 

-- 
Frederic Weisbecker
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 3/3] [Doc] Workqueue: add WQ_PERCPU
  2025-06-12 13:33 ` [PATCH v4 3/3] [Doc] Workqueue: add WQ_PERCPU Marco Crivellari
@ 2025-06-13 13:13   ` Frederic Weisbecker
  2025-06-13 13:23     ` Marco Crivellari
  0 siblings, 1 reply; 8+ messages in thread
From: Frederic Weisbecker @ 2025-06-13 13:13 UTC (permalink / raw)
  To: Marco Crivellari
  Cc: linux-kernel, Tejun Heo, Lai Jiangshan, Thomas Gleixner,
	Sebastian Andrzej Siewior, Michal Hocko

Le Thu, Jun 12, 2025 at 03:33:35PM +0200, Marco Crivellari a écrit :
> Workqueue documentation upgraded with the description
> of the new added flag, WQ_PERCPU.
> 
> Also the WQ_UNBOUND flag documentation has been integrated
> 
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>

Thanks, a few spelling nits below:

> ---
>  Documentation/core-api/workqueue.rst | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/Documentation/core-api/workqueue.rst b/Documentation/core-api/workqueue.rst
> index e295835fc116..ae63a648a51b 100644
> --- a/Documentation/core-api/workqueue.rst
> +++ b/Documentation/core-api/workqueue.rst
> @@ -183,6 +183,12 @@ resources, scheduled and executed.
>    BH work items cannot sleep. All other features such as delayed queueing,
>    flushing and canceling are supported.
>  
> +``WQ_PERCPU``
> +  Work items queued to a per-cpu wq are bound to that specific CPU.

s/that/a

> +  This flag it's the right choice when cpu locality is important.

s/it's/is

> +
> +  This flag is the complement of ``WQ_UNBOUND``.
> +
>  ``WQ_UNBOUND``
>    Work items queued to an unbound wq are served by the special
>    worker-pools which host workers which are not bound to any
> @@ -200,6 +206,10 @@ resources, scheduled and executed.
>    * Long running CPU intensive workloads which can be better
>      managed by the system scheduler.
>  
> +  **Note:** This flag will be removed in future and all the work

in the future

> +  items that dosen't need to be bound to a specific CPU, should not

s/dosen't/don't

> +  use this flags.

flag.

But since the support for this is not there yet, perhaps this note
should be added later? Ie: if someone omits the WQ_UNBOUND flag currently,
the workqueue will be percpu.

Thanks.

> +
>  ``WQ_FREEZABLE``
>    A freezable wq participates in the freeze phase of the system
>    suspend operations.  Work items on the wq are drained and no
> -- 
> 2.49.0
> 

-- 
Frederic Weisbecker
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/3] Workqueue: add system_percpu_wq and system_dfl_wq
  2025-06-13 13:05   ` Frederic Weisbecker
@ 2025-06-13 13:19     ` Marco Crivellari
  0 siblings, 0 replies; 8+ messages in thread
From: Marco Crivellari @ 2025-06-13 13:19 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, Tejun Heo, Lai Jiangshan, Thomas Gleixner,
	Sebastian Andrzej Siewior, Michal Hocko

Hi Frederic,

I let the wq allocation together with the wq logic changes.
But if it's better to allocate directly here when we add the wq(s), I
will do so.

Thank you.



On Fri, Jun 13, 2025 at 3:05 PM Frederic Weisbecker <frederic@kernel.org> wrote:
>
> Le Thu, Jun 12, 2025 at 03:33:33PM +0200, Marco Crivellari a écrit :
> > Currently if a user enqueue a work item using schedule_delayed_work() the
> > used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> > WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> > schedule_work() that is using system_wq and queue_work(), that makes use
> > again of WORK_CPU_UNBOUND.
> >
> > This lack of consistentcy cannot be addressed without refactoring the API.
> >
> > system_wq is a per-CPU worqueue, yet nothing in its name tells about that
> > CPU affinity constraint, which is very often not required by users. Make
> > it clear by adding a system_percpu_wq.
> >
> > system_unbound_wq should be the default workqueue so as not to enforce
> > locality constraints for random work whenever it's not required.
> >
> > Adding system_dfl_wq to encourage its use when unbound work should be used.
> >
> > Suggested-by: Tejun Heo <tj@kernel.org>
> > Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
> > ---
> >  include/linux/workqueue.h | 8 +++++---
> >  kernel/workqueue.c        | 4 ++++
> >  2 files changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
> > index 6e30f275da77..502ec4a5e32c 100644
> > --- a/include/linux/workqueue.h
> > +++ b/include/linux/workqueue.h
> > @@ -427,7 +427,7 @@ enum wq_consts {
> >  /*
> >   * System-wide workqueues which are always present.
> >   *
> > - * system_wq is the one used by schedule[_delayed]_work[_on]().
> > + * system_percpu_wq is the one used by schedule[_delayed]_work[_on]().
> >   * Multi-CPU multi-threaded.  There are users which expect relatively
> >   * short queue flush time.  Don't queue works which can run for too
> >   * long.
> > @@ -438,7 +438,7 @@ enum wq_consts {
> >   * system_long_wq is similar to system_wq but may host long running
> >   * works.  Queue flushing might take relatively long.
> >   *
> > - * system_unbound_wq is unbound workqueue.  Workers are not bound to
> > + * system_dfl_wq is unbound workqueue.  Workers are not bound to
> >   * any specific CPU, not concurrency managed, and all queued works are
> >   * executed immediately as long as max_active limit is not reached and
> >   * resources are available.
> > @@ -455,10 +455,12 @@ enum wq_consts {
> >   * system_bh[_highpri]_wq are convenience interface to softirq. BH work items
> >   * are executed in the queueing CPU's BH context in the queueing order.
> >   */
> > -extern struct workqueue_struct *system_wq;
> > +extern struct workqueue_struct *system_wq; /* use system_percpu_wq, this will be removed */
> > +extern struct workqueue_struct *system_percpu_wq;
> >  extern struct workqueue_struct *system_highpri_wq;
> >  extern struct workqueue_struct *system_long_wq;
> >  extern struct workqueue_struct *system_unbound_wq;
> > +extern struct workqueue_struct *system_dfl_wq;
> >  extern struct workqueue_struct *system_freezable_wq;
> >  extern struct workqueue_struct *system_power_efficient_wq;
> >  extern struct workqueue_struct *system_freezable_power_efficient_wq;
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index 97f37b5bae66..7a3f53a9841e 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -505,12 +505,16 @@ static struct kthread_worker *pwq_release_worker __ro_after_init;
> >
> >  struct workqueue_struct *system_wq __ro_after_init;
> >  EXPORT_SYMBOL(system_wq);
> > +struct workqueue_struct *system_percpu_wq __ro_after_init;
> > +EXPORT_SYMBOL(system_percpu_wq);
> >  struct workqueue_struct *system_highpri_wq __ro_after_init;
> >  EXPORT_SYMBOL_GPL(system_highpri_wq);
> >  struct workqueue_struct *system_long_wq __ro_after_init;
> >  EXPORT_SYMBOL_GPL(system_long_wq);
> >  struct workqueue_struct *system_unbound_wq __ro_after_init;
> >  EXPORT_SYMBOL_GPL(system_unbound_wq);
> > +struct workqueue_struct *system_dfl_wq __ro_after_init;
> > +EXPORT_SYMBOL_GPL(system_dfl_wq);
> >  struct workqueue_struct *system_freezable_wq __ro_after_init;
> >  EXPORT_SYMBOL_GPL(system_freezable_wq);
> >  struct workqueue_struct *system_power_efficient_wq __ro_after_init;
>
> Shouldn't you allocate system_percpu_wq and system_dfl_wq in
> workqueue_init_early() ?
>
> And yes I think we should allocate them and not make them a pointer to
> system_wq and system_unbound_wq, this way you can more easily
> warn deprecated uses of system_wq and system_unbound_wq in the future
> after upcoming merge windows.
>
> Thanks.
>
> > --
> > 2.49.0
> >
>
> --
> Frederic Weisbecker
> SUSE Labs



--

Marco Crivellari

L3 Support Engineer, Technology & Product




marco.crivellari@suse.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 3/3] [Doc] Workqueue: add WQ_PERCPU
  2025-06-13 13:13   ` Frederic Weisbecker
@ 2025-06-13 13:23     ` Marco Crivellari
  0 siblings, 0 replies; 8+ messages in thread
From: Marco Crivellari @ 2025-06-13 13:23 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, Tejun Heo, Lai Jiangshan, Thomas Gleixner,
	Sebastian Andrzej Siewior, Michal Hocko

Thank you!

> But since the support for this is not there yet, perhaps this note
> should be added later? Ie: if someone omits the WQ_UNBOUND flag currently,
> the workqueue will be percpu.

Yes, it makes sense.

I will send the v5 with all the corrections.

Thanks.

On Fri, Jun 13, 2025 at 3:13 PM Frederic Weisbecker <frederic@kernel.org> wrote:
>
> Le Thu, Jun 12, 2025 at 03:33:35PM +0200, Marco Crivellari a écrit :
> > Workqueue documentation upgraded with the description
> > of the new added flag, WQ_PERCPU.
> >
> > Also the WQ_UNBOUND flag documentation has been integrated
> >
> > Suggested-by: Tejun Heo <tj@kernel.org>
> > Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
>
> Thanks, a few spelling nits below:
>
> > ---
> >  Documentation/core-api/workqueue.rst | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/Documentation/core-api/workqueue.rst b/Documentation/core-api/workqueue.rst
> > index e295835fc116..ae63a648a51b 100644
> > --- a/Documentation/core-api/workqueue.rst
> > +++ b/Documentation/core-api/workqueue.rst
> > @@ -183,6 +183,12 @@ resources, scheduled and executed.
> >    BH work items cannot sleep. All other features such as delayed queueing,
> >    flushing and canceling are supported.
> >
> > +``WQ_PERCPU``
> > +  Work items queued to a per-cpu wq are bound to that specific CPU.
>
> s/that/a
>
> > +  This flag it's the right choice when cpu locality is important.
>
> s/it's/is
>
> > +
> > +  This flag is the complement of ``WQ_UNBOUND``.
> > +
> >  ``WQ_UNBOUND``
> >    Work items queued to an unbound wq are served by the special
> >    worker-pools which host workers which are not bound to any
> > @@ -200,6 +206,10 @@ resources, scheduled and executed.
> >    * Long running CPU intensive workloads which can be better
> >      managed by the system scheduler.
> >
> > +  **Note:** This flag will be removed in future and all the work
>
> in the future
>
> > +  items that dosen't need to be bound to a specific CPU, should not
>
> s/dosen't/don't
>
> > +  use this flags.
>
> flag.
>
> But since the support for this is not there yet, perhaps this note
> should be added later? Ie: if someone omits the WQ_UNBOUND flag currently,
> the workqueue will be percpu.
>
> Thanks.
>
> > +
> >  ``WQ_FREEZABLE``
> >    A freezable wq participates in the freeze phase of the system
> >    suspend operations.  Work items on the wq are drained and no
> > --
> > 2.49.0
> >
>
> --
> Frederic Weisbecker
> SUSE Labs



-- 

Marco Crivellari

L3 Support Engineer, Technology & Product




marco.crivellari@suse.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-06-13 13:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-12 13:33 [PATCH v4 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq Marco Crivellari
2025-06-12 13:33 ` [PATCH v4 1/3] Workqueue: add system_percpu_wq and system_dfl_wq Marco Crivellari
2025-06-13 13:05   ` Frederic Weisbecker
2025-06-13 13:19     ` Marco Crivellari
2025-06-12 13:33 ` [PATCH v4 2/3] Workqueue: add new WQ_PERCPU flag Marco Crivellari
2025-06-12 13:33 ` [PATCH v4 3/3] [Doc] Workqueue: add WQ_PERCPU Marco Crivellari
2025-06-13 13:13   ` Frederic Weisbecker
2025-06-13 13:23     ` Marco Crivellari

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox