* [PATCH v4 1/2] drm/xe: replace use of system_unbound_wq with system_dfl_wq
2026-02-02 10:37 [PATCH v4 0/2] replace system_unbound_wq, add WQ_PERCPU to alloc_workqueue Marco Crivellari
@ 2026-02-02 10:37 ` Marco Crivellari
2026-02-02 10:37 ` [PATCH v4 2/2] drm/xe: add WQ_PERCPU to alloc_workqueue users Marco Crivellari
2026-02-02 16:20 ` [PATCH v4 0/2] replace system_unbound_wq, add WQ_PERCPU to alloc_workqueue Rodrigo Vivi
2 siblings, 0 replies; 6+ messages in thread
From: Marco Crivellari @ 2026-02-02 10:37 UTC (permalink / raw)
To: linux-kernel, intel-xe, dri-devel
Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko,
Thomas Hellstrom, Rodrigo Vivi, David Airlie, Simona Vetter
This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.
Before that to happen, workqueue users must be converted to the better named
new workqueues with no intended behaviour changes:
system_wq -> system_percpu_wq
system_unbound_wq -> system_dfl_wq
This way the old obsolete workqueues (system_wq, system_unbound_wq) can be
removed in the future.
Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
drivers/gpu/drm/xe/xe_devcoredump.c | 2 +-
drivers/gpu/drm/xe/xe_execlist.c | 2 +-
drivers/gpu/drm/xe/xe_guc_ct.c | 4 ++--
drivers/gpu/drm/xe/xe_oa.c | 2 +-
drivers/gpu/drm/xe/xe_vm.c | 4 ++--
5 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
index cf41bb6d2172..558a1a9841a0 100644
--- a/drivers/gpu/drm/xe/xe_devcoredump.c
+++ b/drivers/gpu/drm/xe/xe_devcoredump.c
@@ -356,7 +356,7 @@ static void devcoredump_snapshot(struct xe_devcoredump *coredump,
xe_engine_snapshot_capture_for_queue(q);
- queue_work(system_unbound_wq, &ss->work);
+ queue_work(system_dfl_wq, &ss->work);
dma_fence_end_signalling(cookie);
}
diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c
index 005a5b2c36fe..dc25caf47813 100644
--- a/drivers/gpu/drm/xe/xe_execlist.c
+++ b/drivers/gpu/drm/xe/xe_execlist.c
@@ -421,7 +421,7 @@ static void execlist_exec_queue_kill(struct xe_exec_queue *q)
static void execlist_exec_queue_destroy(struct xe_exec_queue *q)
{
INIT_WORK(&q->execlist->destroy_async, execlist_exec_queue_destroy_async);
- queue_work(system_unbound_wq, &q->execlist->destroy_async);
+ queue_work(system_dfl_wq, &q->execlist->destroy_async);
}
static int execlist_exec_queue_set_priority(struct xe_exec_queue *q,
diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
index dfbf76037b04..a0498f36bd74 100644
--- a/drivers/gpu/drm/xe/xe_guc_ct.c
+++ b/drivers/gpu/drm/xe/xe_guc_ct.c
@@ -643,7 +643,7 @@ static int __xe_guc_ct_start(struct xe_guc_ct *ct, bool needs_register)
spin_lock_irq(&ct->dead.lock);
if (ct->dead.reason) {
ct->dead.reason |= (1 << CT_DEAD_STATE_REARM);
- queue_work(system_unbound_wq, &ct->dead.worker);
+ queue_work(system_dfl_wq, &ct->dead.worker);
}
spin_unlock_irq(&ct->dead.lock);
#endif
@@ -2165,7 +2165,7 @@ static void ct_dead_capture(struct xe_guc_ct *ct, struct guc_ctb *ctb, u32 reaso
spin_unlock_irqrestore(&ct->dead.lock, flags);
- queue_work(system_unbound_wq, &(ct)->dead.worker);
+ queue_work(system_dfl_wq, &(ct)->dead.worker);
}
static void ct_dead_print(struct xe_dead_ct *dead)
diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c
index abf87fe0b345..8b37e49f639f 100644
--- a/drivers/gpu/drm/xe/xe_oa.c
+++ b/drivers/gpu/drm/xe/xe_oa.c
@@ -969,7 +969,7 @@ static void xe_oa_config_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
struct xe_oa_fence *ofence = container_of(cb, typeof(*ofence), cb);
INIT_DELAYED_WORK(&ofence->work, xe_oa_fence_work_fn);
- queue_delayed_work(system_unbound_wq, &ofence->work,
+ queue_delayed_work(system_dfl_wq, &ofence->work,
usecs_to_jiffies(NOA_PROGRAM_ADDITIONAL_DELAY_US));
dma_fence_put(fence);
}
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 293b92ed2fdd..e6cfa5dc7f62 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1112,7 +1112,7 @@ static void vma_destroy_cb(struct dma_fence *fence,
struct xe_vma *vma = container_of(cb, struct xe_vma, destroy_cb);
INIT_WORK(&vma->destroy_work, vma_destroy_work_func);
- queue_work(system_unbound_wq, &vma->destroy_work);
+ queue_work(system_dfl_wq, &vma->destroy_work);
}
static void xe_vma_destroy(struct xe_vma *vma, struct dma_fence *fence)
@@ -1894,7 +1894,7 @@ static void xe_vm_free(struct drm_gpuvm *gpuvm)
struct xe_vm *vm = container_of(gpuvm, struct xe_vm, gpuvm);
/* To destroy the VM we need to be able to sleep */
- queue_work(system_unbound_wq, &vm->destroy_work);
+ queue_work(system_dfl_wq, &vm->destroy_work);
}
struct xe_vm *xe_vm_lookup(struct xe_file *xef, u32 id)
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v4 2/2] drm/xe: add WQ_PERCPU to alloc_workqueue users
2026-02-02 10:37 [PATCH v4 0/2] replace system_unbound_wq, add WQ_PERCPU to alloc_workqueue Marco Crivellari
2026-02-02 10:37 ` [PATCH v4 1/2] drm/xe: replace use of system_unbound_wq with system_dfl_wq Marco Crivellari
@ 2026-02-02 10:37 ` Marco Crivellari
2026-02-02 16:20 ` [PATCH v4 0/2] replace system_unbound_wq, add WQ_PERCPU to alloc_workqueue Rodrigo Vivi
2 siblings, 0 replies; 6+ messages in thread
From: Marco Crivellari @ 2026-02-02 10:37 UTC (permalink / raw)
To: linux-kernel, intel-xe, dri-devel
Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko,
Thomas Hellstrom, Rodrigo Vivi, David Airlie, Simona Vetter
This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
The refactoring is going to alter the default behavior of
alloc_workqueue() to be unbound by default.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU. For more details see the Link tag below.
In order to keep alloc_workqueue() behavior identical, explicitly request
WQ_PERCPU.
Link: https://lore.kernel.org/all/20250221112
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
drivers/gpu/drm/xe/xe_device.c | 4 ++--
drivers/gpu/drm/xe/xe_ggtt.c | 2 +-
drivers/gpu/drm/xe/xe_hw_engine_group.c | 3 ++-
drivers/gpu/drm/xe/xe_sriov.c | 2 +-
4 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 9cf82bde36c4..9e5fb0d4b8e7 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -508,8 +508,8 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
xe->preempt_fence_wq = alloc_ordered_workqueue("xe-preempt-fence-wq",
WQ_MEM_RECLAIM);
xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);
- xe->unordered_wq = alloc_workqueue("xe-unordered-wq", 0, 0);
- xe->destroy_wq = alloc_workqueue("xe-destroy-wq", 0, 0);
+ xe->unordered_wq = alloc_workqueue("xe-unordered-wq", WQ_PERCPU, 0);
+ xe->destroy_wq = alloc_workqueue("xe-destroy-wq", WQ_PERCPU, 0);
if (!xe->ordered_wq || !xe->unordered_wq ||
!xe->preempt_fence_wq || !xe->destroy_wq) {
/*
diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index 60665ad1415b..8b9d7c0bbe90 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -367,7 +367,7 @@ int xe_ggtt_init_early(struct xe_ggtt *ggtt)
else
ggtt->pt_ops = &xelp_pt_ops;
- ggtt->wq = alloc_workqueue("xe-ggtt-wq", WQ_MEM_RECLAIM, 0);
+ ggtt->wq = alloc_workqueue("xe-ggtt-wq", WQ_MEM_RECLAIM | WQ_PERCPU, 0);
if (!ggtt->wq)
return -ENOMEM;
diff --git a/drivers/gpu/drm/xe/xe_hw_engine_group.c b/drivers/gpu/drm/xe/xe_hw_engine_group.c
index 2ef33dfbe3a2..4c2b113364d3 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine_group.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine_group.c
@@ -51,7 +51,8 @@ hw_engine_group_alloc(struct xe_device *xe)
if (!group)
return ERR_PTR(-ENOMEM);
- group->resume_wq = alloc_workqueue("xe-resume-lr-jobs-wq", 0, 0);
+ group->resume_wq = alloc_workqueue("xe-resume-lr-jobs-wq", WQ_PERCPU,
+ 0);
if (!group->resume_wq)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/xe/xe_sriov.c b/drivers/gpu/drm/xe/xe_sriov.c
index ea411944609b..f3835867fce5 100644
--- a/drivers/gpu/drm/xe/xe_sriov.c
+++ b/drivers/gpu/drm/xe/xe_sriov.c
@@ -120,7 +120,7 @@ int xe_sriov_init(struct xe_device *xe)
xe_sriov_vf_init_early(xe);
xe_assert(xe, !xe->sriov.wq);
- xe->sriov.wq = alloc_workqueue("xe-sriov-wq", 0, 0);
+ xe->sriov.wq = alloc_workqueue("xe-sriov-wq", WQ_PERCPU, 0);
if (!xe->sriov.wq)
return -ENOMEM;
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v4 0/2] replace system_unbound_wq, add WQ_PERCPU to alloc_workqueue
2026-02-02 10:37 [PATCH v4 0/2] replace system_unbound_wq, add WQ_PERCPU to alloc_workqueue Marco Crivellari
2026-02-02 10:37 ` [PATCH v4 1/2] drm/xe: replace use of system_unbound_wq with system_dfl_wq Marco Crivellari
2026-02-02 10:37 ` [PATCH v4 2/2] drm/xe: add WQ_PERCPU to alloc_workqueue users Marco Crivellari
@ 2026-02-02 16:20 ` Rodrigo Vivi
2026-02-02 16:22 ` Marco Crivellari
2 siblings, 1 reply; 6+ messages in thread
From: Rodrigo Vivi @ 2026-02-02 16:20 UTC (permalink / raw)
To: Marco Crivellari
Cc: linux-kernel, intel-xe, dri-devel, Tejun Heo, Lai Jiangshan,
Frederic Weisbecker, Sebastian Andrzej Siewior, Michal Hocko,
Thomas Hellstrom, David Airlie, Simona Vetter
On Mon, Feb 02, 2026 at 11:37:54AM +0100, Marco Crivellari wrote:
> Hi,
>
> === Current situation: problems ===
>
> Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
> set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
>
> This leads to different scenarios if a work item is scheduled on an
> isolated CPU where "delay" value is 0 or greater then 0:
> schedule_delayed_work(, 0);
>
> This will be handled by __queue_work() that will queue the work item on the
> current local (isolated) CPU, while:
>
> schedule_delayed_work(, 1);
>
> Will move the timer on an housekeeping CPU, and schedule the work there.
>
> Currently if a user enqueue a work item using schedule_delayed_work() the
> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> schedule_work() that is using system_wq and queue_work(), that makes use
> again of WORK_CPU_UNBOUND.
>
> This lack of consistency cannot be addressed without refactoring the API.
>
> === Recent changes to the WQ API ===
>
> The following, address the recent changes in the Workqueue API:
>
> - commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
> - commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
>
> The old workqueues will be removed in a future release cycle.
>
> === Introduced Changes by this series ===
>
> 1) [P 1] Replace uses of system_unbound_wq
>
> system_unbound_wq is to be used when locality is not required.
>
> Because of that, system_unbound_wq has been replaced with
> system_dfl_wq, to make sure this would be the default choice
> when locality is not important.
>
> system_dfl_wq behave like system_unbound_wq.
>
>
> 2) [P 2] add WQ_PERCPU to alloc_workqueue()
>
> This change adds a new WQ_PERCPU flag to explicitly request
> alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.
>
> The behavior is the same.
>
> Thanks!
>
> ---
> Changes in v4:
> - rebased on drm-xe
series is
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
I just resent it for CI and will push to drm-xe-next as soon as I get
the greenlight from CI.
Thanks,
Rodrigo.
>
> Changes in v3:
> - rebased on v6.19-rc6 (on master specifically)
>
> - commit logs improved
>
> Changes in v2:
> - rebased on v6.18-rc4.
>
> - commit logs integrated with the appropriate workqueue API commit hash.
>
>
> Marco Crivellari (2):
> drm/xe: replace use of system_unbound_wq with system_dfl_wq
> drm/xe: add WQ_PERCPU to alloc_workqueue users
>
> drivers/gpu/drm/xe/xe_devcoredump.c | 2 +-
> drivers/gpu/drm/xe/xe_device.c | 4 ++--
> drivers/gpu/drm/xe/xe_execlist.c | 2 +-
> drivers/gpu/drm/xe/xe_ggtt.c | 2 +-
> drivers/gpu/drm/xe/xe_guc_ct.c | 4 ++--
> drivers/gpu/drm/xe/xe_hw_engine_group.c | 3 ++-
> drivers/gpu/drm/xe/xe_oa.c | 2 +-
> drivers/gpu/drm/xe/xe_sriov.c | 2 +-
> drivers/gpu/drm/xe/xe_vm.c | 4 ++--
> 9 files changed, 13 insertions(+), 12 deletions(-)
>
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH v4 0/2] replace system_unbound_wq, add WQ_PERCPU to alloc_workqueue
2026-02-02 16:20 ` [PATCH v4 0/2] replace system_unbound_wq, add WQ_PERCPU to alloc_workqueue Rodrigo Vivi
@ 2026-02-02 16:22 ` Marco Crivellari
2026-02-03 0:19 ` Rodrigo Vivi
0 siblings, 1 reply; 6+ messages in thread
From: Marco Crivellari @ 2026-02-02 16:22 UTC (permalink / raw)
To: Rodrigo Vivi
Cc: linux-kernel, intel-xe, dri-devel, Tejun Heo, Lai Jiangshan,
Frederic Weisbecker, Sebastian Andrzej Siewior, Michal Hocko,
Thomas Hellstrom, David Airlie, Simona Vetter
On Mon, Feb 2, 2026 at 5:21 PM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> [...]
> series is
>
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>
> I just resent it for CI and will push to drm-xe-next as soon as I get
> the greenlight from CI.
>
Many thanks Rodrigo!
--
Marco Crivellari
L3 Support Engineer
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4 0/2] replace system_unbound_wq, add WQ_PERCPU to alloc_workqueue
2026-02-02 16:22 ` Marco Crivellari
@ 2026-02-03 0:19 ` Rodrigo Vivi
0 siblings, 0 replies; 6+ messages in thread
From: Rodrigo Vivi @ 2026-02-03 0:19 UTC (permalink / raw)
To: Marco Crivellari
Cc: linux-kernel, intel-xe, dri-devel, Tejun Heo, Lai Jiangshan,
Frederic Weisbecker, Sebastian Andrzej Siewior, Michal Hocko,
Thomas Hellstrom, David Airlie, Simona Vetter
On Mon, Feb 02, 2026 at 05:22:05PM +0100, Marco Crivellari wrote:
> On Mon, Feb 2, 2026 at 5:21 PM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > [...]
> > series is
> >
> > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >
> > I just resent it for CI and will push to drm-xe-next as soon as I get
> > the greenlight from CI.
> >
>
> Many thanks Rodrigo!
Thank you. Pushed to drm-xe-next.
>
> --
>
> Marco Crivellari
>
> L3 Support Engineer
^ permalink raw reply [flat|nested] 6+ messages in thread