From: Matthew Brost <matthew.brost@intel.com>
To: intel-xe@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org, Tejun Heo <tj@kernel.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
linux-kernel@vger.kernel.org
Subject: [RFC PATCH 01/12] workqueue: Add interface to teach lockdep to warn on reclaim violations
Date: Sun, 15 Mar 2026 21:32:44 -0700 [thread overview]
Message-ID: <20260316043255.226352-2-matthew.brost@intel.com> (raw)
In-Reply-To: <20260316043255.226352-1-matthew.brost@intel.com>
Drivers often use workqueues that run in reclaim paths (e.g., DRM
scheduler workqueues). It is useful to teach lockdep that memory
allocations which can recurse into reclaim (e.g., GFP_KERNEL) are not
allowed on these workqueues. Add an interface that taints a workqueue’s
lockdep state with reclaim.
Also add a helper to test whether a workqueue is reclaim annotated,
allowing drivers to enforce reclaim-safe behavior.
Example of lockdep splat upon violation below:
[ 60.953095] =============================================
[ 73.023656] Console: switching to colour dummy device 80x25
[ 73.023684] [IGT] xe_exec_reset: executing
[ 73.038237] [IGT] xe_exec_reset: starting subtest gt-reset
[ 73.044163] xe 0000:03:00.0: [drm] Tile0: GT0: trying reset from force_reset_write [xe]
[ 73.044276] xe 0000:03:00.0: [drm] Tile0: GT0: reset queued
[ 73.045963] ======================================================
[ 73.052133] WARNING: possible circular locking dependency detected
[ 73.058302] 7.0.0-rc3-xe+ #31 Tainted: G U
[ 73.063866] ------------------------------------------------------
[ 73.070036] kworker/u64:5/158 is trying to acquire lock:
[ 73.075342] ffffffff829a87a0 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc_cache_noprof+0x39/0x420
[ 73.083791]
but task is already holding lock:
[ 73.089612] ffffc9000152fe60 ((work_completion)(>->reset.worker)){+.+.}-{0:0}, at: process_one_work+0x1d2/0x6a0
[ 73.099852]
which lock already depends on the new lock.
[ 73.108013]
the existing dependency chain (in reverse order) is:
[ 73.115481]
-> #2 ((work_completion)(>->reset.worker)){+.+.}-{0:0}:
[ 73.123381] process_one_work+0x1ec/0x6a0
[ 73.127906] worker_thread+0x183/0x330
[ 73.132173] kthread+0xe2/0x120
[ 73.135833] ret_from_fork+0x289/0x2f0
[ 73.140101] ret_from_fork_asm+0x1a/0x30
[ 73.144540]
-> #1 ((wq_completion)gt-ordered-wq){+.+.}-{0:0}:
[ 73.151749] workqueue_warn_on_reclaim.part.0+0x32/0x50
[ 73.157487] alloc_workqueue_noprof+0xef/0x100
[ 73.162445] xe_gt_alloc+0x92/0x220 [xe]
[ 73.166954] xe_pci_probe+0x734/0x1660 [xe]
[ 73.171720] pci_device_probe+0x98/0x140
[ 73.176161] really_probe+0xcf/0x2c0
[ 73.180256] __driver_probe_device+0x6e/0x120
[ 73.185126] driver_probe_device+0x19/0x90
[ 73.189740] __driver_attach+0x89/0x140
[ 73.194091] bus_for_each_dev+0x79/0xd0
[ 73.198446] bus_add_driver+0xe6/0x210
[ 73.202712] driver_register+0x5b/0x110
[ 73.207064] 0xffffffffa00aa0db
[ 73.210724] do_one_initcall+0x59/0x2e0
[ 73.215077] do_init_module+0x5f/0x230
[ 73.219345] init_module_from_file+0xc7/0xe0
[ 73.224128] idempotent_init_module+0x176/0x270
[ 73.229175] __x64_sys_finit_module+0x61/0xb0
[ 73.234047] do_syscall_64+0x9b/0x540
[ 73.238228] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 73.243793]
-> #0 (fs_reclaim){+.+.}-{0:0}:
[ 73.249442] __lock_acquire+0x1496/0x2510
[ 73.253970] lock_acquire+0xbd/0x2f0
[ 73.258062] fs_reclaim_acquire+0x98/0xd0
[ 73.262586] __kmalloc_cache_noprof+0x39/0x420
[ 73.267545] gt_reset_worker+0x27/0x1f0 [xe]
[ 73.272385] process_one_work+0x213/0x6a0
[ 73.276910] worker_thread+0x183/0x330
[ 73.281178] kthread+0xe2/0x120
[ 73.284838] ret_from_fork+0x289/0x2f0
[ 73.289104] ret_from_fork_asm+0x1a/0x30
[ 73.293542]
other info that might help us debug this:
[ 73.301528] Chain exists of:
fs_reclaim --> (wq_completion)gt-ordered-wq --> (work_completion)(>->reset.worker)
[ 73.314795] Possible unsafe locking scenario:
[ 73.320705] CPU0 CPU1
[ 73.325232] ---- ----
[ 73.329759] lock((work_completion)(>->reset.worker));
[ 73.335148] lock((wq_completion)gt-ordered-wq);
[ 73.342359] lock((work_completion)(>->reset.worker));
[ 73.350259] lock(fs_reclaim);
[ 73.353400]
*** DEADLOCK ***
v2:
- Add WQ flag to warn on reclaim violations (Tejun)
- Add a helper function to test if WQ is annotated
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
include/linux/workqueue.h | 3 +++
kernel/workqueue.c | 41 +++++++++++++++++++++++++++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index a4749f56398f..5ad3b92ddd75 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -403,6 +403,7 @@ enum wq_flags {
*/
WQ_POWER_EFFICIENT = 1 << 7,
WQ_PERCPU = 1 << 8, /* bound to a specific cpu */
+ WQ_MEM_WARN_ON_RECLAIM = 1 << 9, /* teach lockdep to warn on reclaim */
__WQ_DESTROYING = 1 << 15, /* internal: workqueue is destroying */
__WQ_DRAINING = 1 << 16, /* internal: workqueue is draining */
@@ -582,6 +583,8 @@ alloc_workqueue_lockdep_map(const char *fmt, unsigned int flags, int max_active,
extern void destroy_workqueue(struct workqueue_struct *wq);
+extern bool workqueue_is_reclaim_annotated(struct workqueue_struct *wq);
+
struct workqueue_attrs *alloc_workqueue_attrs_noprof(void);
#define alloc_workqueue_attrs(...) alloc_hooks(alloc_workqueue_attrs_noprof(__VA_ARGS__))
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index b77119d71641..9c2c3a503e2c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5872,6 +5872,45 @@ static struct workqueue_struct *__alloc_workqueue(const char *fmt,
return NULL;
}
+#ifdef CONFIG_LOCKDEP
+static void workqueue_warn_on_reclaim(struct workqueue_struct *wq)
+{
+ if (wq->flags & WQ_MEM_WARN_ON_RECLAIM) {
+ fs_reclaim_acquire(GFP_KERNEL);
+ lock_map_acquire(wq->lockdep_map);
+ lock_map_release(wq->lockdep_map);
+ fs_reclaim_release(GFP_KERNEL);
+ }
+}
+#else
+static void workqueue_warn_on_reclaim(struct workqueue_struct *wq)
+{
+}
+#endif
+
+/**
+ * workqueue_is_reclaim_annotated() - Test whether a workqueue is annotated for
+ * reclaim safety
+ * @wq: workqueue to test
+ *
+ * Returns true if @wq is flags have both %WQ_MEM_WARN_ON_RECLAIM and
+ * %WQ_MEM_RECLAIM set. A workqueue marked with these flags indicates that it
+ * participates in reclaim paths, and therefore must not perform memory
+ * allocations that can recurse into reclaim (e.g., GFP_KERNEL is not allowed).
+ *
+ * Drivers can use this helper to enforce reclaim-safe behavior on workqueues
+ * that are created or provided elsewhere in the code.
+ *
+ * Return:
+ * true if the workqueue is reclaim-annotated, false otherwise.
+ */
+bool workqueue_is_reclaim_annotated(struct workqueue_struct *wq)
+{
+ return (wq->flags & WQ_MEM_WARN_ON_RECLAIM) &&
+ (wq->flags & WQ_MEM_RECLAIM);
+}
+EXPORT_SYMBOL_GPL(workqueue_is_reclaim_annotated);
+
__printf(1, 4)
struct workqueue_struct *alloc_workqueue_noprof(const char *fmt,
unsigned int flags,
@@ -5887,6 +5926,7 @@ struct workqueue_struct *alloc_workqueue_noprof(const char *fmt,
return NULL;
wq_init_lockdep(wq);
+ workqueue_warn_on_reclaim(wq);
return wq;
}
@@ -5908,6 +5948,7 @@ alloc_workqueue_lockdep_map(const char *fmt, unsigned int flags,
return NULL;
wq->lockdep_map = lockdep_map;
+ workqueue_warn_on_reclaim(wq);
return wq;
}
--
2.34.1
next prev parent reply other threads:[~2026-03-16 4:33 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 4:32 [RFC PATCH 00/12] Introduce DRM dep queue Matthew Brost
2026-03-16 4:32 ` Matthew Brost [this message]
2026-03-25 15:59 ` [RFC PATCH 01/12] workqueue: Add interface to teach lockdep to warn on reclaim violations Tejun Heo
2026-03-26 1:49 ` Matthew Brost
2026-03-26 2:19 ` Tejun Heo
2026-03-27 4:33 ` Matthew Brost
2026-03-27 17:25 ` Tejun Heo
2026-03-16 4:32 ` [RFC PATCH 02/12] drm/dep: Add DRM dependency queue layer Matthew Brost
2026-03-16 9:16 ` Boris Brezillon
2026-03-17 5:22 ` Matthew Brost
2026-03-17 8:48 ` Boris Brezillon
2026-03-16 10:25 ` Danilo Krummrich
2026-03-17 5:10 ` Matthew Brost
2026-03-17 12:19 ` Danilo Krummrich
2026-03-18 23:02 ` Matthew Brost
2026-03-17 2:47 ` Daniel Almeida
2026-03-17 5:45 ` Matthew Brost
2026-03-17 7:17 ` Miguel Ojeda
2026-03-17 8:26 ` Matthew Brost
2026-03-17 12:04 ` Daniel Almeida
2026-03-17 19:41 ` Miguel Ojeda
2026-03-23 17:31 ` Matthew Brost
2026-03-23 17:42 ` Miguel Ojeda
2026-03-17 18:14 ` Matthew Brost
2026-03-17 19:48 ` Daniel Almeida
2026-03-17 20:43 ` Boris Brezillon
2026-03-18 22:40 ` Matthew Brost
2026-03-19 9:57 ` Boris Brezillon
2026-03-22 6:43 ` Matthew Brost
2026-03-23 7:58 ` Matthew Brost
2026-03-23 10:06 ` Boris Brezillon
2026-03-23 17:11 ` Matthew Brost
2026-03-17 12:31 ` Danilo Krummrich
2026-03-17 14:25 ` Daniel Almeida
2026-03-17 14:33 ` Danilo Krummrich
2026-03-18 22:50 ` Matthew Brost
2026-03-17 8:47 ` Christian König
2026-03-17 14:55 ` Boris Brezillon
2026-03-18 23:28 ` Matthew Brost
2026-03-19 9:11 ` Boris Brezillon
2026-03-23 4:50 ` Matthew Brost
2026-03-23 9:55 ` Boris Brezillon
2026-03-23 17:08 ` Matthew Brost
2026-03-23 18:38 ` Matthew Brost
2026-03-24 9:23 ` Boris Brezillon
2026-03-24 16:06 ` Matthew Brost
2026-03-25 2:33 ` Matthew Brost
2026-03-24 8:49 ` Boris Brezillon
2026-03-24 16:51 ` Matthew Brost
2026-03-17 16:30 ` Shashank Sharma
2026-03-16 4:32 ` [RFC PATCH 03/12] drm/xe: Use WQ_MEM_WARN_ON_RECLAIM on all workqueues in the reclaim path Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 04/12] drm/xe: Issue GGTT invalidation under lock in ggtt_node_remove Matthew Brost
2026-03-26 5:45 ` Bhadane, Dnyaneshwar
2026-03-16 4:32 ` [RFC PATCH 05/12] drm/xe: Return fence from xe_sched_job_arm and adjust job references Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 06/12] drm/xe: Convert to DRM dep queue scheduler layer Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 07/12] drm/xe: Make scheduler message lock IRQ-safe Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 08/12] drm/xe: Rework exec queue object on top of DRM dep Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 09/12] drm/xe: Enable IRQ job put in " Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 10/12] drm/xe: Use DRM dep queue kill semantics Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 11/12] accel/amdxdna: Convert to drm_dep scheduler layer Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 12/12] drm/panthor: " Matthew Brost
2026-03-16 4:52 ` ✗ CI.checkpatch: warning for Introduce DRM dep queue Patchwork
2026-03-16 4:53 ` ✓ CI.KUnit: success " Patchwork
2026-03-16 5:28 ` ✓ Xe.CI.BAT: " Patchwork
2026-03-16 8:09 ` ✗ Xe.CI.FULL: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260316043255.226352-2-matthew.brost@intel.com \
--to=matthew.brost@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.