Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/xe/preempt_fence: enlarge the fence critical section
@ 2024-04-18 14:46 Matthew Auld
  2024-04-18 18:06 ` ✓ CI.Patch_applied: success for " Patchwork
                   ` (16 more replies)
  0 siblings, 17 replies; 19+ messages in thread
From: Matthew Auld @ 2024-04-18 14:46 UTC (permalink / raw)
  To: intel-xe; +Cc: Matthew Brost

It is really easy to introduce subtle deadlocks in
preempt_fence_work_func() since we operate on single global ordered-wq
for signalling our preempt fences behind the scenes, so even though we
signal a particular fence, everything in the callback should be in the
fence critical section, since blocking in the callback will prevent
other published fences from signalling. If we enlarge the fence critical
section to cover the entire callback, then lockdep should be able to
understand this better, and complain if we grab a sensitive lock like
vm->lock, which is also held when waiting on preempt fences.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_preempt_fence.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c
index 7d50c6e89d8e..5b243b7feb59 100644
--- a/drivers/gpu/drm/xe/xe_preempt_fence.c
+++ b/drivers/gpu/drm/xe/xe_preempt_fence.c
@@ -23,11 +23,19 @@ static void preempt_fence_work_func(struct work_struct *w)
 		q->ops->suspend_wait(q);
 
 	dma_fence_signal(&pfence->base);
-	dma_fence_end_signalling(cookie);
-
+	/*
+	 * Opt for keep everything in the fence critical section. This looks really strange since we
+	 * have just signalled the fence, however the preempt fences are all signalled via single
+	 * global ordered-wq, therefore anything that happens in this callback can easily block
+	 * progress on the entire wq, which itself may prevent other published preempt fences from
+	 * ever signalling.  Therefore try to keep everything here in the callback in the fence
+	 * critical section. For example if something below grabs a scary lock like vm->lock,
+	 * lockdep should complain since we also hold that lock whilst waiting on preempt fences to
+	 * complete.
+	 */
 	xe_vm_queue_rebind_worker(q->vm);
-
 	xe_exec_queue_put(q);
+	dma_fence_end_signalling(cookie);
 }
 
 static const char *
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-04-21  0:50 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-18 14:46 [PATCH] drm/xe/preempt_fence: enlarge the fence critical section Matthew Auld
2024-04-18 18:06 ` ✓ CI.Patch_applied: success for " Patchwork
2024-04-18 18:06 ` ✓ CI.checkpatch: " Patchwork
2024-04-18 18:11 ` ✓ CI.KUnit: " Patchwork
2024-04-18 18:22 ` ✓ CI.Build: " Patchwork
2024-04-18 18:32 ` ✓ CI.Hooks: " Patchwork
2024-04-18 18:35 ` ✓ CI.checksparse: " Patchwork
2024-04-18 19:40 ` ✗ CI.BAT: failure " Patchwork
2024-04-18 19:55 ` [PATCH] " Matthew Brost
2024-04-19  7:44   ` Matthew Auld
2024-04-19  7:43 ` ✓ CI.Patch_applied: success for drm/xe/preempt_fence: enlarge the fence critical section (rev2) Patchwork
2024-04-19  7:43 ` ✓ CI.checkpatch: " Patchwork
2024-04-19  7:45 ` ✓ CI.KUnit: " Patchwork
2024-04-19  7:57 ` ✓ CI.Build: " Patchwork
2024-04-19  8:00 ` ✓ CI.Hooks: " Patchwork
2024-04-19  8:01 ` ✓ CI.checksparse: " Patchwork
2024-04-19  8:31 ` ✓ CI.BAT: " Patchwork
2024-04-20  9:54 ` ✗ CI.FULL: failure for drm/xe/preempt_fence: enlarge the fence critical section Patchwork
2024-04-21  0:50 ` ✓ CI.FULL: success for drm/xe/preempt_fence: enlarge the fence critical section (rev2) Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox