public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Daniel Almeida <daniel.almeida@collabora.com>
Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	"Boris Brezillon" <boris.brezillon@collabora.com>,
	"Tvrtko Ursulin" <tvrtko.ursulin@igalia.com>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"David Airlie" <airlied@gmail.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Philipp Stanner" <phasta@kernel.org>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	linux-kernel@vger.kernel.org,
	"Sami Tolvanen" <samitolvanen@google.com>,
	"Jeffrey Vander Stoep" <jeffv@google.com>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Daniel Stone" <daniels@collabora.com>,
	"Alexandre Courbot" <acourbot@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	shashanks@nvidia.com, jajones@nvidia.com,
	"Eliot Courtney" <ecourtney@nvidia.com>,
	"Joel Fernandes" <joelagnelf@nvidia.com>,
	rust-for-linux <rust-for-linux@vger.kernel.org>
Subject: Re: [RFC PATCH 02/12] drm/dep: Add DRM dependency queue layer
Date: Mon, 16 Mar 2026 22:45:33 -0700	[thread overview]
Message-ID: <abjqfXERS6Xk4FAQ@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <7A8108C7-7CF0-4EA4-95ED-8003502DC35A@collabora.com>

On Mon, Mar 16, 2026 at 11:47:01PM -0300, Daniel Almeida wrote:
> (+cc a few other people + Rust-for-Linux ML)
> 
> Hi Matthew,
> 
> I agree with what Danilo said below, i.e.:  IMHO, with the direction that DRM
> is going, it is much more ergonomic to add a Rust component with a nice C
> interface than doing it the other way around.
>

Holy war? See my reply to Danilo — I’ll write this in Rust if needed,
but it’s not my first choice since I’m not yet a native speaker.
 
> > On 16 Mar 2026, at 01:32, Matthew Brost <matthew.brost@intel.com> wrote:
> > 
> > Diverging requirements between GPU drivers using firmware scheduling
> > and those using hardware scheduling have shown that drm_gpu_scheduler is
> > no longer sufficient for firmware-scheduled GPU drivers. The technical
> > debt, lack of memory-safety guarantees, absence of clear object-lifetime
> > rules, and numerous driver-specific hacks have rendered
> > drm_gpu_scheduler unmaintainable. It is time for a fresh design for
> > firmware-scheduled GPU drivers—one that addresses all of the
> > aforementioned shortcomings.
> > 
> > Add drm_dep, a lightweight GPU submission queue intended as a
> > replacement for drm_gpu_scheduler for firmware-managed GPU schedulers
> > (e.g. Xe, Panthor, AMDXDNA, PVR, Nouveau, Nova). Unlike
> > drm_gpu_scheduler, which separates the scheduler (drm_gpu_scheduler)
> > from the queue (drm_sched_entity) into two objects requiring external
> > coordination, drm_dep merges both roles into a single struct
> > drm_dep_queue. This eliminates the N:1 entity-to-scheduler mapping
> > that is unnecessary for firmware schedulers which manage their own
> > run-lists internally.
> > 
> > Unlike drm_gpu_scheduler, which relies on external locking and lifetime
> > management by the driver, drm_dep uses reference counting (kref) on both
> > queues and jobs to guarantee object lifetime safety. A job holds a queue
> 
> In a domain that has been plagued by lifetime issues, we really should be

Yes, drm sched is a mess. I’ve been suggesting we fix it for years and
have met pushback. This, however (drm dep), isn’t plagued by lifetime
issues — that’s the primary focus here.

> enforcing RAII for resource management instead of manual calls.
> 

You can do RAII in C - see cleanup.h. Clear object lifetimes and
ownership are what is important. Disciplined coding is the only to do
this regardless of language. RAII doesn't help with help with bad object
models / ownership / lifetime models either.

I don't buy the Rust solves everything argument but again non-native
speaker.

> > reference from init until its last put, and the queue holds a job reference
> > from dispatch until the put_job worker runs. This makes use-after-free
> > impossible even when completion arrives from IRQ context or concurrent
> > teardown is in flight.
> 
> It makes use-after-free impossible _if_ you’re careful. It is not a
> property of the type system, and incorrect code will compile just fine.
> 

Sure. If a driver puts a drm_dep object reference on a resource that
drm_dep owns, it will explode. That’s effectively putting a reference on
a resource the driver doesn’t own. A driver can write to any physical
memory and crash the system anyway, so I’m not really sure what we’re
talking about here. Rust doesn’t solve anything in this scenario — you
can always use an unsafe block and put a reference on a resource you
don’t own.

Object model, ownership, and lifetimes are what is important and that is
what drm dep is built around.

> > 
> > The core objects are:
> > 
> >  struct drm_dep_queue - a per-context submission queue owning an
> >    ordered submit workqueue, a TDR timeout workqueue, an SPSC job
> >    queue, and a pending-job list. Reference counted; drivers can embed
> >    it and provide a .release vfunc for RCU-safe teardown.
> > 
> >  struct drm_dep_job - a single unit of GPU work. Drivers embed this
> >    and provide a .release vfunc. Jobs carry an xarray of input
> >    dma_fence dependencies and produce a drm_dep_fence as their
> >    finished fence.
> > 
> >  struct drm_dep_fence - a dma_fence subclass wrapping an optional
> >    parent hardware fence. The finished fence is armed (sequence
> >    number assigned) before submission and signals when the hardware
> >    fence signals (or immediately on synchronous completion).
> > 
> > Job lifecycle:
> >  1. drm_dep_job_init() - allocate and initialise; job acquires a
> >     queue reference.
> >  2. drm_dep_job_add_dependency() and friends - register input fences;
> >     duplicates from the same context are deduplicated.
> >  3. drm_dep_job_arm() - assign sequence number, obtain finished fence.
> >  4. drm_dep_job_push() - submit to queue.
> 
> You cannot enforce this sequence easily in C code. Once again, we are trusting
> drivers that it is followed, but in Rust, you can simply reject code that does
> not follow this order at compile time.
> 

I don’t know Rust, but yes — you can enforce this in C. It’s called
lockdep and annotations. It’s not compile-time, but all of this is
strictly enforced. e.g., write some code that doesn't follow this and
report back if the kernel doesn't explode. It will, if doesn't I'll fix
it to complain.

> 
> > 
> > Submission paths under queue lock:
> >  - Bypass path: if DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED is set, the
> >    SPSC queue is empty, no dependencies are pending, and credits are
> >    available, the job is dispatched inline on the calling thread.
> >  - Queued path: job is pushed onto the SPSC queue and the run_job
> >    worker is kicked. The worker resolves remaining dependencies
> >    (installing wakeup callbacks for unresolved fences) before calling
> >    ops->run_job().
> > 
> > Credit-based throttling prevents hardware overflow: each job declares
> > a credit cost at init time; dispatch is deferred until sufficient
> > credits are available.
> 
> Why can’t we design an API where the driver can refuse jobs in
> ops->run_job() if there are no resources to run it? This would do away with the
> credit system that has been in place for quite a while. Has this approach been
> tried in the past?
> 

That seems possible if this is the preferred option. -EAGAIN is the way
to do this. I’m open to the idea, but we also need to weigh the cost of
converting drivers against the number of changes required.

Partial - reply with catch up the rest later.

Appreciate the feedback.

Matt

> 
> > 
> > Timeout Detection and Recovery (TDR): a per-queue delayed work item
> > fires when the head pending job exceeds q->job.timeout jiffies, calling
> > ops->timedout_job(). drm_dep_queue_trigger_timeout() forces immediate
> > expiry for device teardown.
> > 
> > IRQ-safe completion: queues flagged DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE
> > allow drm_dep_job_done() to be called from hardirq context (e.g. a
> > dma_fence callback). Dependency cleanup is deferred to process context
> > after ops->run_job() returns to avoid calling xa_destroy() from IRQ.
> > 
> > Zombie-state guard: workers use kref_get_unless_zero() on entry and
> > bail immediately if the queue refcount has already reached zero and
> > async teardown is in flight, preventing use-after-free.
> 
> In rust, when you queue work, you have to pass a reference-counted pointer
> (Arc<T>). We simply never have this problem in a Rust design. If there is work
> queued, the queue is alive.
> 
> By the way, why can’t we simply require synchronous teardowns?
> 
> > 
> > Teardown is always deferred to a module-private workqueue (dep_free_wq)
> > so that destroy_workqueue() is never called from within one of the
> > queue's own workers. Each queue holds a drm_dev_get() reference on its
> > owning struct drm_device, released as the final step of teardown via
> > drm_dev_put(). This prevents the driver module from being unloaded
> > while any queue is still alive without requiring a separate drain API.
> > 
> > Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Cc: Danilo Krummrich <dakr@kernel.org>
> > Cc: David Airlie <airlied@gmail.com>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Maxime Ripard <mripard@kernel.org>
> > Cc: Philipp Stanner <phasta@kernel.org>
> > Cc: Simona Vetter <simona@ffwll.ch>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > Assisted-by: GitHub Copilot:claude-sonnet-4.6
> > ---
> > drivers/gpu/drm/Kconfig             |    4 +
> > drivers/gpu/drm/Makefile            |    1 +
> > drivers/gpu/drm/dep/Makefile        |    5 +
> > drivers/gpu/drm/dep/drm_dep_fence.c |  406 +++++++
> > drivers/gpu/drm/dep/drm_dep_fence.h |   25 +
> > drivers/gpu/drm/dep/drm_dep_job.c   |  675 +++++++++++
> > drivers/gpu/drm/dep/drm_dep_job.h   |   13 +
> > drivers/gpu/drm/dep/drm_dep_queue.c | 1647 +++++++++++++++++++++++++++
> > drivers/gpu/drm/dep/drm_dep_queue.h |   31 +
> > include/drm/drm_dep.h               |  597 ++++++++++
> > 10 files changed, 3404 insertions(+)
> > create mode 100644 drivers/gpu/drm/dep/Makefile
> > create mode 100644 drivers/gpu/drm/dep/drm_dep_fence.c
> > create mode 100644 drivers/gpu/drm/dep/drm_dep_fence.h
> > create mode 100644 drivers/gpu/drm/dep/drm_dep_job.c
> > create mode 100644 drivers/gpu/drm/dep/drm_dep_job.h
> > create mode 100644 drivers/gpu/drm/dep/drm_dep_queue.c
> > create mode 100644 drivers/gpu/drm/dep/drm_dep_queue.h
> > create mode 100644 include/drm/drm_dep.h
> > 
> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > index 5386248e75b6..834f6e210551 100644
> > --- a/drivers/gpu/drm/Kconfig
> > +++ b/drivers/gpu/drm/Kconfig
> > @@ -276,6 +276,10 @@ config DRM_SCHED
> > tristate
> > depends on DRM
> > 
> > +config DRM_DEP
> > + tristate
> > + depends on DRM
> > +
> > # Separate option as not all DRM drivers use it
> > config DRM_PANEL_BACKLIGHT_QUIRKS
> > tristate
> > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> > index e97faabcd783..1ad87cc0e545 100644
> > --- a/drivers/gpu/drm/Makefile
> > +++ b/drivers/gpu/drm/Makefile
> > @@ -173,6 +173,7 @@ obj-y += clients/
> > obj-y += display/
> > obj-$(CONFIG_DRM_TTM) += ttm/
> > obj-$(CONFIG_DRM_SCHED) += scheduler/
> > +obj-$(CONFIG_DRM_DEP) += dep/
> > obj-$(CONFIG_DRM_RADEON)+= radeon/
> > obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/
> > obj-$(CONFIG_DRM_AMDGPU)+= amd/amdxcp/
> > diff --git a/drivers/gpu/drm/dep/Makefile b/drivers/gpu/drm/dep/Makefile
> > new file mode 100644
> > index 000000000000..335f1af46a7b
> > --- /dev/null
> > +++ b/drivers/gpu/drm/dep/Makefile
> > @@ -0,0 +1,5 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +
> > +drm_dep-y := drm_dep_queue.o drm_dep_job.o drm_dep_fence.o
> > +
> > +obj-$(CONFIG_DRM_DEP) += drm_dep.o
> > diff --git a/drivers/gpu/drm/dep/drm_dep_fence.c b/drivers/gpu/drm/dep/drm_dep_fence.c
> > new file mode 100644
> > index 000000000000..ae05b9077772
> > --- /dev/null
> > +++ b/drivers/gpu/drm/dep/drm_dep_fence.c
> > @@ -0,0 +1,406 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2026 Intel Corporation
> > + */
> > +
> > +/**
> > + * DOC: DRM dependency fence
> > + *
> > + * Each struct drm_dep_job has an associated struct drm_dep_fence that
> > + * provides a single dma_fence (@finished) signalled when the hardware
> > + * completes the job.
> > + *
> > + * The hardware fence returned by &drm_dep_queue_ops.run_job is stored as
> > + * @parent. @finished is chained to @parent via drm_dep_job_done_cb() and
> > + * is signalled once @parent signals (or immediately if run_job() returns
> > + * NULL or an error).
> 
> I thought this fence proxy mechanism was going away due to recent work being
> carried out by Christian?
> 
> > + *
> > + * Drivers should expose @finished as the out-fence for GPU work since it is
> > + * valid from the moment drm_dep_job_arm() returns, whereas the hardware fence
> > + * could be a compound fence, which is disallowed when installed into
> > + * drm_syncobjs or dma-resv.
> > + *
> > + * The fence uses the kernel's inline spinlock (NULL passed to dma_fence_init())
> > + * so no separate lock allocation is required.
> > + *
> > + * Deadline propagation is supported: if a consumer sets a deadline via
> > + * dma_fence_set_deadline(), it is forwarded to @parent when @parent is set.
> > + * If @parent has not been set yet the deadline is stored in @deadline and
> > + * forwarded at that point.
> > + *
> > + * Memory management: drm_dep_fence objects are allocated with kzalloc() and
> > + * freed via kfree_rcu() once the fence is released, ensuring safety with
> > + * RCU-protected fence accesses.
> > + */
> > +
> > +#include <linux/slab.h>
> > +#include <drm/drm_dep.h>
> > +#include "drm_dep_fence.h"
> > +
> > +/**
> > + * DRM_DEP_FENCE_FLAG_HAS_DEADLINE_BIT - a fence deadline hint has been set
> > + *
> > + * Set by the deadline callback on the finished fence to indicate a deadline
> > + * has been set which may need to be propagated to the parent hardware fence.
> > + */
> > +#define DRM_DEP_FENCE_FLAG_HAS_DEADLINE_BIT (DMA_FENCE_FLAG_USER_BITS + 1)
> > +
> > +/**
> > + * struct drm_dep_fence - fence tracking the completion of a dep job
> > + *
> > + * Contains a single dma_fence (@finished) that is signalled when the
> > + * hardware completes the job. The fence uses the kernel's inline_lock
> > + * (no external spinlock required).
> > + *
> > + * This struct is private to the drm_dep module; external code interacts
> > + * through the accessor functions declared in drm_dep_fence.h.
> > + */
> > +struct drm_dep_fence {
> > + /**
> > + * @finished: signalled when the job completes on hardware.
> > + *
> > + * Drivers should use this fence as the out-fence for a job since it
> > + * is available immediately upon drm_dep_job_arm().
> > + */
> > + struct dma_fence finished;
> > +
> > + /**
> > + * @deadline: deadline set on @finished which potentially needs to be
> > + * propagated to @parent.
> > + */
> > + ktime_t deadline;
> > +
> > + /**
> > + * @parent: The hardware fence returned by &drm_dep_queue_ops.run_job.
> > + *
> > + * @finished is signaled once @parent is signaled. The initial store is
> > + * performed via smp_store_release to synchronize with deadline handling.
> > + *
> > + * All readers must access this under the fence lock and take a reference to
> > + * it, as @parent is set to NULL under the fence lock when the drm_dep_fence
> > + * signals, and this drop also releases its internal reference.
> > + */
> > + struct dma_fence *parent;
> > +
> > + /**
> > + * @q: the queue this fence belongs to.
> > + */
> > + struct drm_dep_queue *q;
> > +};
> > +
> > +static const struct dma_fence_ops drm_dep_fence_ops;
> > +
> > +/**
> > + * to_drm_dep_fence() - cast a dma_fence to its enclosing drm_dep_fence
> > + * @f: dma_fence to cast
> > + *
> > + * Context: No context requirements (inline helper).
> > + * Return: pointer to the enclosing &drm_dep_fence.
> > + */
> > +static struct drm_dep_fence *to_drm_dep_fence(struct dma_fence *f)
> > +{
> > + return container_of(f, struct drm_dep_fence, finished);
> > +}
> > +
> > +/**
> > + * drm_dep_fence_set_parent() - store the hardware fence and propagate
> > + *   any deadline
> > + * @dfence: dep fence
> > + * @parent: hardware fence returned by &drm_dep_queue_ops.run_job, or NULL/error
> > + *
> > + * Stores @parent on @dfence under smp_store_release() so that a concurrent
> > + * drm_dep_fence_set_deadline() call sees the parent before checking the
> > + * deadline bit. If a deadline has already been set on @dfence->finished it is
> > + * forwarded to @parent immediately. Does nothing if @parent is NULL or an
> > + * error pointer.
> > + *
> > + * Context: Any context.
> > + */
> > +void drm_dep_fence_set_parent(struct drm_dep_fence *dfence,
> > +      struct dma_fence *parent)
> > +{
> > + if (IS_ERR_OR_NULL(parent))
> > + return;
> > +
> > + /*
> > + * smp_store_release() to ensure a thread racing us in
> > + * drm_dep_fence_set_deadline() sees the parent set before
> > + * it calls test_bit(HAS_DEADLINE_BIT).
> > + */
> > + smp_store_release(&dfence->parent, dma_fence_get(parent));
> > + if (test_bit(DRM_DEP_FENCE_FLAG_HAS_DEADLINE_BIT,
> > +     &dfence->finished.flags))
> > + dma_fence_set_deadline(parent, dfence->deadline);
> > +}
> > +
> > +/**
> > + * drm_dep_fence_finished() - signal the finished fence with a result
> > + * @dfence: dep fence to signal
> > + * @result: error code to set, or 0 for success
> > + *
> > + * Sets the fence error to @result if non-zero, then signals
> > + * @dfence->finished. Also removes parent visibility under the fence lock
> > + * and drops the parent reference. Dropping the parent here allows the
> > + * DRM dep fence to be completely decoupled from the DRM dep module.
> > + *
> > + * Context: Any context.
> > + */
> > +static void drm_dep_fence_finished(struct drm_dep_fence *dfence, int result)
> > +{
> > + struct dma_fence *parent;
> > + unsigned long flags;
> > +
> > + dma_fence_lock_irqsave(&dfence->finished, flags);
> > + if (result)
> > + dma_fence_set_error(&dfence->finished, result);
> > + dma_fence_signal_locked(&dfence->finished);
> > + parent = dfence->parent;
> > + dfence->parent = NULL;
> > + dma_fence_unlock_irqrestore(&dfence->finished, flags);
> > +
> > + dma_fence_put(parent);
> > +}
> 
> We should really try to move away from manual locks and unlocks.
> 
> > +
> > +static const char *drm_dep_fence_get_driver_name(struct dma_fence *fence)
> > +{
> > + return "drm_dep";
> > +}
> > +
> > +static const char *drm_dep_fence_get_timeline_name(struct dma_fence *f)
> > +{
> > + struct drm_dep_fence *dfence = to_drm_dep_fence(f);
> > +
> > + return dfence->q->name;
> > +}
> > +
> > +/**
> > + * drm_dep_fence_get_parent() - get a reference to the parent hardware fence
> > + * @dfence: dep fence to query
> > + *
> > + * Returns a new reference to @dfence->parent, or NULL if the parent has
> > + * already been cleared (i.e. @dfence->finished has signalled and the parent
> > + * reference was dropped under the fence lock).
> > + *
> > + * Uses smp_load_acquire() to pair with the smp_store_release() in
> > + * drm_dep_fence_set_parent(), ensuring that if we race a concurrent
> > + * drm_dep_fence_set_parent() call we observe the parent pointer only after
> > + * the store is fully visible — before set_parent() tests
> > + * %DRM_DEP_FENCE_FLAG_HAS_DEADLINE_BIT.
> > + *
> > + * Caller must hold the fence lock on @dfence->finished.
> > + *
> > + * Context: Any context, fence lock on @dfence->finished must be held.
> > + * Return: a new reference to the parent fence, or NULL.
> > + */
> > +static struct dma_fence *drm_dep_fence_get_parent(struct drm_dep_fence *dfence)
> > +{
> > + dma_fence_assert_held(&dfence->finished);
> 
> > +
> > + return dma_fence_get(smp_load_acquire(&dfence->parent));
> > +}
> > +
> > +/**
> > + * drm_dep_fence_set_deadline() - dma_fence_ops deadline callback
> > + * @f: fence on which the deadline is being set
> > + * @deadline: the deadline hint to apply
> > + *
> > + * Stores the earliest deadline under the fence lock, then propagates
> > + * it to the parent hardware fence via smp_load_acquire() to race
> > + * safely with drm_dep_fence_set_parent().
> > + *
> > + * Context: Any context.
> > + */
> > +static void drm_dep_fence_set_deadline(struct dma_fence *f, ktime_t deadline)
> > +{
> > + struct drm_dep_fence *dfence = to_drm_dep_fence(f);
> > + struct dma_fence *parent;
> > + unsigned long flags;
> > +
> > + dma_fence_lock_irqsave(f, flags);
> > +
> > + /* If we already have an earlier deadline, keep it: */
> > + if (test_bit(DRM_DEP_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> > +    ktime_before(dfence->deadline, deadline)) {
> > + dma_fence_unlock_irqrestore(f, flags);
> > + return;
> > + }
> > +
> > + dfence->deadline = deadline;
> > + set_bit(DRM_DEP_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> > +
> > + parent = drm_dep_fence_get_parent(dfence);
> > + dma_fence_unlock_irqrestore(f, flags);
> > +
> > + if (parent)
> > + dma_fence_set_deadline(parent, deadline);
> > +
> > + dma_fence_put(parent);
> > +}
> > +
> > +static const struct dma_fence_ops drm_dep_fence_ops = {
> > + .get_driver_name = drm_dep_fence_get_driver_name,
> > + .get_timeline_name = drm_dep_fence_get_timeline_name,
> > + .set_deadline = drm_dep_fence_set_deadline,
> > +};
> > +
> > +/**
> > + * drm_dep_fence_alloc() - allocate a dep fence
> > + *
> > + * Allocates a &drm_dep_fence with kzalloc() without initialising the
> > + * dma_fence. Call drm_dep_fence_init() to fully initialise it.
> > + *
> > + * Context: Process context.
> > + * Return: new &drm_dep_fence on success, NULL on allocation failure.
> > + */
> > +struct drm_dep_fence *drm_dep_fence_alloc(void)
> > +{
> > + return kzalloc_obj(struct drm_dep_fence);
> > +}
> > +
> > +/**
> > + * drm_dep_fence_init() - initialise the dma_fence inside a dep fence
> > + * @dfence: dep fence to initialise
> > + * @q: queue the owning job belongs to
> > + *
> > + * Initialises @dfence->finished using the context and sequence number from @q.
> > + * Passes NULL as the lock so the fence uses its inline spinlock.
> > + *
> > + * Context: Any context.
> > + */
> > +void drm_dep_fence_init(struct drm_dep_fence *dfence, struct drm_dep_queue *q)
> > +{
> > + u32 seq = ++q->fence.seqno;
> > +
> > + /*
> > + * XXX: Inline fence hazard: currently all expected users of DRM dep
> > + * hardware fences have a unique lockdep class. If that ever changes,
> > + * we will need to assign a unique lockdep class here so lockdep knows
> > + * this fence is allowed to nest with driver hardware fences.
> > + */
> > +
> > + dfence->q = q;
> > + dma_fence_init(&dfence->finished, &drm_dep_fence_ops,
> > +       NULL, q->fence.context, seq);
> > +}
> > +
> > +/**
> > + * drm_dep_fence_cleanup() - release a dep fence at job teardown
> > + * @dfence: dep fence to clean up
> > + *
> > + * Called from drm_dep_job_fini(). If the dep fence was armed (refcount > 0)
> > + * it is released via dma_fence_put() and will be freed by the RCU release
> > + * callback once all waiters have dropped their references. If it was never
> > + * armed it is freed directly with kfree().
> > + *
> > + * Context: Any context.
> > + */
> > +void drm_dep_fence_cleanup(struct drm_dep_fence *dfence)
> > +{
> > + if (drm_dep_fence_is_armed(dfence))
> > + dma_fence_put(&dfence->finished);
> > + else
> > + kfree(dfence);
> > +}
> > +
> > +/**
> > + * drm_dep_fence_is_armed() - check whether the fence has been armed
> > + * @dfence: dep fence to check
> > + *
> > + * Returns true if drm_dep_job_arm() has been called, i.e. @dfence->finished
> > + * has been initialised and its reference count is non-zero.  Used by
> > + * assertions to enforce correct job lifecycle ordering (arm before push,
> > + * add_dependency before arm).
> > + *
> > + * Context: Any context.
> > + * Return: true if the fence is armed, false otherwise.
> > + */
> > +bool drm_dep_fence_is_armed(struct drm_dep_fence *dfence)
> > +{
> > + return !!kref_read(&dfence->finished.refcount);
> > +}
> 
> > +
> > +/**
> > + * drm_dep_fence_is_finished() - test whether the finished fence has signalled
> > + * @dfence: dep fence to check
> > + *
> > + * Uses dma_fence_test_signaled_flag() to read %DMA_FENCE_FLAG_SIGNALED_BIT
> > + * directly without invoking the fence's ->signaled() callback or triggering
> > + * any signalling side-effects.
> > + *
> > + * Context: Any context.
> > + * Return: true if @dfence->finished has been signalled, false otherwise.
> > + */
> > +bool drm_dep_fence_is_finished(struct drm_dep_fence *dfence)
> > +{
> > + return dma_fence_test_signaled_flag(&dfence->finished);
> > +}
> > +
> > +/**
> > + * drm_dep_fence_is_complete() - test whether the job has completed
> > + * @dfence: dep fence to check
> > + *
> > + * Takes the fence lock on @dfence->finished and calls
> > + * drm_dep_fence_get_parent() to safely obtain a reference to the parent
> > + * hardware fence — or NULL if the parent has already been cleared after
> > + * signalling.  Calls dma_fence_is_signaled() on @parent outside the lock,
> > + * which may invoke the fence's ->signaled() callback and trigger signalling
> > + * side-effects if the fence has completed but the signalled flag has not yet
> > + * been set.  The finished fence is tested via dma_fence_test_signaled_flag(),
> > + * without side-effects.
> > + *
> > + * May only be called on a stopped queue (see drm_dep_queue_is_stopped()).
> > + *
> > + * Context: Process context. The queue must be stopped before calling this.
> > + * Return: true if the job is complete, false otherwise.
> > + */
> > +bool drm_dep_fence_is_complete(struct drm_dep_fence *dfence)
> > +{
> > + struct dma_fence *parent;
> > + unsigned long flags;
> > + bool complete;
> > +
> > + dma_fence_lock_irqsave(&dfence->finished, flags);
> > + parent = drm_dep_fence_get_parent(dfence);
> > + dma_fence_unlock_irqrestore(&dfence->finished, flags);
> > +
> > + complete = (parent && dma_fence_is_signaled(parent)) ||
> > + dma_fence_test_signaled_flag(&dfence->finished);
> > +
> > + dma_fence_put(parent);
> > +
> > + return complete;
> > +}
> > +
> > +/**
> > + * drm_dep_fence_to_dma() - return the finished dma_fence for a dep fence
> > + * @dfence: dep fence to query
> > + *
> > + * No reference is taken; the caller must hold its own reference to the owning
> > + * &drm_dep_job for the duration of the access.
> > + *
> > + * Context: Any context.
> > + * Return: the finished &dma_fence.
> > + */
> > +struct dma_fence *drm_dep_fence_to_dma(struct drm_dep_fence *dfence)
> > +{
> > + return &dfence->finished;
> > +}
> > +
> > +/**
> > + * drm_dep_fence_done() - signal the finished fence on job completion
> > + * @dfence: dep fence to signal
> > + * @result: job error code, or 0 on success
> > + *
> > + * Gets a temporary reference to @dfence->finished to guard against a racing
> > + * last-put, signals the fence with @result, then drops the temporary
> > + * reference. Called from drm_dep_job_done() in the queue core when a
> > + * hardware completion callback fires or when run_job() returns immediately.
> > + *
> > + * Context: Any context.
> > + */
> > +void drm_dep_fence_done(struct drm_dep_fence *dfence, int result)
> > +{
> > + dma_fence_get(&dfence->finished);
> > + drm_dep_fence_finished(dfence, result);
> > + dma_fence_put(&dfence->finished);
> > +}
> 
> Proper refcounting is automated (and enforced) in Rust.
> 
> > diff --git a/drivers/gpu/drm/dep/drm_dep_fence.h b/drivers/gpu/drm/dep/drm_dep_fence.h
> > new file mode 100644
> > index 000000000000..65a1582f858b
> > --- /dev/null
> > +++ b/drivers/gpu/drm/dep/drm_dep_fence.h
> > @@ -0,0 +1,25 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2026 Intel Corporation
> > + */
> > +
> > +#ifndef _DRM_DEP_FENCE_H_
> > +#define _DRM_DEP_FENCE_H_
> > +
> > +#include <linux/dma-fence.h>
> > +
> > +struct drm_dep_fence;
> > +struct drm_dep_queue;
> > +
> > +struct drm_dep_fence *drm_dep_fence_alloc(void);
> > +void drm_dep_fence_init(struct drm_dep_fence *dfence, struct drm_dep_queue *q);
> > +void drm_dep_fence_cleanup(struct drm_dep_fence *dfence);
> > +void drm_dep_fence_set_parent(struct drm_dep_fence *dfence,
> > +      struct dma_fence *parent);
> > +void drm_dep_fence_done(struct drm_dep_fence *dfence, int result);
> > +bool drm_dep_fence_is_armed(struct drm_dep_fence *dfence);
> > +bool drm_dep_fence_is_finished(struct drm_dep_fence *dfence);
> > +bool drm_dep_fence_is_complete(struct drm_dep_fence *dfence);
> > +struct dma_fence *drm_dep_fence_to_dma(struct drm_dep_fence *dfence);
> > +
> > +#endif /* _DRM_DEP_FENCE_H_ */
> > diff --git a/drivers/gpu/drm/dep/drm_dep_job.c b/drivers/gpu/drm/dep/drm_dep_job.c
> > new file mode 100644
> > index 000000000000..2d012b29a5fc
> > --- /dev/null
> > +++ b/drivers/gpu/drm/dep/drm_dep_job.c
> > @@ -0,0 +1,675 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright 2015 Advanced Micro Devices, Inc.
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > + * OTHER DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright © 2026 Intel Corporation
> > + */
> > +
> > +/**
> > + * DOC: DRM dependency job
> > + *
> > + * A struct drm_dep_job represents a single unit of GPU work associated with
> > + * a struct drm_dep_queue. The lifecycle of a job is:
> > + *
> > + * 1. **Allocation**: the driver allocates memory for the job (typically by
> > + *    embedding struct drm_dep_job in a larger structure) and calls
> > + *    drm_dep_job_init() to initialise it. On success the job holds one
> > + *    kref reference and a reference to its queue.
> > + *
> > + * 2. **Dependency collection**: the driver calls drm_dep_job_add_dependency(),
> > + *    drm_dep_job_add_syncobj_dependency(), drm_dep_job_add_resv_dependencies(),
> > + *    or drm_dep_job_add_implicit_dependencies() to register dma_fence objects
> > + *    that must be signalled before the job can run. Duplicate fences from the
> > + *    same fence context are deduplicated automatically.
> > + *
> > + * 3. **Arming**: drm_dep_job_arm() initialises the job's finished fence,
> > + *    consuming a sequence number from the queue. After arming,
> > + *    drm_dep_job_finished_fence() returns a valid fence that may be passed to
> > + *    userspace or used as a dependency by other jobs.
> > + *
> > + * 4. **Submission**: drm_dep_job_push() submits the job to the queue. The
> > + *    queue takes a reference that it holds until the job's finished fence
> > + *    signals and the job is freed by the put_job worker.
> > + *
> > + * 5. **Completion**: when the job's hardware work finishes its finished fence
> > + *    is signalled and drm_dep_job_put() is called by the queue. The driver
> > + *    must release any driver-private resources in &drm_dep_job_ops.release.
> > + *
> > + * Reference counting uses drm_dep_job_get() / drm_dep_job_put(). The
> > + * internal drm_dep_job_fini() tears down the dependency xarray and fence
> > + * objects before the driver's release callback is invoked.
> > + */
> > +
> > +#include <linux/dma-resv.h>
> > +#include <linux/kref.h>
> > +#include <linux/slab.h>
> > +#include <drm/drm_dep.h>
> > +#include <drm/drm_file.h>
> > +#include <drm/drm_gem.h>
> > +#include <drm/drm_syncobj.h>
> > +#include "drm_dep_fence.h"
> > +#include "drm_dep_job.h"
> > +#include "drm_dep_queue.h"
> > +
> > +/**
> > + * drm_dep_job_init() - initialise a dep job
> > + * @job: dep job to initialise
> > + * @args: initialisation arguments
> > + *
> > + * Initialises @job with the queue, ops and credit count from @args.  Acquires
> > + * a reference to @args->q via drm_dep_queue_get(); this reference is held for
> > + * the lifetime of the job and released by drm_dep_job_release() when the last
> > + * job reference is dropped.
> > + *
> > + * Resources are released automatically when the last reference is dropped
> > + * via drm_dep_job_put(), which must be called to release the job; drivers
> > + * must not free the job directly.
> 
> Again, can’t enforce that in C.
> 
> > + *
> > + * Context: Process context. Allocates memory with GFP_KERNEL.
> > + * Return: 0 on success, -%EINVAL if credits is 0,
> > + *   -%ENOMEM on fence allocation failure.
> > + */
> > +int drm_dep_job_init(struct drm_dep_job *job,
> > +     const struct drm_dep_job_init_args *args)
> > +{
> > + if (unlikely(!args->credits)) {
> > + pr_err("drm_dep: %s: credits cannot be 0\n", __func__);
> > + return -EINVAL;
> > + }
> > +
> > + memset(job, 0, sizeof(*job));
> > +
> > + job->dfence = drm_dep_fence_alloc();
> > + if (!job->dfence)
> > + return -ENOMEM;
> > +
> > + job->ops = args->ops;
> > + job->q = drm_dep_queue_get(args->q);
> > + job->credits = args->credits;
> > +
> > + kref_init(&job->refcount);
> > + xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
> > + INIT_LIST_HEAD(&job->pending_link);
> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_init);
> > +
> > +/**
> > + * drm_dep_job_drop_dependencies() - release all input dependency fences
> > + * @job: dep job whose dependency xarray to drain
> > + *
> > + * Walks @job->dependencies, puts each fence, and destroys the xarray.
> > + * Any slots still holding a %DRM_DEP_JOB_FENCE_PREALLOC sentinel —
> > + * i.e. slots that were pre-allocated but never replaced — are silently
> > + * skipped; the sentinel carries no reference.  Called from
> > + * drm_dep_queue_run_job() in process context immediately after
> > + * @ops->run_job() returns, before the final drm_dep_job_put().  Releasing
> > + * dependencies here — while still in process context — avoids calling
> > + * xa_destroy() from IRQ context if the job's last reference is later
> > + * dropped from a dma_fence callback.
> > + *
> > + * Context: Process context.
> > + */
> > +void drm_dep_job_drop_dependencies(struct drm_dep_job *job)
> > +{
> > + struct dma_fence *fence;
> > + unsigned long index;
> > +
> > + xa_for_each(&job->dependencies, index, fence) {
> > + if (unlikely(fence == DRM_DEP_JOB_FENCE_PREALLOC))
> > + continue;
> > + dma_fence_put(fence);
> > + }
> > + xa_destroy(&job->dependencies);
> > +}
> 
> This is automated in Rust. You also can’t “forget” to call this.
> 
> > +
> > +/**
> > + * drm_dep_job_fini() - clean up a dep job
> > + * @job: dep job to clean up
> > + *
> > + * Cleans up the dep fence and drops the queue reference held by @job.
> > + *
> > + * If the job was never armed (e.g. init failed before drm_dep_job_arm()),
> > + * the dependency xarray is also released here.  For armed jobs the xarray
> > + * has already been drained by drm_dep_job_drop_dependencies() in process
> > + * context immediately after run_job(), so it is left untouched to avoid
> > + * calling xa_destroy() from IRQ context.
> > + *
> > + * Warns if @job is still linked on the queue's pending list, which would
> > + * indicate a bug in the teardown ordering.
> > + *
> > + * Context: Any context.
> > + */
> > +static void drm_dep_job_fini(struct drm_dep_job *job)
> > +{
> > + bool armed = drm_dep_fence_is_armed(job->dfence);
> > +
> > + WARN_ON(!list_empty(&job->pending_link));
> > +
> > + drm_dep_fence_cleanup(job->dfence);
> > + job->dfence = NULL;
> > +
> > + /*
> > + * Armed jobs have their dependencies drained by
> > + * drm_dep_job_drop_dependencies() in process context after run_job().
> > + * Skip here to avoid calling xa_destroy() from IRQ context.
> > + */
> > + if (!armed)
> > + drm_dep_job_drop_dependencies(job);
> > +}
> 
> Same here.
> 
> > +
> > +/**
> > + * drm_dep_job_get() - acquire a reference to a dep job
> > + * @job: dep job to acquire a reference on, or NULL
> > + *
> > + * Context: Any context.
> > + * Return: @job with an additional reference held, or NULL if @job is NULL.
> > + */
> > +struct drm_dep_job *drm_dep_job_get(struct drm_dep_job *job)
> > +{
> > + if (job)
> > + kref_get(&job->refcount);
> > + return job;
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_get);
> > +
> 
> Same here.
> 
> > +/**
> > + * drm_dep_job_release() - kref release callback for a dep job
> > + * @kref: kref embedded in the dep job
> > + *
> > + * Calls drm_dep_job_fini(), then invokes &drm_dep_job_ops.release if set,
> > + * otherwise frees @job with kfree().  Finally, releases the queue reference
> > + * that was acquired by drm_dep_job_init() via drm_dep_queue_put().  The
> > + * queue put is performed last to ensure no queue state is accessed after
> > + * the job memory is freed.
> > + *
> > + * Context: Any context if %DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE is set on the
> > + *   job's queue; otherwise process context only, as the release callback may
> > + *   sleep.
> > + */
> > +static void drm_dep_job_release(struct kref *kref)
> > +{
> > + struct drm_dep_job *job =
> > + container_of(kref, struct drm_dep_job, refcount);
> > + struct drm_dep_queue *q = job->q;
> > +
> > + drm_dep_job_fini(job);
> > +
> > + if (job->ops && job->ops->release)
> > + job->ops->release(job);
> > + else
> > + kfree(job);
> > +
> > + drm_dep_queue_put(q);
> > +}
> 
> Same here.
> 
> > +
> > +/**
> > + * drm_dep_job_put() - release a reference to a dep job
> > + * @job: dep job to release a reference on, or NULL
> > + *
> > + * When the last reference is dropped, calls &drm_dep_job_ops.release if set,
> > + * otherwise frees @job with kfree(). Does nothing if @job is NULL.
> > + *
> > + * Context: Any context if %DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE is set on the
> > + *   job's queue; otherwise process context only, as the release callback may
> > + *   sleep.
> > + */
> > +void drm_dep_job_put(struct drm_dep_job *job)
> > +{
> > + if (job)
> > + kref_put(&job->refcount, drm_dep_job_release);
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_put);
> > +
> 
> Same here.
> 
> > +/**
> > + * drm_dep_job_arm() - arm a dep job for submission
> > + * @job: dep job to arm
> > + *
> > + * Initialises the finished fence on @job->dfence, assigning
> > + * it a sequence number from the job's queue. Must be called after
> > + * drm_dep_job_init() and before drm_dep_job_push(). Once armed,
> > + * drm_dep_job_finished_fence() returns a valid fence that may be passed to
> > + * userspace or used as a dependency by other jobs.
> > + *
> > + * Begins the DMA fence signalling path via dma_fence_begin_signalling().
> > + * After this point, memory allocations that could trigger reclaim are
> > + * forbidden; lockdep enforces this. arm() must always be paired with
> > + * drm_dep_job_push(); lockdep also enforces this pairing.
> > + *
> > + * Warns if the job has already been armed.
> > + *
> > + * Context: Process context if %DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED is set
> > + *   (takes @q->sched.lock, a mutex); any context otherwise. DMA fence signaling
> > + *   path.
> > + */
> > +void drm_dep_job_arm(struct drm_dep_job *job)
> > +{
> > + drm_dep_queue_push_job_begin(job->q);
> > + WARN_ON(drm_dep_fence_is_armed(job->dfence));
> > + drm_dep_fence_init(job->dfence, job->q);
> > + job->signalling_cookie = dma_fence_begin_signalling();
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_arm);
> > +
> > +/**
> > + * drm_dep_job_push() - submit a job to its queue for execution
> > + * @job: dep job to push
> > + *
> > + * Submits @job to the queue it was initialised with. Must be called after
> > + * drm_dep_job_arm(). Acquires a reference on @job on behalf of the queue,
> > + * held until the queue is fully done with it. The reference is released
> > + * directly in the finished-fence dma_fence callback for queues with
> > + * %DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE (where drm_dep_job_done() may run
> > + * from hardirq context), or via the put_job work item on the submit
> > + * workqueue otherwise.
> > + *
> > + * Ends the DMA fence signalling path begun by drm_dep_job_arm() via
> > + * dma_fence_end_signalling(). This must be paired with arm(); lockdep
> > + * enforces the pairing.
> > + *
> > + * Once pushed, &drm_dep_queue_ops.run_job is guaranteed to be called for
> > + * @job exactly once, even if the queue is killed or torn down before the
> > + * job reaches the head of the queue. Drivers can use this guarantee to
> > + * perform bookkeeping cleanup; the actual backend operation should be
> > + * skipped when drm_dep_queue_is_killed() returns true.
> > + *
> > + * If the queue does not support the bypass path, the job is pushed directly
> > + * onto the SPSC submission queue via drm_dep_queue_push_job() without holding
> > + * @q->sched.lock. Otherwise, @q->sched.lock is taken and the job is either
> > + * run immediately via drm_dep_queue_run_job() if it qualifies for bypass, or
> > + * enqueued via drm_dep_queue_push_job() for dispatch by the run_job work item.
> > + *
> > + * Warns if the job has not been armed.
> > + *
> > + * Context: Process context if %DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED is set
> > + *   (takes @q->sched.lock, a mutex); any context otherwise. DMA fence signaling
> > + *   path.
> > + */
> > +void drm_dep_job_push(struct drm_dep_job *job)
> > +{
> > + struct drm_dep_queue *q = job->q;
> > +
> > + WARN_ON(!drm_dep_fence_is_armed(job->dfence));
> > +
> > + drm_dep_job_get(job);
> > +
> > + if (!(q->sched.flags & DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED)) {
> > + drm_dep_queue_push_job(q, job);
> > + dma_fence_end_signalling(job->signalling_cookie);
> 
> Signaling is enforced in a more thorough way in Rust. I’ll expand on this later in this patch.
> 
> > + drm_dep_queue_push_job_end(job->q);
> > + return;
> > + }
> > +
> > + scoped_guard(mutex, &q->sched.lock) {
> > + if (drm_dep_queue_can_job_bypass(q, job))
> > + drm_dep_queue_run_job(q, job);
> > + else
> > + drm_dep_queue_push_job(q, job);
> > + }
> > +
> > + dma_fence_end_signalling(job->signalling_cookie);
> > + drm_dep_queue_push_job_end(job->q);
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_push);
> > +
> > +/**
> > + * drm_dep_job_add_dependency() - adds the fence as a job dependency
> > + * @job: dep job to add the dependencies to
> > + * @fence: the dma_fence to add to the list of dependencies, or
> > + *         %DRM_DEP_JOB_FENCE_PREALLOC to reserve a slot for later.
> > + *
> > + * Note that @fence is consumed in both the success and error cases (except
> > + * when @fence is %DRM_DEP_JOB_FENCE_PREALLOC, which carries no reference).
> > + *
> > + * Signalled fences and fences belonging to the same queue as @job (i.e. where
> > + * fence->context matches the queue's finished fence context) are silently
> > + * dropped; the job need not wait on its own queue's output.
> > + *
> > + * Warns if the job has already been armed (dependencies must be added before
> > + * drm_dep_job_arm()).
> > + *
> > + * **Pre-allocation pattern**
> > + *
> > + * When multiple jobs across different queues must be prepared and submitted
> > + * together in a single atomic commit — for example, where job A's finished
> > + * fence is an input dependency of job B — all jobs must be armed and pushed
> > + * within a single dma_fence_begin_signalling() / dma_fence_end_signalling()
> > + * region.  Once that region has started no memory allocation is permitted.
> > + *
> > + * To handle this, pass %DRM_DEP_JOB_FENCE_PREALLOC during the preparation
> > + * phase (before arming any job, while GFP_KERNEL allocation is still allowed)
> > + * to pre-allocate a slot in @job->dependencies.  The slot index assigned by
> > + * the underlying xarray must be tracked by the caller separately (e.g. it is
> > + * always index 0 when the dependency array is empty, as Xe relies on).
> > + * After all jobs have been armed and the finished fences are available, call
> > + * drm_dep_job_replace_dependency() with that index and the real fence.
> > + * drm_dep_job_replace_dependency() uses GFP_NOWAIT internally and may be
> > + * called from atomic or signalling context.
> > + *
> > + * The sentinel slot is never skipped by the signalled-fence fast-path,
> > + * ensuring a slot is always allocated even when the real fence is not yet
> > + * known.
> > + *
> > + * **Example: bind job feeding TLB invalidation jobs**
> > + *
> > + * Consider a GPU with separate queues for page-table bind operations and for
> > + * TLB invalidation.  A single atomic commit must:
> > + *
> > + *  1. Run a bind job that modifies page tables.
> > + *  2. Run one TLB-invalidation job per MMU that depends on the bind
> > + *     completing, so stale translations are flushed before the engines
> > + *     continue.
> > + *
> > + * Because all jobs must be armed and pushed inside a signalling region (where
> > + * GFP_KERNEL is forbidden), pre-allocate slots before entering the region::
> > + *
> > + *   // Phase 1 — process context, GFP_KERNEL allowed
> > + *   drm_dep_job_init(bind_job, bind_queue, ops);
> > + *   for_each_mmu(mmu) {
> > + *       drm_dep_job_init(tlb_job[mmu], tlb_queue[mmu], ops);
> > + *       // Pre-allocate slot at index 0; real fence not available yet
> > + *       drm_dep_job_add_dependency(tlb_job[mmu], DRM_DEP_JOB_FENCE_PREALLOC);
> > + *   }
> > + *
> > + *   // Phase 2 — inside signalling region, no GFP_KERNEL
> > + *   dma_fence_begin_signalling();
> > + *   drm_dep_job_arm(bind_job);
> > + *   for_each_mmu(mmu) {
> > + *       // Swap sentinel for bind job's finished fence
> > + *       drm_dep_job_replace_dependency(tlb_job[mmu], 0,
> > + *                                      dma_fence_get(bind_job->finished));
> > + *       drm_dep_job_arm(tlb_job[mmu]);
> > + *   }
> > + *   drm_dep_job_push(bind_job);
> > + *   for_each_mmu(mmu)
> > + *       drm_dep_job_push(tlb_job[mmu]);
> > + *   dma_fence_end_signalling();
> > + *
> > + * Context: Process context. May allocate memory with GFP_KERNEL.
> > + * Return: If fence == DRM_DEP_JOB_FENCE_PREALLOC index of allocation on
> > + * success, else 0 on success, or a negative error code.
> > + */
> 
> > +int drm_dep_job_add_dependency(struct drm_dep_job *job, struct dma_fence *fence)
> > +{
> > + struct drm_dep_queue *q = job->q;
> > + struct dma_fence *entry;
> > + unsigned long index;
> > + u32 id = 0;
> > + int ret;
> > +
> > + WARN_ON(drm_dep_fence_is_armed(job->dfence));
> > + might_alloc(GFP_KERNEL);
> > +
> > + if (!fence)
> > + return 0;
> > +
> > + if (fence == DRM_DEP_JOB_FENCE_PREALLOC)
> > + goto add_fence;
> > +
> > + /*
> > + * Ignore signalled fences or fences from our own queue — finished
> > + * fences use q->fence.context.
> > + */
> > + if (dma_fence_test_signaled_flag(fence) ||
> > +    fence->context == q->fence.context) {
> > + dma_fence_put(fence);
> > + return 0;
> > + }
> > +
> > + /* Deduplicate if we already depend on a fence from the same context.
> > + * This lets the size of the array of deps scale with the number of
> > + * engines involved, rather than the number of BOs.
> > + */
> > + xa_for_each(&job->dependencies, index, entry) {
> > + if (entry == DRM_DEP_JOB_FENCE_PREALLOC ||
> > +    entry->context != fence->context)
> > + continue;
> > +
> > + if (dma_fence_is_later(fence, entry)) {
> > + dma_fence_put(entry);
> > + xa_store(&job->dependencies, index, fence, GFP_KERNEL);
> > + } else {
> > + dma_fence_put(fence);
> > + }
> > + return 0;
> > + }
> > +
> > +add_fence:
> > + ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b,
> > +       GFP_KERNEL);
> > + if (ret != 0) {
> > + if (fence != DRM_DEP_JOB_FENCE_PREALLOC)
> > + dma_fence_put(fence);
> > + return ret;
> > + }
> > +
> > + return (fence == DRM_DEP_JOB_FENCE_PREALLOC) ? id : 0;
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_add_dependency);
> > +
> > +/**
> > + * drm_dep_job_replace_dependency() - replace a pre-allocated dependency slot
> > + * @job: dep job to update
> > + * @index: xarray index of the slot to replace, as returned when the sentinel
> > + *         was originally inserted via drm_dep_job_add_dependency()
> > + * @fence: the real dma_fence to store; its reference is always consumed
> > + *
> > + * Replaces the %DRM_DEP_JOB_FENCE_PREALLOC sentinel at @index in
> > + * @job->dependencies with @fence.  The slot must have been pre-allocated by
> > + * passing %DRM_DEP_JOB_FENCE_PREALLOC to drm_dep_job_add_dependency(); the
> > + * existing entry is asserted to be the sentinel.
> > + *
> > + * This is the second half of the pre-allocation pattern described in
> > + * drm_dep_job_add_dependency().  It is intended to be called inside a
> > + * dma_fence_begin_signalling() / dma_fence_end_signalling() region where
> > + * memory allocation with GFP_KERNEL is forbidden.  It uses GFP_NOWAIT
> > + * internally so it is safe to call from atomic or signalling context, but
> > + * since the slot has been pre-allocated no actual memory allocation occurs.
> > + *
> > + * If @fence is already signalled the slot is erased rather than storing a
> > + * redundant dependency.  The successful store is asserted — if the store
> > + * fails it indicates a programming error (slot index out of range or
> > + * concurrent modification).
> > + *
> > + * Must be called before drm_dep_job_arm(). @fence is consumed in all cases.
> 
> Can’t enforce this in C. Also, how is the fence “consumed” ? You can’t enforce that
> the user can’t access the fence anymore after this function returns, like we can do
> at compile time in Rust.
> 
> > + *
> > + * Context: Any context. DMA fence signaling path.
> > + */
> > +void drm_dep_job_replace_dependency(struct drm_dep_job *job, u32 index,
> > +    struct dma_fence *fence)
> > +{
> > + WARN_ON(xa_load(&job->dependencies, index) !=
> > + DRM_DEP_JOB_FENCE_PREALLOC);
> > +
> > + if (dma_fence_test_signaled_flag(fence)) {
> > + xa_erase(&job->dependencies, index);
> > + dma_fence_put(fence);
> > + return;
> > + }
> > +
> > + if (WARN_ON(xa_is_err(xa_store(&job->dependencies, index, fence,
> > +       GFP_NOWAIT)))) {
> > + dma_fence_put(fence);
> > + return;
> > + }
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_replace_dependency);
> > +
> > +/**
> > + * drm_dep_job_add_syncobj_dependency() - adds a syncobj's fence as a
> > + *   job dependency
> > + * @job: dep job to add the dependencies to
> > + * @file: drm file private pointer
> > + * @handle: syncobj handle to lookup
> > + * @point: timeline point
> > + *
> > + * This adds the fence matching the given syncobj to @job.
> > + *
> > + * Context: Process context.
> > + * Return: 0 on success, or a negative error code.
> > + */
> > +int drm_dep_job_add_syncobj_dependency(struct drm_dep_job *job,
> > +       struct drm_file *file, u32 handle,
> > +       u32 point)
> > +{
> > + struct dma_fence *fence;
> > + int ret;
> > +
> > + ret = drm_syncobj_find_fence(file, handle, point, 0, &fence);
> > + if (ret)
> > + return ret;
> > +
> > + return drm_dep_job_add_dependency(job, fence);
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_add_syncobj_dependency);
> > +
> > +/**
> > + * drm_dep_job_add_resv_dependencies() - add all fences from the resv to the job
> > + * @job: dep job to add the dependencies to
> > + * @resv: the dma_resv object to get the fences from
> > + * @usage: the dma_resv_usage to use to filter the fences
> > + *
> > + * This adds all fences matching the given usage from @resv to @job.
> > + * Must be called with the @resv lock held.
> > + *
> > + * Context: Process context.
> > + * Return: 0 on success, or a negative error code.
> > + */
> > +int drm_dep_job_add_resv_dependencies(struct drm_dep_job *job,
> > +      struct dma_resv *resv,
> > +      enum dma_resv_usage usage)
> > +{
> > + struct dma_resv_iter cursor;
> > + struct dma_fence *fence;
> > + int ret;
> > +
> > + dma_resv_assert_held(resv);
> > +
> > + dma_resv_for_each_fence(&cursor, resv, usage, fence) {
> > + /*
> > + * As drm_dep_job_add_dependency always consumes the fence
> > + * reference (even when it fails), and dma_resv_for_each_fence
> > + * is not obtaining one, we need to grab one before calling.
> > + */
> > + ret = drm_dep_job_add_dependency(job, dma_fence_get(fence));
> > + if (ret)
> > + return ret;
> > + }
> > + return 0;
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_add_resv_dependencies);
> > +
> > +/**
> > + * drm_dep_job_add_implicit_dependencies() - adds implicit dependencies
> > + *   as job dependencies
> > + * @job: dep job to add the dependencies to
> > + * @obj: the gem object to add new dependencies from.
> > + * @write: whether the job might write the object (so we need to depend on
> > + * shared fences in the reservation object).
> > + *
> > + * This should be called after drm_gem_lock_reservations() on your array of
> > + * GEM objects used in the job but before updating the reservations with your
> > + * own fences.
> > + *
> > + * Context: Process context.
> > + * Return: 0 on success, or a negative error code.
> > + */
> > +int drm_dep_job_add_implicit_dependencies(struct drm_dep_job *job,
> > +  struct drm_gem_object *obj,
> > +  bool write)
> > +{
> > + return drm_dep_job_add_resv_dependencies(job, obj->resv,
> > + dma_resv_usage_rw(write));
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_add_implicit_dependencies);
> > +
> > +/**
> > + * drm_dep_job_is_signaled() - check whether a dep job has completed
> > + * @job: dep job to check
> > + *
> > + * Determines whether @job has signalled. The queue should be stopped before
> > + * calling this to obtain a stable snapshot of state. Both the parent hardware
> > + * fence and the finished software fence are checked.
> > + *
> > + * Context: Process context. The queue must be stopped before calling this.
> > + * Return: true if the job is signalled, false otherwise.
> > + */
> > +bool drm_dep_job_is_signaled(struct drm_dep_job *job)
> > +{
> > + WARN_ON(!drm_dep_queue_is_stopped(job->q));
> > + return drm_dep_fence_is_complete(job->dfence);
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_is_signaled);
> > +
> > +/**
> > + * drm_dep_job_is_finished() - test whether a dep job's finished fence has signalled
> > + * @job: dep job to check
> > + *
> > + * Tests whether the job's software finished fence has been signalled, using
> > + * dma_fence_test_signaled_flag() to avoid any signalling side-effects. Unlike
> > + * drm_dep_job_is_signaled(), this does not require the queue to be stopped and
> > + * does not check the parent hardware fence — it is a lightweight test of the
> > + * finished fence only.
> > + *
> > + * Context: Any context.
> > + * Return: true if the job's finished fence has been signalled, false otherwise.
> > + */
> > +bool drm_dep_job_is_finished(struct drm_dep_job *job)
> > +{
> > + return drm_dep_fence_is_finished(job->dfence);
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_is_finished);
> > +
> > +/**
> > + * drm_dep_job_invalidate_job() - increment the invalidation count for a job
> > + * @job: dep job to invalidate
> > + * @threshold: threshold above which the job is considered invalidated
> > + *
> > + * Increments @job->invalidate_count and returns true if it exceeds @threshold,
> > + * indicating the job should be considered hung and discarded. The queue must
> > + * be stopped before calling this function.
> > + *
> > + * Context: Process context. The queue must be stopped before calling this.
> > + * Return: true if @job->invalidate_count exceeds @threshold, false otherwise.
> > + */
> > +bool drm_dep_job_invalidate_job(struct drm_dep_job *job, int threshold)
> > +{
> > + WARN_ON(!drm_dep_queue_is_stopped(job->q));
> > + return ++job->invalidate_count > threshold;
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_invalidate_job);
> > +
> > +/**
> > + * drm_dep_job_finished_fence() - return the finished fence for a job
> > + * @job: dep job to query
> > + *
> > + * No reference is taken on the returned fence; the caller must hold its own
> > + * reference to @job for the duration of any access.
> 
> Can’t enforce this in C.
> 
> > + *
> > + * Context: Any context.
> > + * Return: the finished &dma_fence for @job.
> > + */
> > +struct dma_fence *drm_dep_job_finished_fence(struct drm_dep_job *job)
> > +{
> > + return drm_dep_fence_to_dma(job->dfence);
> > +}
> > +EXPORT_SYMBOL(drm_dep_job_finished_fence);
> > diff --git a/drivers/gpu/drm/dep/drm_dep_job.h b/drivers/gpu/drm/dep/drm_dep_job.h
> > new file mode 100644
> > index 000000000000..35c61d258fa1
> > --- /dev/null
> > +++ b/drivers/gpu/drm/dep/drm_dep_job.h
> > @@ -0,0 +1,13 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2026 Intel Corporation
> > + */
> > +
> > +#ifndef _DRM_DEP_JOB_H_
> > +#define _DRM_DEP_JOB_H_
> > +
> > +struct drm_dep_queue;
> > +
> > +void drm_dep_job_drop_dependencies(struct drm_dep_job *job);
> > +
> > +#endif /* _DRM_DEP_JOB_H_ */
> > diff --git a/drivers/gpu/drm/dep/drm_dep_queue.c b/drivers/gpu/drm/dep/drm_dep_queue.c
> > new file mode 100644
> > index 000000000000..dac02d0d22c4
> > --- /dev/null
> > +++ b/drivers/gpu/drm/dep/drm_dep_queue.c
> > @@ -0,0 +1,1647 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright 2015 Advanced Micro Devices, Inc.
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > + * OTHER DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright © 2026 Intel Corporation
> > + */
> > +
> > +/**
> > + * DOC: DRM dependency queue
> > + *
> > + * The drm_dep subsystem provides a lightweight GPU submission queue that
> > + * combines the roles of drm_gpu_scheduler and drm_sched_entity into a
> > + * single object (struct drm_dep_queue). Each queue owns its own ordered
> > + * submit workqueue, timeout workqueue, and TDR delayed-work.
> > + *
> > + * **Job lifecycle**
> > + *
> > + * 1. Allocate and initialise a job with drm_dep_job_init().
> > + * 2. Add dependency fences with drm_dep_job_add_dependency() and friends.
> > + * 3. Arm the job with drm_dep_job_arm() to obtain its out-fences.
> > + * 4. Submit with drm_dep_job_push().
> > + *
> > + * **Submission paths**
> > + *
> > + * drm_dep_job_push() decides between two paths under @q->sched.lock:
> > + *
> > + * - **Bypass path** (drm_dep_queue_can_job_bypass()): if
> > + *   %DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED is set, the queue is not stopped,
> > + *   the SPSC queue is empty, the job has no dependency fences, and credits
> > + *   are available, the job is submitted inline on the calling thread without
> > + *   touching the submit workqueue.
> > + *
> > + * - **Queued path** (drm_dep_queue_push_job()): the job is pushed onto an
> > + *   SPSC queue and the run_job worker is kicked. The run_job worker pops the
> > + *   job, resolves any remaining dependency fences (installing wakeup
> > + *   callbacks for unresolved ones), and calls drm_dep_queue_run_job().
> > + *
> > + * **Running a job**
> > + *
> > + * drm_dep_queue_run_job() accounts credits, appends the job to the pending
> > + * list (starting the TDR timer only when the list was previously empty),
> > + * calls @ops->run_job(), stores the returned hardware fence as the parent
> > + * of the job's dep fence, then installs a callback on it. When the hardware
> > + * fence fires (or the job completes synchronously), drm_dep_job_done()
> > + * signals the finished fence, returns credits, and kicks the put_job worker
> > + * to free the job.
> > + *
> > + * **Timeout detection and recovery (TDR)**
> > + *
> > + * A delayed work item fires when a job on the pending list takes longer than
> > + * @q->job.timeout jiffies. It calls @ops->timedout_job() and acts on the
> > + * returned status (%DRM_DEP_TIMEDOUT_STAT_JOB_SIGNALED or
> > + * %DRM_DEP_TIMEDOUT_STAT_REQUEUE_JOB).
> > + * drm_dep_queue_trigger_timeout() forces the timer to fire immediately (without
> > + * changing the stored timeout), for example during device teardown.
> > + *
> > + * **Reference counting**
> > + *
> > + * Jobs and queues are both reference counted.
> > + *
> > + * A job holds a reference to its queue from drm_dep_job_init() until
> > + * drm_dep_job_put() drops the job's last reference and its release callback
> > + * runs. This ensures the queue remains valid for the entire lifetime of any
> > + * job that was submitted to it.
> > + *
> > + * The queue holds its own reference to a job for as long as the job is
> > + * internally tracked: from the moment the job is added to the pending list
> > + * in drm_dep_queue_run_job() until drm_dep_job_done() kicks the put_job
> > + * worker, which calls drm_dep_job_put() to release that reference.
> 
> Why not simply keep track that the job was completed, instead of relinquishing
> the reference? We can then release the reference once the job is cleaned up
> (by the queue, using a worker) in process context.
> 
> 
> > + *
> > + * **Hazard: use-after-free from within a worker**
> > + *
> > + * Because a job holds a queue reference, drm_dep_job_put() dropping the last
> > + * job reference will also drop a queue reference via the job's release path.
> > + * If that happens to be the last queue reference, drm_dep_queue_fini() can be
> > + * called, which queues @q->free_work on dep_free_wq and returns immediately.
> > + * free_work calls disable_work_sync() / disable_delayed_work_sync() on the
> > + * queue's own workers before destroying its workqueues, so in practice a
> > + * running worker always completes before the queue memory is freed.
> > + *
> > + * However, there is a secondary hazard: a worker can be queued while the
> > + * queue is in a "zombie" state — refcount has already reached zero and async
> > + * teardown is in flight, but the work item has not yet been disabled by
> > + * free_work.  To guard against this every worker uses
> > + * drm_dep_queue_get_unless_zero() at entry; if the refcount is already zero
> > + * the worker bails immediately without touching the queue state.
> 
> Again, this problem is gone in Rust.
> 
> > + *
> > + * Because all actual teardown (disable_*_sync, destroy_workqueue) runs on
> > + * dep_free_wq — which is independent of the queue's own submit/timeout
> > + * workqueues — there is no deadlock risk.  Each queue holds a drm_dev_get()
> > + * reference on its owning &drm_device, which is released as the last step of
> > + * teardown.  This ensures the driver module cannot be unloaded while any queue
> > + * is still alive.
> > + */
> > +
> > +#include <linux/dma-resv.h>
> > +#include <linux/kref.h>
> > +#include <linux/module.h>
> > +#include <linux/overflow.h>
> > +#include <linux/slab.h>
> > +#include <linux/wait.h>
> > +#include <linux/workqueue.h>
> > +#include <drm/drm_dep.h>
> > +#include <drm/drm_drv.h>
> > +#include <drm/drm_print.h>
> > +#include "drm_dep_fence.h"
> > +#include "drm_dep_job.h"
> > +#include "drm_dep_queue.h"
> > +
> > +/*
> > + * Dedicated workqueue for deferred drm_dep_queue teardown.  Using a
> > + * module-private WQ instead of system_percpu_wq keeps teardown isolated
> > + * from unrelated kernel subsystems.
> > + */
> > +static struct workqueue_struct *dep_free_wq;
> > +
> > +/**
> > + * drm_dep_queue_flags_set() - set a flag on the queue under sched.lock
> > + * @q: dep queue
> > + * @flag: flag to set (one of &enum drm_dep_queue_flags)
> > + *
> > + * Sets @flag in @q->sched.flags. Must be called with @q->sched.lock
> > + * held; the lockdep assertion enforces this.
> > + *
> > + * Context: Process context. Must hold @q->sched.lock. DMA fence signaling path.
> > + */
> > +static void drm_dep_queue_flags_set(struct drm_dep_queue *q,
> > +    enum drm_dep_queue_flags flag)
> > +{
> > + lockdep_assert_held(&q->sched.lock);
> 
> We can enforce this in Rust at compile-time. The code does not compile if the
> lock is not taken. Same here and everywhere else where the sched lock has
> to be taken.
> 
> 
> > + q->sched.flags |= flag;
> > +}
> > +
> > +/**
> > + * drm_dep_queue_flags_clear() - clear a flag on the queue under sched.lock
> > + * @q: dep queue
> > + * @flag: flag to clear (one of &enum drm_dep_queue_flags)
> > + *
> > + * Clears @flag in @q->sched.flags. Must be called with @q->sched.lock
> > + * held; the lockdep assertion enforces this.
> > + *
> > + * Context: Process context. Must hold @q->sched.lock. DMA fence signaling path.
> > + */
> > +static void drm_dep_queue_flags_clear(struct drm_dep_queue *q,
> > +      enum drm_dep_queue_flags flag)
> > +{
> > + lockdep_assert_held(&q->sched.lock);
> > + q->sched.flags &= ~flag;
> > +}
> > +
> > +/**
> > + * drm_dep_queue_has_credits() - check whether the queue has enough credits
> > + * @q: dep queue
> > + * @job: job requesting credits
> > + *
> > + * Checks whether the queue has enough available credits to dispatch
> > + * @job. If @job->credits exceeds the queue's credit limit, it is
> > + * clamped with a WARN.
> > + *
> > + * Context: Process context. Must hold @q->sched.lock. DMA fence signaling path.
> > + * Return: true if available credits >= @job->credits, false otherwise.
> > + */
> > +static bool drm_dep_queue_has_credits(struct drm_dep_queue *q,
> > +      struct drm_dep_job *job)
> > +{
> > + u32 available;
> > +
> > + lockdep_assert_held(&q->sched.lock);
> > +
> > + if (job->credits > q->credit.limit) {
> > + drm_warn(q->drm,
> > + "Jobs may not exceed the credit limit, truncate.\n");
> > + job->credits = q->credit.limit;
> > + }
> > +
> > + WARN_ON(check_sub_overflow(q->credit.limit,
> > +   atomic_read(&q->credit.count),
> > +   &available));
> > +
> > + return available >= job->credits;
> > +}
> > +
> > +/**
> > + * drm_dep_queue_run_job_queue() - kick the run-job worker
> > + * @q: dep queue
> > + *
> > + * Queues @q->sched.run_job on @q->sched.submit_wq unless the queue is stopped
> > + * or the job queue is empty.  The empty-queue check avoids queueing a work item
> > + * that would immediately return with nothing to do.
> > + *
> > + * Context: Any context.
> > + */
> > +static void drm_dep_queue_run_job_queue(struct drm_dep_queue *q)
> > +{
> > + if (!drm_dep_queue_is_stopped(q) && spsc_queue_count(&q->job.queue))
> > + queue_work(q->sched.submit_wq, &q->sched.run_job);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_put_job_queue() - kick the put-job worker
> > + * @q: dep queue
> > + *
> > + * Queues @q->sched.put_job on @q->sched.submit_wq unless the queue
> > + * is stopped.
> > + *
> > + * Context: Any context.
> > + */
> > +static void drm_dep_queue_put_job_queue(struct drm_dep_queue *q)
> > +{
> > + if (!drm_dep_queue_is_stopped(q))
> > + queue_work(q->sched.submit_wq, &q->sched.put_job);
> > +}
> > +
> > +/**
> > + * drm_queue_start_timeout() - arm or re-arm the TDR delayed work
> > + * @q: dep queue
> > + *
> > + * Arms the TDR delayed work with @q->job.timeout. No-op if
> > + * @q->ops->timedout_job is NULL, the timeout is MAX_SCHEDULE_TIMEOUT,
> > + * or the pending list is empty.
> > + *
> > + * Context: Process context. Must hold @q->sched.lock. DMA fence signaling path.
> > + */
> > +static void drm_queue_start_timeout(struct drm_dep_queue *q)
> > +{
> > + lockdep_assert_held(&q->job.lock);
> > +
> > + if (!q->ops->timedout_job ||
> > +    q->job.timeout == MAX_SCHEDULE_TIMEOUT ||
> > +    list_empty(&q->job.pending))
> > + return;
> > +
> > + mod_delayed_work(q->sched.timeout_wq, &q->sched.tdr, q->job.timeout);
> > +}
> > +
> > +/**
> > + * drm_queue_start_timeout_unlocked() - arm TDR, acquiring job.lock
> > + * @q: dep queue
> > + *
> > + * Acquires @q->job.lock with irq and calls
> > + * drm_queue_start_timeout().
> > + *
> > + * Context: Process context (workqueue).
> > + */
> > +static void drm_queue_start_timeout_unlocked(struct drm_dep_queue *q)
> > +{
> > + guard(spinlock_irq)(&q->job.lock);
> > + drm_queue_start_timeout(q);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_remove_dependency() - clear the active dependency and wake
> > + *   the run-job worker
> > + * @q: dep queue
> > + * @f: the dependency fence being removed
> > + *
> > + * Stores @f into @q->dep.removed_fence via smp_store_release() so that the
> > + * run-job worker can drop the reference to it in drm_dep_queue_is_ready(),
> > + * paired with smp_load_acquire().  Clears @q->dep.fence and kicks the
> > + * run-job worker.
> > + *
> > + * The fence reference is not dropped here; it is deferred to the run-job
> > + * worker via @q->dep.removed_fence to keep this path suitable dma_fence
> > + * callback removal in drm_dep_queue_kill().
> 
> This is a comment in C, but in Rust this is encoded directly in the type system.
> 
> > + *
> > + * Context: Any context.
> > + */
> > +static void drm_dep_queue_remove_dependency(struct drm_dep_queue *q,
> > +    struct dma_fence *f)
> > +{
> > + /* removed_fence must be visible to the reader before &q->dep.fence */
> > + smp_store_release(&q->dep.removed_fence, f);
> > +
> > + WRITE_ONCE(q->dep.fence, NULL);
> > + drm_dep_queue_run_job_queue(q);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_wakeup() - dma_fence callback to wake the run-job worker
> > + * @f: the signalled dependency fence
> > + * @cb: callback embedded in the dep queue
> > + *
> > + * Called from dma_fence_signal() when the active dependency fence signals.
> > + * Delegates to drm_dep_queue_remove_dependency() to clear @q->dep.fence and
> > + * kick the run-job worker.  The fence reference is not dropped here; it is
> > + * deferred to the run-job worker via @q->dep.removed_fence.
> 
> Same here.
> 
> > + *
> > + * Context: Any context.
> > + */
> > +static void drm_dep_queue_wakeup(struct dma_fence *f, struct dma_fence_cb *cb)
> > +{
> > + struct drm_dep_queue *q =
> > + container_of(cb, struct drm_dep_queue, dep.cb);
> > +
> > + drm_dep_queue_remove_dependency(q, f);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_is_ready() - check whether the queue has a dispatchable job
> > + * @q: dep queue
> > + *
> > + * Context: Process context. Must hold @q->sched.lock. DMA fence signaling path.
> 
> Can’t call this in Rust if the lock is not taken.
> 
> > + * Return: true if SPSC queue non-empty and no dep fence pending,
> > + *   false otherwise.
> > + */
> > +static bool drm_dep_queue_is_ready(struct drm_dep_queue *q)
> > +{
> > + lockdep_assert_held(&q->sched.lock);
> > +
> > + if (!spsc_queue_count(&q->job.queue))
> > + return false;
> > +
> > + if (READ_ONCE(q->dep.fence))
> > + return false;
> > +
> > + /* Paired with smp_store_release in drm_dep_queue_remove_dependency() */
> > + dma_fence_put(smp_load_acquire(&q->dep.removed_fence));
> > +
> > + q->dep.removed_fence = NULL;
> > +
> > + return true;
> > +}
> > +
> > +/**
> > + * drm_dep_queue_is_killed() - check whether a dep queue has been killed
> > + * @q: dep queue to check
> > + *
> > + * Return: true if %DRM_DEP_QUEUE_FLAGS_KILLED is set on @q, false otherwise.
> > + *
> > + * Context: Any context.
> > + */
> > +bool drm_dep_queue_is_killed(struct drm_dep_queue *q)
> > +{
> > + return !!(q->sched.flags & DRM_DEP_QUEUE_FLAGS_KILLED);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_is_killed);
> > +
> > +/**
> > + * drm_dep_queue_is_initialized() - check whether a dep queue has been initialized
> > + * @q: dep queue to check
> > + *
> > + * A queue is considered initialized once its ops pointer has been set by a
> > + * successful call to drm_dep_queue_init().  Drivers that embed a
> > + * &drm_dep_queue inside a larger structure may call this before attempting any
> > + * other queue operation to confirm that initialization has taken place.
> > + * drm_dep_queue_put() must be called if this function returns true to drop the
> > + * initialization reference from drm_dep_queue_init().
> > + *
> > + * Return: true if @q has been initialized, false otherwise.
> > + *
> > + * Context: Any context.
> > + */
> > +bool drm_dep_queue_is_initialized(struct drm_dep_queue *q)
> > +{
> > + return !!q->ops;
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_is_initialized);
> > +
> > +/**
> > + * drm_dep_queue_set_stopped() - pre-mark a queue as stopped before first use
> > + * @q: dep queue to mark
> > + *
> > + * Sets %DRM_DEP_QUEUE_FLAGS_STOPPED directly on @q without going through the
> > + * normal drm_dep_queue_stop() path.  This is only valid during the driver-side
> > + * queue initialisation sequence — i.e. after drm_dep_queue_init() returns but
> > + * before the queue is made visible to other threads (e.g. before it is added
> > + * to any lookup structures).  Using this after the queue is live is a driver
> > + * bug; use drm_dep_queue_stop() instead.
> > + *
> > + * Context: Process context, queue not yet visible to other threads.
> > + */
> > +void drm_dep_queue_set_stopped(struct drm_dep_queue *q)
> > +{
> > + q->sched.flags |= DRM_DEP_QUEUE_FLAGS_STOPPED;
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_set_stopped);
> > +
> > +/**
> > + * drm_dep_queue_refcount() - read the current reference count of a queue
> > + * @q: dep queue to query
> > + *
> > + * Returns the instantaneous kref value.  The count may change immediately
> > + * after this call; callers must not make safety decisions based solely on
> > + * the returned value.  Intended for diagnostic snapshots and debugfs output.
> > + *
> > + * Context: Any context.
> > + * Return: current reference count.
> > + */
> > +unsigned int drm_dep_queue_refcount(const struct drm_dep_queue *q)
> > +{
> > + return kref_read(&q->refcount);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_refcount);
> > +
> > +/**
> > + * drm_dep_queue_timeout() - read the per-job TDR timeout for a queue
> > + * @q: dep queue to query
> > + *
> > + * Returns the per-job timeout in jiffies as set at init time.
> > + * %MAX_SCHEDULE_TIMEOUT means no timeout is configured.
> > + *
> > + * Context: Any context.
> > + * Return: timeout in jiffies.
> > + */
> > +long drm_dep_queue_timeout(const struct drm_dep_queue *q)
> > +{
> > + return q->job.timeout;
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_timeout);
> > +
> > +/**
> > + * drm_dep_queue_is_job_put_irq_safe() - test whether job-put from IRQ is allowed
> > + * @q: dep queue
> > + *
> > + * Context: Any context.
> > + * Return: true if %DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE is set,
> > + *   false otherwise.
> > + */
> > +static bool drm_dep_queue_is_job_put_irq_safe(const struct drm_dep_queue *q)
> > +{
> > + return !!(q->sched.flags & DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_job_dependency() - get next unresolved dep fence
> > + * @q: dep queue
> > + * @job: job whose dependencies to advance
> > + *
> > + * Returns NULL immediately if the queue has been killed via
> > + * drm_dep_queue_kill(), bypassing all dependency waits so that jobs
> > + * drain through run_job as quickly as possible.
> > + *
> > + * Context: Process context. Must hold @q->sched.lock. DMA fence signaling path.
> > + * Return: next unresolved &dma_fence with a new reference, or NULL
> > + *   when all dependencies have been consumed (or the queue is killed).
> > + */
> > +static struct dma_fence *
> > +drm_dep_queue_job_dependency(struct drm_dep_queue *q,
> > +     struct drm_dep_job *job)
> > +{
> > + struct dma_fence *f;
> > +
> > + lockdep_assert_held(&q->sched.lock);
> > +
> > + if (drm_dep_queue_is_killed(q))
> > + return NULL;
> > +
> > + f = xa_load(&job->dependencies, job->last_dependency);
> > + if (f) {
> > + job->last_dependency++;
> > + if (WARN_ON(DRM_DEP_JOB_FENCE_PREALLOC == f))
> > + return dma_fence_get_stub();
> > + return dma_fence_get(f);
> > + }
> > +
> > + return NULL;
> > +}
> > +
> > +/**
> > + * drm_dep_queue_add_dep_cb() - install wakeup callback on dep fence
> > + * @q: dep queue
> > + * @job: job whose dependency fence is stored in @q->dep.fence
> > + *
> > + * Installs a wakeup callback on @q->dep.fence. Returns true if the
> > + * callback was installed (the queue must wait), false if the fence is
> > + * already signalled or is a self-fence from the same queue context.
> > + *
> > + * Context: Process context. Must hold @q->sched.lock. DMA fence signaling path.
> > + * Return: true if callback installed, false if fence already done.
> > + */
> 
> In Rust, we can encode the signaling paths with a “token type”. So any
> sections that are part of the signaling path can simply take this token as an
> argument. This type also enforces that end_signaling() is called automatically when it
> goes out of scope.
> 
> By the way, we can easily offer an irq handler type where we enforce this:
> 
> fn handle_threaded_irq(&self, device: &Device<Bound>) -> IrqReturn { 
>  let _annotation = DmaFenceSignallingAnnotation::new();  // Calls begin_signaling()
>  self.driver.handle_threaded_irq(device) 
> 
>  // end_signaling() is called here automatically.
> }
> 
> Same for workqueues:
> 
> fn work_fn(&self, device: &Device<Bound>) {
>  let _annotation = DmaFenceSignallingAnnotation::new();  // Calls begin_signaling()
>  self.driver.work_fn(device) 
> 
>  // end_signaling() is called here automatically.
> }
> 
> This is not Rust-specific, of course, but it is more ergonomic to write in Rust.
> 
> > +static bool drm_dep_queue_add_dep_cb(struct drm_dep_queue *q,
> > +     struct drm_dep_job *job)
> > +{
> > + struct dma_fence *fence = q->dep.fence;
> > +
> > + lockdep_assert_held(&q->sched.lock);
> > +
> > + if (WARN_ON(fence->context == q->fence.context)) {
> > + dma_fence_put(q->dep.fence);
> > + q->dep.fence = NULL;
> > + return false;
> > + }
> > +
> > + if (!dma_fence_add_callback(q->dep.fence, &q->dep.cb,
> > +    drm_dep_queue_wakeup))
> > + return true;
> > +
> > + dma_fence_put(q->dep.fence);
> > + q->dep.fence = NULL;
> > +
> > + return false;
> > +}
> 
> In rust we can enforce that all callbacks take a reference to the fence
> automatically. If the callback is “forgotten” in a buggy path, it is
> automatically removed, and the fence is automatically signaled with -ECANCELED.
> 
> > +
> > +/**
> > + * drm_dep_queue_pop_job() - pop a dispatchable job from the SPSC queue
> > + * @q: dep queue
> > + *
> > + * Peeks at the head of the SPSC queue and drains all resolved
> > + * dependencies. If a dependency is still pending, installs a wakeup
> > + * callback and returns NULL. On success pops the job and returns it.
> > + *
> > + * Context: Process context. Must hold @q->sched.lock. DMA fence signaling path.
> > + * Return: next dispatchable job, or NULL if a dep is still pending.
> > + */
> > +static struct drm_dep_job *drm_dep_queue_pop_job(struct drm_dep_queue *q)
> > +{
> > + struct spsc_node *node;
> > + struct drm_dep_job *job;
> > +
> > + lockdep_assert_held(&q->sched.lock);
> > +
> > + node = spsc_queue_peek(&q->job.queue);
> > + if (!node)
> > + return NULL;
> > +
> > + job = container_of(node, struct drm_dep_job, queue_node);
> > +
> > + while ((q->dep.fence = drm_dep_queue_job_dependency(q, job))) {
> > + if (drm_dep_queue_add_dep_cb(q, job))
> > + return NULL;
> > + }
> > +
> > + spsc_queue_pop(&q->job.queue);
> > +
> > + return job;
> > +}
> > +
> > +/*
> > + * drm_dep_queue_get_unless_zero() - try to acquire a queue reference
> > + *
> > + * Workers use this instead of drm_dep_queue_get() to guard against the zombie
> > + * state: the queue's refcount has already reached zero (async teardown is in
> > + * flight) but a work item was queued before free_work had a chance to cancel
> > + * it.  If kref_get_unless_zero() fails the caller must bail immediately.
> > + *
> > + * Context: Any context.
> > + * Returns true if the reference was acquired, false if the queue is zombie.
> > + */
> 
> Again, this function is totally gone in Rust.
> 
> > +bool drm_dep_queue_get_unless_zero(struct drm_dep_queue *q)
> > +{
> > + return kref_get_unless_zero(&q->refcount);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_get_unless_zero);
> > +
> > +/**
> > + * drm_dep_queue_run_job_work() - run-job worker
> > + * @work: work item embedded in the dep queue
> > + *
> > + * Acquires @q->sched.lock, checks stopped state, queue readiness and
> > + * available credits, pops the next job via drm_dep_queue_pop_job(),
> > + * dispatches it via drm_dep_queue_run_job(), then re-kicks itself.
> > + *
> > + * Uses drm_dep_queue_get_unless_zero() at entry and bails immediately if the
> > + * queue is in zombie state (refcount already zero, async teardown in flight).
> > + *
> > + * Context: Process context (workqueue). DMA fence signaling path.
> > + */
> > +static void drm_dep_queue_run_job_work(struct work_struct *work)
> > +{
> > + struct drm_dep_queue *q =
> > + container_of(work, struct drm_dep_queue, sched.run_job);
> > + struct spsc_node *node;
> > + struct drm_dep_job *job;
> > + bool cookie = dma_fence_begin_signalling();
> > +
> > + /* Bail if queue is zombie (refcount already zero, teardown in flight). */
> > + if (!drm_dep_queue_get_unless_zero(q)) {
> > + dma_fence_end_signalling(cookie);
> > + return;
> > + }
> > +
> > + mutex_lock(&q->sched.lock);
> > +
> > + if (drm_dep_queue_is_stopped(q))
> > + goto put_queue;
> > +
> > + if (!drm_dep_queue_is_ready(q))
> > + goto put_queue;
> > +
> > + /* Peek to check credits before committing to pop and dep resolution */
> > + node = spsc_queue_peek(&q->job.queue);
> > + if (!node)
> > + goto put_queue;
> > +
> > + job = container_of(node, struct drm_dep_job, queue_node);
> > + if (!drm_dep_queue_has_credits(q, job))
> > + goto put_queue;
> > +
> > + job = drm_dep_queue_pop_job(q);
> > + if (!job)
> > + goto put_queue;
> > +
> > + drm_dep_queue_run_job(q, job);
> > + drm_dep_queue_run_job_queue(q);
> > +
> > +put_queue:
> > + mutex_unlock(&q->sched.lock);
> > + drm_dep_queue_put(q);
> > + dma_fence_end_signalling(cookie);
> > +}
> > +
> > +/*
> > + * drm_dep_queue_remove_job() - unlink a job from the pending list and reset TDR
> > + * @q:   dep queue owning @job
> > + * @job: job to remove
> > + *
> > + * Splices @job out of @q->job.pending, cancels any pending TDR delayed work,
> > + * and arms the timeout for the new list head (if any).
> > + *
> > + * Context: Process context. Must hold @q->sched.lock. DMA fence signaling path.
> > + */
> > +static void drm_dep_queue_remove_job(struct drm_dep_queue *q,
> > +     struct drm_dep_job *job)
> > +{
> > + lockdep_assert_held(&q->job.lock);
> > +
> > + list_del_init(&job->pending_link);
> > + cancel_delayed_work(&q->sched.tdr);
> > + drm_queue_start_timeout(q);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_get_finished_job() - dequeue a finished job
> > + * @q: dep queue
> > + *
> > + * Under @q->job.lock checks the head of the pending list for a
> > + * finished dep fence. If found, removes the job from the list,
> > + * cancels the TDR, and re-arms it for the new head.
> > + *
> > + * Context: Process context (workqueue). DMA fence signaling path.
> > + * Return: the finished &drm_dep_job, or NULL if none is ready.
> > + */
> > +static struct drm_dep_job *
> > +drm_dep_queue_get_finished_job(struct drm_dep_queue *q)
> > +{
> > + struct drm_dep_job *job;
> > +
> > + guard(spinlock_irq)(&q->job.lock);
> > +
> > + job = list_first_entry_or_null(&q->job.pending, struct drm_dep_job,
> > +       pending_link);
> > + if (job && drm_dep_fence_is_finished(job->dfence))
> > + drm_dep_queue_remove_job(q, job);
> > + else
> > + job = NULL;
> > +
> > + return job;
> > +}
> > +
> > +/**
> > + * drm_dep_queue_put_job_work() - put-job worker
> > + * @work: work item embedded in the dep queue
> > + *
> > + * Drains all finished jobs by calling drm_dep_job_put() in a loop,
> > + * then kicks the run-job worker.
> > + *
> > + * Uses drm_dep_queue_get_unless_zero() at entry and bails immediately if the
> > + * queue is in zombie state (refcount already zero, async teardown in flight).
> > + *
> > + * Wraps execution in dma_fence_begin_signalling() / dma_fence_end_signalling()
> > + * because workqueue is shared with other items in the fence signaling path.
> > + *
> > + * Context: Process context (workqueue). DMA fence signaling path.
> > + */
> > +static void drm_dep_queue_put_job_work(struct work_struct *work)
> > +{
> > + struct drm_dep_queue *q =
> > + container_of(work, struct drm_dep_queue, sched.put_job);
> > + struct drm_dep_job *job;
> > + bool cookie = dma_fence_begin_signalling();
> > +
> > + /* Bail if queue is zombie (refcount already zero, teardown in flight). */
> > + if (!drm_dep_queue_get_unless_zero(q)) {
> > + dma_fence_end_signalling(cookie);
> > + return;
> > + }
> > +
> > + while ((job = drm_dep_queue_get_finished_job(q)))
> > + drm_dep_job_put(job);
> > +
> > + drm_dep_queue_run_job_queue(q);
> > +
> > + drm_dep_queue_put(q);
> > + dma_fence_end_signalling(cookie);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_tdr_work() - TDR worker
> > + * @work: work item embedded in the delayed TDR work
> > + *
> > + * Removes the head job from the pending list under @q->job.lock,
> > + * asserts @q->ops->timedout_job is non-NULL, calls it outside the lock,
> > + * requeues the job if %DRM_DEP_TIMEDOUT_STAT_REQUEUE_JOB, drops the
> > + * queue's job reference on %DRM_DEP_TIMEDOUT_STAT_JOB_SIGNALED, and always
> > + * restarts the TDR timer after handling the job (unless @q is stopping).
> > + * Any other return value triggers a WARN.
> > + *
> > + * The TDR is never armed when @q->ops->timedout_job is NULL, so firing
> > + * this worker without a timedout_job callback is a driver bug.
> > + *
> > + * Uses drm_dep_queue_get_unless_zero() at entry and bails immediately if the
> > + * queue is in zombie state (refcount already zero, async teardown in flight).
> > + *
> > + * Wraps execution in dma_fence_begin_signalling() / dma_fence_end_signalling()
> > + * because timedout_job() is expected to signal the guilty job's fence as part
> > + * of reset.
> > + *
> > + * Context: Process context (workqueue). DMA fence signaling path.
> > + */
> > +static void drm_dep_queue_tdr_work(struct work_struct *work)
> > +{
> > + struct drm_dep_queue *q =
> > + container_of(work, struct drm_dep_queue, sched.tdr.work);
> > + struct drm_dep_job *job;
> > + bool cookie = dma_fence_begin_signalling();
> > +
> > + /* Bail if queue is zombie (refcount already zero, teardown in flight). */
> > + if (!drm_dep_queue_get_unless_zero(q)) {
> > + dma_fence_end_signalling(cookie);
> > + return;
> > + }
> > +
> > + scoped_guard(spinlock_irq, &q->job.lock) {
> > + job = list_first_entry_or_null(&q->job.pending,
> > +       struct drm_dep_job,
> > +       pending_link);
> > + if (job)
> > + /*
> > + * Remove from pending so it cannot be freed
> > + * concurrently by drm_dep_queue_get_finished_job() or
> > + * .drm_dep_job_done().
> > + */
> > + list_del_init(&job->pending_link);
> > + }
> > +
> > + if (job) {
> > + enum drm_dep_timedout_stat status;
> > +
> > + if (WARN_ON(!q->ops->timedout_job)) {
> > + drm_dep_job_put(job);
> > + goto out;
> > + }
> > +
> > + status = q->ops->timedout_job(job);
> > +
> > + switch (status) {
> > + case DRM_DEP_TIMEDOUT_STAT_REQUEUE_JOB:
> > + scoped_guard(spinlock_irq, &q->job.lock)
> > + list_add(&job->pending_link, &q->job.pending);
> > + drm_dep_queue_put_job_queue(q);
> > + break;
> > + case DRM_DEP_TIMEDOUT_STAT_JOB_SIGNALED:
> > + drm_dep_job_put(job);
> > + break;
> > + default:
> > + WARN_ON("invalid drm_dep_timedout_stat");
> > + break;
> > + }
> > + }
> > +
> > +out:
> > + drm_queue_start_timeout_unlocked(q);
> > + drm_dep_queue_put(q);
> > + dma_fence_end_signalling(cookie);
> > +}
> > +
> > +/**
> > + * drm_dep_alloc_submit_wq() - allocate an ordered submit workqueue
> > + * @name: name for the workqueue
> > + * @flags: DRM_DEP_QUEUE_FLAGS_* flags
> > + *
> > + * Allocates an ordered workqueue for job submission with %WQ_MEM_RECLAIM and
> > + * %WQ_MEM_WARN_ON_RECLAIM set, ensuring the workqueue is safe to use from
> > + * memory reclaim context and properly annotated for lockdep taint tracking.
> > + * Adds %WQ_HIGHPRI if %DRM_DEP_QUEUE_FLAGS_HIGHPRI is set. When
> > + * CONFIG_LOCKDEP is enabled, uses a dedicated lockdep map for annotation.
> > + *
> > + * Context: Process context.
> > + * Return: the new &workqueue_struct, or NULL on failure.
> > + */
> > +static struct workqueue_struct *
> > +drm_dep_alloc_submit_wq(const char *name, enum drm_dep_queue_flags flags)
> > +{
> > + unsigned int wq_flags = WQ_MEM_RECLAIM | WQ_MEM_WARN_ON_RECLAIM;
> > +
> > + if (flags & DRM_DEP_QUEUE_FLAGS_HIGHPRI)
> > + wq_flags |= WQ_HIGHPRI;
> > +
> > +#if IS_ENABLED(CONFIG_LOCKDEP)
> > + static struct lockdep_map map = {
> > + .name = "drm_dep_submit_lockdep_map"
> > + };
> > + return alloc_ordered_workqueue_lockdep_map(name, wq_flags, &map);
> > +#else
> > + return alloc_ordered_workqueue(name, wq_flags);
> > +#endif
> > +}
> > +
> > +/**
> > + * drm_dep_alloc_timeout_wq() - allocate an ordered TDR workqueue
> > + * @name: name for the workqueue
> > + *
> > + * Allocates an ordered workqueue for timeout detection and recovery with
> > + * %WQ_MEM_RECLAIM and %WQ_MEM_WARN_ON_RECLAIM set, ensuring consistent taint
> > + * annotation with the submit workqueue. When CONFIG_LOCKDEP is enabled, uses
> > + * a dedicated lockdep map for annotation.
> > + *
> > + * Context: Process context.
> > + * Return: the new &workqueue_struct, or NULL on failure.
> > + */
> > +static struct workqueue_struct *drm_dep_alloc_timeout_wq(const char *name)
> > +{
> > + unsigned int wq_flags = WQ_MEM_RECLAIM | WQ_MEM_WARN_ON_RECLAIM;
> > +
> > +#if IS_ENABLED(CONFIG_LOCKDEP)
> > + static struct lockdep_map map = {
> > + .name = "drm_dep_timeout_lockdep_map"
> > + };
> > + return alloc_ordered_workqueue_lockdep_map(name, wq_flags, &map);
> > +#else
> > + return alloc_ordered_workqueue(name, wq_flags);
> > +#endif
> > +}
> > +
> > +/**
> > + * drm_dep_queue_init() - initialize a dep queue
> > + * @q: dep queue to initialize
> > + * @args: initialization arguments
> > + *
> > + * Initializes all fields of @q from @args. If @args->submit_wq is NULL an
> > + * ordered workqueue is allocated and owned by the queue
> > + * (%DRM_DEP_QUEUE_FLAGS_OWN_SUBMIT_WQ). If @args->timeout_wq is NULL an
> > + * ordered workqueue is allocated and owned by the queue
> > + * (%DRM_DEP_QUEUE_FLAGS_OWN_TIMEDOUT_WQ). On success the queue holds one kref
> > + * reference and drm_dep_queue_put() must be called to drop this reference
> > + * (i.e., drivers cannot directly free the queue).
> > + *
> > + * When CONFIG_LOCKDEP is enabled, @q->sched.lock is primed against the
> > + * fs_reclaim pseudo-lock so that lockdep can detect any lock ordering
> > + * inversion between @sched.lock and memory reclaim.
> > + *
> > + * Return: 0 on success, %-EINVAL when @args->credit_limit is zero, @args->ops
> > + * is NULL, @args->drm is NULL, @args->ops->run_job is NULL, or when
> > + * @args->submit_wq or @args->timeout_wq is non-NULL but was not allocated with
> > + * %WQ_MEM_WARN_ON_RECLAIM; %-ENOMEM when workqueue allocation fails.
> > + *
> > + * Context: Process context. May allocate memory and create workqueues.
> > + */
> > +int drm_dep_queue_init(struct drm_dep_queue *q,
> > +       const struct drm_dep_queue_init_args *args)
> > +{
> > + if (!args->credit_limit || !args->drm || !args->ops ||
> > +    !args->ops->run_job)
> > + return -EINVAL;
> > +
> > + if (args->submit_wq && !workqueue_is_reclaim_annotated(args->submit_wq))
> > + return -EINVAL;
> > +
> > + if (args->timeout_wq &&
> > +    !workqueue_is_reclaim_annotated(args->timeout_wq))
> > + return -EINVAL;
> > +
> > + memset(q, 0, sizeof(*q));
> > +
> > + q->name = args->name;
> > + q->drm = args->drm;
> > + q->credit.limit = args->credit_limit;
> > + q->job.timeout = args->timeout ? args->timeout : MAX_SCHEDULE_TIMEOUT;
> > +
> > + init_rcu_head(&q->rcu);
> > + INIT_LIST_HEAD(&q->job.pending);
> > + spin_lock_init(&q->job.lock);
> > + spsc_queue_init(&q->job.queue);
> > +
> > + mutex_init(&q->sched.lock);
> > + if (IS_ENABLED(CONFIG_LOCKDEP)) {
> > + fs_reclaim_acquire(GFP_KERNEL);
> > + might_lock(&q->sched.lock);
> > + fs_reclaim_release(GFP_KERNEL);
> > + }
> > +
> > + if (args->submit_wq) {
> > + q->sched.submit_wq = args->submit_wq;
> > + } else {
> > + q->sched.submit_wq = drm_dep_alloc_submit_wq(args->name ?: "drm_dep",
> > +     args->flags);
> > + if (!q->sched.submit_wq)
> > + return -ENOMEM;
> > +
> > + q->sched.flags |= DRM_DEP_QUEUE_FLAGS_OWN_SUBMIT_WQ;
> > + }
> > +
> > + if (args->timeout_wq) {
> > + q->sched.timeout_wq = args->timeout_wq;
> > + } else {
> > + q->sched.timeout_wq = drm_dep_alloc_timeout_wq(args->name ?: "drm_dep");
> > + if (!q->sched.timeout_wq)
> > + goto err_submit_wq;
> > +
> > + q->sched.flags |= DRM_DEP_QUEUE_FLAGS_OWN_TIMEDOUT_WQ;
> > + }
> > +
> > + q->sched.flags |= args->flags &
> > + ~(DRM_DEP_QUEUE_FLAGS_OWN_SUBMIT_WQ |
> > +  DRM_DEP_QUEUE_FLAGS_OWN_TIMEDOUT_WQ);
> > +
> > + INIT_DELAYED_WORK(&q->sched.tdr, drm_dep_queue_tdr_work);
> > + INIT_WORK(&q->sched.run_job, drm_dep_queue_run_job_work);
> > + INIT_WORK(&q->sched.put_job, drm_dep_queue_put_job_work);
> > +
> > + q->fence.context = dma_fence_context_alloc(1);
> > +
> > + kref_init(&q->refcount);
> > + q->ops = args->ops;
> > + drm_dev_get(q->drm);
> > +
> > + return 0;
> > +
> > +err_submit_wq:
> > + if (q->sched.flags & DRM_DEP_QUEUE_FLAGS_OWN_SUBMIT_WQ)
> > + destroy_workqueue(q->sched.submit_wq);
> > + mutex_destroy(&q->sched.lock);
> > +
> > + return -ENOMEM;
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_init);
> > +
> > +#if IS_ENABLED(CONFIG_PROVE_LOCKING)
> > +/**
> > + * drm_dep_queue_push_job_begin() - mark the start of an arm/push critical section
> > + * @q: dep queue the job belongs to
> > + *
> > + * Called at the start of drm_dep_job_arm() and warns if the push context is
> > + * already owned by another task, which would indicate concurrent arm/push on
> > + * the same queue.
> > + *
> > + * No-op when CONFIG_PROVE_LOCKING is disabled.
> > + *
> > + * Context: Process context. DMA fence signaling path.
> > + */
> > +void drm_dep_queue_push_job_begin(struct drm_dep_queue *q)
> > +{
> > + WARN_ON(q->job.push.owner);
> > + q->job.push.owner = current;
> > +}
> > +
> > +/**
> > + * drm_dep_queue_push_job_end() - mark the end of an arm/push critical section
> > + * @q: dep queue the job belongs to
> > + *
> > + * Called at the end of drm_dep_job_push() and warns if the push context is not
> > + * owned by the current task, which would indicate a mismatched begin/end pair
> > + * or a push from the wrong thread.
> > + *
> > + * No-op when CONFIG_PROVE_LOCKING is disabled.
> > + *
> > + * Context: Process context. DMA fence signaling path.
> > + */
> > +void drm_dep_queue_push_job_end(struct drm_dep_queue *q)
> > +{
> > + WARN_ON(q->job.push.owner != current);
> > + q->job.push.owner = NULL;
> > +}
> > +#endif
> > +
> > +/**
> > + * drm_dep_queue_assert_teardown_invariants() - assert teardown invariants
> > + * @q: dep queue being torn down
> > + *
> > + * Warns if the pending-job list, the SPSC submission queue, or the credit
> > + * counter is non-zero when called, or if the queue still has a non-zero
> > + * reference count.
> > + *
> > + * Context: Any context.
> > + */
> > +static void drm_dep_queue_assert_teardown_invariants(struct drm_dep_queue *q)
> > +{
> > + WARN_ON(!list_empty(&q->job.pending));
> > + WARN_ON(spsc_queue_count(&q->job.queue));
> > + WARN_ON(atomic_read(&q->credit.count));
> > + WARN_ON(drm_dep_queue_refcount(q));
> > +}
> > +
> > +/**
> > + * drm_dep_queue_release() - final internal cleanup of a dep queue
> > + * @q: dep queue to clean up
> > + *
> > + * Asserts teardown invariants and destroys internal resources allocated by
> > + * drm_dep_queue_init() that cannot be torn down earlier in the teardown
> > + * sequence.  Currently this destroys @q->sched.lock.
> > + *
> > + * Drivers that implement &drm_dep_queue_ops.release **must** call this
> > + * function after removing @q from any internal bookkeeping (e.g. lookup
> > + * tables or lists) but before freeing the memory that contains @q.  When
> > + * &drm_dep_queue_ops.release is NULL, drm_dep follows the default teardown
> > + * path and calls this function automatically.
> > + *
> > + * Context: Any context.
> > + */
> > +void drm_dep_queue_release(struct drm_dep_queue *q)
> > +{
> > + drm_dep_queue_assert_teardown_invariants(q);
> > + mutex_destroy(&q->sched.lock);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_release);
> > +
> > +/**
> > + * drm_dep_queue_free() - final cleanup of a dep queue
> > + * @q: dep queue to free
> > + *
> > + * Invokes &drm_dep_queue_ops.release if set, in which case the driver is
> > + * responsible for calling drm_dep_queue_release() and freeing @q itself.
> > + * If &drm_dep_queue_ops.release is NULL, calls drm_dep_queue_release()
> > + * and then frees @q with kfree_rcu().
> > + *
> > + * In either case, releases the drm_dev_get() reference taken at init time
> > + * via drm_dev_put(), allowing the owning &drm_device to be unloaded once
> > + * all queues have been freed.
> > + *
> > + * Context: Process context (workqueue), reclaim safe.
> > + */
> > +static void drm_dep_queue_free(struct drm_dep_queue *q)
> > +{
> > + struct drm_device *drm = q->drm;
> > +
> > + if (q->ops->release) {
> > + q->ops->release(q);
> > + } else {
> > + drm_dep_queue_release(q);
> > + kfree_rcu(q, rcu);
> > + }
> > + drm_dev_put(drm);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_free_work() - deferred queue teardown worker
> > + * @work: free_work item embedded in the dep queue
> > + *
> > + * Runs on dep_free_wq. Disables all work items synchronously
> > + * (preventing re-queue and waiting for in-flight instances),
> > + * destroys any owned workqueues, then calls drm_dep_queue_free().
> > + * Running on dep_free_wq ensures destroy_workqueue() is never
> > + * called from within one of the queue's own workers (deadlock)
> > + * and disable_*_sync() cannot deadlock either.
> > + *
> > + * Context: Process context (workqueue), reclaim safe.
> > + */
> > +static void drm_dep_queue_free_work(struct work_struct *work)
> > +{
> > + struct drm_dep_queue *q =
> > + container_of(work, struct drm_dep_queue, free_work);
> > +
> > + drm_dep_queue_assert_teardown_invariants(q);
> > +
> > + disable_delayed_work_sync(&q->sched.tdr);
> > + disable_work_sync(&q->sched.run_job);
> > + disable_work_sync(&q->sched.put_job);
> > +
> > + if (q->sched.flags & DRM_DEP_QUEUE_FLAGS_OWN_TIMEDOUT_WQ)
> > + destroy_workqueue(q->sched.timeout_wq);
> > +
> > + if (q->sched.flags & DRM_DEP_QUEUE_FLAGS_OWN_SUBMIT_WQ)
> > + destroy_workqueue(q->sched.submit_wq);
> > +
> > + drm_dep_queue_free(q);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_fini() - tear down a dep queue
> > + * @q: dep queue to tear down
> > + *
> > + * Asserts teardown invariants  and nitiates teardown of @q by queuing the
> > + * deferred free work onto tht module-private dep_free_wq workqueue.  The work
> > + * item disables any pending TDR and run/put-job work synchronously, destroys
> > + * any workqueues that were allocated by drm_dep_queue_init(), and then releases
> > + * the queue memory.
> > + *
> > + * Running teardown from dep_free_wq ensures that destroy_workqueue() is never
> > + * called from within one of the queue's own workers (e.g. via
> > + * drm_dep_queue_put()), which would deadlock.
> > + *
> > + * Drivers can wait for all outstanding deferred work to complete by waiting
> > + * for the last drm_dev_put() reference on their &drm_device, which is
> > + * released as the final step of each queue's teardown.
> > + *
> > + * Drivers that implement &drm_dep_queue_ops.fini **must** call this
> > + * function after removing @q from any device bookkeeping but before freeing the
> > + * memory that contains @q.  When &drm_dep_queue_ops.fini is NULL, drm_dep
> > + * follows the default teardown path and calls this function automatically.
> > + *
> > + * Context: Any context.
> > + */
> > +void drm_dep_queue_fini(struct drm_dep_queue *q)
> > +{
> > + drm_dep_queue_assert_teardown_invariants(q);
> > +
> > + INIT_WORK(&q->free_work, drm_dep_queue_free_work);
> > + queue_work(dep_free_wq, &q->free_work);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_fini);
> > +
> > +/**
> > + * drm_dep_queue_get() - acquire a reference to a dep queue
> > + * @q: dep queue to acquire a reference on, or NULL
> > + *
> > + * Return: @q with an additional reference held, or NULL if @q is NULL.
> > + *
> > + * Context: Any context.
> > + */
> > +struct drm_dep_queue *drm_dep_queue_get(struct drm_dep_queue *q)
> > +{
> > + if (q)
> > + kref_get(&q->refcount);
> > + return q;
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_get);
> > +
> > +/**
> > + * __drm_dep_queue_release() - kref release callback for a dep queue
> > + * @kref: kref embedded in the dep queue
> > + *
> > + * Calls &drm_dep_queue_ops.fini if set, otherwise calls
> > + * drm_dep_queue_fini() to initiate deferred teardown.
> > + *
> > + * Context: Any context.
> > + */
> > +static void __drm_dep_queue_release(struct kref *kref)
> > +{
> > + struct drm_dep_queue *q =
> > + container_of(kref, struct drm_dep_queue, refcount);
> > +
> > + if (q->ops->fini)
> > + q->ops->fini(q);
> > + else
> > + drm_dep_queue_fini(q);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_put() - release a reference to a dep queue
> > + * @q: dep queue to release a reference on, or NULL
> > + *
> > + * When the last reference is dropped, calls &drm_dep_queue_ops.fini if set,
> > + * otherwise calls drm_dep_queue_fini(). Final memory release is handled by
> > + * &drm_dep_queue_ops.release (which must call drm_dep_queue_release()) if set,
> > + * or drm_dep_queue_release() followed by kfree_rcu() otherwise.
> > + * Does nothing if @q is NULL.
> > + *
> > + * Context: Any context.
> > + */
> > +void drm_dep_queue_put(struct drm_dep_queue *q)
> > +{
> > + if (q)
> > + kref_put(&q->refcount, __drm_dep_queue_release);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_put);
> > +
> > +/**
> > + * drm_dep_queue_stop() - stop a dep queue from processing new jobs
> > + * @q: dep queue to stop
> > + *
> > + * Sets %DRM_DEP_QUEUE_FLAGS_STOPPED on @q under both @q->sched.lock (mutex)
> > + * and @q->job.lock (spinlock_irq), making the flag safe to test from finished
> > + * fenced signaling context. Then cancels any in-flight run_job and put_job work
> > + * items. Once stopped, the bypass path and the submit workqueue will not
> > + * dispatch further jobs nor will any jobs be removed from the pending list.
> > + * Call drm_dep_queue_start() to resume processing.
> > + *
> > + * Context: Process context. Waits for in-flight workers to complete.
> > + */
> > +void drm_dep_queue_stop(struct drm_dep_queue *q)
> > +{
> > + scoped_guard(mutex, &q->sched.lock) {
> > + scoped_guard(spinlock_irq, &q->job.lock)
> > + drm_dep_queue_flags_set(q, DRM_DEP_QUEUE_FLAGS_STOPPED);
> > + }
> > + cancel_work_sync(&q->sched.run_job);
> > + cancel_work_sync(&q->sched.put_job);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_stop);
> > +
> > +/**
> > + * drm_dep_queue_start() - resume a stopped dep queue
> > + * @q: dep queue to start
> > + *
> > + * Clears %DRM_DEP_QUEUE_FLAGS_STOPPED on @q under both @q->sched.lock (mutex)
> > + * and @q->job.lock (spinlock_irq), making the flag safe to test from IRQ
> > + * context. Then re-queues the run_job and put_job work items so that any jobs
> > + * pending since the queue was stopped are processed. Must only be called after
> > + * drm_dep_queue_stop().
> > + *
> > + * Context: Process context.
> > + */
> > +void drm_dep_queue_start(struct drm_dep_queue *q)
> > +{
> > + scoped_guard(mutex, &q->sched.lock) {
> > + scoped_guard(spinlock_irq, &q->job.lock)
> > + drm_dep_queue_flags_clear(q, DRM_DEP_QUEUE_FLAGS_STOPPED);
> > + }
> > + drm_dep_queue_run_job_queue(q);
> > + drm_dep_queue_put_job_queue(q);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_start);
> > +
> > +/**
> > + * drm_dep_queue_trigger_timeout() - trigger the TDR immediately for
> > + *   all pending jobs
> > + * @q: dep queue to trigger timeout on
> > + *
> > + * Sets @q->job.timeout to 1 and arms the TDR delayed work with a one-jiffy
> > + * delay, causing it to fire almost immediately without hot-spinning at zero
> > + * delay. This is used to force-expire any pendind jobs on the queue, for
> > + * example when the device is being torn down or has encountered an
> > + * unrecoverable error.
> > + *
> > + * It is suggested that when this function is used, the first timedout_job call
> > + * causes the driver to kick the queue off the hardware and signal all pending
> > + * job fences. Subsequent calls continue to signal all pending job fences.
> > + *
> > + * Has no effect if the pending list is empty.
> > + *
> > + * Context: Any context.
> > + */
> > +void drm_dep_queue_trigger_timeout(struct drm_dep_queue *q)
> > +{
> > + guard(spinlock_irqsave)(&q->job.lock);
> > + q->job.timeout = 1;
> > + drm_queue_start_timeout(q);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_trigger_timeout);
> > +
> > +/**
> > + * drm_dep_queue_cancel_tdr_sync() - cancel any pending TDR and wait
> > + *   for it to finish
> > + * @q: dep queue whose TDR to cancel
> > + *
> > + * Cancels the TDR delayed work item if it has not yet started, and waits for
> > + * it to complete if it is already running.  After this call returns, the TDR
> > + * worker is guaranteed not to be executing and will not fire again until
> > + * explicitly rearmed (e.g. via drm_dep_queue_resume_timeout() or by a new
> > + * job being submitted).
> > + *
> > + * Useful during error recovery or queue teardown when the caller needs to
> > + * know that no timeout handling races with its own reset logic.
> > + *
> > + * Context: Process context. May sleep waiting for the TDR worker to finish.
> > + */
> > +void drm_dep_queue_cancel_tdr_sync(struct drm_dep_queue *q)
> > +{
> > + cancel_delayed_work_sync(&q->sched.tdr);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_cancel_tdr_sync);
> > +
> > +/**
> > + * drm_dep_queue_resume_timeout() - restart the TDR timer with the
> > + *   configured timeout
> > + * @q: dep queue to resume the timeout for
> > + *
> > + * Restarts the TDR delayed work using @q->job.timeout. Called after device
> > + * recovery to give pending jobs a fresh full timeout window. Has no effect
> > + * if the pending list is empty.
> > + *
> > + * Context: Any context.
> > + */
> > +void drm_dep_queue_resume_timeout(struct drm_dep_queue *q)
> > +{
> > + drm_queue_start_timeout_unlocked(q);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_resume_timeout);
> > +
> > +/**
> > + * drm_dep_queue_is_stopped() - check whether a dep queue is stopped
> > + * @q: dep queue to check
> > + *
> > + * Return: true if %DRM_DEP_QUEUE_FLAGS_STOPPED is set on @q, false otherwise.
> > + *
> > + * Context: Any context.
> > + */
> > +bool drm_dep_queue_is_stopped(struct drm_dep_queue *q)
> > +{
> > + return !!(q->sched.flags & DRM_DEP_QUEUE_FLAGS_STOPPED);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_is_stopped);
> > +
> > +/**
> > + * drm_dep_queue_kill() - kill a dep queue and flush all pending jobs
> > + * @q: dep queue to kill
> > + *
> > + * Sets %DRM_DEP_QUEUE_FLAGS_KILLED on @q under @q->sched.lock.  If a
> > + * dependency fence is currently being waited on, its callback is removed and
> > + * the run-job worker is kicked immediately so that the blocked job drains
> > + * without waiting.
> > + *
> > + * Once killed, drm_dep_queue_job_dependency() returns NULL for all jobs,
> > + * bypassing dependency waits so that every queued job drains through
> > + * &drm_dep_queue_ops.run_job without blocking.
> > + *
> > + * The &drm_dep_queue_ops.run_job callback is guaranteed to be called for every
> > + * job that was pushed before or after drm_dep_queue_kill(), even during queue
> > + * teardown.  Drivers should use this guarantee to perform any necessary
> > + * bookkeeping cleanup without executing the actual backend operation when the
> > + * queue is killed.
> > + *
> > + * Unlike drm_dep_queue_stop(), killing is one-way: there is no corresponding
> > + * start function.
> > + *
> > + * **Driver safety requirement**
> > + *
> > + * drm_dep_queue_kill() must only be called once the driver can guarantee that
> > + * no job in the queue will touch memory associated with any of its fences
> > + * (i.e., the queue has been removed from the device and will never be put back
> > + * on).
> > + *
> > + * Context: Process context.
> > + */
> > +void drm_dep_queue_kill(struct drm_dep_queue *q)
> > +{
> > + scoped_guard(mutex, &q->sched.lock) {
> > + struct dma_fence *fence;
> > +
> > + drm_dep_queue_flags_set(q, DRM_DEP_QUEUE_FLAGS_KILLED);
> > +
> > + /*
> > + * Holding &q->sched.lock guarantees that the run-job work item
> > + * cannot drop its reference to q->dep.fence concurrently, so
> > + * reading q->dep.fence here is safe.
> > + */
> > + fence = READ_ONCE(q->dep.fence);
> > + if (fence && dma_fence_remove_callback(fence, &q->dep.cb))
> > + drm_dep_queue_remove_dependency(q, fence);
> > + }
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_kill);
> > +
> > +/**
> > + * drm_dep_queue_submit_wq() - retrieve the submit workqueue of a dep queue
> > + * @q: dep queue whose workqueue to retrieve
> > + *
> > + * Drivers may use this to queue their own work items alongside the queue's
> > + * internal run-job and put-job workers — for example to process incoming
> > + * messages in the same serialisation domain.
> > + *
> > + * Prefer drm_dep_queue_work_enqueue() when the only need is to enqueue a
> > + * work item, as it additionally checks the stopped state.  Use this accessor
> > + * when the workqueue itself is required (e.g. for alloc_ordered_workqueue
> > + * replacement or drain_workqueue calls).
> > + *
> > + * Context: Any context.
> > + * Return: the &workqueue_struct used by @q for job submission.
> > + */
> > +struct workqueue_struct *drm_dep_queue_submit_wq(struct drm_dep_queue *q)
> > +{
> > + return q->sched.submit_wq;
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_submit_wq);
> > +
> > +/**
> > + * drm_dep_queue_timeout_wq() - retrieve the timeout workqueue of a dep queue
> > + * @q: dep queue whose workqueue to retrieve
> > + *
> > + * Returns the workqueue used by @q to run TDR (timeout detection and recovery)
> > + * work.  Drivers may use this to queue their own timeout-domain work items, or
> > + * to call drain_workqueue() when tearing down and needing to ensure all pending
> > + * timeout callbacks have completed before proceeding.
> > + *
> > + * Context: Any context.
> > + * Return: the &workqueue_struct used by @q for TDR work.
> > + */
> > +struct workqueue_struct *drm_dep_queue_timeout_wq(struct drm_dep_queue *q)
> > +{
> > + return q->sched.timeout_wq;
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_timeout_wq);
> > +
> > +/**
> > + * drm_dep_queue_work_enqueue() - queue work on the dep queue's submit workqueue
> > + * @q: dep queue to enqueue work on
> > + * @work: work item to enqueue
> > + *
> > + * Queues @work on @q->sched.submit_wq if the queue is not stopped.  This
> > + * allows drivers to schedule custom work items that run serialised with the
> > + * queue's own run-job and put-job workers.
> > + *
> > + * Return: true if the work was queued, false if the queue is stopped or the
> > + * work item was already pending.
> > + *
> > + * Context: Any context.
> > + */
> > +bool drm_dep_queue_work_enqueue(struct drm_dep_queue *q,
> > + struct work_struct *work)
> > +{
> > + if (drm_dep_queue_is_stopped(q))
> > + return false;
> > +
> > + return queue_work(q->sched.submit_wq, work);
> > +}
> > +EXPORT_SYMBOL(drm_dep_queue_work_enqueue);
> > +
> > +/**
> > + * drm_dep_queue_can_job_bypass() - test whether a job can skip the SPSC queue
> > + * @q: dep queue
> > + * @job: job to test
> > + *
> > + * A job may bypass the submit workqueue and run inline on the calling thread
> > + * if all of the following hold:
> > + *
> > + *  - %DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED is set on the queue
> > + *  - the queue is not stopped
> > + *  - the SPSC submission queue is empty (no other jobs waiting)
> > + *  - the queue has enough credits for @job
> > + *  - @job has no unresolved dependency fences
> > + *
> > + * Must be called under @q->sched.lock.
> > + *
> > + * Context: Process context. Must hold @q->sched.lock (a mutex).
> > + * Return: true if the job may be run inline, false otherwise.
> > + */
> > +bool drm_dep_queue_can_job_bypass(struct drm_dep_queue *q,
> > +  struct drm_dep_job *job)
> > +{
> > + lockdep_assert_held(&q->sched.lock);
> > +
> > + return q->sched.flags & DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED &&
> > + !drm_dep_queue_is_stopped(q) &&
> > + !spsc_queue_count(&q->job.queue) &&
> > + drm_dep_queue_has_credits(q, job) &&
> > + xa_empty(&job->dependencies);
> > +}
> > +
> > +/**
> > + * drm_dep_job_done() - mark a job as complete
> > + * @job: the job that finished
> > + * @result: error code to propagate, or 0 for success
> > + *
> > + * Subtracts @job->credits from the queue credit counter, then signals the
> > + * job's dep fence with @result.
> > + *
> > + * When %DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE is set (IRQ-safe path), a
> > + * temporary extra reference is taken on @job before signalling the fence.
> > + * This prevents a concurrent put-job worker — which may be woken by timeouts or
> > + * queue starting — from freeing the job while this function still holds a
> > + * pointer to it.  The extra reference is released at the end of the function.
> > + *
> > + * After signalling, the IRQ-safe path removes the job from the pending list
> > + * under @q->job.lock, provided the queue is not stopped.  Removal is skipped
> > + * when the queue is stopped so that drm_dep_queue_for_each_pending_job() can
> > + * iterate the list without racing with the completion path.  On successful
> > + * removal, kicks the run-job worker so the next queued job can be dispatched
> > + * immediately, then drops the job reference.  If the job was already removed
> > + * by TDR, or removal was skipped because the queue is stopped, kicks the
> > + * put-job worker instead to allow the deferred put to complete.
> > + *
> > + * Context: Any context.
> > + */
> > +static void drm_dep_job_done(struct drm_dep_job *job, int result)
> > +{
> > + struct drm_dep_queue *q = job->q;
> > + bool irq_safe = drm_dep_queue_is_job_put_irq_safe(q), removed = false;
> > +
> > + /*
> > + * Local ref to ensure the put worker—which may be woken by external
> > + * forces (TDR, driver-side queue starting)—doesn't free the job behind
> > + * this function's back after drm_dep_fence_done() while it is still on
> > + * the pending list.
> > + */
> > + if (irq_safe)
> > + drm_dep_job_get(job);
> > +
> > + atomic_sub(job->credits, &q->credit.count);
> > + drm_dep_fence_done(job->dfence, result);
> > +
> > + /* Only safe to touch job after fence signal if we have a local ref. */
> > +
> > + if (irq_safe) {
> > + scoped_guard(spinlock_irqsave, &q->job.lock) {
> > + removed = !list_empty(&job->pending_link) &&
> > + !drm_dep_queue_is_stopped(q);
> > +
> > + /* Guard against TDR operating on job */
> > + if (removed)
> > + drm_dep_queue_remove_job(q, job);
> > + }
> > + }
> > +
> > + if (removed) {
> > + drm_dep_queue_run_job_queue(q);
> > + drm_dep_job_put(job);
> > + } else {
> > + drm_dep_queue_put_job_queue(q);
> > + }
> > +
> > + if (irq_safe)
> > + drm_dep_job_put(job);
> > +}
> > +
> > +/**
> > + * drm_dep_job_done_cb() - dma_fence callback to complete a job
> > + * @f: the hardware fence that signalled
> > + * @cb: fence callback embedded in the dep job
> > + *
> > + * Extracts the job from @cb and calls drm_dep_job_done() with
> > + * @f->error as the result.
> > + *
> > + * Context: Any context, but with IRQ disabled. May not sleep.
> > + */
> > +static void drm_dep_job_done_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> > +{
> > + struct drm_dep_job *job = container_of(cb, struct drm_dep_job, cb);
> > +
> > + drm_dep_job_done(job, f->error);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_run_job() - submit a job to hardware and set up
> > + *   completion tracking
> > + * @q: dep queue
> > + * @job: job to run
> > + *
> > + * Accounts @job->credits against the queue, appends the job to the pending
> > + * list, then calls @q->ops->run_job(). The TDR timer is started only when
> > + * @job is the first entry on the pending list; subsequent jobs added while
> > + * a TDR is already in flight do not reset the timer (which would otherwise
> > + * extend the deadline for the already-running head job). Stores the returned
> > + * hardware fence as the parent of the job's dep fence, then installs
> > + * drm_dep_job_done_cb() on it. If the hardware fence is already signalled
> > + * (%-ENOENT from dma_fence_add_callback()) or run_job() returns NULL/error,
> > + * the job is completed immediately. Must be called under @q->sched.lock.
> > + *
> > + * Context: Process context. Must hold @q->sched.lock (a mutex). DMA fence
> > + * signaling path.
> > + */
> > +void drm_dep_queue_run_job(struct drm_dep_queue *q, struct drm_dep_job *job)
> > +{
> > + struct dma_fence *fence;
> > + int r;
> > +
> > + lockdep_assert_held(&q->sched.lock);
> > +
> > + drm_dep_job_get(job);
> > + atomic_add(job->credits, &q->credit.count);
> > +
> > + scoped_guard(spinlock_irq, &q->job.lock) {
> > + bool first = list_empty(&q->job.pending);
> > +
> > + list_add_tail(&job->pending_link, &q->job.pending);
> > + if (first)
> > + drm_queue_start_timeout(q);
> > + }
> > +
> > + fence = q->ops->run_job(job);
> > + drm_dep_fence_set_parent(job->dfence, fence);
> > +
> > + if (!IS_ERR_OR_NULL(fence)) {
> > + r = dma_fence_add_callback(fence, &job->cb,
> > +   drm_dep_job_done_cb);
> > + if (r == -ENOENT)
> > + drm_dep_job_done(job, fence->error);
> > + else if (r)
> > + drm_err(q->drm, "fence add callback failed (%d)\n", r);
> > + dma_fence_put(fence);
> > + } else {
> > + drm_dep_job_done(job, IS_ERR(fence) ? PTR_ERR(fence) : 0);
> > + }
> > +
> > + /*
> > + * Drop all input dependency fences now, in process context, before the
> > + * final job put. Once the job is on the pending list its last reference
> > + * may be dropped from a dma_fence callback (IRQ context), where calling
> > + * xa_destroy() would be unsafe.
> > + */
> 
> I assume that “pending” is the list of jobs that have been handed to the driver
> via ops->run_job()?
> 
> Can’t this problem be solved by not doing anything inside a dma_fence callback
> other than scheduling the queue worker?
> 
> > + drm_dep_job_drop_dependencies(job);
> > + drm_dep_job_put(job);
> > +}
> > +
> > +/**
> > + * drm_dep_queue_push_job() - enqueue a job on the SPSC submission queue
> > + * @q: dep queue
> > + * @job: job to push
> > + *
> > + * Pushes @job onto the SPSC queue. If the queue was previously empty
> > + * (i.e. this is the first pending job), kicks the run_job worker so it
> > + * processes the job promptly without waiting for the next wakeup.
> > + * May be called with or without @q->sched.lock held.
> > + *
> > + * Context: Any context. DMA fence signaling path.
> > + */
> > +void drm_dep_queue_push_job(struct drm_dep_queue *q, struct drm_dep_job *job)
> > +{
> > + /*
> > + * spsc_queue_push() returns true if the queue was previously empty,
> > + * i.e. this is the first pending job. Kick the run_job worker so it
> > + * picks it up without waiting for the next wakeup.
> > + */
> > + if (spsc_queue_push(&q->job.queue, &job->queue_node))
> > + drm_dep_queue_run_job_queue(q);
> > +}
> > +
> > +/**
> > + * drm_dep_init() - module initialiser
> > + *
> > + * Allocates the module-private dep_free_wq unbound workqueue used for
> > + * deferred queue teardown.
> > + *
> > + * Return: 0 on success, %-ENOMEM if workqueue allocation fails.
> > + */
> > +static int __init drm_dep_init(void)
> > +{
> > + dep_free_wq = alloc_workqueue("drm_dep_free", WQ_UNBOUND, 0);
> > + if (!dep_free_wq)
> > + return -ENOMEM;
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * drm_dep_exit() - module exit
> > + *
> > + * Destroys the module-private dep_free_wq workqueue.
> > + */
> > +static void __exit drm_dep_exit(void)
> > +{
> > + destroy_workqueue(dep_free_wq);
> > + dep_free_wq = NULL;
> > +}
> > +
> > +module_init(drm_dep_init);
> > +module_exit(drm_dep_exit);
> > +
> > +MODULE_DESCRIPTION("DRM dependency queue");
> > +MODULE_LICENSE("Dual MIT/GPL");
> > diff --git a/drivers/gpu/drm/dep/drm_dep_queue.h b/drivers/gpu/drm/dep/drm_dep_queue.h
> > new file mode 100644
> > index 000000000000..e5c217a3fab5
> > --- /dev/null
> > +++ b/drivers/gpu/drm/dep/drm_dep_queue.h
> > @@ -0,0 +1,31 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2026 Intel Corporation
> > + */
> > +
> > +#ifndef _DRM_DEP_QUEUE_H_
> > +#define _DRM_DEP_QUEUE_H_
> > +
> > +#include <linux/types.h>
> > +
> > +struct drm_dep_job;
> > +struct drm_dep_queue;
> > +
> > +bool drm_dep_queue_can_job_bypass(struct drm_dep_queue *q,
> > +  struct drm_dep_job *job);
> > +void drm_dep_queue_run_job(struct drm_dep_queue *q, struct drm_dep_job *job);
> > +void drm_dep_queue_push_job(struct drm_dep_queue *q, struct drm_dep_job *job);
> > +
> > +#if IS_ENABLED(CONFIG_PROVE_LOCKING)
> > +void drm_dep_queue_push_job_begin(struct drm_dep_queue *q);
> > +void drm_dep_queue_push_job_end(struct drm_dep_queue *q);
> > +#else
> > +static inline void drm_dep_queue_push_job_begin(struct drm_dep_queue *q)
> > +{
> > +}
> > +static inline void drm_dep_queue_push_job_end(struct drm_dep_queue *q)
> > +{
> > +}
> > +#endif
> > +
> > +#endif /* _DRM_DEP_QUEUE_H_ */
> > diff --git a/include/drm/drm_dep.h b/include/drm/drm_dep.h
> > new file mode 100644
> > index 000000000000..615926584506
> > --- /dev/null
> > +++ b/include/drm/drm_dep.h
> > @@ -0,0 +1,597 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright 2015 Advanced Micro Devices, Inc.
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the "Software"),
> > + * to deal in the Software without restriction, including without limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > + * OTHER DEALINGS IN THE SOFTWARE.
> > + *
> > + * Copyright © 2026 Intel Corporation
> > + */
> > +
> > +#ifndef _DRM_DEP_H_
> > +#define _DRM_DEP_H_
> > +
> > +#include <drm/spsc_queue.h>
> > +#include <linux/dma-fence.h>
> > +#include <linux/xarray.h>
> > +#include <linux/workqueue.h>
> > +
> > +enum dma_resv_usage;
> > +struct dma_resv;
> > +struct drm_dep_fence;
> > +struct drm_dep_job;
> > +struct drm_dep_queue;
> > +struct drm_file;
> > +struct drm_gem_object;
> > +
> > +/**
> > + * enum drm_dep_timedout_stat - return value of &drm_dep_queue_ops.timedout_job
> > + * @DRM_DEP_TIMEDOUT_STAT_JOB_SIGNALED: driver signaled the job's finished
> > + *   fence during reset; drm_dep may safely drop its reference to the job.
> > + * @DRM_DEP_TIMEDOUT_STAT_REQUEUE_JOB: timeout was a false alarm; reinsert the
> > + *   job at the head of the pending list so it can complete normally.
> > + */
> > +enum drm_dep_timedout_stat {
> > + DRM_DEP_TIMEDOUT_STAT_JOB_SIGNALED,
> > + DRM_DEP_TIMEDOUT_STAT_REQUEUE_JOB,
> > +};
> > +
> > +/**
> > + * struct drm_dep_queue_ops - driver callbacks for a dep queue
> > + */
> > +struct drm_dep_queue_ops {
> > + /**
> > + * @run_job: submit the job to hardware. Returns the hardware completion
> > + * fence (with a reference held for the scheduler), or NULL/ERR_PTR on
> > + * synchronous completion or error.
> > + */
> > + struct dma_fence *(*run_job)(struct drm_dep_job *job);
> > +
> > + /**
> > + * @timedout_job: called when the TDR fires for the head job. Must stop
> > + * the hardware, then return %DRM_DEP_TIMEDOUT_STAT_JOB_SIGNALED if the
> > + * job's fence was signalled during reset, or
> > + * %DRM_DEP_TIMEDOUT_STAT_REQUEUE_JOB if the timeout was spurious or
> > + * signalling was otherwise delayed, and the job should be re-inserted
> > + * at the head of the pending list. Any other value triggers a WARN.
> > + */
> > + enum drm_dep_timedout_stat (*timedout_job)(struct drm_dep_job *job);
> > +
> > + /**
> > + * @release: called when the last kref on the queue is dropped and
> > + * drm_dep_queue_fini() has completed.  The driver is responsible for
> > + * removing @q from any internal bookkeeping, calling
> > + * drm_dep_queue_release(), and then freeing the memory containing @q
> > + * (e.g. via kfree_rcu() using @q->rcu).  If NULL, drm_dep calls
> > + * drm_dep_queue_release() and frees @q automatically via kfree_rcu().
> > + * Use this when the queue is embedded in a larger structure.
> > + */
> > + void (*release)(struct drm_dep_queue *q);
> > +
> > + /**
> > + * @fini: if set, called instead of drm_dep_queue_fini() when the last
> > + * kref is dropped. The driver is responsible for calling
> > + * drm_dep_queue_fini() itself after it is done with the queue. Use this
> > + * when additional teardown logic must run before fini (e.g., cleanup
> > + * firmware resources associated with the queue).
> > + */
> > + void (*fini)(struct drm_dep_queue *q);
> > +};
> > +
> > +/**
> > + * enum drm_dep_queue_flags - flags for &drm_dep_queue and
> > + *   &drm_dep_queue_init_args
> > + *
> > + * Flags are divided into three categories:
> > + *
> > + * - **Private static**: set internally at init time and never changed.
> > + *   Drivers must not read or write these.
> > + *   %DRM_DEP_QUEUE_FLAGS_OWN_SUBMIT_WQ,
> > + *   %DRM_DEP_QUEUE_FLAGS_OWN_TIMEDOUT_WQ.
> > + *
> > + * - **Public dynamic**: toggled at runtime by drivers via accessors.
> > + *   Any modification must be performed under &drm_dep_queue.sched.lock.
> 
> Can’t enforce that in C.
> 
> > + *   Accessor functions provide unstable reads.
> > + *   %DRM_DEP_QUEUE_FLAGS_STOPPED,
> > + *   %DRM_DEP_QUEUE_FLAGS_KILLED.
> 
> > + *
> > + * - **Public static**: supplied by the driver in
> > + *   &drm_dep_queue_init_args.flags at queue creation time and not modified
> > + *   thereafter.
> 
> Same here.
> 
> > + *   %DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED,
> > + *   %DRM_DEP_QUEUE_FLAGS_HIGHPRI,
> > + *   %DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE.
> 
> > + *
> > + * @DRM_DEP_QUEUE_FLAGS_OWN_SUBMIT_WQ: (private, static) submit workqueue was
> > + *   allocated by drm_dep_queue_init() and will be destroyed by
> > + *   drm_dep_queue_fini().
> > + * @DRM_DEP_QUEUE_FLAGS_OWN_TIMEDOUT_WQ: (private, static) timeout workqueue
> > + *   was allocated by drm_dep_queue_init() and will be destroyed by
> > + *   drm_dep_queue_fini().
> > + * @DRM_DEP_QUEUE_FLAGS_STOPPED: (public, dynamic) the queue is stopped and
> > + *   will not dispatch new jobs or remove jobs from the pending list, dropping
> > + *   the drm_dep-owned reference. Set by drm_dep_queue_stop(), cleared by
> > + *   drm_dep_queue_start().
> > + * @DRM_DEP_QUEUE_FLAGS_KILLED: (public, dynamic) the queue has been killed
> > + *   via drm_dep_queue_kill(). Any active dependency wait is cancelled
> > + *   immediately.  Jobs continue to flow through run_job for bookkeeping
> > + *   cleanup, but dependency waiting is skipped so that queued work drains
> > + *   as quickly as possible.
> > + * @DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED: (public, static) the queue supports
> > + *   the bypass path where eligible jobs skip the SPSC queue and run inline.
> > + * @DRM_DEP_QUEUE_FLAGS_HIGHPRI: (public, static) the submit workqueue owned
> > + *   by the queue is created with %WQ_HIGHPRI, causing run-job and put-job
> > + *   workers to execute at elevated priority. Only privileged clients (e.g.
> > + *   drivers managing time-critical or real-time GPU contexts) should request
> > + *   this flag; granting it to unprivileged userspace would allow priority
> > + *   inversion attacks.
> > + *   @drm_dep_queue_init_args.submit_wq is provided.
> > + * @DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE: (public, static) when set,
> > + *   drm_dep_job_done() may be called from hardirq context (e.g. from a
> > + *   hardware-signalled dma_fence callback). drm_dep_job_done() will directly
> > + *   dequeue the job and call drm_dep_job_put() without deferring to a
> > + *   workqueue. The driver's &drm_dep_job_ops.release callback must therefore
> > + *   be safe to invoke from IRQ context.
> > + */
> > +enum drm_dep_queue_flags {
> > + DRM_DEP_QUEUE_FLAGS_OWN_SUBMIT_WQ = BIT(0),
> > + DRM_DEP_QUEUE_FLAGS_OWN_TIMEDOUT_WQ = BIT(1),
> > + DRM_DEP_QUEUE_FLAGS_STOPPED = BIT(2),
> > + DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED = BIT(3),
> > + DRM_DEP_QUEUE_FLAGS_HIGHPRI = BIT(4),
> > + DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE = BIT(5),
> > + DRM_DEP_QUEUE_FLAGS_KILLED = BIT(6),
> > +};
> > +
> > +/**
> > + * struct drm_dep_queue - a dependency-tracked GPU submission queue
> > + *
> > + * Combines the role of &drm_gpu_scheduler and &drm_sched_entity into a single
> > + * object.  Each queue owns a submit workqueue (or borrows one), a timeout
> > + * workqueue, an SPSC submission queue, and a pending-job list used for TDR.
> > + *
> > + * Initialise with drm_dep_queue_init(), tear down with drm_dep_queue_fini().
> > + * Reference counted via drm_dep_queue_get() / drm_dep_queue_put().
> > + *
> > + * All fields are **opaque to drivers**.  Do not read or write any field
> 
> Can’t enforce this in C.
> 
> > + * directly; use the provided helper functions instead.  The sole exception
> > + * is @rcu, which drivers may pass to kfree_rcu() when the queue is embedded
> > + * inside a larger driver-managed structure and the &drm_dep_queue_ops.release
> > + * vfunc performs an RCU-deferred free.
> 
> > + */
> > +struct drm_dep_queue {
> > + /** @ops: driver callbacks, set at init time. */
> > + const struct drm_dep_queue_ops *ops;
> > + /** @name: human-readable name used for workqueue and fence naming. */
> > + const char *name;
> > + /** @drm: owning DRM device; a drm_dev_get() reference is held for the
> > + *  lifetime of the queue to prevent module unload while queues are live.
> > + */
> > + struct drm_device *drm;
> > + /** @refcount: reference count; use drm_dep_queue_get/put(). */
> > + struct kref refcount;
> > + /**
> > + * @free_work: deferred teardown work queued unconditionally by
> > + * drm_dep_queue_fini() onto the module-private dep_free_wq.  The work
> > + * item disables pending workers synchronously and destroys any owned
> > + * workqueues before releasing the queue memory and dropping the
> > + * drm_dev_get() reference.  Running on dep_free_wq ensures
> > + * destroy_workqueue() is never called from within one of the queue's
> > + * own workers.
> > + */
> > + struct work_struct free_work;
> > + /**
> > + * @rcu: RCU head for deferred freeing.
> > + *
> > + * This is the **only** field drivers may access directly.  When the
> 
> We can enforce this in Rust at compile time.
> 
> > + * queue is embedded in a larger structure, implement
> > + * &drm_dep_queue_ops.release, call drm_dep_queue_release() to destroy
> > + * internal resources, then pass this field to kfree_rcu() so that any
> > + * in-flight RCU readers referencing the queue's dma_fence timeline name
> > + * complete before the memory is returned.  All other fields must be
> > + * accessed through the provided helpers.
> > + */
> > + struct rcu_head rcu;
> > +
> > + /** @sched: scheduling and workqueue state. */
> > + struct {
> > + /** @sched.submit_wq: ordered workqueue for run/put-job work. */
> > + struct workqueue_struct *submit_wq;
> > + /** @sched.timeout_wq: workqueue for the TDR delayed work. */
> > + struct workqueue_struct *timeout_wq;
> > + /**
> > + * @sched.run_job: work item that dispatches the next queued
> > + * job.
> > + */
> > + struct work_struct run_job;
> > + /** @sched.put_job: work item that frees finished jobs. */
> > + struct work_struct put_job;
> > + /** @sched.tdr: delayed work item for timeout/reset (TDR). */
> > + struct delayed_work tdr;
> > + /**
> > + * @sched.lock: mutex serialising job dispatch, bypass
> > + * decisions, stop/start, and flag updates.
> > + */
> > + struct mutex lock;
> > + /**
> > + * @sched.flags: bitmask of &enum drm_dep_queue_flags.
> > + * Any modification after drm_dep_queue_init() must be
> > + * performed under @sched.lock.
> > + */
> > + enum drm_dep_queue_flags flags;
> > + } sched;
> > +
> > + /** @job: pending-job tracking state. */
> > + struct {
> > + /**
> > + * @job.pending: list of jobs that have been dispatched to
> > + * hardware and not yet freed. Protected by @job.lock.
> > + */
> > + struct list_head pending;
> > + /**
> > + * @job.queue: SPSC queue of jobs waiting to be dispatched.
> > + * Producers push via drm_dep_queue_push_job(); the run_job
> > + * work item pops from the consumer side.
> > + */
> > + struct spsc_queue queue;
> > + /**
> > + * @job.lock: spinlock protecting @job.pending, TDR start, and
> > + * the %DRM_DEP_QUEUE_FLAGS_STOPPED flag. Always acquired with
> > + * irqsave (spin_lock_irqsave / spin_unlock_irqrestore) to
> > + * support %DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE queues where
> > + * drm_dep_job_done() may run from hardirq context.
> > + */
> > + spinlock_t lock;
> > + /**
> > + * @job.timeout: per-job TDR timeout in jiffies.
> > + * %MAX_SCHEDULE_TIMEOUT means no timeout.
> > + */
> > + long timeout;
> > +#if IS_ENABLED(CONFIG_PROVE_LOCKING)
> > + /**
> > + * @job.push: lockdep annotation tracking the arm-to-push
> > + * critical section.
> > + */
> > + struct {
> > + /*
> > + * @job.push.owner: task that currently holds the push
> > + * context, used to assert single-owner invariants.
> > + * NULL when idle.
> > + */
> > + struct task_struct *owner;
> > + } push;
> > +#endif
> > + } job;
> > +
> > + /** @credit: hardware credit accounting. */
> > + struct {
> > + /** @credit.limit: maximum credits the queue can hold. */
> > + u32 limit;
> > + /** @credit.count: credits currently in flight (atomic). */
> > + atomic_t count;
> > + } credit;
> > +
> > + /** @dep: current blocking dependency for the head SPSC job. */
> > + struct {
> > + /**
> > + * @dep.fence: fence being waited on before the head job can
> > + * run. NULL when no dependency is pending.
> > + */
> > + struct dma_fence *fence;
> > + /**
> > + * @dep.removed_fence: dependency fence whose callback has been
> > + * removed.  The run-job worker must drop its reference to this
> > + * fence before proceeding to call run_job.
> 
> We can enforce this in Rust automatically.
> 
> > + */
> > + struct dma_fence *removed_fence;
> > + /** @dep.cb: callback installed on @dep.fence. */
> > + struct dma_fence_cb cb;
> > + } dep;
> > +
> > + /** @fence: fence context and sequence number state. */
> > + struct {
> > + /**
> > + * @fence.seqno: next sequence number to assign, incremented
> > + * each time a job is armed.
> > + */
> > + u32 seqno;
> > + /**
> > + * @fence.context: base DMA fence context allocated at init
> > + * time. Finished fences use this context.
> > + */
> > + u64 context;
> > + } fence;
> > +};
> > +
> > +/**
> > + * struct drm_dep_queue_init_args - arguments for drm_dep_queue_init()
> > + */
> > +struct drm_dep_queue_init_args {
> > + /** @ops: driver callbacks; must not be NULL. */
> > + const struct drm_dep_queue_ops *ops;
> > + /** @name: human-readable name for workqueues and fence timelines. */
> > + const char *name;
> > + /** @drm: owning DRM device. A drm_dev_get() reference is taken at
> > + *  queue init and released when the queue is freed, preventing module
> > + *  unload while any queue is still alive.
> > + */
> > + struct drm_device *drm;
> > + /**
> > + * @submit_wq: workqueue for job dispatch. If NULL, an ordered
> > + * workqueue is allocated and owned by the queue.  If non-NULL, the
> > + * workqueue must have been allocated with %WQ_MEM_RECLAIM_TAINT;
> > + * drm_dep_queue_init() returns %-EINVAL otherwise.
> > + */
> > + struct workqueue_struct *submit_wq;
> > + /**
> > + * @timeout_wq: workqueue for TDR. If NULL, an ordered workqueue
> > + * is allocated and owned by the queue.  If non-NULL, the workqueue
> > + * must have been allocated with %WQ_MEM_RECLAIM_TAINT;
> > + * drm_dep_queue_init() returns %-EINVAL otherwise.
> > + */
> > + struct workqueue_struct *timeout_wq;
> > + /** @credit_limit: maximum hardware credits; must be non-zero. */
> > + u32 credit_limit;
> > + /**
> > + * @timeout: per-job TDR timeout in jiffies. Zero means no timeout
> > + * (%MAX_SCHEDULE_TIMEOUT is used internally).
> > + */
> > + long timeout;
> > + /**
> > + * @flags: initial queue flags. %DRM_DEP_QUEUE_FLAGS_OWN_SUBMIT_WQ
> > + * and %DRM_DEP_QUEUE_FLAGS_OWN_TIMEDOUT_WQ are managed internally
> > + * and will be ignored if set here. Setting
> > + * %DRM_DEP_QUEUE_FLAGS_HIGHPRI requests a high-priority submit
> > + * workqueue; drivers must only set this for privileged clients.
> > + */
> > + enum drm_dep_queue_flags flags;
> > +};
> > +
> > +/**
> > + * struct drm_dep_job_ops - driver callbacks for a dep job
> > + */
> > +struct drm_dep_job_ops {
> > + /**
> > + * @release: called when the last reference to the job is dropped.
> > + *
> > + * If set, the driver is responsible for freeing the job. If NULL,
> 
> And if they don’t?
> 
> By the way, we can also enforce this in Rust.
> 
> > + * drm_dep_job_put() will call kfree() on the job directly.
> > + */
> > + void (*release)(struct drm_dep_job *job);
> > +};
> > +
> > +/**
> > + * struct drm_dep_job - a unit of work submitted to a dep queue
> > + *
> > + * All fields are **opaque to drivers**.  Do not read or write any field
> > + * directly; use the provided helper functions instead.
> > + */
> > +struct drm_dep_job {
> > + /** @ops: driver callbacks for this job. */
> > + const struct drm_dep_job_ops *ops;
> > + /** @refcount: reference count, managed by drm_dep_job_get/put(). */
> > + struct kref refcount;
> > + /**
> > + * @dependencies: xarray of &dma_fence dependencies before the job can
> > + * run.
> > + */
> > + struct xarray dependencies;
> > + /** @q: the queue this job is submitted to. */
> > + struct drm_dep_queue *q;
> > + /** @queue_node: SPSC queue linkage for pending submission. */
> > + struct spsc_node queue_node;
> > + /**
> > + * @pending_link: list entry in the queue's pending job list. Protected
> > + * by @job.q->job.lock.
> > + */
> > + struct list_head pending_link;
> > + /** @dfence: finished fence for this job. */
> > + struct drm_dep_fence *dfence;
> > + /** @cb: fence callback used to watch for dependency completion. */
> > + struct dma_fence_cb cb;
> > + /** @credits: number of credits this job consumes from the queue. */
> > + u32 credits;
> > + /**
> > + * @last_dependency: index into @dependencies of the next fence to
> > + * check. Advanced by drm_dep_queue_job_dependency() as each
> > + * dependency is consumed.
> > + */
> > + u32 last_dependency;
> > + /**
> > + * @invalidate_count: number of times this job has been invalidated.
> > + * Incremented by drm_dep_job_invalidate_job().
> > + */
> > + u32 invalidate_count;
> > + /**
> > + * @signalling_cookie: return value of dma_fence_begin_signalling()
> > + * captured in drm_dep_job_arm() and consumed by drm_dep_job_push().
> > + * Not valid outside the arm→push window.
> > + */
> > + bool signalling_cookie;
> > +};
> > +
> > +/**
> > + * struct drm_dep_job_init_args - arguments for drm_dep_job_init()
> > + */
> > +struct drm_dep_job_init_args {
> > + /**
> > + * @ops: driver callbacks for the job, or NULL for default behaviour.
> > + */
> > + const struct drm_dep_job_ops *ops;
> > + /** @q: the queue to associate the job with. A reference is taken. */
> > + struct drm_dep_queue *q;
> > + /** @credits: number of credits this job consumes; must be non-zero. */
> > + u32 credits;
> > +};
> > +
> > +/* Queue API */
> > +
> > +/**
> > + * drm_dep_queue_sched_guard() - acquire the queue scheduler lock as a guard
> > + * @__q: dep queue whose scheduler lock to acquire
> > + *
> > + * Acquires @__q->sched.lock as a scoped mutex guard (released automatically
> > + * when the enclosing scope exits).  This lock serialises all scheduler state
> > + * transitions — stop/start/kill flag changes, bypass-path decisions, and the
> > + * run-job worker — so it must be held when the driver needs to atomically
> > + * inspect or modify queue state in relation to job submission.
> > + *
> > + * **When to use**
> > + *
> > + * Drivers that set %DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED and wish to
> > + * serialise their own submit work against the bypass path must acquire this
> > + * guard.  Without it, a concurrent caller of drm_dep_job_push() could take
> > + * the bypass path and call ops->run_job() inline between the driver's
> > + * eligibility check and its corresponding action, producing a race.
> 
> So if you’re not careful, you have just introduced a race :/
> 
> > + *
> > + * **Constraint: only from submit_wq worker context**
> > + *
> > + * This guard must only be acquired from a work item running on the queue's
> > + * submit workqueue (@q->sched.submit_wq) by drivers.
> > + *
> > + * Context: Process context only; must be called from submit_wq work by
> > + * drivers.
> > + */
> > +#define drm_dep_queue_sched_guard(__q) \
> > + guard(mutex)(&(__q)->sched.lock)
> > +
> > +int drm_dep_queue_init(struct drm_dep_queue *q,
> > +       const struct drm_dep_queue_init_args *args);
> > +void drm_dep_queue_fini(struct drm_dep_queue *q);
> > +void drm_dep_queue_release(struct drm_dep_queue *q);
> > +struct drm_dep_queue *drm_dep_queue_get(struct drm_dep_queue *q);
> > +bool drm_dep_queue_get_unless_zero(struct drm_dep_queue *q);
> > +void drm_dep_queue_put(struct drm_dep_queue *q);
> > +void drm_dep_queue_stop(struct drm_dep_queue *q);
> > +void drm_dep_queue_start(struct drm_dep_queue *q);
> > +void drm_dep_queue_kill(struct drm_dep_queue *q);
> > +void drm_dep_queue_trigger_timeout(struct drm_dep_queue *q);
> > +void drm_dep_queue_cancel_tdr_sync(struct drm_dep_queue *q);
> > +void drm_dep_queue_resume_timeout(struct drm_dep_queue *q);
> > +bool drm_dep_queue_work_enqueue(struct drm_dep_queue *q,
> > + struct work_struct *work);
> > +bool drm_dep_queue_is_stopped(struct drm_dep_queue *q);
> > +bool drm_dep_queue_is_killed(struct drm_dep_queue *q);
> > +bool drm_dep_queue_is_initialized(struct drm_dep_queue *q);
> > +void drm_dep_queue_set_stopped(struct drm_dep_queue *q);
> > +unsigned int drm_dep_queue_refcount(const struct drm_dep_queue *q);
> > +long drm_dep_queue_timeout(const struct drm_dep_queue *q);
> > +struct workqueue_struct *drm_dep_queue_submit_wq(struct drm_dep_queue *q);
> > +struct workqueue_struct *drm_dep_queue_timeout_wq(struct drm_dep_queue *q);
> > +
> > +/* Job API */
> > +
> > +/**
> > + * DRM_DEP_JOB_FENCE_PREALLOC - sentinel value for pre-allocating a dependency slot
> > + *
> > + * Pass this to drm_dep_job_add_dependency() instead of a real fence to
> > + * pre-allocate a slot in the job's dependency xarray during the preparation
> > + * phase (where GFP_KERNEL is available).  The returned xarray index identifies
> > + * the slot.  Call drm_dep_job_replace_dependency() later — inside a
> > + * dma_fence_begin_signalling() region if needed — to swap in the real fence
> > + * without further allocation.
> > + *
> > + * This sentinel is never treated as a dma_fence; it carries no reference count
> > + * and must not be passed to dma_fence_put().  It is only valid as an argument
> > + * to drm_dep_job_add_dependency() and as the expected stored value checked by
> > + * drm_dep_job_replace_dependency().
> > + */
> > +#define DRM_DEP_JOB_FENCE_PREALLOC ((struct dma_fence *)-1)
> > +
> > +int drm_dep_job_init(struct drm_dep_job *job,
> > +     const struct drm_dep_job_init_args *args);
> > +struct drm_dep_job *drm_dep_job_get(struct drm_dep_job *job);
> > +void drm_dep_job_put(struct drm_dep_job *job);
> > +void drm_dep_job_arm(struct drm_dep_job *job);
> > +void drm_dep_job_push(struct drm_dep_job *job);
> > +int drm_dep_job_add_dependency(struct drm_dep_job *job,
> > +       struct dma_fence *fence);
> > +void drm_dep_job_replace_dependency(struct drm_dep_job *job, u32 index,
> > +    struct dma_fence *fence);
> > +int drm_dep_job_add_syncobj_dependency(struct drm_dep_job *job,
> > +       struct drm_file *file, u32 handle,
> > +       u32 point);
> > +int drm_dep_job_add_resv_dependencies(struct drm_dep_job *job,
> > +      struct dma_resv *resv,
> > +      enum dma_resv_usage usage);
> > +int drm_dep_job_add_implicit_dependencies(struct drm_dep_job *job,
> > +  struct drm_gem_object *obj,
> > +  bool write);
> > +bool drm_dep_job_is_signaled(struct drm_dep_job *job);
> > +bool drm_dep_job_is_finished(struct drm_dep_job *job);
> > +bool drm_dep_job_invalidate_job(struct drm_dep_job *job, int threshold);
> > +struct dma_fence *drm_dep_job_finished_fence(struct drm_dep_job *job);
> > +
> > +/**
> > + * struct drm_dep_queue_pending_job_iter - iterator state for
> > + *   drm_dep_queue_for_each_pending_job()
> > + * @q: queue being iterated
> > + */
> > +struct drm_dep_queue_pending_job_iter {
> > + struct drm_dep_queue *q;
> > +};
> > +
> > +/* Drivers should never call this directly */
> 
> Not enforceable in C.
> 
> > +static inline struct drm_dep_queue_pending_job_iter
> > +__drm_dep_queue_pending_job_iter_begin(struct drm_dep_queue *q)
> > +{
> > + struct drm_dep_queue_pending_job_iter iter = {
> > + .q = q,
> > + };
> > +
> > + WARN_ON(!drm_dep_queue_is_stopped(q));
> > + return iter;
> > +}
> > +
> > +/* Drivers should never call this directly */
> > +static inline void
> > +__drm_dep_queue_pending_job_iter_end(struct drm_dep_queue_pending_job_iter iter)
> > +{
> > + WARN_ON(!drm_dep_queue_is_stopped(iter.q));
> > +}
> > +
> > +/* clang-format off */
> > +DEFINE_CLASS(drm_dep_queue_pending_job_iter,
> > +     struct drm_dep_queue_pending_job_iter,
> > +     __drm_dep_queue_pending_job_iter_end(_T),
> > +     __drm_dep_queue_pending_job_iter_begin(__q),
> > +     struct drm_dep_queue *__q);
> > +/* clang-format on */
> > +static inline void *
> > +class_drm_dep_queue_pending_job_iter_lock_ptr(
> > + class_drm_dep_queue_pending_job_iter_t *_T)
> > +{ return _T; }
> > +#define class_drm_dep_queue_pending_job_iter_is_conditional false
> > +
> > +/**
> > + * drm_dep_queue_for_each_pending_job() - iterate over all pending jobs
> > + *   in a queue
> > + * @__job: loop cursor, a &struct drm_dep_job pointer
> > + * @__q: &struct drm_dep_queue to iterate
> > + *
> > + * Iterates over every job currently on @__q->job.pending. The queue must be
> > + * stopped (drm_dep_queue_stop() called) before using this iterator; a WARN_ON
> > + * fires at the start and end of the scope if it is not.
> > + *
> > + * Context: Any context.
> > + */
> > +#define drm_dep_queue_for_each_pending_job(__job, __q) \
> > + scoped_guard(drm_dep_queue_pending_job_iter, (__q)) \
> > + list_for_each_entry((__job), &(__q)->job.pending, pending_link)
> > +
> > +#endif
> > -- 
> > 2.34.1
> > 
> 
> 
> By the way:
> 
> I invite you to have a look at this implementation [0]. It currently works in real
> hardware i.e.: our downstream "Tyr" driver for Arm Mali is using that at the
> moment. It is a mere prototype that we’ve put together to test different
> approaches, so it’s not meant to be a “solution” at all. It’s a mere data point
> for further discussion.
> 
> Philip Stanner is working on this “Job Queue” concept too, but from an upstream
> perspective.
> 
> [0]: https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/61

  reply	other threads:[~2026-03-17  5:45 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260316043255.226352-1-matthew.brost@intel.com>
2026-03-16  4:32 ` [RFC PATCH 01/12] workqueue: Add interface to teach lockdep to warn on reclaim violations Matthew Brost
2026-03-25 15:59   ` Tejun Heo
2026-03-26  1:49     ` Matthew Brost
2026-03-26  2:19       ` Tejun Heo
2026-03-27  4:33         ` Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 02/12] drm/dep: Add DRM dependency queue layer Matthew Brost
2026-03-16  9:16   ` Boris Brezillon
2026-03-17  5:22     ` Matthew Brost
2026-03-17  8:48       ` Boris Brezillon
2026-03-16 10:25   ` Danilo Krummrich
2026-03-17  5:10     ` Matthew Brost
2026-03-17 12:19       ` Danilo Krummrich
2026-03-18 23:02         ` Matthew Brost
2026-03-17  2:47   ` Daniel Almeida
2026-03-17  5:45     ` Matthew Brost [this message]
2026-03-17  7:17       ` Miguel Ojeda
2026-03-17  8:26         ` Matthew Brost
2026-03-17 12:04           ` Daniel Almeida
2026-03-17 19:41           ` Miguel Ojeda
2026-03-23 17:31             ` Matthew Brost
2026-03-23 17:42               ` Miguel Ojeda
2026-03-17 18:14       ` Matthew Brost
2026-03-17 19:48         ` Daniel Almeida
2026-03-17 20:43         ` Boris Brezillon
2026-03-18 22:40           ` Matthew Brost
2026-03-19  9:57             ` Boris Brezillon
2026-03-22  6:43               ` Matthew Brost
2026-03-23  7:58                 ` Matthew Brost
2026-03-23 10:06                   ` Boris Brezillon
2026-03-23 17:11                     ` Matthew Brost
2026-03-17 12:31     ` Danilo Krummrich
2026-03-17 14:25       ` Daniel Almeida
2026-03-17 14:33         ` Danilo Krummrich
2026-03-18 22:50           ` Matthew Brost
2026-03-17  8:47   ` Christian König
2026-03-17 14:55   ` Boris Brezillon
2026-03-18 23:28     ` Matthew Brost
2026-03-19  9:11       ` Boris Brezillon
2026-03-23  4:50         ` Matthew Brost
2026-03-23  9:55           ` Boris Brezillon
2026-03-23 17:08             ` Matthew Brost
2026-03-23 18:38               ` Matthew Brost
2026-03-24  9:23                 ` Boris Brezillon
2026-03-24 16:06                   ` Matthew Brost
2026-03-25  2:33                     ` Matthew Brost
2026-03-24  8:49               ` Boris Brezillon
2026-03-24 16:51                 ` Matthew Brost
2026-03-17 16:30   ` Shashank Sharma
2026-03-16  4:32 ` [RFC PATCH 11/12] accel/amdxdna: Convert to drm_dep scheduler layer Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 12/12] drm/panthor: " Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abjqfXERS6Xk4FAQ@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=acourbot@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=boris.brezillon@collabora.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=daniel.almeida@collabora.com \
    --cc=daniels@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ecourtney@nvidia.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jajones@nvidia.com \
    --cc=jeffv@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=phasta@kernel.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=samitolvanen@google.com \
    --cc=shashanks@nvidia.com \
    --cc=simona@ffwll.ch \
    --cc=sumit.semwal@linaro.org \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tvrtko.ursulin@igalia.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox