Re: [PATCH v3 10/14] drm/panthor: Add the scheduler logical block

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Boris Brezillon <boris.brezillon@collabora.com>
To: Steven Price <steven.price@arm.com>
Cc: "Nicolas Boichat" <drinkcat@chromium.org>,
	kernel@collabora.com, "Daniel Stone" <daniels@collabora.com>,
	"Neil Armstrong" <neil.armstrong@linaro.org>,
	"Liviu Dudau" <Liviu.Dudau@arm.com>,
	dri-devel@lists.freedesktop.org,
	"Clément Péron" <peron.clem@gmail.com>,
	"Marty E . Plummer" <hanetzer@startmail.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Faith Ekstrand" <faith.ekstrand@collabora.com>
Subject: Re: [PATCH v3 10/14] drm/panthor: Add the scheduler logical block
Date: Mon, 15 Jan 2024 14:03:51 +0100	[thread overview]
Message-ID: <20240115140351.687ae919@collabora.com> (raw)
In-Reply-To: <d100db02-fc65-452e-8081-ff6794284e70@arm.com>

On Mon, 11 Dec 2023 16:27:36 +0000
Steven Price <steven.price@arm.com> wrote:

> On 04/12/2023 17:33, Boris Brezillon wrote:
> > This is the piece of software interacting with the FW scheduler, and
> > taking care of some scheduling aspects when the FW comes short of slots
> > scheduling slots. Indeed, the FW only expose a few slots, and the kernel
> > has to give all submission contexts, a chance to execute their jobs.
> > 
> > The kernel-side scheduler is timeslice-based, with a round-robin queue
> > per priority level.
> > 
> > Job submission is handled with a 1:1 drm_sched_entity:drm_gpu_scheduler,
> > allowing us to delegate the dependency tracking to the core.
> > 
> > All the gory details should be documented inline.
> > 
> > v3:
> > - Rework the FW event handling logic to avoid races
> > - Make sure MMU faults kill the group immediately
> > - Use the panthor_kernel_bo abstraction for group/queue buffers
> > - Make in_progress an atomic_t, so we can check it without the reset lock
> >   held
> > - Don't limit the number of groups per context to the FW scheduler
> >   capacity. Fix the limit to 128 for now.
> > - Add a panthor_job_vm() helper
> > - Account for panthor_vm changes
> > - Add our job fence as DMA_RESV_USAGE_WRITE to all external objects
> >   (was previously DMA_RESV_USAGE_BOOKKEEP). I don't get why, given
> >   we're supposed to be fully-explicit, but other drivers do that, so
> >   there must be a good reason
> > - Account for drm_sched changes
> > - Provide a panthor_queue_put_syncwait_obj()
> > - Unconditionally return groups to their idle list in
> >   panthor_sched_suspend()
> > - Condition of sched_queue_{,delayed_}work fixed to be only when a reset
> >   isn't pending or in progress.
> > - Several typos in comments fixed.
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > Signed-off-by: Steven Price <steven.price@arm.com>  
> 
> Two minor comments below, but either way:
> 
> Reviewed-by: Steven Price <steven.price@arm.com>
> 
> > ---
> >  drivers/gpu/drm/panthor/panthor_sched.c | 3410 +++++++++++++++++++++++
> >  drivers/gpu/drm/panthor/panthor_sched.h |   48 +
> >  2 files changed, 3458 insertions(+)
> >  create mode 100644 drivers/gpu/drm/panthor/panthor_sched.c
> >  create mode 100644 drivers/gpu/drm/panthor/panthor_sched.h
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> > new file mode 100644
> > index 000000000000..08e5662f4879
> > --- /dev/null
> > +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> > @@ -0,0 +1,3410 @@
> > +// SPDX-License-Identifier: GPL-2.0 or MIT
> > +/* Copyright 2023 Collabora ltd. */
> > +
> > +#include <drm/panthor_drm.h>
> > +#include <drm/drm_drv.h>
> > +#include <drm/drm_exec.h>
> > +#include <drm/drm_gem_shmem_helper.h>
> > +#include <drm/drm_managed.h>
> > +#include <drm/gpu_scheduler.h>
> > +
> > +#include <linux/build_bug.h>
> > +#include <linux/clk.h>
> > +#include <linux/delay.h>
> > +#include <linux/dma-mapping.h>
> > +#include <linux/firmware.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/io.h>
> > +#include <linux/iopoll.h>
> > +#include <linux/iosys-map.h>
> > +#include <linux/module.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/pm_runtime.h>
> > +#include <linux/dma-resv.h>
> > +
> > +#include "panthor_sched.h"
> > +#include "panthor_devfreq.h"
> > +#include "panthor_device.h"
> > +#include "panthor_gem.h"
> > +#include "panthor_heap.h"
> > +#include "panthor_regs.h"
> > +#include "panthor_gpu.h"
> > +#include "panthor_fw.h"
> > +#include "panthor_mmu.h"
> > +
> > +/**
> > + * DOC: Scheduler
> > + *
> > + * Mali CSF hardware adopts a firmware-assisted scheduling model, where
> > + * the firmware takes care of scheduling aspects, to some extend.
> > + *
> > + * The scheduling happens at the scheduling group level, each group
> > + * contains 1 to N queues (N is FW/hardware dependent, and exposed
> > + * through the firmware interface). Each queue is assigned a command
> > + * stream ring buffer, which serves as a way to get jobs submitted to
> > + * the GPU, among other things.
> > + *
> > + * The firmware can schedule a maximum of M groups (M is FW/hardware
> > + * dependent, and exposed through the firmware interface). Passed
> > + * this maximum number of groups, the kernel must take care of
> > + * rotating the groups passed to the firmware so every group gets
> > + * a chance to have his queues scheduled for execution.
> > + *
> > + * The current implementation only supports with kernel-mode queues.
> > + * In other terms, userspace doesn't have access to the ring-buffer.
> > + * Instead, userspace passes indirect command stream buffers that are
> > + * called from the queue ring-buffer by the kernel using a pre-defined
> > + * sequence of command stream instructions to ensure the userspace driver
> > + * always gets consistent results (cache maintenance,
> > + * synchronization, ...).
> > + *
> > + * We rely on the drm_gpu_scheduler framework to deal with job
> > + * dependencies and submission. As any other driver dealing with a
> > + * FW-scheduler, we use the 1:1 entity:scheduler mode, such that each
> > + * entity has its own job scheduler. When a job is ready to be executed
> > + * (all its dependencies are met), it is pushed to the appropriate
> > + * queue ring-buffer, and the group is scheduled for execution if it
> > + * wasn't already active.
> > + *
> > + * Kernel-side group scheduling is timeslice-based. When we have less
> > + * groups than there are slots, the periodic tick is disabled and we
> > + * just let the FW schedule the active groups. When there are more
> > + * groups than slots, we let each group a chance to execute stuff for
> > + * a given amount of time, and then re-evaluate and pick new groups
> > + * to schedule. The group selection algorithm is based on
> > + * priority+round-robin.
> > + *
> > + * Even though user-mode queues is out of the scope right now, the
> > + * current design takes them into account by avoiding any guess on the
> > + * group/queue state that would be based on information we wouldn't have
> > + * if userspace was in charge of the ring-buffer. That's also one of the
> > + * reason we don't do 'cooperative' scheduling (encoding FW group slot
> > + * reservation as dma_fence that would be returned from the
> > + * drm_gpu_scheduler::prepare_job() hook, and treating group rotation as
> > + * a queue of waiters, ordered by job submission order). This approach
> > + * would work for kernel-mode queues, but would make user-mode queues a
> > + * lot more complicated to retrofit.
> > + */
> > +
> > +#define JOB_TIMEOUT_MS				5000
> > +
> > +#define MIN_CS_PER_CSG				8
> > +
> > +#define MIN_CSGS				3
> > +#define MAX_CSG_PRIO				0xf
> > +
> > +struct panthor_group;
> > +
> > +/**
> > + * struct panthor_csg_slot - Command stream group slot
> > + *
> > + * This represents a FW slot for a scheduling group.
> > + */
> > +struct panthor_csg_slot {
> > +	/** @group: Scheduling group bound to this slot. */
> > +	struct panthor_group *group;
> > +
> > +	/** @priority: Group priority. */
> > +	u8 priority;
> > +
> > +	/**
> > +	 * @idle: True if the group bound to this slot is idle.
> > +	 *
> > +	 * A group is idle when it has nothing waiting for execution on
> > +	 * all its queues, or when queues are blocked waiting for something
> > +	 * to happen (synchronization object).
> > +	 */
> > +	bool idle;
> > +};
> > +
> > +/**
> > + * enum panthor_csg_priority - Group priority
> > + */
> > +enum panthor_csg_priority {
> > +	/** @PANTHOR_CSG_PRIORITY_LOW: Low priority group. */
> > +	PANTHOR_CSG_PRIORITY_LOW = 0,
> > +
> > +	/** @PANTHOR_CSG_PRIORITY_MEDIUM: Medium priority group. */
> > +	PANTHOR_CSG_PRIORITY_MEDIUM,
> > +
> > +	/** @PANTHOR_CSG_PRIORITY_HIGH: High priority group. */
> > +	PANTHOR_CSG_PRIORITY_HIGH,
> > +
> > +	/**
> > +	 * @PANTHOR_CSG_PRIORITY_RT: Real-time priority group.
> > +	 *
> > +	 * Real-time priority allows one to preempt scheduling of other
> > +	 * non-real-time groups. When such a group becomes executable,
> > +	 * it will evict the group with the lowest non-rt priority if
> > +	 * there's no free group slot available.
> > +	 *
> > +	 * Currently not exposed to userspace.
> > +	 */
> > +	PANTHOR_CSG_PRIORITY_RT,
> > +
> > +	/** @PANTHOR_CSG_PRIORITY_COUNT: Number of priority levels. */
> > +	PANTHOR_CSG_PRIORITY_COUNT,
> > +};
> > +
> > +/**
> > + * struct panthor_scheduler - Object used to manage the scheduler
> > + */
> > +struct panthor_scheduler {
> > +	/** @ptdev: Device. */
> > +	struct panthor_device *ptdev;
> > +
> > +	/**
> > +	 * @wq: Workqueue used by our internal scheduler logic.
> > +	 *
> > +	 * Used for the scheduler tick, group update or other kind of FW
> > +	 * event processing that can't be handled in the threaded interrupt
> > +	 * path.
> > +	 */
> > +	struct workqueue_struct *wq;
> > +
> > +	/**
> > +	 * @drm_sched_wq: Workqueue passed to the drm_gpu_scheduler.
> > +	 *
> > +	 * The driver doesn't use this queue, it's left entirely to the
> > +	 * drm_sched for job dequeuing/cleanup.
> > +	 */
> > +	struct workqueue_struct *drm_sched_wq;
> > +
> > +	/** @tick_work: Work executed on a scheduling tick. */
> > +	struct delayed_work tick_work;
> > +
> > +	/**
> > +	 * @sync_upd_work: Work used to process synchronization object updates.
> > +	 *
> > +	 * We use this work to unblock queues/groups that were waiting on a
> > +	 * synchronization object.
> > +	 */
> > +	struct work_struct sync_upd_work;
> > +
> > +	/**
> > +	 * @fw_events_work: Work used to process FW events outside the interrupt path.
> > +	 *
> > +	 * Even if the interrupt is threaded, we need any event processing
> > +	 * that require taking the panthor_scheduler::lock to be processed
> > +	 * outside the interrupt path so we don't block the tick logic when
> > +	 * it calls panthor_fw_{csg,wait}_wait_acks(). Since most of the
> > +	 * even processing require taking this lock, we just delegate all  
> 
>            ^^^^^^^^^^^^^^^^^^^^^^^
>            event processing requires

Oops, I'll fix that.

> 
> > +	 * FW event processing to the scheduler workqueue.
> > +	 */
> > +	struct work_struct fw_events_work;
> > +
> > +	/**
> > +	 * @fw_events: Bitmask encoding pending FW events.
> > +	 */
> > +	atomic_t fw_events;
> > +
> > +	/**
> > +	 * @resched_target: When the next tick should occur.
> > +	 *
> > +	 * Expressed in jiffies.
> > +	 */
> > +	u64 resched_target;
> > +
> > +	/**
> > +	 * @last_tick: When the last tick occurred.
> > +	 *
> > +	 * Expressed in jiffies.
> > +	 */
> > +	u64 last_tick;
> > +
> > +	/** @tick_period: Tick period in jiffies. */
> > +	u64 tick_period;
> > +
> > +	/**
> > +	 * @lock: Lock protecting access to all the scheduler fields.
> > +	 *
> > +	 * Should be taken in the tick work, the irq handler, and anywhere the @groups
> > +	 * fields are touched.
> > +	 */
> > +	struct mutex lock;
> > +
> > +	/** @groups: Various lists used to classify groups. */
> > +	struct {
> > +		/**
> > +		 * @runnable: Runnable group lists.
> > +		 *
> > +		 * When a group has queues that want to execute something,
> > +		 * its panthor_group::run_node should be inserted here.
> > +		 *
> > +		 * One list per-priority.
> > +		 */
> > +		struct list_head runnable[PANTHOR_CSG_PRIORITY_COUNT];
> > +
> > +		/**
> > +		 * @idle: Idle group lists.
> > +		 *
> > +		 * When all queues of a group are idle (either because they
> > +		 * have nothing to execute, or because they are blocked), the
> > +		 * panthor_group::run_node field should be inserted here.
> > +		 *
> > +		 * One list per-priority.
> > +		 */
> > +		struct list_head idle[PANTHOR_CSG_PRIORITY_COUNT];
> > +
> > +		/**
> > +		 * @waiting: List of groups whose queues are blocked on a
> > +		 * synchronization object.
> > +		 *
> > +		 * Insert panthor_group::wait_node here when a group is waiting
> > +		 * for synchronization objects to be signaled.
> > +		 *
> > +		 * This list is evaluated in the @sync_upd_work work.
> > +		 */
> > +		struct list_head waiting;
> > +	} groups;
> > +
> > +	/**
> > +	 * @csg_slots: FW command stream group slots.
> > +	 */
> > +	struct panthor_csg_slot csg_slots[MAX_CSGS];
> > +
> > +	/** @csg_slot_count: Number of command stream group slots exposed by the FW. */
> > +	u32 csg_slot_count;
> > +
> > +	/** @cs_slot_count: Number of command stream slot per group slot exposed by the FW. */
> > +	u32 cs_slot_count;
> > +
> > +	/** @as_slot_count: Number of address space slots supported by the MMU. */
> > +	u32 as_slot_count;
> > +
> > +	/** @used_csg_slot_count: Number of command stream group slot currently used. */
> > +	u32 used_csg_slot_count;
> > +
> > +	/** @sb_slot_count: Number of scoreboard slots. */
> > +	u32 sb_slot_count;
> > +
> > +	/**
> > +	 * @might_have_idle_groups: True if an active group might have become idle.
> > +	 *
> > +	 * This will force a tick, so other runnable groups can be scheduled if one
> > +	 * or more active groups became idle.
> > +	 */
> > +	bool might_have_idle_groups;
> > +
> > +	/** @pm: Power management related fields. */
> > +	struct {
> > +		/** @has_ref: True if the scheduler owns a runtime PM reference. */
> > +		bool has_ref;
> > +	} pm;
> > +
> > +	/** @reset: Reset related fields. */
> > +	struct {
> > +		/** @lock: Lock protecting the other reset fields. */
> > +		struct mutex lock;
> > +
> > +		/**
> > +		 * @in_progress: True if a reset is in progress.
> > +		 *
> > +		 * Set to true in panthor_sched_pre_reset() and back to false in
> > +		 * panthor_sched_post_reset().
> > +		 */
> > +		atomic_t in_progress;
> > +
> > +		/**
> > +		 * @stopped_groups: List containing all groups that were stopped
> > +		 * before a reset.
> > +		 *
> > +		 * Insert panthor_group::run_node in the pre_reset path.
> > +		 */
> > +		struct list_head stopped_groups;
> > +	} reset;
> > +};  
> 
> <snip>
> 
> > +
> > +static void process_fw_events_work(struct work_struct *work)
> > +{
> > +	struct panthor_scheduler *sched = container_of(work, struct panthor_scheduler,
> > +						      fw_events_work);
> > +	u32 events = atomic_fetch_and(0, &sched->fw_events);  
> 
> I think atomic_xchg() would be clearer here.

Definitely, I'll use atomic_xchg() here.

> 
> > +	struct panthor_device *ptdev = sched->ptdev;
> > +
> > +	mutex_lock(&sched->lock);
> > +
> > +	if (events & JOB_INT_GLOBAL_IF) {
> > +		sched_process_global_irq_locked(ptdev);
> > +		events &= ~JOB_INT_GLOBAL_IF;
> > +	}
> > +
> > +	while (events) {
> > +		u32 csg_id = ffs(events) - 1;
> > +		sched_process_csg_irq_locked(ptdev, csg_id);
> > +		events &= ~BIT(csg_id);
> > +	}
> > +
> > +	mutex_unlock(&sched->lock);
> > +}

next prev parent reply	other threads:[~2024-01-15 13:03 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-04 17:32 [PATCH v3 00/14] drm: Add a driver for CSF-based Mali GPUs Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 01/14] drm/panthor: Add uAPI Boris Brezillon
2023-12-06 16:17   ` Steven Price
2023-12-18 13:20   ` Chris Diamand
2024-01-15 11:18     ` Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 02/14] drm/panthor: Add GPU register definitions Boris Brezillon
2023-12-06 16:23   ` Steven Price
2023-12-04 17:32 ` [PATCH v3 03/14] drm/panthor: Add the device logical block Boris Brezillon
2023-12-06 16:55   ` Steven Price
2023-12-07  8:12     ` Boris Brezillon
2023-12-07  8:56       ` Boris Brezillon
2023-12-07 10:23         ` Steven Price
2023-12-07 10:49           ` Boris Brezillon
2023-12-07 11:11           ` [EXTERNAL] " Donald Robson
2023-12-22 13:26   ` Liviu Dudau
2023-12-22 14:04     ` Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 04/14] drm/panthor: Add the GPU " Boris Brezillon
2023-12-07 16:05   ` Steven Price
2023-12-04 17:32 ` [PATCH v3 05/14] drm/panthor: Add GEM " Boris Brezillon
2023-12-07 16:38   ` Steven Price
2024-01-15 10:29     ` Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 06/14] drm/panthor: Add the devfreq " Boris Brezillon
2023-12-05  9:42   ` Clément Péron
2023-12-04 17:33 ` [PATCH v3 07/14] drm/panthor: Add the MMU/VM " Boris Brezillon
2023-12-08 14:28   ` Steven Price
2024-01-15 11:04     ` Boris Brezillon
2024-01-15 17:31   ` Boris Brezillon
2024-01-15 17:38   ` Boris Brezillon
2024-01-15 17:41   ` Boris Brezillon
2024-01-15 18:09   ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 08/14] drm/panthor: Add the FW " Boris Brezillon
2023-12-08 15:39   ` Steven Price
2023-12-18 21:25   ` Chris Diamand
2024-01-15 11:37     ` Boris Brezillon
2024-01-22 16:34     ` Boris Brezillon
2024-01-22 21:14       ` Chris Diamand
2023-12-20 15:12   ` Liviu Dudau
2024-01-15 12:56     ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 09/14] drm/panthor: Add the heap " Boris Brezillon
2023-12-08 16:27   ` Steven Price
2024-01-15 11:15     ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 10/14] drm/panthor: Add the scheduler " Boris Brezillon
2023-12-11 16:27   ` Steven Price
2024-01-15 13:03     ` Boris Brezillon [this message]
2023-12-19 11:50   ` Ketil Johnsen
2024-01-15 13:05     ` Boris Brezillon
2023-12-20 19:59   ` Ketil Johnsen
2024-01-15 13:11     ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 11/14] drm/panthor: Add the driver frontend block Boris Brezillon
2023-12-13 11:47   ` Steven Price
2023-12-20 16:24   ` Liviu Dudau
2024-01-15 12:59     ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 12/14] drm/panthor: Allow driver compilation Boris Brezillon
2023-12-05  4:39   ` kernel test robot
2023-12-05  8:06     ` Boris Brezillon
2023-12-05 14:38   ` kernel test robot
2023-12-05 23:34   ` kernel test robot
2023-12-13 13:18   ` Steven Price
2023-12-04 17:33 ` [PATCH v3 13/14] dt-bindings: gpu: mali-valhall-csf: Add support for Arm Mali CSF GPUs Boris Brezillon
2023-12-04 17:33   ` Boris Brezillon
2023-12-04 19:29   ` Rob Herring
2023-12-04 19:29     ` Rob Herring
2023-12-05  8:46     ` Boris Brezillon
2023-12-05  8:46       ` Boris Brezillon
2023-12-05  6:24   ` kernel test robot
2023-12-05 20:48   ` Rob Herring
2023-12-05 20:48     ` Rob Herring
2023-12-06 10:59     ` Liviu Dudau
2023-12-06 10:59       ` Liviu Dudau
2024-01-22 16:37       ` Boris Brezillon
2024-01-22 16:37         ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 14/14] drm/panthor: Add an entry to MAINTAINERS Boris Brezillon
2023-12-13 13:51   ` Steven Price
2023-12-04 18:09 ` [PATCH v3 00/14] drm: Add a driver for CSF-based Mali GPUs Clément Péron
2023-12-05  8:04   ` Boris Brezillon
2023-12-05  8:48 ` Boris Brezillon
2023-12-06 15:47   ` Steven Price
2023-12-06 16:28     ` Boris Brezillon
2023-12-10  4:58 ` Tatsuyuki Ishi
2023-12-11  8:52   ` Boris Brezillon
2023-12-11 18:18     ` Faith Ekstrand
2024-01-15 14:18       ` Boris Brezillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240115140351.687ae919@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=Liviu.Dudau@arm.com \
    --cc=daniels@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=drinkcat@chromium.org \
    --cc=faith.ekstrand@collabora.com \
    --cc=hanetzer@startmail.com \
    --cc=kernel@collabora.com \
    --cc=neil.armstrong@linaro.org \
    --cc=peron.clem@gmail.com \
    --cc=robin.murphy@arm.com \
    --cc=steven.price@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.