Re: [Intel-gfx] [RFC PATCH 04/20] drm/sched: Convert drm scheduler to use a work queue rather than kthread

Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Boris Brezillon <boris.brezillon@collabora.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [RFC PATCH 04/20] drm/sched: Convert drm scheduler to use a work queue rather than kthread
Date: Mon, 2 Jan 2023 08:30:19 +0100	[thread overview]
Message-ID: <20230102083019.24b99647@collabora.com> (raw)
In-Reply-To: <20221230125508.57af8a14@collabora.com>

On Fri, 30 Dec 2022 12:55:08 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> On Fri, 30 Dec 2022 11:20:42 +0100
> Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
> > Hello Matthew,
> > 
> > On Thu, 22 Dec 2022 14:21:11 -0800
> > Matthew Brost <matthew.brost@intel.com> wrote:
> >   
> > > In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
> > > mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
> > > seems a bit odd but let us explain the reasoning below.
> > > 
> > > 1. In XE the submission order from multiple drm_sched_entity is not
> > > guaranteed to be the same completion even if targeting the same hardware
> > > engine. This is because in XE we have a firmware scheduler, the GuC,
> > > which allowed to reorder, timeslice, and preempt submissions. If a using
> > > shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
> > > apart as the TDR expects submission order == completion order. Using a
> > > dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.    
> > 
> > Oh, that's interesting. I've been trying to solve the same sort of
> > issues to support Arm's new Mali GPU which is relying on a FW-assisted
> > scheduling scheme (you give the FW N streams to execute, and it does
> > the scheduling between those N command streams, the kernel driver
> > does timeslice scheduling to update the command streams passed to the
> > FW). I must admit I gave up on using drm_sched at some point, mostly
> > because the integration with drm_sched was painful, but also because I
> > felt trying to bend drm_sched to make it interact with a
> > timeslice-oriented scheduling model wasn't really future proof. Giving
> > drm_sched_entity exlusive access to a drm_gpu_scheduler probably might
> > help for a few things (didn't think it through yet), but I feel it's
> > coming short on other aspects we have to deal with on Arm GPUs.  
> 
> Ok, so I just had a quick look at the Xe driver and how it
> instantiates the drm_sched_entity and drm_gpu_scheduler, and I think I
> have a better understanding of how you get away with using drm_sched
> while still controlling how scheduling is really done. Here
> drm_gpu_scheduler is just a dummy abstract that let's you use the
> drm_sched job queuing/dep/tracking mechanism. The whole run-queue
> selection is dumb because there's only one entity ever bound to the
> scheduler (the one that's part of the xe_guc_engine object which also
> contains the drm_gpu_scheduler instance). I guess the main issue we'd
> have on Arm is the fact that the stream doesn't necessarily get
> scheduled when ->run_job() is called, it can be placed in the runnable
> queue and be picked later by the kernel-side scheduler when a FW slot
> gets released. That can probably be sorted out by manually disabling the
> job timer and re-enabling it when the stream gets picked by the
> scheduler. But my main concern remains, we're basically abusing
> drm_sched here.
> 
> For the Arm driver, that means turning the following sequence
> 
> 1. wait for job deps
> 2. queue job to ringbuf and push the stream to the runnable
>    queue (if it wasn't queued already). Wakeup the timeslice scheduler
>    to re-evaluate (if the stream is not on a FW slot already)
> 3. stream gets picked by the timeslice scheduler and sent to the FW for
>    execution
> 
> into
> 
> 1. queue job to entity which takes care of waiting for job deps for
>    us
> 2. schedule a drm_sched_main iteration
> 3. the only available entity is picked, and the first job from this
>    entity is dequeued. ->run_job() is called: the job is queued to the
>    ringbuf and the stream is pushed to the runnable queue (if it wasn't
>    queued already). Wakeup the timeslice scheduler to re-evaluate (if
>    the stream is not on a FW slot already)
> 4. stream gets picked by the timeslice scheduler and sent to the FW for
>    execution
> 
> That's one extra step we don't really need. To sum-up, yes, all the
> job/entity tracking might be interesting to share/re-use, but I wonder
> if we couldn't have that without pulling out the scheduling part of
> drm_sched, or maybe I'm missing something, and there's something in
> drm_gpu_scheduler you really need.

On second thought, that's probably an acceptable overhead (not even
sure the extra step I was mentioning exists in practice, because dep
fence signaled state is checked as part of the drm_sched_main
iteration, so that's basically replacing the worker I schedule to
check job deps), and I like the idea of being able to re-use drm_sched
dep-tracking without resorting to invasive changes to the existing
logic, so I'll probably give it a try.

next prev parent reply	other threads:[~2023-01-02  7:30 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-22 22:21 [Intel-gfx] [RFC PATCH 00/20] Initial Xe driver submission Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 01/20] drm/suballoc: Introduce a generic suballocation manager Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 02/20] drm/amd: Convert amdgpu to use suballocation helper Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 03/20] drm/radeon: Use the drm suballocation manager implementation Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 04/20] drm/sched: Convert drm scheduler to use a work queue rather than kthread Matthew Brost
2022-12-23 17:42   ` Rob Clark
2022-12-28 22:21     ` Matthew Brost
2022-12-30 10:20   ` Boris Brezillon
2022-12-30 11:55     ` Boris Brezillon
2023-01-02  7:30       ` Boris Brezillon [this message]
2023-01-03 13:02         ` Tvrtko Ursulin
2023-01-03 14:21           ` Boris Brezillon
2023-01-05 21:43           ` Matthew Brost
2023-01-06 23:52             ` Matthew Brost
2023-01-09 13:46               ` Tvrtko Ursulin
2023-01-09 17:27                 ` Jason Ekstrand
2023-01-10 11:28                   ` Tvrtko Ursulin
2023-01-10 12:19                     ` Tvrtko Ursulin
2023-01-10 15:55                       ` Matthew Brost
2023-01-10 16:50                         ` Tvrtko Ursulin
2023-01-10 19:01                           ` Matthew Brost
2023-01-11  9:17                             ` Tvrtko Ursulin
2023-01-11 18:07                               ` Matthew Brost
2023-01-11 18:52                                 ` John Harrison
2023-01-11 18:55                                   ` Matthew Brost
2023-01-10 14:08                     ` Jason Ekstrand
2023-01-11  8:50                       ` Tvrtko Ursulin
2023-01-11 19:40                         ` Matthew Brost
2023-01-12 18:43                           ` Tvrtko Ursulin
2023-01-11 22:18                         ` Jason Ekstrand
2023-01-11 22:31                           ` Matthew Brost
2023-01-11 22:56                             ` Jason Ekstrand
2023-01-13  0:39                               ` John Harrison
2023-01-18  3:06                                 ` Matthew Brost
2023-01-10 16:39                     ` Matthew Brost
2023-01-11  1:13                       ` Matthew Brost
2023-01-11  9:09                         ` Tvrtko Ursulin
2023-01-11 17:52                           ` Matthew Brost
2023-01-12 18:21                             ` Tvrtko Ursulin
2023-01-05 19:40         ` Matthew Brost
2023-01-09 15:45           ` Jason Ekstrand
2023-01-09 17:17             ` Boris Brezillon
2023-01-09 20:40               ` Daniel Vetter
2023-01-10  8:46                 ` Boris Brezillon
2023-01-11 21:47                   ` Daniel Vetter
2023-01-12  9:10                     ` Boris Brezillon
2023-01-12  9:32                       ` Daniel Vetter
2023-01-12 10:11                         ` Boris Brezillon
2023-01-12 10:25                           ` Boris Brezillon
2023-01-12 10:42                             ` Daniel Vetter
2023-01-12 12:08                               ` Boris Brezillon
2023-01-12 15:38                                 ` Daniel Vetter
2023-01-12 16:48                                   ` Boris Brezillon
2023-01-12 10:30                           ` Boris Brezillon
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 05/20] drm/sched: Add generic scheduler message interface Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 06/20] drm/sched: Start run wq before TDR in drm_sched_start Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 07/20] drm/sched: Submit job before starting TDR Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 08/20] drm/sched: Add helper to set TDR timeout Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 09/20] drm: Add a gpu page-table walker helper Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 10/20] drm/ttm: Don't print error message if eviction was interrupted Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 11/20] drm/i915: Remove gem and overlay frontbuffer tracking Matthew Brost
2022-12-23 11:13   ` Tvrtko Ursulin
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 12/20] drm/i915/display: Neuter frontbuffer tracking harder Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 13/20] drm/i915/display: Add more macros to remove all direct calls to uncore Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 14/20] drm/i915/display: Remove all uncore mmio accesses in favor of intel_de Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 15/20] drm/i915: Rename find_section to find_bdb_section Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 16/20] drm/i915/regs: Set DISPLAY_MMIO_BASE to 0 for xe Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 17/20] drm/i915/display: Fix a use-after-free when intel_edp_init_connector fails Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 18/20] drm/i915/display: Remaining changes to make xe compile Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 19/20] sound/hda: Allow XE as i915 replacement for sound Matthew Brost
2022-12-22 22:21 ` [Intel-gfx] [RFC PATCH 20/20] mei/hdcp: Also enable for XE Matthew Brost
2022-12-22 22:41 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Initial Xe driver submission Patchwork
2023-01-02  8:14 ` [Intel-gfx] [RFC PATCH 00/20] " Thomas Zimmermann
2023-01-02 11:42   ` Jani Nikula
2023-01-03 13:56     ` Boris Brezillon
2023-01-03 14:41       ` Alyssa Rosenzweig
2023-01-03 12:21 ` Tvrtko Ursulin
2023-01-05 21:27   ` Matthew Brost
2023-01-12  9:54     ` Lucas De Marchi
2023-01-12 17:10       ` Matthew Brost
2023-01-17 16:40         ` Jason Ekstrand
2023-01-10 12:33 ` Boris Brezillon
2023-01-17 16:12 ` Jason Ekstrand
2023-02-17 20:51 ` Daniel Vetter
2023-02-27 12:46   ` Oded Gabbay
2023-03-01 23:00   ` Rodrigo Vivi
2023-03-09 15:10     ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230102083019.24b99647@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox