Re: [Intel-xe] [PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: "Christian König" <christian.koenig@amd.com>
To: Danilo Krummrich <dakr@redhat.com>,
	Matthew Brost <matthew.brost@intel.com>
Cc: robdclark@chromium.org, sarah.walker@imgtec.com,
	ketil.johnsen@arm.com, lina@asahilina.net, Liviu.Dudau@arm.com,
	dri-devel@lists.freedesktop.org, luben.tuikov@amd.com,
	donald.robson@imgtec.com, boris.brezillon@collabora.com,
	intel-xe@lists.freedesktop.org, faith.ekstrand@collabora.com
Subject: Re: [Intel-xe] [PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread
Date: Thu, 17 Aug 2023 07:33:46 +0200	[thread overview]
Message-ID: <0bf839df-db7f-41fa-8b34-59792d2ba8be@amd.com> (raw)
In-Reply-To: <5fdf7d59-3323-24b5-a35a-bd60b06b4ce5@redhat.com>

Am 16.08.23 um 18:33 schrieb Danilo Krummrich:
> On 8/16/23 16:59, Christian König wrote:
>> Am 16.08.23 um 14:30 schrieb Danilo Krummrich:
>>> On 8/16/23 16:05, Christian König wrote:
>>>> Am 16.08.23 um 13:30 schrieb Danilo Krummrich:
>>>>> Hi Matt,
>>>>>
>>>>> On 8/11/23 04:31, Matthew Brost wrote:
>>>>>> In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
>>>>>> mapping between a drm_gpu_scheduler and drm_sched_entity. At 
>>>>>> first this
>>>>>> seems a bit odd but let us explain the reasoning below.
>>>>>>
>>>>>> 1. In XE the submission order from multiple drm_sched_entity is not
>>>>>> guaranteed to be the same completion even if targeting the same 
>>>>>> hardware
>>>>>> engine. This is because in XE we have a firmware scheduler, the GuC,
>>>>>> which allowed to reorder, timeslice, and preempt submissions. If 
>>>>>> a using
>>>>>> shared drm_gpu_scheduler across multiple drm_sched_entity, the 
>>>>>> TDR falls
>>>>>> apart as the TDR expects submission order == completion order. 
>>>>>> Using a
>>>>>> dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
>>>>>>
>>>>>> 2. In XE submissions are done via programming a ring buffer 
>>>>>> (circular
>>>>>> buffer), a drm_gpu_scheduler provides a limit on number of jobs, 
>>>>>> if the
>>>>>> limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we 
>>>>>> get flow
>>>>>> control on the ring for free.
>>>>>
>>>>> In XE, where does the limitation of MAX_SIZE_PER_JOB come from?
>>>>>
>>>>> In Nouveau we currently do have such a limitation as well, but it 
>>>>> is derived from the RING_SIZE, hence RING_SIZE / MAX_SIZE_PER_JOB 
>>>>> would always be 1. However, I think most jobs won't actually 
>>>>> utilize the whole ring.
>>>>
>>>> Well that should probably rather be RING_SIZE / MAX_SIZE_PER_JOB = 
>>>> hw_submission_limit (or even hw_submission_limit - 1 when the hw 
>>>> can't distinct full vs empty ring buffer).
>>>
>>> Not sure if I get you right, let me try to clarify what I was trying 
>>> to say: I wanted to say that in Nouveau MAX_SIZE_PER_JOB isn't 
>>> really limited by anything other than the RING_SIZE and hence we'd 
>>> never allow more than 1 active job.
>>
>> But that lets the hw run dry between submissions. That is usually a 
>> pretty horrible idea for performance.
>
> Correct, that's the reason why I said it seems to be more efficient to 
> base ring flow control on the actual size of each incoming job rather 
> than the maximum size of a job.
>
>>
>>>
>>> However, it seems to be more efficient to base ring flow control on 
>>> the actual size of each incoming job rather than the worst case, 
>>> namely the maximum size of a job.
>>
>> That doesn't sounds like a good idea to me. See we don't limit the 
>> number of submitted jobs based on the ring size, but rather we 
>> calculate the ring size based on the number of submitted jobs.
>>
>
> My point isn't really about whether we derive the ring size from the 
> job limit or the other way around. It's more about the job size (or 
> its maximum size) being arbitrary.
>
> As mentioned in my reply to Matt:
>
> "In Nouveau, userspace can submit an arbitrary amount of addresses of 
> indirect bufferes containing the ring instructions. The ring on the 
> kernel side takes the addresses of the indirect buffers rather than 
> the instructions themself. Hence, technically there isn't really a 
> limit on the amount of IBs submitted by a job except for the ring size."
>
> So, my point is that I don't really want to limit the job size 
> artificially just to be able to fit multiple jobs into the ring even 
> if they're submitted at their "artificial" maximum size, but rather 
> track how much of the ring the submitted job actually occupies.
>
>> In other words the hw_submission_limit defines the ring size, not the 
>> other way around. And you usually want the hw_submission_limit as low 
>> as possible for good scheduler granularity and to avoid extra overhead.
>
> I don't think you really mean "as low as possible", do you?

No, I do mean as low as possible or in other words as few as possible.

Ideally the scheduler would submit only the minimum amount of work to 
the hardware to keep the hardware busy.

The hardware seems to work mostly the same for all vendors, but you 
somehow seem to think that filling the ring is somehow beneficial which 
is really not the case as far as I can see.

Regards,
Christian.

> Because one really is the minimum if you want to do work at all, but 
> as you mentioned above a job limit of one can let the ring run dry.
>
> In the end my proposal comes down to tracking the actual size of a job 
> rather than just assuming a pre-defined maximum job size, and hence a 
> dynamic job limit.
>
> I don't think this would hurt the scheduler granularity. In fact, it 
> should even contribute to the desire of not letting the ring run dry 
> even better. Especially for sequences of small jobs, where the current 
> implementation might wrongly assume the ring is already full although 
> actually there would still be enough space left.
>
>>
>> Christian.
>>
>>>
>>>>
>>>> Otherwise your scheduler might just overwrite the ring buffer by 
>>>> pushing things to fast.
>>>>
>>>> Christian.
>>>>
>>>>>
>>>>> Given that, it seems like it would be better to let the scheduler 
>>>>> keep track of empty ring "slots" instead, such that the scheduler 
>>>>> can deceide whether a subsequent job will still fit on the ring 
>>>>> and if not re-evaluate once a previous job finished. Of course 
>>>>> each submitted job would be required to carry the number of slots 
>>>>> it requires on the ring.
>>>>>
>>>>> What to you think of implementing this as alternative flow control 
>>>>> mechanism? Implementation wise this could be a union with the 
>>>>> existing hw_submission_limit.
>>>>>
>>>>> - Danilo
>>>>>
>>>>>>
>>>>>> A problem with this design is currently a drm_gpu_scheduler uses a
>>>>>> kthread for submission / job cleanup. This doesn't scale if a large
>>>>>> number of drm_gpu_scheduler are used. To work around the scaling 
>>>>>> issue,
>>>>>> use a worker rather than kthread for submission / job cleanup.
>>>>>>
>>>>>> v2:
>>>>>>    - (Rob Clark) Fix msm build
>>>>>>    - Pass in run work queue
>>>>>> v3:
>>>>>>    - (Boris) don't have loop in worker
>>>>>> v4:
>>>>>>    - (Tvrtko) break out submit ready, stop, start helpers into 
>>>>>> own patch
>>>>>>
>>>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>>>
>>>>
>>>
>>
>

next prev parent reply	other threads:[~2023-08-17  5:34 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-11  2:31 [Intel-xe] [PATCH v2 0/9] DRM scheduler changes for Xe Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread Matthew Brost
2023-08-16 11:30   ` Danilo Krummrich
2023-08-16 14:05     ` Christian König
2023-08-16 12:30       ` Danilo Krummrich
2023-08-16 14:38         ` Matthew Brost
2023-08-16 15:40           ` Danilo Krummrich
2023-08-16 14:59         ` Christian König
2023-08-16 16:33           ` Danilo Krummrich
2023-08-17  5:33             ` Christian König [this message]
2023-08-17 11:13               ` Danilo Krummrich
2023-08-17 13:35                 ` Christian König
2023-08-17 12:48                   ` Danilo Krummrich
2023-08-17 16:17                     ` Christian König
2023-08-18 11:58                       ` Danilo Krummrich
2023-08-21 14:07                         ` Christian König
2023-08-21 18:01                           ` Danilo Krummrich
2023-08-21 18:12                             ` Christian König
2023-08-21 19:07                               ` Danilo Krummrich
2023-08-22  9:35                                 ` Christian König
2023-08-21 19:46                               ` Faith Ekstrand
2023-08-22  9:51                                 ` Christian König
2023-08-22 16:55                                   ` Faith Ekstrand
2023-08-24 11:50                                     ` Bas Nieuwenhuizen
2023-08-18  3:08                 ` Matthew Brost
2023-08-18  5:40                   ` Christian König
2023-08-18 12:49                     ` Matthew Brost
2023-08-18 12:06                       ` Danilo Krummrich
2023-09-12 14:28                 ` Boris Brezillon
2023-09-12 14:33                   ` Danilo Krummrich
2023-09-12 14:49                     ` Boris Brezillon
2023-09-12 15:13                       ` Boris Brezillon
2023-09-12 16:58                         ` Danilo Krummrich
2023-09-12 16:52                       ` Danilo Krummrich
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 2/9] drm/sched: Move schedule policy to scheduler / entity Matthew Brost
2023-08-11 21:43   ` Maira Canal
2023-08-12  3:20     ` Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 3/9] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy Matthew Brost
2023-08-29 17:37   ` Danilo Krummrich
2023-09-05 11:10     ` Danilo Krummrich
2023-09-11 19:44       ` Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 4/9] drm/sched: Split free_job into own work item Matthew Brost
2023-08-17 13:39   ` Christian König
2023-08-17 17:54     ` Matthew Brost
2023-08-18  5:27       ` Christian König
2023-08-18 13:13         ` Matthew Brost
2023-08-21 13:17           ` Christian König
2023-08-23  3:27             ` Matthew Brost
2023-08-23  7:10               ` Christian König
2023-08-23 15:24                 ` Matthew Brost
2023-08-23 15:41                   ` Alex Deucher
2023-08-23 17:26                     ` Rodrigo Vivi
2023-08-23 23:12                       ` Matthew Brost
2023-08-24 11:44                         ` Christian König
2023-08-24 14:30                           ` Matthew Brost
2023-08-24 23:04   ` Danilo Krummrich
2023-08-25  2:58     ` Matthew Brost
2023-08-25  8:02       ` Christian König
2023-08-25 13:36         ` Matthew Brost
2023-08-25 13:45           ` Christian König
2023-09-12 10:13             ` Boris Brezillon
2023-09-12 10:46               ` Danilo Krummrich
2023-09-12 12:18                 ` Boris Brezillon
2023-09-12 12:56                   ` Danilo Krummrich
2023-09-12 13:52                     ` Boris Brezillon
2023-09-12 14:10                       ` Danilo Krummrich
2023-09-12 13:27             ` Boris Brezillon
2023-09-12 13:34               ` Danilo Krummrich
2023-09-12 13:53                 ` Boris Brezillon
2023-08-28 18:04   ` Danilo Krummrich
2023-08-28 18:41     ` Matthew Brost
2023-08-29  1:20       ` Danilo Krummrich
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 5/9] drm/sched: Add generic scheduler message interface Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 6/9] drm/sched: Add drm_sched_start_timeout_unlocked helper Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 7/9] drm/sched: Start run wq before TDR in drm_sched_start Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 8/9] drm/sched: Submit job before starting TDR Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 9/9] drm/sched: Add helper to set TDR timeout Matthew Brost
2023-08-11  2:34 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev2) Patchwork
2023-08-24  0:08 ` [Intel-xe] [PATCH v2 0/9] DRM scheduler changes for Xe Danilo Krummrich
2023-08-24  3:23   ` Matthew Brost
2023-08-24 14:51     ` Danilo Krummrich
2023-08-25  3:01 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev3) Patchwork
2023-09-05 11:13 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev4) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0bf839df-db7f-41fa-8b34-59792d2ba8be@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Liviu.Dudau@arm.com \
    --cc=boris.brezillon@collabora.com \
    --cc=dakr@redhat.com \
    --cc=donald.robson@imgtec.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=faith.ekstrand@collabora.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=ketil.johnsen@arm.com \
    --cc=lina@asahilina.net \
    --cc=luben.tuikov@amd.com \
    --cc=matthew.brost@intel.com \
    --cc=robdclark@chromium.org \
    --cc=sarah.walker@imgtec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox