Re: [Intel-xe] [PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Christian König" <christian.koenig@amd.com>
To: Faith Ekstrand <faith@gfxstrand.net>
Cc: robdclark@chromium.org, sarah.walker@imgtec.com,
	ketil.johnsen@arm.com, lina@asahilina.net, Liviu.Dudau@arm.com,
	dri-devel@lists.freedesktop.org, luben.tuikov@amd.com,
	Danilo Krummrich <dakr@redhat.com>,
	donald.robson@imgtec.com, boris.brezillon@collabora.com,
	intel-xe@lists.freedesktop.org, faith.ekstrand@collabora.com
Subject: Re: [Intel-xe] [PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread
Date: Tue, 22 Aug 2023 11:51:13 +0200	[thread overview]
Message-ID: <2498b1a3-6597-c112-82cd-58b44ca188f0@amd.com> (raw)
In-Reply-To: <CAOFGe94JC8V2GS5L2iCaD9=X-sbbcvrvijck8ivieko=LGBSbg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3141 bytes --]

Am 21.08.23 um 21:46 schrieb Faith Ekstrand:
> On Mon, Aug 21, 2023 at 1:13 PM Christian König 
> <christian.koenig@amd.com> wrote:
>
>     [SNIP]
>     So as long as nobody from userspace comes and says we absolutely
>     need to
>     optimize this use case I would rather not do it.
>
>
> This is a place where nouveau's needs are legitimately different from 
> AMD or Intel, I think.  NVIDIA's command streamer model is very 
> different from AMD and Intel.  On AMD and Intel, each EXEC turns into 
> a single small packet (on the order of 16B) which kicks off a command 
> buffer.  There may be a bit of cache management or something around it 
> but that's it.  From there, it's userspace's job to make one command 
> buffer chain to another until it's finally done and then do a 
> "return", whatever that looks like.
>
> NVIDIA's model is much more static.  Each packet in the HW/FW ring is 
> an address and a size and that much data is processed and then it 
> grabs the next packet and processes. The result is that, if we use 
> multiple buffers of commands, there's no way to chain them together.  
> We just have to pass the whole list of buffers to the kernel.

So far that is actually completely identical to what AMD has.

> A single EXEC ioctl / job may have 500 such addr+size packets 
> depending on how big the command buffer is.

And that is what I don't understand. Why would you need 100dreds of such 
addr+size packets?

This is basically identical to what AMD has (well on newer hw there is 
an extension in the CP packets to JUMP/CALL subsequent IBs, but this 
isn't widely used as far as I know).

Previously the limit was something like 4 which we extended to because 
Bas came up with similar requirements for the AMD side from RADV.

But essentially those approaches with 100dreds of IBs doesn't sound like 
a good idea to me.

> It gets worse on pre-Turing hardware where we have to split the batch 
> for every single DrawIndirect or DispatchIndirect.
>
> Lest you think NVIDIA is just crazy here, it's a perfectly reasonable 
> model if you assume that userspace is feeding the firmware.  When 
> that's happening, you just have a userspace thread that sits there and 
> feeds the ringbuffer with whatever is next and you can marshal as much 
> data through as you want. Sure, it'd be nice to have a 2nd level batch 
> thing that gets launched from the FW ring and has all the individual 
> launch commands but it's not at all necessary.
>
> What does that mean from a gpu_scheduler PoV? Basically, it means a 
> variable packet size.
>
> What does this mean for implementation? IDK.  One option would be to 
> teach the scheduler about actual job sizes. Another would be to 
> virtualize it and have another layer underneath the scheduler that 
> does the actual feeding of the ring. Another would be to decrease the 
> job size somewhat and then have the front-end submit as many jobs as 
> it needs to service userspace and only put the out-fences on the last 
> job. All the options kinda suck.

Yeah, agree. The job size Danilo suggested is still the least painful.

Christian.

>
> ~Faith

[-- Attachment #2: Type: text/html, Size: 5413 bytes --]

WARNING: multiple messages have this Message-ID (diff)

From: "Christian König" <christian.koenig@amd.com>
To: Faith Ekstrand <faith@gfxstrand.net>
Cc: Matthew Brost <matthew.brost@intel.com>,
	robdclark@chromium.org, sarah.walker@imgtec.com,
	thomas.hellstrom@linux.intel.com, ketil.johnsen@arm.com,
	lina@asahilina.net, Liviu.Dudau@arm.com,
	dri-devel@lists.freedesktop.org, luben.tuikov@amd.com,
	Danilo Krummrich <dakr@redhat.com>,
	donald.robson@imgtec.com, boris.brezillon@collabora.com,
	intel-xe@lists.freedesktop.org, faith.ekstrand@collabora.com
Subject: Re: [PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread
Date: Tue, 22 Aug 2023 11:51:13 +0200	[thread overview]
Message-ID: <2498b1a3-6597-c112-82cd-58b44ca188f0@amd.com> (raw)
In-Reply-To: <CAOFGe94JC8V2GS5L2iCaD9=X-sbbcvrvijck8ivieko=LGBSbg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3141 bytes --]

Am 21.08.23 um 21:46 schrieb Faith Ekstrand:
> On Mon, Aug 21, 2023 at 1:13 PM Christian König 
> <christian.koenig@amd.com> wrote:
>
>     [SNIP]
>     So as long as nobody from userspace comes and says we absolutely
>     need to
>     optimize this use case I would rather not do it.
>
>
> This is a place where nouveau's needs are legitimately different from 
> AMD or Intel, I think.  NVIDIA's command streamer model is very 
> different from AMD and Intel.  On AMD and Intel, each EXEC turns into 
> a single small packet (on the order of 16B) which kicks off a command 
> buffer.  There may be a bit of cache management or something around it 
> but that's it.  From there, it's userspace's job to make one command 
> buffer chain to another until it's finally done and then do a 
> "return", whatever that looks like.
>
> NVIDIA's model is much more static.  Each packet in the HW/FW ring is 
> an address and a size and that much data is processed and then it 
> grabs the next packet and processes. The result is that, if we use 
> multiple buffers of commands, there's no way to chain them together.  
> We just have to pass the whole list of buffers to the kernel.

So far that is actually completely identical to what AMD has.

> A single EXEC ioctl / job may have 500 such addr+size packets 
> depending on how big the command buffer is.

And that is what I don't understand. Why would you need 100dreds of such 
addr+size packets?

This is basically identical to what AMD has (well on newer hw there is 
an extension in the CP packets to JUMP/CALL subsequent IBs, but this 
isn't widely used as far as I know).

Previously the limit was something like 4 which we extended to because 
Bas came up with similar requirements for the AMD side from RADV.

But essentially those approaches with 100dreds of IBs doesn't sound like 
a good idea to me.

> It gets worse on pre-Turing hardware where we have to split the batch 
> for every single DrawIndirect or DispatchIndirect.
>
> Lest you think NVIDIA is just crazy here, it's a perfectly reasonable 
> model if you assume that userspace is feeding the firmware.  When 
> that's happening, you just have a userspace thread that sits there and 
> feeds the ringbuffer with whatever is next and you can marshal as much 
> data through as you want. Sure, it'd be nice to have a 2nd level batch 
> thing that gets launched from the FW ring and has all the individual 
> launch commands but it's not at all necessary.
>
> What does that mean from a gpu_scheduler PoV? Basically, it means a 
> variable packet size.
>
> What does this mean for implementation? IDK.  One option would be to 
> teach the scheduler about actual job sizes. Another would be to 
> virtualize it and have another layer underneath the scheduler that 
> does the actual feeding of the ring. Another would be to decrease the 
> job size somewhat and then have the front-end submit as many jobs as 
> it needs to service userspace and only put the out-fences on the last 
> job. All the options kinda suck.

Yeah, agree. The job size Danilo suggested is still the least painful.

Christian.

>
> ~Faith

[-- Attachment #2: Type: text/html, Size: 5413 bytes --]

next prev parent reply	other threads:[~2023-08-22  9:51 UTC|newest]

Thread overview: 163+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-11  2:31 [Intel-xe] [PATCH v2 0/9] DRM scheduler changes for Xe Matthew Brost
2023-08-11  2:31 ` Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread Matthew Brost
2023-08-11  2:31   ` Matthew Brost
2023-08-16 11:30   ` [Intel-xe] " Danilo Krummrich
2023-08-16 11:30     ` Danilo Krummrich
2023-08-16 14:05     ` [Intel-xe] " Christian König
2023-08-16 14:05       ` Christian König
2023-08-16 12:30       ` [Intel-xe] " Danilo Krummrich
2023-08-16 12:30         ` Danilo Krummrich
2023-08-16 14:38         ` [Intel-xe] " Matthew Brost
2023-08-16 14:38           ` Matthew Brost
2023-08-16 15:40           ` [Intel-xe] " Danilo Krummrich
2023-08-16 15:40             ` Danilo Krummrich
2023-08-16 14:59         ` [Intel-xe] " Christian König
2023-08-16 14:59           ` Christian König
2023-08-16 16:33           ` [Intel-xe] " Danilo Krummrich
2023-08-16 16:33             ` Danilo Krummrich
2023-08-17  5:33             ` [Intel-xe] " Christian König
2023-08-17  5:33               ` Christian König
2023-08-17 11:13               ` [Intel-xe] " Danilo Krummrich
2023-08-17 11:13                 ` Danilo Krummrich
2023-08-17 13:35                 ` [Intel-xe] " Christian König
2023-08-17 13:35                   ` Christian König
2023-08-17 12:48                   ` [Intel-xe] " Danilo Krummrich
2023-08-17 12:48                     ` Danilo Krummrich
2023-08-17 16:17                     ` [Intel-xe] " Christian König
2023-08-17 16:17                       ` Christian König
2023-08-18 11:58                       ` [Intel-xe] " Danilo Krummrich
2023-08-18 11:58                         ` Danilo Krummrich
2023-08-21 14:07                         ` [Intel-xe] " Christian König
2023-08-21 14:07                           ` Christian König
2023-08-21 18:01                           ` [Intel-xe] " Danilo Krummrich
2023-08-21 18:01                             ` Danilo Krummrich
2023-08-21 18:12                             ` [Intel-xe] " Christian König
2023-08-21 18:12                               ` Christian König
2023-08-21 19:07                               ` [Intel-xe] " Danilo Krummrich
2023-08-21 19:07                                 ` Danilo Krummrich
2023-08-22  9:35                                 ` [Intel-xe] " Christian König
2023-08-22  9:35                                   ` Christian König
2023-08-21 19:46                               ` [Intel-xe] " Faith Ekstrand
2023-08-21 19:46                                 ` Faith Ekstrand
2023-08-22  9:51                                 ` Christian König [this message]
2023-08-22  9:51                                   ` Christian König
2023-08-22 16:55                                   ` [Intel-xe] " Faith Ekstrand
2023-08-22 16:55                                     ` Faith Ekstrand
2023-08-24 11:50                                     ` [Intel-xe] " Bas Nieuwenhuizen
2023-08-24 11:50                                       ` Bas Nieuwenhuizen
2023-08-18  3:08                 ` [Intel-xe] " Matthew Brost
2023-08-18  3:08                   ` Matthew Brost
2023-08-18  5:40                   ` [Intel-xe] " Christian König
2023-08-18  5:40                     ` Christian König
2023-08-18 12:49                     ` [Intel-xe] " Matthew Brost
2023-08-18 12:49                       ` Matthew Brost
2023-08-18 12:06                       ` [Intel-xe] " Danilo Krummrich
2023-08-18 12:06                         ` Danilo Krummrich
2023-09-12 14:28                 ` [Intel-xe] " Boris Brezillon
2023-09-12 14:28                   ` Boris Brezillon
2023-09-12 14:33                   ` [Intel-xe] " Danilo Krummrich
2023-09-12 14:33                     ` Danilo Krummrich
2023-09-12 14:49                     ` [Intel-xe] " Boris Brezillon
2023-09-12 14:49                       ` Boris Brezillon
2023-09-12 15:13                       ` [Intel-xe] " Boris Brezillon
2023-09-12 15:13                         ` Boris Brezillon
2023-09-12 16:58                         ` [Intel-xe] " Danilo Krummrich
2023-09-12 16:58                           ` Danilo Krummrich
2023-09-12 16:52                       ` [Intel-xe] " Danilo Krummrich
2023-09-12 16:52                         ` Danilo Krummrich
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 2/9] drm/sched: Move schedule policy to scheduler / entity Matthew Brost
2023-08-11  2:31   ` Matthew Brost
2023-08-11 21:43   ` [Intel-xe] " Maira Canal
2023-08-11 21:43     ` Maira Canal
2023-08-12  3:20     ` [Intel-xe] " Matthew Brost
2023-08-12  3:20       ` Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 3/9] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy Matthew Brost
2023-08-11  2:31   ` Matthew Brost
2023-08-29 17:37   ` [Intel-xe] " Danilo Krummrich
2023-08-29 17:37     ` Danilo Krummrich
2023-09-05 11:10     ` [Intel-xe] " Danilo Krummrich
2023-09-05 11:10       ` Danilo Krummrich
2023-09-11 19:44       ` [Intel-xe] " Matthew Brost
2023-09-11 19:44         ` Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 4/9] drm/sched: Split free_job into own work item Matthew Brost
2023-08-11  2:31   ` Matthew Brost
2023-08-17 13:39   ` [Intel-xe] " Christian König
2023-08-17 13:39     ` Christian König
2023-08-17 17:54     ` [Intel-xe] " Matthew Brost
2023-08-17 17:54       ` Matthew Brost
2023-08-18  5:27       ` [Intel-xe] " Christian König
2023-08-18  5:27         ` Christian König
2023-08-18 13:13         ` [Intel-xe] " Matthew Brost
2023-08-18 13:13           ` Matthew Brost
2023-08-21 13:17           ` [Intel-xe] " Christian König
2023-08-21 13:17             ` Christian König
2023-08-23  3:27             ` [Intel-xe] " Matthew Brost
2023-08-23  3:27               ` Matthew Brost
2023-08-23  7:10               ` [Intel-xe] " Christian König
2023-08-23  7:10                 ` Christian König
2023-08-23 15:24                 ` [Intel-xe] " Matthew Brost
2023-08-23 15:24                   ` Matthew Brost
2023-08-23 15:41                   ` [Intel-xe] " Alex Deucher
2023-08-23 15:41                     ` Alex Deucher
2023-08-23 17:26                     ` [Intel-xe] " Rodrigo Vivi
2023-08-23 17:26                       ` Rodrigo Vivi
2023-08-23 23:12                       ` Matthew Brost
2023-08-23 23:12                         ` Matthew Brost
2023-08-24 11:44                         ` Christian König
2023-08-24 11:44                           ` Christian König
2023-08-24 14:30                           ` Matthew Brost
2023-08-24 14:30                             ` Matthew Brost
2023-08-24 23:04   ` Danilo Krummrich
2023-08-24 23:04     ` Danilo Krummrich
2023-08-25  2:58     ` [Intel-xe] " Matthew Brost
2023-08-25  2:58       ` Matthew Brost
2023-08-25  8:02       ` [Intel-xe] " Christian König
2023-08-25  8:02         ` Christian König
2023-08-25 13:36         ` [Intel-xe] " Matthew Brost
2023-08-25 13:36           ` Matthew Brost
2023-08-25 13:45           ` [Intel-xe] " Christian König
2023-08-25 13:45             ` Christian König
2023-09-12 10:13             ` [Intel-xe] " Boris Brezillon
2023-09-12 10:13               ` Boris Brezillon
2023-09-12 10:46               ` [Intel-xe] " Danilo Krummrich
2023-09-12 10:46                 ` Danilo Krummrich
2023-09-12 12:18                 ` [Intel-xe] " Boris Brezillon
2023-09-12 12:18                   ` Boris Brezillon
2023-09-12 12:56                   ` [Intel-xe] " Danilo Krummrich
2023-09-12 12:56                     ` Danilo Krummrich
2023-09-12 13:52                     ` [Intel-xe] " Boris Brezillon
2023-09-12 13:52                       ` Boris Brezillon
2023-09-12 14:10                       ` [Intel-xe] " Danilo Krummrich
2023-09-12 14:10                         ` Danilo Krummrich
2023-09-12 13:27             ` [Intel-xe] " Boris Brezillon
2023-09-12 13:27               ` Boris Brezillon
2023-09-12 13:34               ` [Intel-xe] " Danilo Krummrich
2023-09-12 13:34                 ` Danilo Krummrich
2023-09-12 13:53                 ` [Intel-xe] " Boris Brezillon
2023-09-12 13:53                   ` Boris Brezillon
2023-08-28 18:04   ` [Intel-xe] " Danilo Krummrich
2023-08-28 18:04     ` Danilo Krummrich
2023-08-28 18:41     ` [Intel-xe] " Matthew Brost
2023-08-28 18:41       ` Matthew Brost
2023-08-29  1:20       ` [Intel-xe] " Danilo Krummrich
2023-08-29  1:20         ` Danilo Krummrich
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 5/9] drm/sched: Add generic scheduler message interface Matthew Brost
2023-08-11  2:31   ` Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 6/9] drm/sched: Add drm_sched_start_timeout_unlocked helper Matthew Brost
2023-08-11  2:31   ` Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 7/9] drm/sched: Start run wq before TDR in drm_sched_start Matthew Brost
2023-08-11  2:31   ` Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 8/9] drm/sched: Submit job before starting TDR Matthew Brost
2023-08-11  2:31   ` Matthew Brost
2023-08-11  2:31 ` [Intel-xe] [PATCH v2 9/9] drm/sched: Add helper to set TDR timeout Matthew Brost
2023-08-11  2:31   ` Matthew Brost
2023-08-11  2:34 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev2) Patchwork
2023-08-24  0:08 ` [Intel-xe] [PATCH v2 0/9] DRM scheduler changes for Xe Danilo Krummrich
2023-08-24  0:08   ` Danilo Krummrich
2023-08-24  3:23   ` [Intel-xe] " Matthew Brost
2023-08-24  3:23     ` Matthew Brost
2023-08-24 14:51     ` [Intel-xe] " Danilo Krummrich
2023-08-24 14:51       ` Danilo Krummrich
2023-08-25  3:01 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev3) Patchwork
2023-09-05 11:13 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev4) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2498b1a3-6597-c112-82cd-58b44ca188f0@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Liviu.Dudau@arm.com \
    --cc=boris.brezillon@collabora.com \
    --cc=dakr@redhat.com \
    --cc=donald.robson@imgtec.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=faith.ekstrand@collabora.com \
    --cc=faith@gfxstrand.net \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=ketil.johnsen@arm.com \
    --cc=lina@asahilina.net \
    --cc=luben.tuikov@amd.com \
    --cc=robdclark@chromium.org \
    --cc=sarah.walker@imgtec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.