Re: [Intel-xe] [PATCH 1/8] drm/sched: Convert drm scheduler to use a work queue rather than kthread

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: "Christian König" <christian.koenig@amd.com>,
	"Matthew Brost" <matthew.brost@intel.com>
Cc: robdclark@chromium.org, sarah.walker@imgtec.com,
	ketil.johnsen@arm.com, lina@asahilina.net, Liviu.Dudau@arm.com,
	dri-devel@lists.freedesktop.org, luben.tuikov@amd.com,
	donald.robson@imgtec.com, boris.brezillon@collabora.com,
	intel-xe@lists.freedesktop.org, faith.ekstrand@collabora.com
Subject: Re: [Intel-xe] [PATCH 1/8] drm/sched: Convert drm scheduler to use a work queue rather than kthread
Date: Thu, 3 Aug 2023 16:19:22 +0100	[thread overview]
Message-ID: <6ce4ae3b-0633-85d4-dc56-091166b24e8d@linux.intel.com> (raw)
In-Reply-To: <42afed03-ab5d-dad7-52ca-a07f238593a2@amd.com>


On 03/08/2023 15:56, Christian König wrote:
> Am 03.08.23 um 16:43 schrieb Matthew Brost:
>> On Thu, Aug 03, 2023 at 11:11:13AM +0100, Tvrtko Ursulin wrote:
>>> On 01/08/2023 21:50, Matthew Brost wrote:
>>> [SNIP]
>>>>        sched->ops = ops;
>>>>        sched->hw_submission_limit = hw_submission;
>>>>        sched->name = name;
>>>> +    sched->run_wq = run_wq ? : system_wq;
>>> I still think it is not nice to implicitly move everyone over to the 
>>> shared
>>> system wq. Maybe even more so with now one at a time execution, since 
>>> effect
>>> on latency can be even greater.
>>>
>> No one that has a stake in this has pushed back that I can recall. Open
>> to feedback stakeholders (maintainers of drivers that use the drm
>> scheduler).
 >
> No objections to using the system_wq here. Drivers can still pass in 
> their own or simply use system_highpri_wq instead.
> 
> Additional to that the system_wq isn't single threaded, it will create 
> as much threads as needed to fully utilize all CPUs.
> 
>>   The i915 doesn't use the DRM scheduler last time I looked.
>> Has that changed?
>>> Have you considered kthread_work as a backend? Maybe it would work to 
>>> have
>>> callers pass in a kthread_worker they create, or provide a drm_sched 
>>> helper
>>> to create one, which would then be passed to drm_sched_init.
>>>
>>> That would enable per driver kthread_worker, or per device, or whatever
>>> granularity each driver would want/need/desire.
>>>
>>> driver init:
>>> struct drm_sched_worker = drm_sched_create_worker(...);
>>>
>>> queue/whatever init:
>>> drm_sched_init(.., worker, ...);
>>>
>> This idea doesn't seem to work for varitey of reasons. Will type it out
>> if needed but not going to spend time on this unless someone with a
>> stake raises this as an issue.
> 
> Agree completely. kthread_work is for real time workers IIRC.

AFAIK it is indicated if one needs to tweak the kthread priority, but 
that is not the only use case.

I am curious to know why the idea does not work for variety of reasons.

>>> You could create one inside drm_sched_init if not passed in, which would
>>> keep the behaviour for existing drivers more similar - they would 
>>> still have
>>> a 1:1 kthread context for their exclusive use.
>>>
>> Part of the idea of a work queue is so a user can't directly create a
>> kthread via an IOCTL (XE_EXEC_QUEUE_CREATE). What you suggesting exposes
>> this issue.
> 
> Yeah, prevent that is indeed a very good idea.

Nope, I wasn't suggesting that at all.

I was suggesting as many kthread_workers (these are threads) as the 
implementation wants. Xe can create one per device. Someone else can 
create one per hw engine, whatever.

One kthread_*work* per entity does not mean one thread per 
XE_EXEC_QUEUE_CREATE. Kthread_work is just a unit of work executed by 
the kthread_worker thread. Same in that conceptual relationship as 
workqueue and workitem.

Difference is it may work better for single-shot re-arming design if 
regression in submission latency concerns any stakeholders.

>>> And I *think* self-re-arming would be less problematic latency wise 
>>> since
>>> kthread_worker consumes everything queued without relinquishing 
>>> control and
>>> execution context would be guaranteed not to be shared with random 
>>> system
>>> stuff.
>>>
>> So this is essentially so we can use a loop? Seems like a lot effort for
>> what is pure speculation. Again if a stakeholder raises an issue we can
>> address then.
> 
> Instead of a loop what you usually do in the worker is to submit one 
> item (if possible) and then re-queue yourself if there is more work to do.
> 
> This way you give others chance to run as well and/or cancel the work 
> etc...

Yeah I was pointing out loop in the worker was bad months ago (or more) 
so it is not about that. Here my point is whether it can be done better 
than silently convert everyone to system_wq.

Hence my proposal is to *keep* closer to the thread semantics for 
everyone and at the same time _allow_ the option of custom 
workqueue/whatever.

Where is the problem there?

Regards,

Tvrtko

next prev parent reply	other threads:[~2023-08-03 15:19 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01 20:50 [Intel-xe] [PATCH 0/8] DRM scheduler changes for Xe Matthew Brost
2023-08-01 20:50 ` [Intel-xe] [PATCH 1/8] drm/sched: Convert drm scheduler to use a work queue rather than kthread Matthew Brost
2023-08-03 10:11   ` Tvrtko Ursulin
2023-08-03 14:43     ` Matthew Brost
2023-08-03 14:56       ` Christian König
2023-08-03 15:19         ` Tvrtko Ursulin [this message]
2023-08-03 15:39       ` Tvrtko Ursulin
2023-08-01 20:50 ` [Intel-xe] [PATCH 2/8] drm/sched: Move schedule policy to scheduler / entity Matthew Brost
2023-08-01 20:50 ` [Intel-xe] [PATCH 3/8] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy Matthew Brost
2023-08-03  8:50   ` Christian König
2023-08-01 20:50 ` [Intel-xe] [PATCH 4/8] drm/sched: Add generic scheduler message interface Matthew Brost
2023-08-03  8:53   ` Christian König
2023-08-03  8:58     ` Daniel Vetter
2023-08-03  9:35       ` Christian König
2023-08-04  8:50         ` Daniel Vetter
2023-08-04 14:13           ` Matthew Brost
2023-08-07 15:46             ` Christian König
2023-08-08 14:06               ` Matthew Brost
2023-08-08 14:14                 ` Christian König
2023-08-09 14:36                   ` Matthew Brost
2023-08-01 20:51 ` [Intel-xe] [PATCH 5/8] drm/sched: Add drm_sched_start_timeout_unlocked helper Matthew Brost
2023-08-01 20:51 ` [Intel-xe] [PATCH 6/8] drm/sched: Start run wq before TDR in drm_sched_start Matthew Brost
2023-08-01 20:51 ` [Intel-xe] [PATCH 7/8] drm/sched: Submit job before starting TDR Matthew Brost
2023-08-01 20:51 ` [Intel-xe] [PATCH 8/8] drm/sched: Add helper to set TDR timeout Matthew Brost
2023-08-01 20:53 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ce4ae3b-0633-85d4-dc56-091166b24e8d@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Liviu.Dudau@arm.com \
    --cc=boris.brezillon@collabora.com \
    --cc=christian.koenig@amd.com \
    --cc=donald.robson@imgtec.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=faith.ekstrand@collabora.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=ketil.johnsen@arm.com \
    --cc=lina@asahilina.net \
    --cc=luben.tuikov@amd.com \
    --cc=matthew.brost@intel.com \
    --cc=robdclark@chromium.org \
    --cc=sarah.walker@imgtec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox