Re: [RFC PATCH] drm/sched: Add new KUnit test suite for concurrent job submission

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marco Pagani <marco.pagani@linux.dev>
To: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: airlied@gmail.com, ckoenig.leichtzumerken@gmail.com,
	dakr@kernel.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org, maarten.lankhorst@linux.intel.com,
	marpagan@redhat.com, matthew.brost@intel.com, mripard@kernel.org,
	phasta@kernel.org, simona@ffwll.ch, tzimmermann@suse.de
Subject: Re: [RFC PATCH] drm/sched: Add new KUnit test suite for concurrent job submission
Date: Fri, 6 Feb 2026 11:36:45 +0100	[thread overview]
Message-ID: <dde160ab-57bd-4727-be7f-d035cd9e7f19@linux.dev> (raw)
In-Reply-To: <07c8c588-c2d6-463d-aabc-d472b8f38be8@ursulin.net>



On 05/02/2026 10:53, Tvrtko Ursulin wrote:
> 
> On 04/02/2026 16:33, Marco Pagani wrote:
> 
> 8><
> 
>>>>>> +	{
>>>>>> +		.description = "Concurrently submit multiple jobs in a single entity",
>>>>>> +		.job_base_us = 1000,
>>>>>> +		.num_jobs = 10,
>>>>>> +		.num_subs = 64,
>>>>>> +	},
>>>>>> +};
>>>>>> +
>>>>>> +static void
>>>>>> +drm_sched_concurrent_desc(const struct drm_sched_concurrent_params *params, char *desc)
>>>>>> +{
>>>>>> +	strscpy(desc, params->description, KUNIT_PARAM_DESC_SIZE);
>>>>>> +}
>>>>>> +
>>>>>> +KUNIT_ARRAY_PARAM(drm_sched_concurrent, drm_sched_concurrent_cases, drm_sched_concurrent_desc);
>>>>>> +
>>>>>> +struct submitter_data {
>>>>>> +	struct work_struct work;
>>>>>> +	struct sched_concurrent_test_context *ctx;
>>>>>> +	struct drm_mock_sched_entity *entity;
>>>>>> +	struct drm_mock_sched_job **jobs;
>>>>>> +	struct kunit *test;
>>>>>> +	unsigned int id;
>>>>>> +	bool timedout;
>>>>>> +};
>>>>>> +
>>>>>> +static void drm_sched_submitter_worker(struct work_struct *work)
>>>>>> +{
>>>>>> +	const struct drm_sched_concurrent_params *params;
>>>>>> +	struct sched_concurrent_test_context *ctx;
>>>>>> +	struct submitter_data *sub_data;
>>>>>> +	unsigned int i, duration_us;
>>>>>> +	unsigned long timeout_jiffies;
>>>>>> +	bool done;
>>>>>> +
>>>>>> +	sub_data = container_of(work, struct submitter_data, work);
>>>>>> +	ctx = sub_data->ctx;
>>>>>> +	params = sub_data->test->param_value;
>>>>>> +
>>>>>> +	wait_for_completion(&ctx->wait_go);
>>>>>> +
>>>>>> +	for (i = 0; i < params->num_jobs; i++) {
>>>>>> +		duration_us = params->job_base_us + (sub_data->id * 10);
>>>>>
>>>>> Why is job duration dependent by the submitter id?
>>>>
>>>> Just a simple way to have a deterministic distribution of durations.
>>>> I can change it if it doesn't fit.
>>>>
>>>>> Would it be feasiable to not use auto-completing jobs and instead
>>>>> advance the timeline manually? Given how the premise of the test seems
>>>>> to be about concurrent submission it sounds plausible that what happens
>>>>> after submission maybe isn't very relevant.
>>>>
>>>> Good idea! I'll run some experiments and see if it works.
>>>
>>> Cool, I will await your findings in v2. :)
>>
>>
>> After fiddling a bit with the code, I came to the conclusion that
>> changing the design to use manual timeline advancement is doable, but
>> not beneficial, since it would require serializing job submission into
>> "batches" using a two-step process, i.e., (i) workers submit jobs, and
>> (ii) the main thread waits for all submissions, advances the timeline,
>> and then releases the workers for the next iteration.
> 
> What do you mean by next iteration?
> 
> In the patch you have each worker submit all jobs in one go.

I mean, if I change the code to use manual timeline advancement, then I
must introduce some synchronization logic that makes the main thread
advance the timeline only after the workers have submitted their jobs.
Since workers submit multiple jobs, I was thinking it would be better to
have the workers submit jobs in batches instead of all in one go.

>> This approach would partially defeat the purpose of a concurrency test
>> as it would not allow job submission (KUnit process context) to overlap
>> with job completion (hrtimer callback interrupt context) that models
>> asynchronous hardware in the mock scheduler, limiting contention on the
>> DRM scheduler internal locking. Moreover, while manual advancement might
>> appear to make the test deterministic, it does not since the order in
>> which the workers are scheduled will still be non-deterministic.
> 
> Ah, so it depends what is the test actually wanting to test. In my view 
> exercising parallel submit is one thing, while interleaving submission 
> with completion is something else.
> 
> In the test as written I think there is very little of the latter. Each 
> worker submits all their jobs in one tight loop. Jobs you set to be 1ms 
> so first job completion is 1ms away from when workers are released. A 
> lot of the jobs can be submitted in 1ms and it would be interesting to 
> see exactly how much, from how many workers.
> 
> If desire is to interleave completion and submission I think that should 
> be made more explicit (less depending on how fast is the underlying 
> machine). For example adding delays into the submit loop to actually 
> guarantee that.

Fair point.

> But I would also recommend parallel submit and parallel submit vs 
> completions are tested in separate test cases. It should be easy to do 
> with some flags and almost no new code. I was suggesting flags for some 
> other thing in the initial review as well. Right, for auto-complete. So 
> flag could be something like:
> 
> +static const struct drm_sched_concurrent_params 
> drm_sched_concurrent_cases[] = {
> +	{
> +		.description = "Concurrently submit a single job in a single entity",
> +		.job_base_us = 1000,
> +		.num_jobs = 1,
> +		.num_subs = 32,
> 		.flags = INTERLEAVE_SUBMIT_AND_COMPLETE,
> +	},
> 
> In the submit loop:
> 
> +	for (i = 0; i < params->num_jobs; i++) {
> +		duration_us = params->job_base_us + (sub_data->id * 10);
> 		if (flags & INTERLEAVE_SUBMIT_AND_COMPLETE) {
> 			drm_mock_sched_job_set_duration_us(sub_data->jobs[i], duration_us);
> 			// Add a delay based on time elapse and job duration to guarantee job 
> completions start arriving
> 		}
> +		drm_mock_sched_job_submit(sub_data->jobs[i]);
> +	}
> 
> 
> And of course handle the job waiting stage appropriately depending on 
> auto-complete or not.
> 
> Anyway, testing as little as possible at a time is a best practice I 
> recommend, but if you insist, there is also nothing fundamentally wrong 
> with the test as you have it so I won't block it.

Agreed. Unit tests should test one functionality at a time and be clear
about which one. I'll follow your suggestions and have two separate test
cases: a basic one for concurrent submissions with manual timeline
advancement (no batches, workers submit all jobs at once) and then a
second one with automatic timeline advancement for testing interleaving
submissions with completions.

At this point though, I think it would be better to move the tests to a
separate source file, as the partial similarity of the first concurrent
test case with drm_sched_basic_submit could create some confusion.

Thanks,
Marco

next prev parent reply	other threads:[~2026-02-06 10:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-20 20:52 [RFC PATCH] drm/sched: Add new KUnit test suite for concurrent job submission Marco Pagani
2026-01-22  7:13 ` Philipp Stanner
2026-01-22  9:51 ` Tvrtko Ursulin
2026-01-28 11:39   ` Marco Pagani
2026-01-28 16:17     ` Philipp Stanner
2026-01-29 15:44     ` Tvrtko Ursulin
2026-02-04 16:33       ` Marco Pagani
2026-02-05  9:53         ` Tvrtko Ursulin
2026-02-06 10:36           ` Marco Pagani [this message]
2026-02-06 12:12             ` Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dde160ab-57bd-4727-be7f-d035cd9e7f19@linux.dev \
    --to=marco.pagani@linux.dev \
    --cc=airlied@gmail.com \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=marpagan@redhat.com \
    --cc=matthew.brost@intel.com \
    --cc=mripard@kernel.org \
    --cc=phasta@kernel.org \
    --cc=simona@ffwll.ch \
    --cc=tursulin@ursulin.net \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.