From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B6072C7EE2D for ; Tue, 2 May 2023 16:50:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 194BA10E0F6; Tue, 2 May 2023 16:50:15 +0000 (UTC) Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by gabe.freedesktop.org (Postfix) with ESMTPS id A415210E22A for ; Tue, 2 May 2023 16:50:12 +0000 (UTC) Received: from localhost (unknown [IPv6:2a01:e0a:2c:6930:5cf4:84a1:2763:fe0d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madras.collabora.co.uk (Postfix) with ESMTPSA id F41FB6603219; Tue, 2 May 2023 17:50:08 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1683046209; bh=vFTrOHJyWpt7KeT297iBopls6heX0p4Ispjj/VNUjN8=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=MSfq0XtRHVmVdlWCj8sy+PiN9/Rmncyql76Ax/+zXYCw0DCBUVeK2Hyf4P3EfXWRF WSKNEN3WRlmG8JHm4CX+s8uP/uaU+GO2jvhx9bt8oci3Tegy3QSTwKisAKXveScXWh y02aZ3KwtciEK2l97OZhjmErDaUsnoJChDP5DSJqA2JCDce3JtZ+S+IOvD62StpANd gEHF2Injphdb/VB+U/lSh5e+3O+wliDXjgQmYDLmldRTKgimjWqzHp9RaYOH4Z7nI+ iTnhnScNkLMfvO/u8iGhi+i92z2e+J0MU8AOuRYb4htQmCAu3fkLpPMkJIucaxUwza lRvhzfiMOFbVw== Date: Tue, 2 May 2023 18:50:06 +0200 From: Boris Brezillon To: Christian =?UTF-8?B?S8O2bmln?= Subject: Re: drm/sched: Replacement for drm_sched_resubmit_jobs() is deprecated Message-ID: <20230502185006.4b2e67a7@collabora.com> In-Reply-To: <5c4f4e89-6126-7701-2023-2628db1b7caa@amd.com> References: <20230502131941.5fe5b79f@collabora.com> <5c4f4e89-6126-7701-2023-2628db1b7caa@amd.com> Organization: Collabora X-Mailer: Claws Mail 4.1.1 (GTK 3.24.36; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Matthew Brost , Sarah Walker , "dri-devel@lists.freedesktop.org" , Alex Deucher Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" +Matthew who's been working on the kthread -> wq transition. On Tue, 2 May 2023 13:36:07 +0200 Christian K=C3=B6nig wrote: > Hi Boris, >=20 > Am 02.05.23 um 13:19 schrieb Boris Brezillon: > > Hello Christian, Alex, > > > > As part of our transition to drm_sched for the powervr GPU driver, we > > realized drm_sched_resubmit_jobs(), which is used by all drivers > > relying on drm_sched right except amdgpu, has been deprecated. > > Unfortunately, commit 5efbe6aa7a0e ("drm/scheduler: deprecate > > drm_sched_resubmit_jobs") doesn't describe what drivers should do or use > > as an alternative. > > > > At the very least, for our implementation, we need to restore the > > drm_sched_job::parent pointers that were set to NULL in > > drm_sched_stop(), such that jobs submitted before the GPU recovery are > > considered active when drm_sched_start() is called. That could be done > > with a custom pending_list iteration restoring drm_sched_job::parent's > > pointer, but that seems odd to let the scheduler backend manipulate > > this list directly, and I suspect we need to do other checks, like the > > karma vs hang-limit thing, so we can flag the entity dirty and cancel > > all jobs being queued there if the entity has caused too many hangs. > > > > Now that drm_sched_resubmit_jobs() has been deprecated, that would be > > great if you could help us write a piece of documentation describing > > what should be done between drm_sched_stop() and drm_sched_start(), so > > new drivers don't come up with their own slightly different/broken > > version of the same thing. =20 >=20 > Yeah, really good point! The solution is to not use drm_sched_stop() and= =20 > drm_sched_start() either. >=20 > The general idea Daniel, the other Intel guys and me seem to have agreed= =20 > on is to convert the scheduler thread into a work item. >=20 > This work item for pushing jobs to the hw can then be queued to the same= =20 > workqueue we use for the timeout work item. >=20 > If this workqueue is now configured by your driver as single threaded=20 > you have a guarantee that only either the scheduler or the timeout work=20 > item is running at the same time. That in turn makes starting/stopping=20 > the scheduler for a reset completely superfluous. >=20 > Patches for this has already been floating on the mailing list, but=20 > haven't been committed yet. Since this is all WIP. As I said previously, our work is based on Matthew's patch series converting drm_sched threads to a workqueue based implementation. And I think there are a couple of shortcomings with the current implementation if we do what you suggest (one single-threaded workqueue used to serialize the GPU resets and job dequeuing/submission): - drm_sched_main() has a loop dequeuing jobs until it can't (because a dep is not signaled, or because all slots are taken), not one at a time. There can also be several 1:1 entities waiting for jobs to be dequeued before the reset work. That means the reset operation might be delayed if we don't have a drm_sched_stop() calling cancel_work() and making sure we break out of the loop. - with a workqueue (and that's even worse with a single-threaded one), the entity priorities are not enforced at the de-queuing level. The FW will still take care of execution priority, but you can have a low prio drm_sched_entity getting pushed a lot of jobs, and with the drm_sched_main() loop, you'll only let other entities dequeue their jobs when the ringbuf of this low prio entity is full or a job dep is not signaled. De-queuing one job at a time makes things better, but if you have a lot of 1:1 entities, it can still take sometime until you reach the high prio drm_sched worker. I'm just wondering if you already have solutions to these problems. Regards, Boris