From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A6840C46467 for ; Tue, 10 Jan 2023 12:19:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0D14310E5AB; Tue, 10 Jan 2023 12:19:43 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id D44ED10E5AB; Tue, 10 Jan 2023 12:19:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1673353180; x=1704889180; h=message-id:date:mime-version:subject:from:to:cc: references:in-reply-to:content-transfer-encoding; bh=hMAKru4BErMxJmscpS8U4yfvHGR7TInX+pjKzB6Geb8=; b=iWzgDYhdLNxSGG4L9oVQCVwZ1bEhDjQBBHEE+xXS42x2VonI5yhlebXv e9tMz4p7Z1qi+Rcr1VP83OqRjMGXbY8FLoyInDchxbdasDaHmiI6fMVZ8 vR/4LdBHUw06XmBQIs0Tq4me9HF7+LJaFs3bKYwjEZh/Lu3acBWX/GTPS 4eu3IVwIAZvf4oZNOAdUpwnLQLsyz5FUu8bvLG+ASNeJSueva/9iIzExP pxN/gsbUDiEBM0jHfGR4zVEcwEvdCkNPTKxCB98MXQguXmeIztf/gmVCe 2132+Ev4AJ9LNOC1Xc/ZSFg0qaYK+wF77lUr29jPEeHCwEDXFZI492xxP A==; X-IronPort-AV: E=McAfee;i="6500,9779,10585"; a="323197377" X-IronPort-AV: E=Sophos;i="5.96,315,1665471600"; d="scan'208";a="323197377" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jan 2023 04:19:39 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10585"; a="657036601" X-IronPort-AV: E=Sophos;i="5.96,315,1665471600"; d="scan'208";a="657036601" Received: from gregoire-mobl3.amr.corp.intel.com (HELO [10.212.88.71]) ([10.212.88.71]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jan 2023 04:19:37 -0800 Message-ID: <3b1af19f-3500-3be8-9d28-ac7da5b7d728@linux.intel.com> Date: Tue, 10 Jan 2023 12:19:35 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Content-Language: en-US From: Tvrtko Ursulin To: Jason Ekstrand References: <20221222222127.34560-1-matthew.brost@intel.com> <20221222222127.34560-5-matthew.brost@intel.com> <20221230112042.2ddd1946@collabora.com> <20221230125508.57af8a14@collabora.com> <20230102083019.24b99647@collabora.com> <7d6df13c-6c2e-d713-edc8-128d6e19f187@linux.intel.com> Organization: Intel Corporation UK Plc In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Intel-gfx] [RFC PATCH 04/20] drm/sched: Convert drm scheduler to use a work queue rather than kthread X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On 10/01/2023 11:28, Tvrtko Ursulin wrote: > > > On 09/01/2023 17:27, Jason Ekstrand wrote: > > [snip] > >>      >>> AFAICT it proposes to have 1:1 between *userspace* created >>     contexts (per >>      >>> context _and_ engine) and drm_sched. I am not sure avoiding >>     invasive changes >>      >>> to the shared code is in the spirit of the overall idea and >> instead >>      >>> opportunity should be used to look at way to refactor/improve >>     drm_sched. >> >> >> Maybe?  I'm not convinced that what Xe is doing is an abuse at all or >> really needs to drive a re-factor.  (More on that later.)  There's >> only one real issue which is that it fires off potentially a lot of >> kthreads. Even that's not that bad given that kthreads are pretty >> light and you're not likely to have more kthreads than userspace >> threads which are much heavier.  Not ideal, but not the end of the >> world either.  Definitely something we can/should optimize but if we >> went through with Xe without this patch, it would probably be mostly ok. >> >>      >> Yes, it is 1:1 *userspace* engines and drm_sched. >>      >> >>      >> I'm not really prepared to make large changes to DRM scheduler >>     at the >>      >> moment for Xe as they are not really required nor does Boris >>     seem they >>      >> will be required for his work either. I am interested to see >>     what Boris >>      >> comes up with. >>      >> >>      >>> Even on the low level, the idea to replace drm_sched threads >>     with workers >>      >>> has a few problems. >>      >>> >>      >>> To start with, the pattern of: >>      >>> >>      >>>    while (not_stopped) { >>      >>>     keep picking jobs >>      >>>    } >>      >>> >>      >>> Feels fundamentally in disagreement with workers (while >>     obviously fits >>      >>> perfectly with the current kthread design). >>      >> >>      >> The while loop breaks and worker exists if no jobs are ready. >> >> >> I'm not very familiar with workqueues. What are you saying would fit >> better? One scheduling job per work item rather than one big work item >> which handles all available jobs? > > Yes and no, it indeed IMO does not fit to have a work item which is > potentially unbound in runtime. But it is a bit moot conceptual mismatch > because it is a worst case / theoretical, and I think due more > fundamental concerns. > > If we have to go back to the low level side of things, I've picked this > random spot to consolidate what I have already mentioned and perhaps > expand. > > To start with, let me pull out some thoughts from workqueue.rst: > > """ > Generally, work items are not expected to hog a CPU and consume many > cycles. That means maintaining just enough concurrency to prevent work > processing from stalling should be optimal. > """ > > For unbound queues: > """ > The responsibility of regulating concurrency level is on the users. > """ > > Given the unbound queues will be spawned on demand to service all queued > work items (more interesting when mixing up with the system_unbound_wq), > in the proposed design the number of instantiated worker threads does > not correspond to the number of user threads (as you have elsewhere > stated), but pessimistically to the number of active user contexts. That > is the number which drives the maximum number of not-runnable jobs that > can become runnable at once, and hence spawn that many work items, and > in turn unbound worker threads. > > Several problems there. > > It is fundamentally pointless to have potentially that many more threads > than the number of CPU cores - it simply creates a scheduling storm. To make matters worse, if I follow the code correctly, all these per user context worker thread / work items end up contending on the same lock or circular buffer, both are one instance per GPU: guc_engine_run_job -> submit_engine a) wq_item_append -> wq_wait_for_space -> msleep b) xe_guc_ct_send -> guc_ct_send -> mutex_lock(&ct->lock); -> later a potential msleep in h2g_has_room Regards, Tvrtko