From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
intel-gfx <intel-gfx@lists.freedesktop.org>,
dri-devel <dri-devel@lists.freedesktop.org>,
Chris Wilson <chris.p.wilson@linux.intel.com>,
Tvrtko Ursulin <tursulin@ursulin.net>
Subject: Re: [PATCH v3 00/15] CCS static load balance
Date: Wed, 28 Aug 2024 15:47:21 +0200 [thread overview]
Message-ID: <Zs8qaZftGbq7Ls00@phenom.ffwll.local> (raw)
In-Reply-To: <Zs7dv57piSIj3Og4@ashyti-mobl2.lan>
On Wed, Aug 28, 2024 at 10:20:15AM +0200, Andi Shyti wrote:
> Hi Sima,
>
> first of all, thanks for looking into this series.
>
> On Tue, Aug 27, 2024 at 07:31:21PM +0200, Daniel Vetter wrote:
> > On Fri, Aug 23, 2024 at 03:08:40PM +0200, Andi Shyti wrote:
> > > Hi,
> > >
> > > This patch series introduces static load balancing for GPUs with
> > > multiple compute engines. It's a lengthy series, and some
> > > challenging aspects still need to be resolved.
> >
> > Do we have an actual user for this, where just reloading the entire driver
> > (or well-rebinding, if you only want to change the value for a specific
> > device) with a new module option isn't enough?
>
> Yes, we have users for this and this has been already agreed with
> architects and maintainers.
So my understanding is that for upstream, this only applies to dg2,
because the other platforms don't have enough CCS engines to make this a
real issue.
Do we really have upstream demand for this feature on dg2 only?
Also how hard would it be to make these users happy with xe-on-dg2 in
upstream instead?
> Why are you saying that we are reloading/rebinding the driver?
That's the other alternate solution.
> I'm only removing the exposure of user engines, which is
> basically a flag in the engines data structure.
>
> > There's some really gnarly locking and lifetime fun in there, and it needs
> > a corresponding justification.
>
> What locking are you referring about?
>
> I only added one single mutex that has a comment and a
> justification. If you think that's not enough, I can of course
> improve it (please note that the changes have a good amount of
> comments and I tried to be aso more descriptive as I could).
>
> When I change the engines configurations only for the compute
> engines and only for DG2 platforms, I need to make sure that no
> other user is affected by the change. Thus I need to make sure
> that access to some of the strucures are properly serialized.
>
> > Which needs to be enormous for this case,
> > meaning actual customers willing to shout on dri-devel that they really,
> > absolutely need this, or their machines will go up in flames.
> > Otherwise this is a nack from me.
>
> Would you please tell me why are you nacking the patch? So that I
> address your comments for v4?
So for one, this is substantially more flexible than the solution merged
into xe. And the patch set doesn't explain why (the commit messages
actualy describe the design xe has).
That does not inspire confidence at all.
Second, I don't think anyone understands the entire engine/ctx locking
design in i915-gem. And the fix for that was to make as much as absolutely
possible immutable. Yes the implementation looks correct, but when I
looked at the much, much simpler xe implementation I'm pretty sure I've
found an issue there too. Here I can't even tell.
-Sima
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
next prev parent reply other threads:[~2024-08-28 13:47 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-23 13:08 [PATCH v3 00/15] CCS static load balance Andi Shyti
2024-08-23 13:08 ` [PATCH v3 01/15] drm/i915/gt: Avoid using masked workaround for CCS_MODE setting Andi Shyti
2024-08-23 13:08 ` [PATCH v3 02/15] drm/i915/gt: Move the CCS mode variable to a global position Andi Shyti
2024-08-23 13:08 ` [PATCH v3 03/15] drm/i915/gt: Allow the creation of multi-mode CCS masks Andi Shyti
2024-08-23 13:08 ` [PATCH v3 04/15] drm/i915/gt: Refactor uabi engine class/instance list creation Andi Shyti
2024-08-23 13:08 ` [PATCH v3 05/15] drm/i915/gem: Mark and verify UABI engine validity Andi Shyti
2024-08-23 13:08 ` [PATCH v3 06/15] drm/i915/gt: Introduce for_each_enabled_engine() and apply it in selftests Andi Shyti
2024-08-23 13:08 ` [PATCH v3 07/15] drm/i915/gt: Manage CCS engine creation within UABI exposure Andi Shyti
2024-08-23 13:08 ` [PATCH v3 08/15] drm/i915/gt: Remove cslices mask value from the CCS structure Andi Shyti
2024-08-23 13:08 ` [PATCH v3 09/15] drm/i915/gt: Expose the number of total CCS slices Andi Shyti
2024-08-23 13:08 ` [PATCH v3 10/15] drm/i915/gt: Store engine-related sysfs kobjects Andi Shyti
2024-08-23 13:08 ` [PATCH v3 11/15] drm/i915/gt: Store active CCS mask Andi Shyti
2024-08-23 13:08 ` [PATCH v3 12/15] drm/i915: Protect access to the UABI engines list with a mutex Andi Shyti
2024-08-23 13:08 ` [PATCH v3 13/15] drm/i915/gt: Isolate single sysfs engine file creation Andi Shyti
2024-08-23 13:08 ` [PATCH v3 14/15] drm/i915/gt: Implement creation and removal routines for CCS engines Andi Shyti
2024-08-26 22:07 ` kernel test robot
2024-08-23 13:08 ` [PATCH v3 15/15] drm/i915/gt: Allow the user to change the CCS mode through sysfs Andi Shyti
2024-08-27 17:36 ` Daniel Vetter
2024-08-23 14:14 ` ✗ Fi.CI.CHECKPATCH: warning for CCS static load balance Patchwork
2024-08-23 14:14 ` ✗ Fi.CI.SPARSE: " Patchwork
2024-08-23 14:26 ` ✗ Fi.CI.BAT: failure " Patchwork
2024-08-23 15:48 ` ✗ Fi.CI.CHECKPATCH: warning for CCS static load balance (rev2) Patchwork
2024-08-23 15:48 ` ✗ Fi.CI.SPARSE: " Patchwork
2024-08-23 16:04 ` ✓ Fi.CI.BAT: success " Patchwork
2024-08-24 14:44 ` ✗ Fi.CI.IGT: failure " Patchwork
2024-08-27 9:25 ` [PATCH v3 00/15] CCS static load balance Andi Shyti
2024-08-27 17:31 ` Daniel Vetter
2024-08-28 8:20 ` Andi Shyti
2024-08-28 13:47 ` Daniel Vetter [this message]
2024-08-28 15:35 ` Andi Shyti
2024-09-02 11:13 ` Daniel Vetter
-- strict thread matches above, loose matches on Subject: below --
2024-08-23 13:08 Andi Shyti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zs8qaZftGbq7Ls00@phenom.ffwll.local \
--to=daniel.vetter@ffwll.ch \
--cc=andi.shyti@linux.intel.com \
--cc=chris.p.wilson@linux.intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=tursulin@ursulin.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox