From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: John Harrison <john.c.harrison@intel.com>
Cc: jeff.mcgee@intel.com, intel-gfx@lists.freedesktop.org,
dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost
Date: Wed, 08 Jun 2022 10:39:41 -0700 [thread overview]
Message-ID: <87tu8vjbqa.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <c9edf8d0-d36c-8b65-d536-f8eee4986662@intel.com>
On Tue, 07 Jun 2022 16:15:19 -0700, John Harrison wrote:
>
> On 6/7/2022 15:29, Dixit, Ashutosh wrote:
> > On Sat, 14 May 2022 23:05:06 -0700, Vinay Belgaumkar wrote:
> >> SLPC min/max frequency updates require H2G calls. We are seeing
> >> timeouts when GuC channel is backed up and it is unable to respond
> >> in a timely fashion causing warnings and affecting CI.
> >>
> >> This is seen when waitboosting happens during a stress test.
> >> this patch updates the waitboost path to use a non-blocking
> >> H2G call instead, which returns as soon as the message is
> >> successfully transmitted.
> > Overall I think this patch is trying to paper over problems in the blocking
> > H2G CT interface (specifically the 1 second timeout in
> > wait_for_ct_request_update()). So I think we should address that problem in
> > the interface directly rather than having each client (SLPC and any future
> > client) work around the problem. Following points:
> >
> > 1. This patch seems to assume that it is 'ok' to ignore the return code
> > from FW for a waitboost request (arguing waitboost is best effort so
> > it's ok to 'fire and forget'). But the return code is still useful
> > e.g. in cases where we see performance issues and want to go back and
> > investigate if FW rejected any waitboost requests.
>
> You still get errors reported in the GuC log. Indeed, some errors (or at
> least error reasons) are only visible in the log not in the return code.
OK, so we at least have this method for debug available.
> > 2. We are already seeing that a 1 second timeout is not sufficient. So why
> > not simply increase that timeout?
> >
> > 3. In fact if we are saying that the CT interface is a "reliable" interface
> > (implying no message loss), to ensure reliability that timeout should
> > not simply be increased, it should be made "infinite" (in quotes).
> >
> > 4. Maybe it would have been best to not have a "blocking" H2G interface at
> > all (with the wait in wait_for_ct_request_update()). Just have an
> > asynchronous interface (which mirrors the actual interface between FW
> > and i915) in which clients register callbacks which are invoked when FW
> > responds. If this is too big a change we can probably continue with the
> > current blocking interface after increasing the timeout as mentioned
> > above.
> >
> > 5. Finally, the waitboost request is just the most likely to get stuck at
> > the back of a full CT queue since it happens during normal
> > operation. Actually any request, say one initiated from sysfs, can also
> > get similarly stuck at the back of a full queue. So any solution should
> > also address that situation (where the return code is needed and
> > similarly for a future client of the "blocking" (REQUEST/RESPONSE)
> > interface).
> The blocking interface is only intended for init time operations, not
> runtime.
In that case we should probably have code to enforce this in i915.
> Stuff where the operation is meant to be synchronous and the KMD
> should not proceed until it has an ack back from the GuC that the update
> has taken place. All runtime operations are expected to be asynchronous. If
> a response is required, then it should be sent via an async
> callback. E.g. context de-registration is a 'fire and forget' H2G call but
> gets a 'deregistration complete' G2H notification when it is safe for the
> KMD to free up the associated storage.
At present all GuC interactions in intel_guc_slpc.c (in i915) do *not*
follow this. They use the REQUEST/RESPONSE FW interface which is pushed
through the blocking H2G CT interface in i915. If we are serious about this
this needs a GuC FW change to use bi-directional EVENT's used in the
asynchronous interface (with corresponding changes in intel_guc_slpc.c).
> There is an 'errors only' H2G mechanism. That will not send an ack back in
> the case of a successful H2G but will send back an error notification in
> the case of a failure. All async H2Gs should really be using that
> mechanism. I think Michal W did post a patch for it and I was meant to be
> reviewing it but it dropped of my radar due to other higher priorities.
These I believe are referred to as FAST_REQUEST's in GuC FW. That success
is not communicated back to the KMD might be an issue in cases where KMD
needs to know whether a particular operation was successful (such as
for operations initiated via sysfs).
Thanks.
--
Ashutosh
next prev parent reply other threads:[~2022-06-08 17:39 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-15 6:05 [Intel-gfx] [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost Vinay Belgaumkar
2022-05-15 6:39 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/guc/slpc: Use non-blocking H2G for waitboost (rev2) Patchwork
2022-05-15 7:51 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2022-05-16 7:59 ` [Intel-gfx] [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost Jani Nikula
2022-05-16 8:00 ` Jani Nikula
2022-06-07 23:02 ` John Harrison
2022-06-07 23:04 ` John Harrison
2022-06-08 7:58 ` Jani Nikula
2022-06-07 22:29 ` Dixit, Ashutosh
2022-06-07 23:15 ` John Harrison
2022-06-08 17:39 ` Dixit, Ashutosh [this message]
2022-06-22 0:26 ` Dixit, Ashutosh
2022-06-22 20:30 ` Belgaumkar, Vinay
2022-06-22 21:28 ` Dixit, Ashutosh
2022-06-23 8:12 ` Tvrtko Ursulin
-- strict thread matches above, loose matches on Subject: below --
2022-06-23 0:32 Vinay Belgaumkar
2022-06-23 0:53 ` Dixit, Ashutosh
2022-05-05 5:40 Vinay Belgaumkar
2022-05-05 12:13 ` Tvrtko Ursulin
2022-05-05 17:21 ` Belgaumkar, Vinay
2022-05-05 18:36 ` John Harrison
2022-05-06 7:18 ` Tvrtko Ursulin
2022-05-06 16:21 ` Belgaumkar, Vinay
2022-05-06 16:43 ` John Harrison
2022-05-15 5:46 ` Belgaumkar, Vinay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tu8vjbqa.wl-ashutosh.dixit@intel.com \
--to=ashutosh.dixit@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=jeff.mcgee@intel.com \
--cc=john.c.harrison@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox