Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: John Harrison <john.c.harrison@intel.com>
Cc: jeff.mcgee@intel.com, intel-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost
Date: Wed, 08 Jun 2022 10:39:41 -0700	[thread overview]
Message-ID: <87tu8vjbqa.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <c9edf8d0-d36c-8b65-d536-f8eee4986662@intel.com>

On Tue, 07 Jun 2022 16:15:19 -0700, John Harrison wrote:
>
> On 6/7/2022 15:29, Dixit, Ashutosh wrote:
> > On Sat, 14 May 2022 23:05:06 -0700, Vinay Belgaumkar wrote:
> >> SLPC min/max frequency updates require H2G calls. We are seeing
> >> timeouts when GuC channel is backed up and it is unable to respond
> >> in a timely fashion causing warnings and affecting CI.
> >>
> >> This is seen when waitboosting happens during a stress test.
> >> this patch updates the waitboost path to use a non-blocking
> >> H2G call instead, which returns as soon as the message is
> >> successfully transmitted.
> > Overall I think this patch is trying to paper over problems in the blocking
> > H2G CT interface (specifically the 1 second timeout in
> > wait_for_ct_request_update()). So I think we should address that problem in
> > the interface directly rather than having each client (SLPC and any future
> > client) work around the problem. Following points:
> >
> > 1. This patch seems to assume that it is 'ok' to ignore the return code
> >     from FW for a waitboost request (arguing waitboost is best effort so
> >     it's ok to 'fire and forget'). But the return code is still useful
> >     e.g. in cases where we see performance issues and want to go back and
> >     investigate if FW rejected any waitboost requests.
>
> You still get errors reported in the GuC log. Indeed, some errors (or at
> least error reasons) are only visible in the log not in the return code.

OK, so we at least have this method for debug available.

> > 2. We are already seeing that a 1 second timeout is not sufficient. So why
> >     not simply increase that timeout?
> >
> > 3. In fact if we are saying that the CT interface is a "reliable" interface
> >     (implying no message loss), to ensure reliability that timeout should
> >     not simply be increased, it should be made "infinite" (in quotes).
> >
> > 4. Maybe it would have been best to not have a "blocking" H2G interface at
> >     all (with the wait in wait_for_ct_request_update()). Just have an
> >     asynchronous interface (which mirrors the actual interface between FW
> >     and i915) in which clients register callbacks which are invoked when FW
> >     responds. If this is too big a change we can probably continue with the
> >     current blocking interface after increasing the timeout as mentioned
> >     above.
> >
> > 5. Finally, the waitboost request is just the most likely to get stuck at
> >     the back of a full CT queue since it happens during normal
> >     operation. Actually any request, say one initiated from sysfs, can also
> >     get similarly stuck at the back of a full queue. So any solution should
> >     also address that situation (where the return code is needed and
> >     similarly for a future client of the "blocking" (REQUEST/RESPONSE)
> >     interface).
> The blocking interface is only intended for init time operations, not
> runtime.

In that case we should probably have code to enforce this in i915.

> Stuff where the operation is meant to be synchronous and the KMD
> should not proceed until it has an ack back from the GuC that the update
> has taken place. All runtime operations are expected to be asynchronous. If
> a response is required, then it should be sent via an async
> callback. E.g. context de-registration is a 'fire and forget' H2G call but
> gets a 'deregistration complete' G2H notification when it is safe for the
> KMD to free up the associated storage.

At present all GuC interactions in intel_guc_slpc.c (in i915) do *not*
follow this. They use the REQUEST/RESPONSE FW interface which is pushed
through the blocking H2G CT interface in i915. If we are serious about this
this needs a GuC FW change to use bi-directional EVENT's used in the
asynchronous interface (with corresponding changes in intel_guc_slpc.c).

> There is an 'errors only' H2G mechanism. That will not send an ack back in
> the case of a successful H2G but will send back an error notification in
> the case of a failure. All async H2Gs should really be using that
> mechanism. I think Michal W did post a patch for it and I was meant to be
> reviewing it but it dropped of my radar due to other higher priorities.

These I believe are referred to as FAST_REQUEST's in GuC FW. That success
is not communicated back to the KMD might be an issue in cases where KMD
needs to know whether a particular operation was successful (such as
for operations initiated via sysfs).

Thanks.
--
Ashutosh

  reply	other threads:[~2022-06-08 17:39 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-15  6:05 [Intel-gfx] [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost Vinay Belgaumkar
2022-05-15  6:39 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/guc/slpc: Use non-blocking H2G for waitboost (rev2) Patchwork
2022-05-15  7:51 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2022-05-16  7:59 ` [Intel-gfx] [PATCH] drm/i915/guc/slpc: Use non-blocking H2G for waitboost Jani Nikula
2022-05-16  8:00   ` Jani Nikula
2022-06-07 23:02   ` John Harrison
2022-06-07 23:04     ` John Harrison
2022-06-08  7:58       ` Jani Nikula
2022-06-07 22:29 ` Dixit, Ashutosh
2022-06-07 23:15   ` John Harrison
2022-06-08 17:39     ` Dixit, Ashutosh [this message]
2022-06-22  0:26 ` Dixit, Ashutosh
2022-06-22 20:30   ` Belgaumkar, Vinay
2022-06-22 21:28     ` Dixit, Ashutosh
2022-06-23  8:12       ` Tvrtko Ursulin
  -- strict thread matches above, loose matches on Subject: below --
2022-06-23  0:32 Vinay Belgaumkar
2022-06-23  0:53 ` Dixit, Ashutosh
2022-05-05  5:40 Vinay Belgaumkar
2022-05-05 12:13 ` Tvrtko Ursulin
2022-05-05 17:21   ` Belgaumkar, Vinay
2022-05-05 18:36     ` John Harrison
2022-05-06  7:18       ` Tvrtko Ursulin
2022-05-06 16:21         ` Belgaumkar, Vinay
2022-05-06 16:43         ` John Harrison
2022-05-15  5:46           ` Belgaumkar, Vinay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tu8vjbqa.wl-ashutosh.dixit@intel.com \
    --to=ashutosh.dixit@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jeff.mcgee@intel.com \
    --cc=john.c.harrison@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox