From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC
Date: Thu, 20 Oct 2022 11:33:23 -0700 [thread overview]
Message-ID: <874jvyqr9o.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <20221020002944.4228-1-vinay.belgaumkar@intel.com>
On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote:
>
Hi Vinay,
> Waitboost (when SLPC is enabled) results in a H2G message. This can result
> in thousands of messages during a stress test and fill up an already full
> CTB. There is no need to request for RP0 if GuC is already requesting the
> same.
But how are we sure that the freq will remain at RP0 in the future (when
the waiting request or any requests which are ahead execute)?
In the current waitboost implementation, set_param is sent to GuC ahead of
the waiting request to ensure that the freq would be max when this waiting
request executed on the GPU and the freq is kept at max till this request
retires (considering just one waiting request). How can we ensure this if
we don't send the waitboost set_param to GuC?
I had assumed we'll do this optimization for server parts where min is
already RP0 in which case we can completely disable waitboost. But this
patch is something else.
Thanks.
--
Ashutosh
>
> v2: Add the tracing back, and check requested freq
> in the worker thread (Tvrtko)
> v3: Check requested freq in dec_waiters as well
>
> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++
> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++---
> 2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
> index fc23c562d9b2..18b75cf08d1b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq)
> if (rps_uses_slpc(rps)) {
> slpc = rps_to_slpc(rps);
>
> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n",
> + rq->fence.context, rq->fence.seqno);
> +
> /* Return if old value is non zero */
> if (!atomic_fetch_inc(&slpc->num_waiters))
> schedule_work(&slpc->boost_work);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> index b7cdeec44bd3..9dbdbab1515a 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
> static void slpc_boost_work(struct work_struct *work)
> {
> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work);
> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps;
> int err;
>
> /*
> * Raise min freq to boost. It's possible that
> * this is greater than current max. But it will
> * certainly be limited by RP0. An error setting
> - * the min param is not fatal.
> + * the min param is not fatal. No need to boost
> + * if we are already requesting it.
> */
> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq)
> + return;
> +
> mutex_lock(&slpc->lock);
> if (atomic_read(&slpc->num_waiters)) {
> err = slpc_force_min_freq(slpc, slpc->boost_freq);
> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val)
>
> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
> {
> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps;
> /*
> * Return min back to the softlimit.
> * This is called during request retire,
> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
> * set_param fails.
> */
> mutex_lock(&slpc->lock);
> - if (atomic_dec_and_test(&slpc->num_waiters))
> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
> + if (atomic_dec_and_test(&slpc->num_waiters)) {
> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit)
> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
> + }
> mutex_unlock(&slpc->lock);
> }
>
> --
> 2.35.1
>
WARNING: multiple messages have this Message-ID (diff)
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC
Date: Thu, 20 Oct 2022 11:33:23 -0700 [thread overview]
Message-ID: <874jvyqr9o.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <20221020002944.4228-1-vinay.belgaumkar@intel.com>
On Wed, 19 Oct 2022 17:29:44 -0700, Vinay Belgaumkar wrote:
>
Hi Vinay,
> Waitboost (when SLPC is enabled) results in a H2G message. This can result
> in thousands of messages during a stress test and fill up an already full
> CTB. There is no need to request for RP0 if GuC is already requesting the
> same.
But how are we sure that the freq will remain at RP0 in the future (when
the waiting request or any requests which are ahead execute)?
In the current waitboost implementation, set_param is sent to GuC ahead of
the waiting request to ensure that the freq would be max when this waiting
request executed on the GPU and the freq is kept at max till this request
retires (considering just one waiting request). How can we ensure this if
we don't send the waitboost set_param to GuC?
I had assumed we'll do this optimization for server parts where min is
already RP0 in which case we can completely disable waitboost. But this
patch is something else.
Thanks.
--
Ashutosh
>
> v2: Add the tracing back, and check requested freq
> in the worker thread (Tvrtko)
> v3: Check requested freq in dec_waiters as well
>
> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_rps.c | 3 +++
> drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 14 +++++++++++---
> 2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
> index fc23c562d9b2..18b75cf08d1b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> @@ -1016,6 +1016,9 @@ void intel_rps_boost(struct i915_request *rq)
> if (rps_uses_slpc(rps)) {
> slpc = rps_to_slpc(rps);
>
> + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n",
> + rq->fence.context, rq->fence.seqno);
> +
> /* Return if old value is non zero */
> if (!atomic_fetch_inc(&slpc->num_waiters))
> schedule_work(&slpc->boost_work);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> index b7cdeec44bd3..9dbdbab1515a 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> @@ -227,14 +227,19 @@ static int slpc_force_min_freq(struct intel_guc_slpc *slpc, u32 freq)
> static void slpc_boost_work(struct work_struct *work)
> {
> struct intel_guc_slpc *slpc = container_of(work, typeof(*slpc), boost_work);
> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps;
> int err;
>
> /*
> * Raise min freq to boost. It's possible that
> * this is greater than current max. But it will
> * certainly be limited by RP0. An error setting
> - * the min param is not fatal.
> + * the min param is not fatal. No need to boost
> + * if we are already requesting it.
> */
> + if (intel_rps_get_requested_frequency(rps) == slpc->boost_freq)
> + return;
> +
> mutex_lock(&slpc->lock);
> if (atomic_read(&slpc->num_waiters)) {
> err = slpc_force_min_freq(slpc, slpc->boost_freq);
> @@ -728,6 +733,7 @@ int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val)
>
> void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
> {
> + struct intel_rps *rps = &slpc_to_gt(slpc)->rps;
> /*
> * Return min back to the softlimit.
> * This is called during request retire,
> @@ -735,8 +741,10 @@ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc)
> * set_param fails.
> */
> mutex_lock(&slpc->lock);
> - if (atomic_dec_and_test(&slpc->num_waiters))
> - slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
> + if (atomic_dec_and_test(&slpc->num_waiters)) {
> + if (intel_rps_get_requested_frequency(rps) != slpc->min_freq_softlimit)
> + slpc_force_min_freq(slpc, slpc->min_freq_softlimit);
> + }
> mutex_unlock(&slpc->lock);
> }
>
> --
> 2.35.1
>
next prev parent reply other threads:[~2022-10-20 18:35 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-20 0:29 [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC Vinay Belgaumkar
2022-10-20 0:29 ` Vinay Belgaumkar
2022-10-20 1:22 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/slpc: Optmize waitboost for SLPC (rev3) Patchwork
2022-10-20 18:33 ` Dixit, Ashutosh [this message]
2022-10-20 18:33 ` [Intel-gfx] [PATCH v3] drm/i915/slpc: Optmize waitboost for SLPC Dixit, Ashutosh
2022-10-20 20:16 ` Belgaumkar, Vinay
2022-10-20 20:16 ` Belgaumkar, Vinay
2022-10-20 23:36 ` Dixit, Ashutosh
2022-10-20 23:36 ` Dixit, Ashutosh
2022-10-21 18:24 ` Belgaumkar, Vinay
2022-10-21 18:24 ` Belgaumkar, Vinay
2022-10-21 18:40 ` Dixit, Ashutosh
2022-10-21 18:40 ` Dixit, Ashutosh
2022-10-21 22:46 ` Belgaumkar, Vinay
2022-10-21 22:46 ` Belgaumkar, Vinay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874jvyqr9o.wl-ashutosh.dixit@intel.com \
--to=ashutosh.dixit@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=vinay.belgaumkar@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.