Linux ARM-MSM sub-architecture
 help / color / mirror / Atom feed
From: Akhil P Oommen <akhilpo@codeaurora.org>
To: Caleb Connolly <caleb.connolly@linaro.org>,
	Rob Clark <robdclark@gmail.com>
Cc: dri-devel <dri-devel@lists.freedesktop.org>,
	freedreno <freedreno@lists.freedesktop.org>,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	Rob Clark <robdclark@chromium.org>, Sean Paul <sean@poorly.run>,
	David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>,
	Jordan Crouse <jordan@cosmicpenguin.net>,
	Jonathan Marek <jonathan@marek.ca>,
	Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	Sharat Masetty <smasetty@codeaurora.org>,
	open list <linux-kernel@vger.kernel.org>,
	Stephen Boyd <sboyd@kernel.org>
Subject: Re: [PATCH] drm/msm: Disable frequency clamping on a630
Date: Mon, 9 Aug 2021 22:56:40 +0530	[thread overview]
Message-ID: <53d3e5b7-9dc0-a806-70e9-b9b5ff877462@codeaurora.org> (raw)
In-Reply-To: <83ecbe74-caf0-6c42-e6f5-4887b3b534c6@linaro.org>

On 8/9/2021 9:48 PM, Caleb Connolly wrote:
> 
> 
> On 09/08/2021 17:12, Rob Clark wrote:
>> On Mon, Aug 9, 2021 at 7:52 AM Akhil P Oommen <akhilpo@codeaurora.org> 
>> wrote:
>>>
>>> On 8/8/2021 10:22 PM, Rob Clark wrote:
>>>> On Sun, Aug 8, 2021 at 7:33 AM Caleb Connolly 
>>>> <caleb.connolly@linaro.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 07/08/2021 21:04, Rob Clark wrote:
>>>>>> On Sat, Aug 7, 2021 at 12:21 PM Caleb Connolly
>>>>>> <caleb.connolly@linaro.org> wrote:
>>>>>>>
>>>>>>> Hi Rob, Akhil,
>>>>>>>
>>>>>>> On 29/07/2021 21:53, Rob Clark wrote:
>>>>>>>> On Thu, Jul 29, 2021 at 1:28 PM Caleb Connolly
>>>>>>>> <caleb.connolly@linaro.org> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 29/07/2021 21:24, Rob Clark wrote:
>>>>>>>>>> On Thu, Jul 29, 2021 at 1:06 PM Caleb Connolly
>>>>>>>>>> <caleb.connolly@linaro.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Rob,
>>>>>>>>>>>
>>>>>>>>>>> I've done some more testing! It looks like before that patch 
>>>>>>>>>>> ("drm/msm: Devfreq tuning") the GPU would never get above
>>>>>>>>>>> the second frequency in the OPP table (342MHz) (at least, not 
>>>>>>>>>>> in glxgears). With the patch applied it would more
>>>>>>>>>>> aggressively jump up to the max frequency which seems to be 
>>>>>>>>>>> unstable at the default regulator voltages.
>>>>>>>>>>
>>>>>>>>>> *ohh*, yeah, ok, that would explain it
>>>>>>>>>>
>>>>>>>>>>> Hacking the pm8005 s1 regulator (which provides VDD_GFX) up 
>>>>>>>>>>> to 0.988v (instead of the stock 0.516v) makes the GPU stable
>>>>>>>>>>> at the higher frequencies.
>>>>>>>>>>>
>>>>>>>>>>> Applying this patch reverts the behaviour, and the GPU never 
>>>>>>>>>>> goes above 342MHz in glxgears, losing ~30% performance in
>>>>>>>>>>> glxgear.
>>>>>>>>>>>
>>>>>>>>>>> I think (?) that enabling CPR support would be the proper 
>>>>>>>>>>> solution to this - that would ensure that the regulators run
>>>>>>>>>>> at the voltage the hardware needs to be stable.
>>>>>>>>>>>
>>>>>>>>>>> Is hacking the voltage higher (although ideally not quite 
>>>>>>>>>>> that high) an acceptable short term solution until we have
>>>>>>>>>>> CPR? Or would it be safer to just not make use of the higher 
>>>>>>>>>>> frequencies on a630 for now?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> tbh, I'm not sure about the regulator stuff and CPR.. Bjorn is 
>>>>>>>>>> already
>>>>>>>>>> on CC and I added sboyd, maybe one of them knows better.
>>>>>>>>>>
>>>>>>>>>> In the short term, removing the higher problematic OPPs from 
>>>>>>>>>> dts might
>>>>>>>>>> be a better option than this patch (which I'm dropping), since 
>>>>>>>>>> there
>>>>>>>>>> is nothing stopping other workloads from hitting higher OPPs.
>>>>>>>>> Oh yeah that sounds like a more sensible workaround than mine .
>>>>>>>>>>
>>>>>>>>>> I'm slightly curious why I didn't have problems at higher OPPs 
>>>>>>>>>> on my
>>>>>>>>>> c630 laptop (sdm850)
>>>>>>>>> Perhaps you won the sillicon lottery - iirc sdm850 is binned 
>>>>>>>>> for higher clocks as is out of the factory.
>>>>>>>>>
>>>>>>>>> Would it be best to drop the OPPs for all devices? Or just 
>>>>>>>>> those affected? I guess it's possible another c630 might
>>>>>>>>> crash where yours doesn't?
>>>>>>>>
>>>>>>>> I've not heard any reports of similar issues from the handful of 
>>>>>>>> other
>>>>>>>> folks with c630's on #aarch64-laptops.. but I can't really say 
>>>>>>>> if that
>>>>>>>> is luck or not.
>>>>>>> It looks like this affects at least the OnePlus 6 and PocoPhone 
>>>>>>> F1, I've done some more poking and the following diff
>>>>>>> seems to fix the stability issues completely, it seems the delay 
>>>>>>> is required to let the update propagate.
>>>>>>>
>>>>>>> This doesn't feel like the right fix, but hopefully it's enough 
>>>>>>> to come up with a better solution than disabling the new
>>>>>>> devfreq behaviour on a630.
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
>>>>>>> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>>>>>>> index d7cec7f0dde0..69e2a5e84dae 100644
>>>>>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>>>>>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>>>>>>> @@ -139,6 +139,10 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, 
>>>>>>> struct dev_pm_opp *opp)
>>>>>>>                     return;
>>>>>>>             }
>>>>>>>
>>>>>>> +       dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
>>>>>>> +
>>>>>>> +       usleep_range(300, 500);
>>>>>>> +
>>>>>>
>>>
>>> I am a bit confused. We don't define a power domain for gpu in dt,
>>> correct? Then what exactly set_opp do here? Do you think this usleep is
>>> what is helping here somehow to mask the issue?
> The power domains (for cx and gx) are defined in the GMU DT, the OPPs in 
> the GPU DT. For the sake of simplicity I'll refer to the lowest 
> frequency (257000000) and OPP level (RPMH_REGULATOR_LEVEL_LOW_SVS) as 
> the "min" state, and the highest frequency (710000000) and OPP level 
> (RPMH_REGULATOR_LEVEL_TURBO_L1) as the "max" state. These are defined in 
> sdm845.dtsi under the gpu node.
> 
> The new devfreq behaviour unmasks what I think is a driver bug, it 
> inadvertently puts much more strain on the GPU regulators than they 
> usually get. With the new behaviour the GPU jumps from it's min state to 
> the max state and back again extremely rapidly under workloads as small 
> as refreshing UI. Where previously the GPU would rarely if ever go above 
> 342MHz when interacting with the device, it now jumps between min and 
> max many times per second.
> 
> If my understanding is correct, the current implementation of the GMU 
> set freq is the following:
>   - Get OPP for frequency to set
>   - Push the frequency to the GMU - immediately updating the core clock
>   - Call dev_pm_opp_set_opp() which triggers a notify chain, this winds 
> up somewhere in power management code and causes the gx regulator level 
> to be updated

Nope. dev_pm_opp_set_opp() sets the bandwidth for gpu and nothing else. 
We were using a different api earlier which got deprecated - 
dev_pm_opp_set_bw().

> 
> The regulator will then take some time to reach it's new voltage level 
> and stabilise. I believe that rapid transitions between min and max 
> state - in combination with the increased current load from the GPU core 
> - lead to the regulator becoming unstable (e.g. when it's requested to 
> transition from it's lowest to highest levels immediately after 
> transitioning down), the unstable voltage causes the GPU to crash.
> 
> Sillicon lottery will of course play a role here - this is very much an 
> edge case and would definitely be different on a per-device and even 
> per-unit basis.
>>
>> Hmm, I thought "opp-level = RPMH_REGULATOR_LEVEL_*" did *something*,
>> but tbh I'm not sure exactly what..
>>
>>> I feel we should just leave the new dcvs feature (shall we call it NAP?)
>>> disabled for a630 (and 10ms devfreq interval), until this is root 
>>> caused.
> I believe this hacky workaround expresses the root cause of the issue 
> quite clearly, by setting the OPP first and allowing the gx regulator to 
> become stable before telling the GPU to change clock speeds, we avoid 
> the edge case and prevent the crashes.
> 
> I took some rough measurements by adding logging to msm_devfreq_idle and 
> causing UI updates for ~20 seconds and that function is being called 
> about 30 times per second, this means the GPU is transitioning between 
> min (idle) state and max (active / boost) state at that frequency and 
> causing the issue I described above. It's likely that the usleep is 
> helping to mask this behaviour.
> 
> I hope this serves as a slightly better explanation of what I perceive 
> to be the issue, I realise my previous explanations were not very 
> adequate, I apologise for all the noise.
>>
>> I suppose "NAP" is a reasonable name.
>>
>> But I think that reverting to previous behavior would not be enough,
>> there is nothing stopping devfreq from jumping from min to max freq,
>> which AFAIU should be enough to trigger this.  I guess that there just
>> hasn't been enough testing with different game workloads on those
>> phones to trigger this.
> Ack
>>
>> That said, I haven't seen similar issues on my sdm850 laptop, where I
>> defn have triggered mix->max freq transitions.. I guess it would be
>> interesting to know if this issue could be reproduced on db845c, or if
>> it really is board specific?
> My db845c arrives this week, I'll definitely try and reproduce this.
>>
>> To workaround, I think we'd need to implement some way to limit that
>> maximum frequency jump (and then use delayed work to continue ramping
>> up the freq over time until we hit the target).. which seems like a
>> lot of work if this is just a board(s) specific workaround and isn't
>> needed once CPR is supported
> Based on my reasoning above, I came up with the following: reducing 
> thrashing by preventing rapid idle/active transitions. The minimum 
> active time of 30ms was just used for testing, I think some number 
> between 2 and 4 frames would be a sensible choice - the higher the safer.
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index d7cec7f0dde0..87f2d1085c3e 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -139,6 +139,8 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct 
> dev_pm_opp *opp)
>                  return;
>          }
> 
> +       dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
> +
>          gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0);
> 
>          gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING,
> @@ -158,7 +160,6 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct 
> dev_pm_opp *opp)
>          if (ret)
>                  dev_err(gmu->dev, "GMU set GPU frequency error: %d\n", 
> ret);
> 
> -       dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
>          pm_runtime_put(gmu->dev);
>   }
> 
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 0e4b45bff2e6..0e2293bcb46d 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -99,8 +99,8 @@ struct msm_gpu_devfreq {
>          /** time: Time of last sampling period. */
>          ktime_t time;
> 
> -       /** idle_time: Time of last transition to idle: */
> -       ktime_t idle_time;
> +       /** transition_time: Time of last transition between 
> idle/active: */
> +       ktime_t transition_time;
> 
>          /**
>           * idle_freq:
> diff --git a/drivers/gpu/drm/msm/msm_gpu_devfreq.c 
> b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
> index 0a1ee20296a2..774a7be33e7a 100644
> --- a/drivers/gpu/drm/msm/msm_gpu_devfreq.c
> +++ b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
> @@ -157,7 +157,7 @@ void msm_devfreq_active(struct msm_gpu *gpu)
>           */
>          mutex_lock(&df->devfreq->lock);
> 
> -       idle_time = ktime_to_ms(ktime_sub(ktime_get(), df->idle_time));
> +       idle_time = ktime_to_ms(ktime_sub(ktime_get(), 
> df->transition_time));
> 
>          /*
>           * If we've been idle for a significant fraction of a polling
> @@ -168,7 +168,7 @@ void msm_devfreq_active(struct msm_gpu *gpu)
>                  target_freq *= 2;
>          }
> 
> -       df->idle_freq = 0;
> +       df->transition_time = ktime_get();;
> 
>          msm_devfreq_target(&gpu->pdev->dev, &target_freq, 0);
> 
> @@ -185,6 +185,16 @@ void msm_devfreq_idle(struct msm_gpu *gpu)
>   {
>          struct msm_gpu_devfreq *df = &gpu->devfreq;
>          unsigned long idle_freq, target_freq = 0;
> +       unsigned int active_time;
> +
> +       active_time = ktime_to_ms(ktime_sub(ktime_get(), 
> df->transition_time));
> +       /*
> +        * Don't go back to idle unless we've been active for at least 30ms
> +        * to avoid thrashing.

This basically defeats the purpose of this feature! At least, we should 
keep this '30' gpu specific. Does a Kconfig makes sense here?? BTW, if 
300us was helping you earlier why do you want it to be 30ms now?

> +        */
> +       if (active_time < 30) {
> +               return;
> +       }
> 
>          /*
>           * Hold devfreq lock to synchronize with get_dev_status()/
> @@ -196,7 +206,7 @@ void msm_devfreq_idle(struct msm_gpu *gpu)
> 
>          msm_devfreq_target(&gpu->pdev->dev, &target_freq, 0);
> 
> -       df->idle_time = ktime_get();
> +       df->transition_time = ktime_get();
>          df->idle_freq = idle_freq;
> 
>          mutex_unlock(&df->devfreq->lock);
>>
>> BR,
>> -R
>>
>>>>>> Hmm, this is going to be in the critical path on idle -> active
>>>>>> transition (ie. think response time to user-input).. so we defn don't
>>>>>> want to do this unconditionally..
>>>>>>
>>>>>> If I understand the problem, we just want to limit how far we jump 
>>>>>> the
>>>>>> gpu freq in one go.. maybe deleting the lowest (and perhaps highest)
>>>>>> OPP would accomplish that?  Could that be done in the board(s)'s
>>>>>> toplevel dts files?
>>>>> That would be a workaround, however I'd really like to avoid 
>>>>> limiting performance as a solution if I can help it,
>>>>> especially as the fix might just be "set the opp first, wait for it 
>>>>> to apply, then set the core clock".
>>>>>
>>>>> Is there a sensible way to get a callback from the opp notify 
>>>>> chain? Or from rpmh directly? Or is this solution really
>>>>> not the right way to go?
>>>>
>>>> It does seem a bit strange to me that we are telling GMU to change
>>>> freq before calling dev_pm_opp_set_opp()..  if dev_pm_opp_set_opp() is
>>>> increasing voltage, it seems like you'd want to do that *before*
>>>> increasing freq (but reverse the order when decreasing freq).. But I'm
>>>> not an expert on the ways of the GMU..  maybe Akhil or Jordan knows
>>>> better how this is supposed to work.
>>>
>>> For legacy gmu, we trigger DCVS using DCVS OOB which comes later in this
>>> function. But the order between regulator and clock which you mentioned
>>> is correct.
>>>
>>>>
>>>> But the delay seems like papering something over, and I'm trying to go
>>>> in the other direction and reduce latency between user input and
>>>> pageflip..
>>>>
>>>> BR,
>>>> -R
>>>>
>>>>>>
>>>>>> BR,
>>>>>> -R
>>>>>>
>>>>>>>             gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0);
>>>>>>>
>>>>>>>             gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING,
>>>>>>> @@ -158,7 +162,6 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, 
>>>>>>> struct dev_pm_opp *opp)
>>>>>>>             if (ret)
>>>>>>>                     dev_err(gmu->dev, "GMU set GPU frequency 
>>>>>>> error: %d\n", ret);
>>>>>>>
>>>>>>> -       dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
>>>>>>>             pm_runtime_put(gmu->dev);
>>>>>>>      }
>>>>>>>>
>>>>>>>> Maybe just remove it for affected devices?  But I'll defer to 
>>>>>>>> Bjorn.
>>>>>>>>
>>>>>>>> BR,
>>>>>>>> -R
>>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> Kind Regards,
>>>>>>> Caleb (they/them)
>>>>>
>>>>> -- 
>>>>> Kind Regards,
>>>>> Caleb (they/them)
>>>
> 


  reply	other threads:[~2021-08-09 17:27 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-29 18:39 [PATCH] drm/msm: Disable frequency clamping on a630 Rob Clark
2021-07-29 20:06 ` Caleb Connolly
2021-07-29 20:24   ` Rob Clark
2021-07-29 20:28     ` Caleb Connolly
2021-07-29 20:53       ` Rob Clark
2021-08-07 19:21         ` Caleb Connolly
2021-08-07 20:04           ` Rob Clark
2021-08-08 14:32             ` Caleb Connolly
2021-08-08 16:52               ` Rob Clark
2021-08-09 14:51                 ` Akhil P Oommen
2021-08-09 16:12                   ` Rob Clark
2021-08-09 16:18                     ` Caleb Connolly
2021-08-09 17:26                       ` Akhil P Oommen [this message]
2021-08-09 17:58                         ` Rob Clark
2021-08-09 20:35                           ` Caleb Connolly
2021-08-09 21:08                             ` Rob Clark
2021-09-07 15:43                               ` Bjorn Andersson
2021-09-08  2:21                         ` Bjorn Andersson
2021-09-08 13:49                           ` Caleb Connolly
2021-09-09 12:17                           ` Amit Pundir
2021-09-09 16:12                             ` Amit Pundir
2021-09-09 19:49                               ` Akhil P Oommen
2021-09-09 20:54                                 ` Rob Clark
2021-09-10 17:22                                   ` Rob Clark
2021-09-10 17:18                           ` Rob Clark
2021-09-10 17:34                             ` Caleb Connolly
2021-09-13  6:15                               ` Akhil P Oommen
2021-09-03 19:39         ` John Stultz
2021-09-03 20:29           ` Rob Clark
2021-09-06  8:01             ` Amit Pundir
2021-09-06 16:28               ` Rob Clark
2021-09-06 19:58                 ` Amit Pundir
2021-09-06 20:50                   ` Rob Clark
2021-09-06 21:27                     ` Rob Clark
2021-09-07  8:18                       ` Amit Pundir
2021-09-07  1:45                   ` Rob Clark
2021-09-07  8:25                     ` Amit Pundir
2021-09-07 14:25                       ` Rob Clark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53d3e5b7-9dc0-a806-70e9-b9b5ff877462@codeaurora.org \
    --to=akhilpo@codeaurora.org \
    --cc=airlied@linux.ie \
    --cc=bjorn.andersson@linaro.org \
    --cc=caleb.connolly@linaro.org \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=jonathan@marek.ca \
    --cc=jordan@cosmicpenguin.net \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robdclark@chromium.org \
    --cc=robdclark@gmail.com \
    --cc=saiprakash.ranjan@codeaurora.org \
    --cc=sboyd@kernel.org \
    --cc=sean@poorly.run \
    --cc=smasetty@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox