linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] Revert "clk: Fix invalid execution of clk_set_rate"
       [not found] ` <3fd004add188460bf2bdd1a718387c7f.sboyd@kernel.org>
@ 2024-12-03  8:25   ` Johan Hovold
  2024-12-03  9:21     ` Manivannan Sadhasivam
  0 siblings, 1 reply; 4+ messages in thread
From: Johan Hovold @ 2024-12-03  8:25 UTC (permalink / raw)
  To: Stephen Boyd, Viresh Kumar, Manivannan Sadhasivam
  Cc: Johan Hovold, Michael Turquette, linux-clk, linux-kernel,
	regressions, Aishwarya TCV, Chuan Liu, Sudeep Holla, linux-pm

[ +CC: Viresh and Sudeep ]

On Mon, Dec 02, 2024 at 05:20:06PM -0800, Stephen Boyd wrote:
> Quoting Johan Hovold (2024-12-02 02:06:21)
> > This reverts commit 25f1c96a0e841013647d788d4598e364e5c2ebb7.
> > 
> > The offending commit results in errors like
> > 
> >         cpu cpu0: _opp_config_clk_single: failed to set clock rate: -22
> > 
> > spamming the logs on the Lenovo ThinkPad X13s and other Qualcomm
> > machines when cpufreq tries to update the CPUFreq HW Engine clocks.
> > 
> > As mentioned in commit 4370232c727b ("cpufreq: qcom-hw: Add CPU clock
> > provider support"):
> > 
> >         [T]he frequency supplied by the driver is the actual frequency
> >         that comes out of the EPSS/OSM block after the DCVS operation.
> >         This frequency is not same as what the CPUFreq framework has set
> >         but it is the one that gets supplied to the CPUs after
> >         throttling by LMh.
> > 
> > which seems to suggest that the driver relies on the previous behaviour
> > of clk_set_rate().
> 
> I don't understand why a clk provider is needed there. Is anyone looking
> into the real problem?

I mentioned this to Mani yesterday, but I'm not sure if he has had time
to look into it yet. And I forgot to CC Viresh who was involved in
implementing this. There is comment of his in the thread where this
feature was added:

	Most likely no one will ever do clk_set_rate() on this new
	clock, which is fine, though OPP core will likely do
	clk_get_rate() here.

which may suggest that some underlying assumption has changed. [1]

There are some more details in that thread that should explain why
things were implemented the way they were:

	https://lore.kernel.org/linux-arm-msm/20221117053145.10409-1-manivannan.sadhasivam@linaro.org/

> > Since this affects many Qualcomm machines, let's revert for now.
> > 
> > Fixes: 25f1c96a0e84 ("clk: Fix invalid execution of clk_set_rate")
> > Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com>
> > Link: https://lore.kernel.org/all/e2d83e57-ad07-411b-99f6-a4fc3c4534fa@arm.com/
> > Cc: Chuan Liu <chuan.liu@amlogic.com>
> > Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> > Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
> > ---
> 
> Applied to clk-fixes

Thanks.

Johan

[1] https://lore.kernel.org/linux-arm-msm/20221118055730.yrzpuih3zfko5c2q@vireshk-i7/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "clk: Fix invalid execution of clk_set_rate"
  2024-12-03  8:25   ` [PATCH] Revert "clk: Fix invalid execution of clk_set_rate" Johan Hovold
@ 2024-12-03  9:21     ` Manivannan Sadhasivam
  2024-12-03 19:30       ` Stephen Boyd
  0 siblings, 1 reply; 4+ messages in thread
From: Manivannan Sadhasivam @ 2024-12-03  9:21 UTC (permalink / raw)
  To: Johan Hovold, Stephen Boyd
  Cc: Viresh Kumar, Johan Hovold, Michael Turquette, linux-clk,
	linux-kernel, regressions, Aishwarya TCV, Chuan Liu, Sudeep Holla,
	linux-pm

On Tue, Dec 03, 2024 at 09:25:01AM +0100, Johan Hovold wrote:
> [ +CC: Viresh and Sudeep ]
> 
> On Mon, Dec 02, 2024 at 05:20:06PM -0800, Stephen Boyd wrote:
> > Quoting Johan Hovold (2024-12-02 02:06:21)
> > > This reverts commit 25f1c96a0e841013647d788d4598e364e5c2ebb7.
> > > 
> > > The offending commit results in errors like
> > > 
> > >         cpu cpu0: _opp_config_clk_single: failed to set clock rate: -22
> > > 
> > > spamming the logs on the Lenovo ThinkPad X13s and other Qualcomm
> > > machines when cpufreq tries to update the CPUFreq HW Engine clocks.
> > > 
> > > As mentioned in commit 4370232c727b ("cpufreq: qcom-hw: Add CPU clock
> > > provider support"):
> > > 
> > >         [T]he frequency supplied by the driver is the actual frequency
> > >         that comes out of the EPSS/OSM block after the DCVS operation.
> > >         This frequency is not same as what the CPUFreq framework has set
> > >         but it is the one that gets supplied to the CPUs after
> > >         throttling by LMh.
> > > 
> > > which seems to suggest that the driver relies on the previous behaviour
> > > of clk_set_rate().
> > 
> > I don't understand why a clk provider is needed there. Is anyone looking
> > into the real problem?
> 
> I mentioned this to Mani yesterday, but I'm not sure if he has had time
> to look into it yet. And I forgot to CC Viresh who was involved in
> implementing this. There is comment of his in the thread where this
> feature was added:
> 
> 	Most likely no one will ever do clk_set_rate() on this new
> 	clock, which is fine, though OPP core will likely do
> 	clk_get_rate() here.
> 
> which may suggest that some underlying assumption has changed. [1]
> 

I just looked into the issue this morning. The commit that triggered the errors
seem to be doing the right thing (although the commit message was a bit hard to
understand), but the problem is this check which gets triggered now:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/clk.c?h=v6.13-rc1#n2319

Since the qcom-cpufreq* clocks doesn't have parents now (they should've been
defined anyway) and there is no CLK_SET_RATE_PARENT flag set, the check returns
NULL for the 'top' clock. Then clk_core_set_rate_nolock() returns -EINVAL,
causing the reported error.

But I don't quite understand why clk_core_set_rate_nolock() fails if there is no
parent or CLK_SET_RATE_PARENT is not set. The API is supposed to set the rate of
the passed clock irrespective of the parent. Propagating the rate change to
parent is not strictly needed and doesn't make sense if the parent is a fixed
clock like XO.

Stephen, thoughts?

- Mani

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "clk: Fix invalid execution of clk_set_rate"
  2024-12-03  9:21     ` Manivannan Sadhasivam
@ 2024-12-03 19:30       ` Stephen Boyd
  2024-12-05 16:52         ` Manivannan Sadhasivam
  0 siblings, 1 reply; 4+ messages in thread
From: Stephen Boyd @ 2024-12-03 19:30 UTC (permalink / raw)
  To: Johan Hovold, Manivannan Sadhasivam
  Cc: Viresh Kumar, Johan Hovold, Michael Turquette, linux-clk,
	linux-kernel, regressions, Aishwarya TCV, Chuan Liu, Sudeep Holla,
	linux-pm

Quoting Manivannan Sadhasivam (2024-12-03 01:21:51)
> On Tue, Dec 03, 2024 at 09:25:01AM +0100, Johan Hovold wrote:
> > [ +CC: Viresh and Sudeep ]
> > 
> > On Mon, Dec 02, 2024 at 05:20:06PM -0800, Stephen Boyd wrote:
> > > Quoting Johan Hovold (2024-12-02 02:06:21)
> > > > This reverts commit 25f1c96a0e841013647d788d4598e364e5c2ebb7.
> > > > 
> > > > The offending commit results in errors like
> > > > 
> > > >         cpu cpu0: _opp_config_clk_single: failed to set clock rate: -22
> > > > 
> > > > spamming the logs on the Lenovo ThinkPad X13s and other Qualcomm
> > > > machines when cpufreq tries to update the CPUFreq HW Engine clocks.
> > > > 
> > > > As mentioned in commit 4370232c727b ("cpufreq: qcom-hw: Add CPU clock
> > > > provider support"):
> > > > 
> > > >         [T]he frequency supplied by the driver is the actual frequency
> > > >         that comes out of the EPSS/OSM block after the DCVS operation.
> > > >         This frequency is not same as what the CPUFreq framework has set
> > > >         but it is the one that gets supplied to the CPUs after
> > > >         throttling by LMh.
> > > > 
> > > > which seems to suggest that the driver relies on the previous behaviour
> > > > of clk_set_rate().
> > > 
> > > I don't understand why a clk provider is needed there. Is anyone looking
> > > into the real problem?
> > 
> > I mentioned this to Mani yesterday, but I'm not sure if he has had time
> > to look into it yet. And I forgot to CC Viresh who was involved in
> > implementing this. There is comment of his in the thread where this
> > feature was added:
> > 
> >       Most likely no one will ever do clk_set_rate() on this new
> >       clock, which is fine, though OPP core will likely do
> >       clk_get_rate() here.
> > 
> > which may suggest that some underlying assumption has changed. [1]
> > 

Yikes.

> 
> I just looked into the issue this morning. The commit that triggered the errors
> seem to be doing the right thing (although the commit message was a bit hard to
> understand), but the problem is this check which gets triggered now:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/clk.c?h=v6.13-rc1#n2319
> 
> Since the qcom-cpufreq* clocks doesn't have parents now (they should've been
> defined anyway) and there is no CLK_SET_RATE_PARENT flag set, the check returns
> NULL for the 'top' clock. Then clk_core_set_rate_nolock() returns -EINVAL,
> causing the reported error.
> 
> But I don't quite understand why clk_core_set_rate_nolock() fails if there is no
> parent or CLK_SET_RATE_PARENT is not set. The API is supposed to set the rate of
> the passed clock irrespective of the parent. Propagating the rate change to
> parent is not strictly needed and doesn't make sense if the parent is a fixed
> clock like XO.

The recalc_rate clk_op is telling the framework that the clk is at a
different rate than is requested by the clk consumer _and_ than what the
framework thinks the clk is currently running at. The clk_set_rate()
call is going to attempt to satisfy that request, and because there
isn't a determine_rate/round_rate clk_op it assumes the clk can't change
rate so it looks to see if there's a parent that can be changed to
satisfy the rate. There isn't a parent either, so the clk_set_rate()
call fails because the rate can't be achieved on this clk.

It may work to have a determine_rate clk_op that is like the recalc_rate
one that says "this rate you requested is going to turn into whatever
the hardware is running at" by simply returning the rate that the clk is
running at.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "clk: Fix invalid execution of clk_set_rate"
  2024-12-03 19:30       ` Stephen Boyd
@ 2024-12-05 16:52         ` Manivannan Sadhasivam
  0 siblings, 0 replies; 4+ messages in thread
From: Manivannan Sadhasivam @ 2024-12-05 16:52 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Johan Hovold, Viresh Kumar, Johan Hovold, Michael Turquette,
	linux-clk, linux-kernel, regressions, Aishwarya TCV, Chuan Liu,
	Sudeep Holla, linux-pm

On Tue, Dec 03, 2024 at 11:30:07AM -0800, Stephen Boyd wrote:
> Quoting Manivannan Sadhasivam (2024-12-03 01:21:51)
> > On Tue, Dec 03, 2024 at 09:25:01AM +0100, Johan Hovold wrote:
> > > [ +CC: Viresh and Sudeep ]
> > > 
> > > On Mon, Dec 02, 2024 at 05:20:06PM -0800, Stephen Boyd wrote:
> > > > Quoting Johan Hovold (2024-12-02 02:06:21)
> > > > > This reverts commit 25f1c96a0e841013647d788d4598e364e5c2ebb7.
> > > > > 
> > > > > The offending commit results in errors like
> > > > > 
> > > > >         cpu cpu0: _opp_config_clk_single: failed to set clock rate: -22
> > > > > 
> > > > > spamming the logs on the Lenovo ThinkPad X13s and other Qualcomm
> > > > > machines when cpufreq tries to update the CPUFreq HW Engine clocks.
> > > > > 
> > > > > As mentioned in commit 4370232c727b ("cpufreq: qcom-hw: Add CPU clock
> > > > > provider support"):
> > > > > 
> > > > >         [T]he frequency supplied by the driver is the actual frequency
> > > > >         that comes out of the EPSS/OSM block after the DCVS operation.
> > > > >         This frequency is not same as what the CPUFreq framework has set
> > > > >         but it is the one that gets supplied to the CPUs after
> > > > >         throttling by LMh.
> > > > > 
> > > > > which seems to suggest that the driver relies on the previous behaviour
> > > > > of clk_set_rate().
> > > > 
> > > > I don't understand why a clk provider is needed there. Is anyone looking
> > > > into the real problem?
> > > 
> > > I mentioned this to Mani yesterday, but I'm not sure if he has had time
> > > to look into it yet. And I forgot to CC Viresh who was involved in
> > > implementing this. There is comment of his in the thread where this
> > > feature was added:
> > > 
> > >       Most likely no one will ever do clk_set_rate() on this new
> > >       clock, which is fine, though OPP core will likely do
> > >       clk_get_rate() here.
> > > 
> > > which may suggest that some underlying assumption has changed. [1]
> > > 
> 
> Yikes.
> 
> > 
> > I just looked into the issue this morning. The commit that triggered the errors
> > seem to be doing the right thing (although the commit message was a bit hard to
> > understand), but the problem is this check which gets triggered now:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/clk.c?h=v6.13-rc1#n2319
> > 
> > Since the qcom-cpufreq* clocks doesn't have parents now (they should've been
> > defined anyway) and there is no CLK_SET_RATE_PARENT flag set, the check returns
> > NULL for the 'top' clock. Then clk_core_set_rate_nolock() returns -EINVAL,
> > causing the reported error.
> > 
> > But I don't quite understand why clk_core_set_rate_nolock() fails if there is no
> > parent or CLK_SET_RATE_PARENT is not set. The API is supposed to set the rate of
> > the passed clock irrespective of the parent. Propagating the rate change to
> > parent is not strictly needed and doesn't make sense if the parent is a fixed
> > clock like XO.
> 
> The recalc_rate clk_op is telling the framework that the clk is at a
> different rate than is requested by the clk consumer _and_ than what the
> framework thinks the clk is currently running at. The clk_set_rate()
> call is going to attempt to satisfy that request, and because there
> isn't a determine_rate/round_rate clk_op it assumes the clk can't change
> rate so it looks to see if there's a parent that can be changed to
> satisfy the rate. There isn't a parent either, so the clk_set_rate()
> call fails because the rate can't be achieved on this clk.
> 
> It may work to have a determine_rate clk_op that is like the recalc_rate
> one that says "this rate you requested is going to turn into whatever
> the hardware is running at" by simply returning the rate that the clk is
> running at.

Sounds reasonable to me. Fix submitted incorporating your suggestion, thanks!

- Mani

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-12-05 16:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20241202100621.29209-1-johan+linaro@kernel.org>
     [not found] ` <3fd004add188460bf2bdd1a718387c7f.sboyd@kernel.org>
2024-12-03  8:25   ` [PATCH] Revert "clk: Fix invalid execution of clk_set_rate" Johan Hovold
2024-12-03  9:21     ` Manivannan Sadhasivam
2024-12-03 19:30       ` Stephen Boyd
2024-12-05 16:52         ` Manivannan Sadhasivam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).