Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support

public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
       [not found]           ` <e90a0c77-61a0-49db-86ba-bac253f8ec53@samsung.com>
@ 2025-03-25 22:40             ` Stephen Boyd
  2025-03-26 11:24               ` Ulf Hansson
  0 siblings, 1 reply; 5+ messages in thread
From: Stephen Boyd @ 2025-03-25 22:40 UTC (permalink / raw)
  To: Michal Wilczynski, Philipp Zabel, alex, aou, conor+dt, drew,
	guoren, jszhang, krzk+dt, m.szyprowski, mturquette, palmer,
	paul.walmsley, robh, wefu, Ulf Hansson
  Cc: linux-clk, devicetree, linux-kernel, linux-riscv, linux-pm

Quoting Michal Wilczynski (2025-03-19 02:22:11)
> 
> 
> On 3/13/25 10:25, Philipp Zabel wrote:
> > On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
> >>
> >> On 3/6/25 00:47, Stephen Boyd wrote:
> >>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
> >>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
> >>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
> >>>> configurable. In order for these clock gates to work properly, the
> >>>> CLKGEN reset must be managed in a specific sequence.
> >>>>
> >>>> Move the CLKGEN reset handling to the clock driver since it's
> >>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
> >>>> GPU clocks stay physically enabled without external interference from
> >>>> the reset driver.  The reset is now deasserted only when both core and
> >>>> cfg clocks are enabled, and asserted when either of them is disabled.
> >>>>
> >>>> The mem clock is configured to use nop operations since it cannot be
> >>>> controlled.
> >>>>
> >>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
> >>>>
> >>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
> >>> [...]
> >>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
> >>>> index ea96d007aecd..1dfcde867233 100644
> >>>> --- a/drivers/clk/thead/clk-th1520-ap.c
> >>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
> >>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
> >>> [...]
> >>>>  
> >>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
> >>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
> >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
> >>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
> >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
> >>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
> >>>> +
> >>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
> >>>> +{
> >>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
> >>>> +       unsigned long flags;
> >>>> +
> >>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
> >>>> +
> >>>> +       ccu_disable_helper(&cg->common, cg->enable);
> >>>> +
> >>>> +       if ((cg == &gpu_core_clk &&
> >>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
> >>>> +           (cg == &gpu_cfg_aclk &&
> >>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
> >>>> +               reset_control_assert(gpu_reset);
> >>>
> >>> Why can't the clk consumer control the reset itself? Doing this here is
> >>> not ideal because we hold the clk lock when we try to grab the reset
> >>> lock. These are all spinlocks that should be small in lines of code
> >>> where the lock is held, but we're calling into an entire other framework
> >>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
> >>> lock it will deadlock.
> >>
> >> So in our case the clk consumer is the drm/imagination driver. Here is
> >> the comment from the maintainer for my previous attempt to use a reset
> >> driver to abstract the GPU init sequence [1]:
> >>
> >> "Do you know what this resets? From our side, the GPU only has a single
> >> reset line (which I assume to be GPU_RESET)."
> >>
> >> "I don't love that this procedure appears in the platform reset driver.
> >> I appreciate it may not be clear from the SoC TRM, but this is the
> >> standard reset procedure for all IMG Rogue GPUs. The currently
> >> supported TI SoC handles this in silicon, when power up/down requests
> >> are sent so we never needed to encode it in the driver before.
> >>
> >> Strictly speaking, the 32 cycle delay is required between power and
> >> clocks being enabled and the reset line being deasserted. If nothing
> >> here touches power or clocks (which I don't think it should), the delay
> >> could potentially be lifted to the GPU driver." 
> >>
> >> From the drm/imagination maintainers point of view their hardware has
> >> only one reset, the extra CLKGEN reset is SoC specific.
> > 
> > If I am understanding correctly, the CLKGEN reset doesn't reset
> > anything in the GPU itself, but holds the GPU clock generator block in
> > reset, effectively disabling the three GPU clocks as a workaround for
> > the always-ungated GPU_MEM clock.
> > 
> >> Also the reset driver maintainer didn't like my way of abstracting two
> >> resets ("GPU" and and SoC specific"CLKGEN") into one reset
> > 
> > That is one part of it. The other is that (according to my
> > understanding as laid out above), the combined GPU+CLKGEN reset would
> > effectively disable all three GPU clocks for a while, after the GPU
> > driver has already requested them to be enabled.
> 
> Thank you for your comments Philipp, it seems like we're on the same
> page here. I was wondering whether there is anything I can do to move the
> patches forward.
> 
> Stephen, if the current patch is a no go from your perspective could you
> please advise whether there is a way to solve this in a clock that would
> be acceptable to you.

It looks like the SoC glue makes the interactions between the clk and
reset frameworks complicated because GPU clks don't work if a reset is
asserted. You're trying to find a place to coordinate the clk and reset.
Am I right?

I'd advise managing the clks and resets in a generic power domain that
is attached to the GPU device. In that power domain, coordinate the clk
and reset sequencing so that the reset is deasserted before the clks are
enabled (or whatever the actual requirement is). If the GPU driver
_must_ have a clk and reset pointer to use, implement one that either
does nothing or flag to the GPU driver that the power domain is managing
all this for it so it should just use runtime PM and system PM hooks to
turn on the clks and take the GPU out of reset.

From what I can tell, the GPU driver maintainer doesn't want to think
about the wrapper that likely got placed around the hardware block
shipped by IMG. This wrapper is the SoC glue that needs to go into a
generic power domain so that the different PM resources, reset, clk,
etc. can be coordinated based on the GPU device's power state. It's
either that, or go the dwc3 route and have SoC glue platform drivers
that manage this stuff and create a child device to represent the hard
macro shipped by the vendor like Synopsys/Imagination. Doing the parent
device design isn't as flexible as PM domains because you can only have
one parent device and the child device state can be ignored vs. many PM
domains attached in a graph to a device that are more directly
influenced by the device using runtime PM.

Maybe you'll be heartened to know this problem isn't unique and has
existed for decades :) I don't know what state the graphics driver is in
but they'll likely be interested in solving this problem in a way that
doesn't "pollute" their driver with SoC specific details. It's all a
question of where you put the code. The reset framework wants to focus
on resets, the clk framework wants to focus on clks, and the graphics
driver wants to focus on graphics. BTW, we went through a similar
discussion with regulators and clks years ago and ended up handling that
with OPPs and power domains.

I believe a PM domain is the right place for this kind of stuff, and I
actually presented on this topic at OSSEU[1], but I don't maintain that
code. Ulf does.

[1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-03-25 22:40             ` [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support Stephen Boyd
@ 2025-03-26 11:24               ` Ulf Hansson
  2025-04-01 14:38                 ` Michal Wilczynski
  0 siblings, 1 reply; 5+ messages in thread
From: Ulf Hansson @ 2025-03-26 11:24 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Michal Wilczynski, Philipp Zabel, alex, aou, conor+dt, drew,
	guoren, jszhang, krzk+dt, m.szyprowski, mturquette, palmer,
	paul.walmsley, robh, wefu, linux-clk, devicetree, linux-kernel,
	linux-riscv, linux-pm

On Tue, 25 Mar 2025 at 23:40, Stephen Boyd <sboyd@kernel.org> wrote:
>
> Quoting Michal Wilczynski (2025-03-19 02:22:11)
> >
> >
> > On 3/13/25 10:25, Philipp Zabel wrote:
> > > On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
> > >>
> > >> On 3/6/25 00:47, Stephen Boyd wrote:
> > >>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
> > >>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
> > >>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
> > >>>> configurable. In order for these clock gates to work properly, the
> > >>>> CLKGEN reset must be managed in a specific sequence.
> > >>>>
> > >>>> Move the CLKGEN reset handling to the clock driver since it's
> > >>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
> > >>>> GPU clocks stay physically enabled without external interference from
> > >>>> the reset driver.  The reset is now deasserted only when both core and
> > >>>> cfg clocks are enabled, and asserted when either of them is disabled.
> > >>>>
> > >>>> The mem clock is configured to use nop operations since it cannot be
> > >>>> controlled.
> > >>>>
> > >>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
> > >>>>
> > >>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
> > >>> [...]
> > >>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
> > >>>> index ea96d007aecd..1dfcde867233 100644
> > >>>> --- a/drivers/clk/thead/clk-th1520-ap.c
> > >>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
> > >>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
> > >>> [...]
> > >>>>
> > >>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
> > >>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
> > >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
> > >>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
> > >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
> > >>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
> > >>>> +
> > >>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
> > >>>> +{
> > >>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
> > >>>> +       unsigned long flags;
> > >>>> +
> > >>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
> > >>>> +
> > >>>> +       ccu_disable_helper(&cg->common, cg->enable);
> > >>>> +
> > >>>> +       if ((cg == &gpu_core_clk &&
> > >>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
> > >>>> +           (cg == &gpu_cfg_aclk &&
> > >>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
> > >>>> +               reset_control_assert(gpu_reset);
> > >>>
> > >>> Why can't the clk consumer control the reset itself? Doing this here is
> > >>> not ideal because we hold the clk lock when we try to grab the reset
> > >>> lock. These are all spinlocks that should be small in lines of code
> > >>> where the lock is held, but we're calling into an entire other framework
> > >>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
> > >>> lock it will deadlock.
> > >>
> > >> So in our case the clk consumer is the drm/imagination driver. Here is
> > >> the comment from the maintainer for my previous attempt to use a reset
> > >> driver to abstract the GPU init sequence [1]:
> > >>
> > >> "Do you know what this resets? From our side, the GPU only has a single
> > >> reset line (which I assume to be GPU_RESET)."
> > >>
> > >> "I don't love that this procedure appears in the platform reset driver.
> > >> I appreciate it may not be clear from the SoC TRM, but this is the
> > >> standard reset procedure for all IMG Rogue GPUs. The currently
> > >> supported TI SoC handles this in silicon, when power up/down requests
> > >> are sent so we never needed to encode it in the driver before.
> > >>
> > >> Strictly speaking, the 32 cycle delay is required between power and
> > >> clocks being enabled and the reset line being deasserted. If nothing
> > >> here touches power or clocks (which I don't think it should), the delay
> > >> could potentially be lifted to the GPU driver."
> > >>
> > >> From the drm/imagination maintainers point of view their hardware has
> > >> only one reset, the extra CLKGEN reset is SoC specific.
> > >
> > > If I am understanding correctly, the CLKGEN reset doesn't reset
> > > anything in the GPU itself, but holds the GPU clock generator block in
> > > reset, effectively disabling the three GPU clocks as a workaround for
> > > the always-ungated GPU_MEM clock.
> > >
> > >> Also the reset driver maintainer didn't like my way of abstracting two
> > >> resets ("GPU" and and SoC specific"CLKGEN") into one reset
> > >
> > > That is one part of it. The other is that (according to my
> > > understanding as laid out above), the combined GPU+CLKGEN reset would
> > > effectively disable all three GPU clocks for a while, after the GPU
> > > driver has already requested them to be enabled.
> >
> > Thank you for your comments Philipp, it seems like we're on the same
> > page here. I was wondering whether there is anything I can do to move the
> > patches forward.
> >
> > Stephen, if the current patch is a no go from your perspective could you
> > please advise whether there is a way to solve this in a clock that would
> > be acceptable to you.
>
> It looks like the SoC glue makes the interactions between the clk and
> reset frameworks complicated because GPU clks don't work if a reset is
> asserted. You're trying to find a place to coordinate the clk and reset.
> Am I right?
>
> I'd advise managing the clks and resets in a generic power domain that
> is attached to the GPU device. In that power domain, coordinate the clk
> and reset sequencing so that the reset is deasserted before the clks are
> enabled (or whatever the actual requirement is). If the GPU driver
> _must_ have a clk and reset pointer to use, implement one that either
> does nothing or flag to the GPU driver that the power domain is managing
> all this for it so it should just use runtime PM and system PM hooks to
> turn on the clks and take the GPU out of reset.
>
> From what I can tell, the GPU driver maintainer doesn't want to think
> about the wrapper that likely got placed around the hardware block
> shipped by IMG. This wrapper is the SoC glue that needs to go into a
> generic power domain so that the different PM resources, reset, clk,
> etc. can be coordinated based on the GPU device's power state. It's
> either that, or go the dwc3 route and have SoC glue platform drivers
> that manage this stuff and create a child device to represent the hard
> macro shipped by the vendor like Synopsys/Imagination. Doing the parent
> device design isn't as flexible as PM domains because you can only have
> one parent device and the child device state can be ignored vs. many PM
> domains attached in a graph to a device that are more directly
> influenced by the device using runtime PM.
>
> Maybe you'll be heartened to know this problem isn't unique and has
> existed for decades :) I don't know what state the graphics driver is in
> but they'll likely be interested in solving this problem in a way that
> doesn't "pollute" their driver with SoC specific details. It's all a
> question of where you put the code. The reset framework wants to focus
> on resets, the clk framework wants to focus on clks, and the graphics
> driver wants to focus on graphics. BTW, we went through a similar
> discussion with regulators and clks years ago and ended up handling that
> with OPPs and power domains.

Right, power-domain providers are mostly implementing SoC specific code.

In some cases, power-domain providers also handle per device SoC
specific constraints/sequences, which seems what you are discussing
here. For that, genpd has a couple of callbacks that could be
interesting to have a look at, such as:

genpd->attach|detach_dev() - for probe/remove
genpd.dev_ops->start|stop() - for runtime/system PM

That said, maybe just using the regular genpd->power_on|off() callback
is sufficient here, depending on how you decide to model things.

>
> I believe a PM domain is the right place for this kind of stuff, and I
> actually presented on this topic at OSSEU[1], but I don't maintain that
> code. Ulf does.
>
> [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-03-26 11:24               ` Ulf Hansson
@ 2025-04-01 14:38                 ` Michal Wilczynski
  2025-04-07 17:02                   ` Michal Wilczynski
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Wilczynski @ 2025-04-01 14:38 UTC (permalink / raw)
  To: Ulf Hansson, Stephen Boyd
  Cc: Philipp Zabel, alex, aou, conor+dt, drew, guoren, jszhang,
	krzk+dt, m.szyprowski, mturquette, palmer, paul.walmsley, robh,
	wefu, linux-clk, devicetree, linux-kernel, linux-riscv, linux-pm



On 3/26/25 12:24, Ulf Hansson wrote:
> On Tue, 25 Mar 2025 at 23:40, Stephen Boyd <sboyd@kernel.org> wrote:
>>
>> Quoting Michal Wilczynski (2025-03-19 02:22:11)
>>>
>>>
>>> On 3/13/25 10:25, Philipp Zabel wrote:
>>>> On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
>>>>>
>>>>> On 3/6/25 00:47, Stephen Boyd wrote:
>>>>>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
>>>>>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
>>>>>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
>>>>>>> configurable. In order for these clock gates to work properly, the
>>>>>>> CLKGEN reset must be managed in a specific sequence.
>>>>>>>
>>>>>>> Move the CLKGEN reset handling to the clock driver since it's
>>>>>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
>>>>>>> GPU clocks stay physically enabled without external interference from
>>>>>>> the reset driver.  The reset is now deasserted only when both core and
>>>>>>> cfg clocks are enabled, and asserted when either of them is disabled.
>>>>>>>
>>>>>>> The mem clock is configured to use nop operations since it cannot be
>>>>>>> controlled.
>>>>>>>
>>>>>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
>>>>>>>
>>>>>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
>>>>>> [...]
>>>>>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
>>>>>>> index ea96d007aecd..1dfcde867233 100644
>>>>>>> --- a/drivers/clk/thead/clk-th1520-ap.c
>>>>>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
>>>>>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
>>>>>> [...]
>>>>>>>
>>>>>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
>>>>>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
>>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
>>>>>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
>>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
>>>>>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
>>>>>>> +
>>>>>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
>>>>>>> +{
>>>>>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
>>>>>>> +       unsigned long flags;
>>>>>>> +
>>>>>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
>>>>>>> +
>>>>>>> +       ccu_disable_helper(&cg->common, cg->enable);
>>>>>>> +
>>>>>>> +       if ((cg == &gpu_core_clk &&
>>>>>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
>>>>>>> +           (cg == &gpu_cfg_aclk &&
>>>>>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
>>>>>>> +               reset_control_assert(gpu_reset);
>>>>>>
>>>>>> Why can't the clk consumer control the reset itself? Doing this here is
>>>>>> not ideal because we hold the clk lock when we try to grab the reset
>>>>>> lock. These are all spinlocks that should be small in lines of code
>>>>>> where the lock is held, but we're calling into an entire other framework
>>>>>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
>>>>>> lock it will deadlock.
>>>>>
>>>>> So in our case the clk consumer is the drm/imagination driver. Here is
>>>>> the comment from the maintainer for my previous attempt to use a reset
>>>>> driver to abstract the GPU init sequence [1]:
>>>>>
>>>>> "Do you know what this resets? From our side, the GPU only has a single
>>>>> reset line (which I assume to be GPU_RESET)."
>>>>>
>>>>> "I don't love that this procedure appears in the platform reset driver.
>>>>> I appreciate it may not be clear from the SoC TRM, but this is the
>>>>> standard reset procedure for all IMG Rogue GPUs. The currently
>>>>> supported TI SoC handles this in silicon, when power up/down requests
>>>>> are sent so we never needed to encode it in the driver before.
>>>>>
>>>>> Strictly speaking, the 32 cycle delay is required between power and
>>>>> clocks being enabled and the reset line being deasserted. If nothing
>>>>> here touches power or clocks (which I don't think it should), the delay
>>>>> could potentially be lifted to the GPU driver."
>>>>>
>>>>> From the drm/imagination maintainers point of view their hardware has
>>>>> only one reset, the extra CLKGEN reset is SoC specific.
>>>>
>>>> If I am understanding correctly, the CLKGEN reset doesn't reset
>>>> anything in the GPU itself, but holds the GPU clock generator block in
>>>> reset, effectively disabling the three GPU clocks as a workaround for
>>>> the always-ungated GPU_MEM clock.
>>>>
>>>>> Also the reset driver maintainer didn't like my way of abstracting two
>>>>> resets ("GPU" and and SoC specific"CLKGEN") into one reset
>>>>
>>>> That is one part of it. The other is that (according to my
>>>> understanding as laid out above), the combined GPU+CLKGEN reset would
>>>> effectively disable all three GPU clocks for a while, after the GPU
>>>> driver has already requested them to be enabled.
>>>
>>> Thank you for your comments Philipp, it seems like we're on the same
>>> page here. I was wondering whether there is anything I can do to move the
>>> patches forward.
>>>
>>> Stephen, if the current patch is a no go from your perspective could you
>>> please advise whether there is a way to solve this in a clock that would
>>> be acceptable to you.
>>
>> It looks like the SoC glue makes the interactions between the clk and
>> reset frameworks complicated because GPU clks don't work if a reset is
>> asserted. You're trying to find a place to coordinate the clk and reset.
>> Am I right?
>>
>> I'd advise managing the clks and resets in a generic power domain that
>> is attached to the GPU device. In that power domain, coordinate the clk
>> and reset sequencing so that the reset is deasserted before the clks are
>> enabled (or whatever the actual requirement is). If the GPU driver
>> _must_ have a clk and reset pointer to use, implement one that either
>> does nothing or flag to the GPU driver that the power domain is managing
>> all this for it so it should just use runtime PM and system PM hooks to
>> turn on the clks and take the GPU out of reset.
>>
>> From what I can tell, the GPU driver maintainer doesn't want to think
>> about the wrapper that likely got placed around the hardware block
>> shipped by IMG. This wrapper is the SoC glue that needs to go into a
>> generic power domain so that the different PM resources, reset, clk,
>> etc. can be coordinated based on the GPU device's power state. It's
>> either that, or go the dwc3 route and have SoC glue platform drivers
>> that manage this stuff and create a child device to represent the hard
>> macro shipped by the vendor like Synopsys/Imagination. Doing the parent
>> device design isn't as flexible as PM domains because you can only have
>> one parent device and the child device state can be ignored vs. many PM
>> domains attached in a graph to a device that are more directly
>> influenced by the device using runtime PM.
>>
>> Maybe you'll be heartened to know this problem isn't unique and has
>> existed for decades :) I don't know what state the graphics driver is in
>> but they'll likely be interested in solving this problem in a way that
>> doesn't "pollute" their driver with SoC specific details. It's all a
>> question of where you put the code. The reset framework wants to focus
>> on resets, the clk framework wants to focus on clks, and the graphics
>> driver wants to focus on graphics. BTW, we went through a similar
>> discussion with regulators and clks years ago and ended up handling that
>> with OPPs and power domains.
> 
> Right, power-domain providers are mostly implementing SoC specific code.
> 
> In some cases, power-domain providers also handle per device SoC
> specific constraints/sequences, which seems what you are discussing
> here. For that, genpd has a couple of callbacks that could be
> interesting to have a look at, such as:
> 
> genpd->attach|detach_dev() - for probe/remove
> genpd.dev_ops->start|stop() - for runtime/system PM
> 
> That said, maybe just using the regular genpd->power_on|off() callback
> is sufficient here, depending on how you decide to model things.


Thanks Stephen, Ulf !

So the way forward I see:

1) The reset driver can be merged as-is, if Philipp is fine with this
code [2].
2) I will cook up the update to the thead power-domain driver which will
handle reset and clock management.
3) I think it would be best to convince the drm/imagination maintainers to
make the clock management in their consumer driver optional. This way if
there is a SoC specific sequence the clocks/resets will be managed from
generic PM driver which is SoC specific. Will talk to them.
4) Will remove the reset management from this series, and re-send.

[2] - https://lore.kernel.org/all/4205b786-fb65-468c-a3d8-bce807dd829a@samsung.com/
> 
>>
>> I believe a PM domain is the right place for this kind of stuff, and I
>> actually presented on this topic at OSSEU[1], but I don't maintain that
>> code. Ulf does.
>>
>> [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google

Thanks ! Watched the presentation, very interesting. Hopefully I'll be
able to attend in person in Amsterdam this year if you're presenting :-)

> 
> Kind regards
> Uffe
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-04-01 14:38                 ` Michal Wilczynski
@ 2025-04-07 17:02                   ` Michal Wilczynski
  2025-04-08 12:27                     ` Ulf Hansson
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Wilczynski @ 2025-04-07 17:02 UTC (permalink / raw)
  To: Ulf Hansson, Stephen Boyd
  Cc: Philipp Zabel, alex, aou, conor+dt, drew, guoren, jszhang,
	krzk+dt, m.szyprowski, mturquette, palmer, paul.walmsley, robh,
	wefu, linux-clk, devicetree, linux-kernel, linux-riscv, linux-pm



On 4/1/25 16:38, Michal Wilczynski wrote:
> 
> 
> On 3/26/25 12:24, Ulf Hansson wrote:
>> On Tue, 25 Mar 2025 at 23:40, Stephen Boyd <sboyd@kernel.org> wrote:
>>>
>>> Quoting Michal Wilczynski (2025-03-19 02:22:11)
>>>>
>>>>
>>>> On 3/13/25 10:25, Philipp Zabel wrote:
>>>>> On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
>>>>>>
>>>>>> On 3/6/25 00:47, Stephen Boyd wrote:
>>>>>>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
>>>>>>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
>>>>>>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
>>>>>>>> configurable. In order for these clock gates to work properly, the
>>>>>>>> CLKGEN reset must be managed in a specific sequence.
>>>>>>>>
>>>>>>>> Move the CLKGEN reset handling to the clock driver since it's
>>>>>>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
>>>>>>>> GPU clocks stay physically enabled without external interference from
>>>>>>>> the reset driver.  The reset is now deasserted only when both core and
>>>>>>>> cfg clocks are enabled, and asserted when either of them is disabled.
>>>>>>>>
>>>>>>>> The mem clock is configured to use nop operations since it cannot be
>>>>>>>> controlled.
>>>>>>>>
>>>>>>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
>>>>>>>>
>>>>>>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
>>>>>>> [...]
>>>>>>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
>>>>>>>> index ea96d007aecd..1dfcde867233 100644
>>>>>>>> --- a/drivers/clk/thead/clk-th1520-ap.c
>>>>>>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
>>>>>>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
>>>>>>> [...]
>>>>>>>>
>>>>>>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
>>>>>>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
>>>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
>>>>>>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
>>>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
>>>>>>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
>>>>>>>> +
>>>>>>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
>>>>>>>> +{
>>>>>>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
>>>>>>>> +       unsigned long flags;
>>>>>>>> +
>>>>>>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
>>>>>>>> +
>>>>>>>> +       ccu_disable_helper(&cg->common, cg->enable);
>>>>>>>> +
>>>>>>>> +       if ((cg == &gpu_core_clk &&
>>>>>>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
>>>>>>>> +           (cg == &gpu_cfg_aclk &&
>>>>>>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
>>>>>>>> +               reset_control_assert(gpu_reset);
>>>>>>>
>>>>>>> Why can't the clk consumer control the reset itself? Doing this here is
>>>>>>> not ideal because we hold the clk lock when we try to grab the reset
>>>>>>> lock. These are all spinlocks that should be small in lines of code
>>>>>>> where the lock is held, but we're calling into an entire other framework
>>>>>>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
>>>>>>> lock it will deadlock.
>>>>>>
>>>>>> So in our case the clk consumer is the drm/imagination driver. Here is
>>>>>> the comment from the maintainer for my previous attempt to use a reset
>>>>>> driver to abstract the GPU init sequence [1]:
>>>>>>
>>>>>> "Do you know what this resets? From our side, the GPU only has a single
>>>>>> reset line (which I assume to be GPU_RESET)."
>>>>>>
>>>>>> "I don't love that this procedure appears in the platform reset driver.
>>>>>> I appreciate it may not be clear from the SoC TRM, but this is the
>>>>>> standard reset procedure for all IMG Rogue GPUs. The currently
>>>>>> supported TI SoC handles this in silicon, when power up/down requests
>>>>>> are sent so we never needed to encode it in the driver before.
>>>>>>
>>>>>> Strictly speaking, the 32 cycle delay is required between power and
>>>>>> clocks being enabled and the reset line being deasserted. If nothing
>>>>>> here touches power or clocks (which I don't think it should), the delay
>>>>>> could potentially be lifted to the GPU driver."
>>>>>>
>>>>>> From the drm/imagination maintainers point of view their hardware has
>>>>>> only one reset, the extra CLKGEN reset is SoC specific.
>>>>>
>>>>> If I am understanding correctly, the CLKGEN reset doesn't reset
>>>>> anything in the GPU itself, but holds the GPU clock generator block in
>>>>> reset, effectively disabling the three GPU clocks as a workaround for
>>>>> the always-ungated GPU_MEM clock.
>>>>>
>>>>>> Also the reset driver maintainer didn't like my way of abstracting two
>>>>>> resets ("GPU" and and SoC specific"CLKGEN") into one reset
>>>>>
>>>>> That is one part of it. The other is that (according to my
>>>>> understanding as laid out above), the combined GPU+CLKGEN reset would
>>>>> effectively disable all three GPU clocks for a while, after the GPU
>>>>> driver has already requested them to be enabled.
>>>>
>>>> Thank you for your comments Philipp, it seems like we're on the same
>>>> page here. I was wondering whether there is anything I can do to move the
>>>> patches forward.
>>>>
>>>> Stephen, if the current patch is a no go from your perspective could you
>>>> please advise whether there is a way to solve this in a clock that would
>>>> be acceptable to you.
>>>
>>> It looks like the SoC glue makes the interactions between the clk and
>>> reset frameworks complicated because GPU clks don't work if a reset is
>>> asserted. You're trying to find a place to coordinate the clk and reset.
>>> Am I right?
>>>
>>> I'd advise managing the clks and resets in a generic power domain that
>>> is attached to the GPU device. In that power domain, coordinate the clk
>>> and reset sequencing so that the reset is deasserted before the clks are
>>> enabled (or whatever the actual requirement is). If the GPU driver
>>> _must_ have a clk and reset pointer to use, implement one that either
>>> does nothing or flag to the GPU driver that the power domain is managing
>>> all this for it so it should just use runtime PM and system PM hooks to
>>> turn on the clks and take the GPU out of reset.
>>>
>>> From what I can tell, the GPU driver maintainer doesn't want to think
>>> about the wrapper that likely got placed around the hardware block
>>> shipped by IMG. This wrapper is the SoC glue that needs to go into a
>>> generic power domain so that the different PM resources, reset, clk,
>>> etc. can be coordinated based on the GPU device's power state. It's
>>> either that, or go the dwc3 route and have SoC glue platform drivers
>>> that manage this stuff and create a child device to represent the hard
>>> macro shipped by the vendor like Synopsys/Imagination. Doing the parent
>>> device design isn't as flexible as PM domains because you can only have
>>> one parent device and the child device state can be ignored vs. many PM
>>> domains attached in a graph to a device that are more directly
>>> influenced by the device using runtime PM.
>>>
>>> Maybe you'll be heartened to know this problem isn't unique and has
>>> existed for decades :) I don't know what state the graphics driver is in
>>> but they'll likely be interested in solving this problem in a way that
>>> doesn't "pollute" their driver with SoC specific details. It's all a
>>> question of where you put the code. The reset framework wants to focus
>>> on resets, the clk framework wants to focus on clks, and the graphics
>>> driver wants to focus on graphics. BTW, we went through a similar
>>> discussion with regulators and clks years ago and ended up handling that
>>> with OPPs and power domains.
>>
>> Right, power-domain providers are mostly implementing SoC specific code.
>>
>> In some cases, power-domain providers also handle per device SoC
>> specific constraints/sequences, which seems what you are discussing
>> here. For that, genpd has a couple of callbacks that could be
>> interesting to have a look at, such as:
>>
>> genpd->attach|detach_dev() - for probe/remove
>> genpd.dev_ops->start|stop() - for runtime/system PM
>>
>> That said, maybe just using the regular genpd->power_on|off() callback
>> is sufficient here, depending on how you decide to model things.
> 
> 
> Thanks Stephen, Ulf !
> 
> So the way forward I see:
> 
> 1) The reset driver can be merged as-is, if Philipp is fine with this
> code [2].
> 2) I will cook up the update to the thead power-domain driver which will
> handle reset and clock management.

Hi Ulf,
I'm working on the series right now and I wanted to ask you how you
prefer versioning to be handled. Would you like me to send a series as a
v1, or treat is as a continuation of this series [1] and send as a v9.
Would like to avoid any confusion. 

Thanks,
Michał

[1] - https://lore.kernel.org/all/20250311171900.1549916-1-m.wilczynski@samsung.com/

> 3) I think it would be best to convince the drm/imagination maintainers to
> make the clock management in their consumer driver optional. This way if
> there is a SoC specific sequence the clocks/resets will be managed from
> generic PM driver which is SoC specific. Will talk to them.
> 4) Will remove the reset management from this series, and re-send.
> 
> [2] - https://lore.kernel.org/all/4205b786-fb65-468c-a3d8-bce807dd829a@samsung.com/
>>
>>>
>>> I believe a PM domain is the right place for this kind of stuff, and I
>>> actually presented on this topic at OSSEU[1], but I don't maintain that
>>> code. Ulf does.
>>>
>>> [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google
> 
> Thanks ! Watched the presentation, very interesting. Hopefully I'll be
> able to attend in person in Amsterdam this year if you're presenting :-)
> 
>>
>> Kind regards
>> Uffe
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support
  2025-04-07 17:02                   ` Michal Wilczynski
@ 2025-04-08 12:27                     ` Ulf Hansson
  0 siblings, 0 replies; 5+ messages in thread
From: Ulf Hansson @ 2025-04-08 12:27 UTC (permalink / raw)
  To: Michal Wilczynski
  Cc: Stephen Boyd, Philipp Zabel, alex, aou, conor+dt, drew, guoren,
	jszhang, krzk+dt, m.szyprowski, mturquette, palmer, paul.walmsley,
	robh, wefu, linux-clk, devicetree, linux-kernel, linux-riscv,
	linux-pm

[...]

> >>>
> >>> It looks like the SoC glue makes the interactions between the clk and
> >>> reset frameworks complicated because GPU clks don't work if a reset is
> >>> asserted. You're trying to find a place to coordinate the clk and reset.
> >>> Am I right?
> >>>
> >>> I'd advise managing the clks and resets in a generic power domain that
> >>> is attached to the GPU device. In that power domain, coordinate the clk
> >>> and reset sequencing so that the reset is deasserted before the clks are
> >>> enabled (or whatever the actual requirement is). If the GPU driver
> >>> _must_ have a clk and reset pointer to use, implement one that either
> >>> does nothing or flag to the GPU driver that the power domain is managing
> >>> all this for it so it should just use runtime PM and system PM hooks to
> >>> turn on the clks and take the GPU out of reset.
> >>>
> >>> From what I can tell, the GPU driver maintainer doesn't want to think
> >>> about the wrapper that likely got placed around the hardware block
> >>> shipped by IMG. This wrapper is the SoC glue that needs to go into a
> >>> generic power domain so that the different PM resources, reset, clk,
> >>> etc. can be coordinated based on the GPU device's power state. It's
> >>> either that, or go the dwc3 route and have SoC glue platform drivers
> >>> that manage this stuff and create a child device to represent the hard
> >>> macro shipped by the vendor like Synopsys/Imagination. Doing the parent
> >>> device design isn't as flexible as PM domains because you can only have
> >>> one parent device and the child device state can be ignored vs. many PM
> >>> domains attached in a graph to a device that are more directly
> >>> influenced by the device using runtime PM.
> >>>
> >>> Maybe you'll be heartened to know this problem isn't unique and has
> >>> existed for decades :) I don't know what state the graphics driver is in
> >>> but they'll likely be interested in solving this problem in a way that
> >>> doesn't "pollute" their driver with SoC specific details. It's all a
> >>> question of where you put the code. The reset framework wants to focus
> >>> on resets, the clk framework wants to focus on clks, and the graphics
> >>> driver wants to focus on graphics. BTW, we went through a similar
> >>> discussion with regulators and clks years ago and ended up handling that
> >>> with OPPs and power domains.
> >>
> >> Right, power-domain providers are mostly implementing SoC specific code.
> >>
> >> In some cases, power-domain providers also handle per device SoC
> >> specific constraints/sequences, which seems what you are discussing
> >> here. For that, genpd has a couple of callbacks that could be
> >> interesting to have a look at, such as:
> >>
> >> genpd->attach|detach_dev() - for probe/remove
> >> genpd.dev_ops->start|stop() - for runtime/system PM
> >>
> >> That said, maybe just using the regular genpd->power_on|off() callback
> >> is sufficient here, depending on how you decide to model things.
> >
> >
> > Thanks Stephen, Ulf !
> >
> > So the way forward I see:
> >
> > 1) The reset driver can be merged as-is, if Philipp is fine with this
> > code [2].
> > 2) I will cook up the update to the thead power-domain driver which will
> > handle reset and clock management.
>
> Hi Ulf,
> I'm working on the series right now and I wanted to ask you how you
> prefer versioning to be handled. Would you like me to send a series as a
> v1, or treat is as a continuation of this series [1] and send as a v9.
> Would like to avoid any confusion.

I would suggest starting over with v1, but don't forget to refer to
some of the previous attempts/discussion in the cover-letter.  At
least I would be fine by this.

>
> Thanks,
> Michał
>
> [1] - https://lore.kernel.org/all/20250311171900.1549916-1-m.wilczynski@samsung.com/
>

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-04-08 12:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250303143629.400583-1-m.wilczynski@samsung.com>
     [not found] ` <CGME20250303143637eucas1p1a3abdea520ab88688de1263a5f07bba0@eucas1p1.samsung.com>
     [not found]   ` <20250303143629.400583-5-m.wilczynski@samsung.com>
     [not found]     ` <de50dd55e1285726e8d5ebae73877486.sboyd@kernel.org>
     [not found]       ` <4c035603-4c11-4e71-8ef3-b857a81bf5ef@samsung.com>
     [not found]         ` <aacd03a071dce7b340d7170eae59d662d58f23b1.camel@pengutronix.de>
     [not found]           ` <e90a0c77-61a0-49db-86ba-bac253f8ec53@samsung.com>
2025-03-25 22:40             ` [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support Stephen Boyd
2025-03-26 11:24               ` Ulf Hansson
2025-04-01 14:38                 ` Michal Wilczynski
2025-04-07 17:02                   ` Michal Wilczynski
2025-04-08 12:27                     ` Ulf Hansson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox