From: Marek Vasut <marex@denx.de>
To: Ulf Hansson <ulf.hansson@linaro.org>, Peng Fan <peng.fan@oss.nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>,
linux-pm@vger.kernel.org, Adam Ford <aford173@gmail.com>,
Fabio Estevam <festevam@denx.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Jacky Bai <ping.bai@nxp.com>, Kevin Hilman <khilman@kernel.org>,
Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
Len Brown <len.brown@intel.com>,
Liam Girdwood <lgirdwood@gmail.com>,
Mark Brown <broonie@kernel.org>,
Martin Kepplinger <martink@posteo.de>,
Pavel Machek <pavel@ucw.cz>, Peng Fan <peng.fan@nxp.com>,
Pengutronix Kernel Team <kernel@pengutronix.de>,
Philipp Zabel <p.zabel@pengutronix.de>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Sascha Hauer <s.hauer@pengutronix.de>,
Shawn Guo <shawnguo@kernel.org>,
Shengjiu Wang <shengjiu.wang@nxp.com>,
Stephen Boyd <sboyd@kernel.org>,
linux-clk@vger.kernel.org, linux-imx@nxp.com,
Stephen Boyd <swboyd@chromium.org>
Subject: Re: [PATCH 1/3] [RFC] PM: domains: Introduce .power_pre/post_on/off callbacks
Date: Wed, 18 Jan 2023 14:07:22 +0100 [thread overview]
Message-ID: <f4a3a755-c3cc-f21f-b76d-7d2d7d1b18ae@denx.de> (raw)
In-Reply-To: <CAPDyKFo8ekN3e8kzbtftKuyYY1wv0=gn3L=oz5zBjTebRT84+w@mail.gmail.com>
On 1/18/23 13:55, Ulf Hansson wrote:
> On Wed, 4 Jan 2023 at 09:38, Peng Fan <peng.fan@oss.nxp.com> wrote:
>>
>> Hi Ulf,
>>
>> On 11/17/2022 12:30 AM, Ulf Hansson wrote:
>>> On Wed, 16 Nov 2022 at 14:25, Lucas Stach <l.stach@pengutronix.de> wrote:
>>>>
>>>> Hi Ulf,
>>>>
>>>> Am Mittwoch, dem 16.11.2022 um 13:41 +0100 schrieb Ulf Hansson:
>>>>> + Stephen Boyd
>>>>>
>>>>> On Mon, 14 Nov 2022 at 21:32, Marek Vasut <marex@denx.de> wrote:
>>>>>>
>>>>>> On 11/14/22 20:40, Ulf Hansson wrote:
>>>>>>> On Tue, 8 Nov 2022 at 02:35, Marek Vasut <marex@denx.de> wrote:
>>>>>>>>
>>>>>>>> Currently it is possible that a power domain power on or off would claim
>>>>>>>> the genpd lock first and clock core prepare_lock second, while another
>>>>>>>> thread could do the reverse, and this would trigger lockdep warning.
>>>>>>>
>>>>>>> I am not quite sure I fully understand. In this case is the lockdep
>>>>>>> warning relevant or just something that we want to silence?
>>>>>>
>>>>>> This is a valid problem, see patches 2/3 and 3/3 for details too.
>>>>>>
>>>>>>>> Introduce new callbacks, .power_pre/post_on() and .power_off_pre/post(), which
>>>>>>>> are triggered before the genpd_lock() and after genpd_unlock() respectively in
>>>>>>>> case the domain is powered on and off. Those are meant to let drivers claim
>>>>>>>> clock core prepare_lock via clk_*prepare() call and release the lock via
>>>>>>>> clk_*unprepare() call to always assure that the clock and genpd lock ordering
>>>>>>>> is correct.
>>>>>>>
>>>>>>> To me, this sounds like a problem that may be better fixed by trying
>>>>>>> to model the parent/child-domains in a more strict way, through genpd.
>>>>>>>
>>>>>>> There is a comment in the code in imx8mp_blk_ctrl_probe() that seems
>>>>>>> to be pointing in this direction too.
>>>>>>>
>>>>>>> "* We use runtime PM to trigger power on/off of the upstream GPC
>>>>>>> * domain, as a strict hierarchical parent/child power domain
>>>>>>> * setup doesn't allow us to meet the sequencing requirements......"
>>>>>>>
>>>>>>> I am wondering about what those "sequencing requirements" are - and
>>>>>>> whether it could make better sense to fix these issues instead?
>>>>>>
>>>>>> Here is the lockdep splat:
>>>>>>
>>>>>> https://lore.kernel.org/all/Y1cs++TV2GCuh4tS@pendragon.ideasonboard.com/
>>>>>
>>>>> Yes, that certainly helped!
>>>>>
>>>>>>
>>>>>> It really is a problem between the clock and genpd subsystem locks, they
>>>>>> can be claimed in arbitrary order, see patch 2/3 and 3/3.
>>>>>>
>>>>>> I think that might clarify what I am attempting to solve here.
>>>>>
>>>>> Let me try to put some more words behind this, to make sure I have
>>>>> understood correctly, but also to easier allow more people to chim in.
>>>>>
>>>>> Note that, in your commit messages in patch2 and patch3, you are
>>>>> mentioning clk_disable_unused(), but that's not what the lockdep splat
>>>>> above is pointing at. Although, it seems the clk_disable_unused()
>>>>> thingy, would trigger a similar problem for this configuration for the
>>>>> imx8mp platform.
>>>>>
>>>>> Case #1:
>>>>> Triggered from the workqueue, the genpd_power_off_work_fn() ends up
>>>>> calling clk_bulk_unprepare(), from a genpd's ->power_off() callback(),
>>>>> which has been assigned to imx8mp_blk_ctrl_power_off(). Before genpd's
>>>>> ->power_off() is called, the genpd-lock(s) have been acquired, thus we
>>>>> are trying to acquire the global clk-prepare lock via
>>>>> clk_bulk_unprepare() while holding the genpd-lock(s).
>>>>>
>>>>> Case #0:
>>>>> The "drm driver" calls clk_set_rate(), thus we start by acquiring the
>>>>> global clk-prepare lock. Internally in the clock frameworks, the
>>>>> clk_set_rate() path continues to call clk_pm_runtime_get(). In this
>>>>> case, the corresponding clock provider's struct *device, seems to be
>>>>> attached to a genpd too. This means the call to clk_pm_runtime_get()
>>>>> ends up in genpd_runtime_resume(), which needs to acquire the
>>>>> genpd-lock(s) before it continues to call genpd_power_on() to power-on
>>>>> the PM domain. In other words, we try to acquire genpd-lock(s) while
>>>>> holding the global clk-prepare lock.
>>>>>
>>>>> The solution to fix these problems that you suggest in the $subject
>>>>> patch, isn't the direction I want this to take. The new callbacks are
>>>>> prone to be abused and it would also require genpd provider specific
>>>>> code to fix the problems. Of course, we need things to work, but let's
>>>>> look at a couple of options first. See below.
>>>>>
>>>>> 1)
>>>>> In a way, it looks like we have a circular description in DT of the
>>>>> hierarchy of the clock- and genpd providers, which is a bit awkward in
>>>>> my opinion. I was browsing the imx8mp DTS files to find this, but I
>>>>> couldn't. Can you perhaps point me to the DTS file(s) you are using? I
>>>>> can try to have a look so see if this could be arranged differently.
>>>>
>>>> The dependency chain isn't circular, it just happens to converge in the
>>>> clock framework and its single big hammer lock. The chain looks some
>>>> thing like this:
>>>>
>>>> 1. DRM driver request pixel clock (clk_prepare_enable ->
>>>> clk_prepare_mutex)
>>>> 2. Pixel clock is supplied from the PHY, which is in a power domain, so
>>>> in order to supply the clock it needs to runtime resume
>>>> 3. genpd powers up the PHY blk-ctrl domain, which again is inside a
>>>> GPCv2 power domain
>>>> 4. genpd powers up GPCv2 domain, which needs a specific clock to be
>>>> running in order to power up the domain, so it does a
>>>> clk_prepare_enable, which now happens to hit the clk_prepare_mutex
>>>> taken in step 1.
>>>>
>>>> As the runtime resume/suspend for the PHY may go through a workqueue we
>>>> have two different contexts trying to take the clk_prepare_mutex, which
>>>> is what lockdep complains about.
>>>
>>> I see. Thanks for bringing some more clarity in this.
>>>
>>> So the above is described in some of the in-kernel DTS(i) files too?
>>>
>>>>
>>>>>
>>>>> 2)
>>>>> What we have seen from other use cases [1], is that calling
>>>>> pm_runtime_get|put*(), while holding subsystem specific locks (like
>>>>> the genpd-lock(s) and clk-prepare lock), isn't working very well. So,
>>>>> I am thinking that we could have a look at the runtime PM deployment
>>>>> in the clock framework, to see if we can avoid holding the global
>>>>> clk-prepare lock, while calling into runtime PM. I believe that should
>>>>> fix these problems too.
>>>>
>>>> I don't see any straight forward way to avoid the clock framework calls
>>>> in the chain laid out above. I would be happy if anyone has some good
>>>> suggestions.
>>>
>>> I think you misunderstood what I proposed above. What I was suggesting
>>> is that we could re-work the runtime PM deployment in the clock
>>> framework.
>>>
>>> In this way, we would not be holding the global clk-prepare lock while
>>> trying to power-on the phy and its PM domain. Wouldn't that work?
>>
>>
>> Do you have time to give a look at NXP downstream patch to address the
>> lock issue in genpd side [1]?
>
> Thanks for sharing this! I had a look and the approach seems like it
> could work for you. However, I am still not sure it's the correct
> thing to do.
>
>>
>> Honestly I not have idea on how to rework the clk part to avoid holding
>> the clk-prepare lock when power on the phy and its power domain.
>
> I understand. Let me try to prepare a hackish patch that we can use to
> run some tests.
>
> Unfortunately, I can't make any promises for when, but I will do my best.
Thanks !
next prev parent reply other threads:[~2023-01-18 13:37 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-08 1:35 [PATCH 1/3] [RFC] PM: domains: Introduce .power_pre/post_on/off callbacks Marek Vasut
2022-11-08 1:35 ` [PATCH 2/3] [RFC] soc: imx: gpcv2: Split clock prepare from clock enable in the domain Marek Vasut
2022-11-11 8:27 ` Peng Fan
2022-11-08 1:35 ` [PATCH 3/3] [RFC] soc: imx: imx8m-blk-ctrl: " Marek Vasut
2022-11-09 13:19 ` [PATCH 1/3] [RFC] PM: domains: Introduce .power_pre/post_on/off callbacks Laurent Pinchart
2022-11-09 13:25 ` Marek Vasut
2022-11-14 19:40 ` Ulf Hansson
2022-11-14 20:32 ` Marek Vasut
2022-11-16 12:41 ` Ulf Hansson
2022-11-16 13:25 ` Lucas Stach
2022-11-16 16:30 ` Ulf Hansson
2023-01-04 8:37 ` Peng Fan
2023-01-18 12:55 ` Ulf Hansson
2023-01-18 13:07 ` Marek Vasut [this message]
2023-02-16 1:47 ` Peng Fan
2023-02-16 10:48 ` Ulf Hansson
2023-03-01 0:52 ` Peng Fan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f4a3a755-c3cc-f21f-b76d-7d2d7d1b18ae@denx.de \
--to=marex@denx.de \
--cc=aford173@gmail.com \
--cc=broonie@kernel.org \
--cc=festevam@denx.de \
--cc=gregkh@linuxfoundation.org \
--cc=kernel@pengutronix.de \
--cc=khilman@kernel.org \
--cc=l.stach@pengutronix.de \
--cc=laurent.pinchart@ideasonboard.com \
--cc=len.brown@intel.com \
--cc=lgirdwood@gmail.com \
--cc=linux-clk@vger.kernel.org \
--cc=linux-imx@nxp.com \
--cc=linux-pm@vger.kernel.org \
--cc=martink@posteo.de \
--cc=p.zabel@pengutronix.de \
--cc=pavel@ucw.cz \
--cc=peng.fan@nxp.com \
--cc=peng.fan@oss.nxp.com \
--cc=ping.bai@nxp.com \
--cc=rafael@kernel.org \
--cc=s.hauer@pengutronix.de \
--cc=sboyd@kernel.org \
--cc=shawnguo@kernel.org \
--cc=shengjiu.wang@nxp.com \
--cc=swboyd@chromium.org \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox