All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jerome Brunet <jbrunet@baylibre.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>,
	Stephen Boyd <sboyd@codeaurora.org>
Cc: linux-clk@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Michael Turquette <mturquette@baylibre.com>,
	Shawn Guo <shawnguo@kernel.org>,
	Dong Aisheng <aisheng.dong@nxp.com>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
	 Krzysztof Kozlowski <krzk@kernel.org>
Subject: Re: [PATCH v2] clk: Properly update prepare/enable count on orphan clock reparent
Date: Thu, 15 Feb 2018 12:44:11 +0100	[thread overview]
Message-ID: <1518695051.2883.77.camel@baylibre.com> (raw)
In-Reply-To: <747280a0-77ec-4407-9545-134019b24609@samsung.com>

On Wed, 2018-02-14 at 14:37 +0100, Marek Szyprowski wrote:
> Hi Stephen,
> 
> On 2018-01-30 10:01, Marek Szyprowski wrote:
> > Hi Stephen
> > 
> > On 2018-01-27 01:49, Stephen Boyd wrote:
> > > On 01/26, Marek Szyprowski wrote:
> > > > If orphaned clock has been already prepared/enabled (for example if 
> > > > it or
> > > > one of its children has CLK_IS_CRITICAL flag), then the prepare/enable
> > > > counters of the newly assigned parent are not updated correctly. This
> > > > might later cause warnings during changing clock parents.
> > > 
> > > This doesn't feel right. Perhaps we should delay enabling a clk
> > > if it's CRITICAL until we adopt an orphaned clk. Good news is we
> > > have orphan status tracking now so this should be pretty simple.
> > 
> > Not really. It won't be so simple, because we would need to track the
> > status of the whole clock sub-tree for enabling critical clocks. Even
> > then we would need to delay enabling them until the real top clock is
> > registered.
> > 
> > > Otherwise migrating the count up is complicated and requires us
> > > to call the prepare/enable ops on a critical clk and then keep
> > > doing that each time it gets re-parented. Do you have this case,
> > > where some clk is marked as CRITICAL, and then we need to migrate
> > > that enable/prepare count to the parent?
> > 
> > Yes, this is the case of Exynos5422-based boards, where we have a
> > bunch of critical clocks, which are reparented from internal PLL to
> > external oscillator when respective power domain is suspended.
> > 
> > It took me a while to gather all the needed information from the
> > debug logs. Here is one example of such clock tree:
> > 
> > fin_pll (clk1, top clock, external pll)
> >     fout_dpll (clk2)
> >         mout_sclk_dpll (clk3)
> >             mout_aclk300_disp1 (clk4)
> >                 dout_aclk300_disp1 (clk5)
> >                     mout_sw_aclk300_disp1 (clk6)
> >                         mout_user_aclk300_disp1 (clk7)
> >                             aclk300_disp1 (clk8, critical)
> > 
> > Before turning off power domains, aclk300_disp1 is reparented
> > directly to fin_pll (this is a hardware requirement). When power
> > domain is turned on again, aclk300_disp1 is reparented back to
> > mout_user_aclk300_disp1.
> > 
> > Clocks are registered in the following order:
> > 1, 2, 7, 3, 6, 5, 8, 4.
> > 
> > In this case when critical clock is registered, clk4 is not yet
> > available, so clocks 1-3 won't get proper prepare/enable count.
> > Then this causes a warning during reparenting clk8 to clk1:
> > 
> >  ------------[ cut here ]------------
> >  WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:811 
> > clk_core_disable_lock+0x18/0x24
> >  Modules linked in:
> >  CPU: 0 PID: 59 Comm: kworker/0:1 Not tainted 
> > 4.15.0-next-20180130-dirty #131
> >  Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> >  Workqueue: pm genpd_power_off_work_fn
> >  [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14)
> >  [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8)
> >  [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110)
> >  [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48)
> >  [<c0125064>] (warn_slowpath_null) from [<c048e650>] 
> > (clk_core_disable_lock+0x18/0x24)
> >  [<c048e650>] (clk_core_disable_lock) from [<c048f51c>] 
> > (clk_core_disable_unprepare+0xc/0x20)
> >  [<c048f51c>] (clk_core_disable_unprepare) from [<c048faec>] 
> > (__clk_set_parent_after+0x48/0x4c)
> >  [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] 
> > (clk_core_set_parent_nolock+0x238/0x5d0)
> >  [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] 
> > (clk_set_parent+0x38/0x6c)
> >  [<c04902e4>] (clk_set_parent) from [<c049d760>] 
> > (exynos_pd_power+0x74/0x1cc)
> >  [<c049d760>] (exynos_pd_power) from [<c0552064>] 
> > (genpd_power_off+0x164/0x264)
> >  [<c0552064>] (genpd_power_off) from [<c0552430>] 
> > (genpd_power_off_work_fn+0x2c/0x40)
> >  [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] 
> > (process_one_work+0x1d0/0x7bc)
> >  [<c014355c>] (process_one_work) from [<c0143bb4>] 
> > (worker_thread+0x34/0x4dc)
> >  [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164)
> >  [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> >  Exception stack(0xeeb01fb0 to 0xeeb01ff8)
> >  1fa0:                                     00000000 00000000 00000000 
> > 00000000
> >  1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
> > 00000000
> >  1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> >  ---[ end trace 48eea511a44c78ef ]---
> >  ------------[ cut here ]------------
> >  WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:684 
> > clk_core_disable_unprepare+0x18/0x20
> >  Modules linked in:
> >  CPU: 0 PID: 59 Comm: kworker/0:1 Tainted: G        W 
> > 4.15.0-next-20180130-dirty #131
> >  Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> >  Workqueue: pm genpd_power_off_work_fn
> >  [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14)
> >  [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8)
> >  [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110)
> >  [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48)
> >  [<c0125064>] (warn_slowpath_null) from [<c048f528>] 
> > (clk_core_disable_unprepare+0x18/0x20)
> >  [<c048f528>] (clk_core_disable_unprepare) from [<c048faec>] 
> > (__clk_set_parent_after+0x48/0x4c)
> >  [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] 
> > (clk_core_set_parent_nolock+0x238/0x5d0)
> >  [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] 
> > (clk_set_parent+0x38/0x6c)
> >  [<c04902e4>] (clk_set_parent) from [<c049d760>] 
> > (exynos_pd_power+0x74/0x1cc)
> >  [<c049d760>] (exynos_pd_power) from [<c0552064>] 
> > (genpd_power_off+0x164/0x264)
> >  [<c0552064>] (genpd_power_off) from [<c0552430>] 
> > (genpd_power_off_work_fn+0x2c/0x40)
> >  [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] 
> > (process_one_work+0x1d0/0x7bc)
> >  [<c014355c>] (process_one_work) from [<c0143bb4>] 
> > (worker_thread+0x34/0x4dc)
> >  [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164)
> >  [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> >  Exception stack(0xeeb01fb0 to 0xeeb01ff8)
> >  1fa0:                                     00000000 00000000 00000000 
> > 00000000
> >  1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
> > 00000000
> >  1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> >  ---[ end trace 48eea511a44c78f0 ]---
> > 
> > Please note that this pattern happens for other critical clocks too.
> > 
> > > Hopefully it isn't the worser case, where the clk is handed out
> > > to some consumer but it's still orphaned at that point, and then
> > > we have little control over the migration of state to the parent.
> > 
> > Well, in my opinion my patch is simplest fix for the regression
> > introduced by commit f8f8f1d04494 ("clk: Don't touch hardware when
> > reparenting during registration"). It also doesn't have any side-effects
> > as it affects only the situation when orphaned clock has been already
> > prepared/enabled, what in practice means it or one of its children has
> > critical flag. Reworking critical clock handling will be much more
> > complicated.
> 
> Stephen, do you have any comments?
> 
> The issue is there for almost 2 months... v4.16-rc1 is out and its
> a good time to get a fix in (this or any other if you propose such).
> 
> Best regards

Hi Marek,

I've sent another patch addressing the same issue (sorry, I did not see your
thread earlier)

While I agree parent adopting orphan need to take care of any existing count,
your solution could be missing a few things, like dealing with
CLK_OPS_PARENT_ENABLE clock types

Fortunately, __clk_set_parent_before() and __clk_set_parent_after() already deal
with those annoying stuff and does count migration well.

Would you mind trying this change [0] and let me know if works for you as well ?

[0] : https://lkml.kernel.org/r/20180214134340.17242-5-jbrunet@baylibre.com

Cheers
Jerome

WARNING: multiple messages have this Message-ID (diff)
From: jbrunet@baylibre.com (Jerome Brunet)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] clk: Properly update prepare/enable count on orphan clock reparent
Date: Thu, 15 Feb 2018 12:44:11 +0100	[thread overview]
Message-ID: <1518695051.2883.77.camel@baylibre.com> (raw)
In-Reply-To: <747280a0-77ec-4407-9545-134019b24609@samsung.com>

On Wed, 2018-02-14 at 14:37 +0100, Marek Szyprowski wrote:
> Hi Stephen,
> 
> On 2018-01-30 10:01, Marek Szyprowski wrote:
> > Hi Stephen
> > 
> > On 2018-01-27 01:49, Stephen Boyd wrote:
> > > On 01/26, Marek Szyprowski wrote:
> > > > If orphaned clock has been already prepared/enabled (for example if 
> > > > it or
> > > > one of its children has CLK_IS_CRITICAL flag), then the prepare/enable
> > > > counters of the newly assigned parent are not updated correctly. This
> > > > might later cause warnings during changing clock parents.
> > > 
> > > This doesn't feel right. Perhaps we should delay enabling a clk
> > > if it's CRITICAL until we adopt an orphaned clk. Good news is we
> > > have orphan status tracking now so this should be pretty simple.
> > 
> > Not really. It won't be so simple, because we would need to track the
> > status of the whole clock sub-tree for enabling critical clocks. Even
> > then we would need to delay enabling them until the real top clock is
> > registered.
> > 
> > > Otherwise migrating the count up is complicated and requires us
> > > to call the prepare/enable ops on a critical clk and then keep
> > > doing that each time it gets re-parented. Do you have this case,
> > > where some clk is marked as CRITICAL, and then we need to migrate
> > > that enable/prepare count to the parent?
> > 
> > Yes, this is the case of Exynos5422-based boards, where we have a
> > bunch of critical clocks, which are reparented from internal PLL to
> > external oscillator when respective power domain is suspended.
> > 
> > It took me a while to gather all the needed information from the
> > debug logs. Here is one example of such clock tree:
> > 
> > fin_pll (clk1, top clock, external pll)
> >     fout_dpll (clk2)
> >         mout_sclk_dpll (clk3)
> >             mout_aclk300_disp1 (clk4)
> >                 dout_aclk300_disp1 (clk5)
> >                     mout_sw_aclk300_disp1 (clk6)
> >                         mout_user_aclk300_disp1 (clk7)
> >                             aclk300_disp1 (clk8, critical)
> > 
> > Before turning off power domains, aclk300_disp1 is reparented
> > directly to fin_pll (this is a hardware requirement). When power
> > domain is turned on again, aclk300_disp1 is reparented back to
> > mout_user_aclk300_disp1.
> > 
> > Clocks are registered in the following order:
> > 1, 2, 7, 3, 6, 5, 8, 4.
> > 
> > In this case when critical clock is registered, clk4 is not yet
> > available, so clocks 1-3 won't get proper prepare/enable count.
> > Then this causes a warning during reparenting clk8 to clk1:
> > 
> >  ------------[ cut here ]------------
> >  WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:811 
> > clk_core_disable_lock+0x18/0x24
> >  Modules linked in:
> >  CPU: 0 PID: 59 Comm: kworker/0:1 Not tainted 
> > 4.15.0-next-20180130-dirty #131
> >  Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> >  Workqueue: pm genpd_power_off_work_fn
> >  [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14)
> >  [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8)
> >  [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110)
> >  [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48)
> >  [<c0125064>] (warn_slowpath_null) from [<c048e650>] 
> > (clk_core_disable_lock+0x18/0x24)
> >  [<c048e650>] (clk_core_disable_lock) from [<c048f51c>] 
> > (clk_core_disable_unprepare+0xc/0x20)
> >  [<c048f51c>] (clk_core_disable_unprepare) from [<c048faec>] 
> > (__clk_set_parent_after+0x48/0x4c)
> >  [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] 
> > (clk_core_set_parent_nolock+0x238/0x5d0)
> >  [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] 
> > (clk_set_parent+0x38/0x6c)
> >  [<c04902e4>] (clk_set_parent) from [<c049d760>] 
> > (exynos_pd_power+0x74/0x1cc)
> >  [<c049d760>] (exynos_pd_power) from [<c0552064>] 
> > (genpd_power_off+0x164/0x264)
> >  [<c0552064>] (genpd_power_off) from [<c0552430>] 
> > (genpd_power_off_work_fn+0x2c/0x40)
> >  [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] 
> > (process_one_work+0x1d0/0x7bc)
> >  [<c014355c>] (process_one_work) from [<c0143bb4>] 
> > (worker_thread+0x34/0x4dc)
> >  [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164)
> >  [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> >  Exception stack(0xeeb01fb0 to 0xeeb01ff8)
> >  1fa0:                                     00000000 00000000 00000000 
> > 00000000
> >  1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
> > 00000000
> >  1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> >  ---[ end trace 48eea511a44c78ef ]---
> >  ------------[ cut here ]------------
> >  WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:684 
> > clk_core_disable_unprepare+0x18/0x20
> >  Modules linked in:
> >  CPU: 0 PID: 59 Comm: kworker/0:1 Tainted: G        W 
> > 4.15.0-next-20180130-dirty #131
> >  Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> >  Workqueue: pm genpd_power_off_work_fn
> >  [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14)
> >  [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8)
> >  [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110)
> >  [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48)
> >  [<c0125064>] (warn_slowpath_null) from [<c048f528>] 
> > (clk_core_disable_unprepare+0x18/0x20)
> >  [<c048f528>] (clk_core_disable_unprepare) from [<c048faec>] 
> > (__clk_set_parent_after+0x48/0x4c)
> >  [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] 
> > (clk_core_set_parent_nolock+0x238/0x5d0)
> >  [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] 
> > (clk_set_parent+0x38/0x6c)
> >  [<c04902e4>] (clk_set_parent) from [<c049d760>] 
> > (exynos_pd_power+0x74/0x1cc)
> >  [<c049d760>] (exynos_pd_power) from [<c0552064>] 
> > (genpd_power_off+0x164/0x264)
> >  [<c0552064>] (genpd_power_off) from [<c0552430>] 
> > (genpd_power_off_work_fn+0x2c/0x40)
> >  [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] 
> > (process_one_work+0x1d0/0x7bc)
> >  [<c014355c>] (process_one_work) from [<c0143bb4>] 
> > (worker_thread+0x34/0x4dc)
> >  [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164)
> >  [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> >  Exception stack(0xeeb01fb0 to 0xeeb01ff8)
> >  1fa0:                                     00000000 00000000 00000000 
> > 00000000
> >  1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
> > 00000000
> >  1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> >  ---[ end trace 48eea511a44c78f0 ]---
> > 
> > Please note that this pattern happens for other critical clocks too.
> > 
> > > Hopefully it isn't the worser case, where the clk is handed out
> > > to some consumer but it's still orphaned at that point, and then
> > > we have little control over the migration of state to the parent.
> > 
> > Well, in my opinion my patch is simplest fix for the regression
> > introduced by commit f8f8f1d04494 ("clk: Don't touch hardware when
> > reparenting during registration"). It also doesn't have any side-effects
> > as it affects only the situation when orphaned clock has been already
> > prepared/enabled, what in practice means it or one of its children has
> > critical flag. Reworking critical clock handling will be much more
> > complicated.
> 
> Stephen, do you have any comments?
> 
> The issue is there for almost 2 months... v4.16-rc1 is out and its
> a good time to get a fix in (this or any other if you propose such).
> 
> Best regards

Hi Marek,

I've sent another patch addressing the same issue (sorry, I did not see your
thread earlier)

While I agree parent adopting orphan need to take care of any existing count,
your solution could be missing a few things, like dealing with
CLK_OPS_PARENT_ENABLE clock types

Fortunately, __clk_set_parent_before() and __clk_set_parent_after() already deal
with those annoying stuff and does count migration well.

Would you mind trying this change [0] and let me know if works for you as well ?

[0] : https://lkml.kernel.org/r/20180214134340.17242-5-jbrunet at baylibre.com

Cheers
Jerome

  reply	other threads:[~2018-02-15 11:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20180126091902eucas1p2b3ad9978e3ab09bd7a2bd9123c9eecaa@eucas1p2.samsung.com>
2018-01-26  9:18 ` [PATCH v2] clk: Properly update prepare/enable count on orphan clock reparent Marek Szyprowski
2018-01-26  9:18   ` Marek Szyprowski
2018-01-27  0:49   ` Stephen Boyd
2018-01-27  0:49     ` Stephen Boyd
2018-01-30  9:01     ` Marek Szyprowski
2018-01-30  9:01       ` Marek Szyprowski
2018-02-14 13:37       ` Marek Szyprowski
2018-02-14 13:37         ` Marek Szyprowski
2018-02-15 11:44         ` Jerome Brunet [this message]
2018-02-15 11:44           ` Jerome Brunet
2018-02-15 12:56           ` Marek Szyprowski
2018-02-15 12:56             ` Marek Szyprowski
2018-02-15 12:19     ` Jerome Brunet
2018-02-15 12:19       ` Jerome Brunet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1518695051.2883.77.camel@baylibre.com \
    --to=jbrunet@baylibre.com \
    --cc=aisheng.dong@nxp.com \
    --cc=b.zolnierkie@samsung.com \
    --cc=krzk@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mturquette@baylibre.com \
    --cc=sboyd@codeaurora.org \
    --cc=shawnguo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.