From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Return-Path: Message-ID: <1518695051.2883.77.camel@baylibre.com> Subject: Re: [PATCH v2] clk: Properly update prepare/enable count on orphan clock reparent From: Jerome Brunet To: Marek Szyprowski , Stephen Boyd Cc: linux-clk@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Michael Turquette , Shawn Guo , Dong Aisheng , Bartlomiej Zolnierkiewicz , Krzysztof Kozlowski Date: Thu, 15 Feb 2018 12:44:11 +0100 In-Reply-To: <747280a0-77ec-4407-9545-134019b24609@samsung.com> References: <20180126091845.11902-1-m.szyprowski@samsung.com> <20180127004917.GI28313@codeaurora.org> <7c7810e0-efcb-c837-1b18-328e59c3b2df@samsung.com> <747280a0-77ec-4407-9545-134019b24609@samsung.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 List-ID: On Wed, 2018-02-14 at 14:37 +0100, Marek Szyprowski wrote: > Hi Stephen, > > On 2018-01-30 10:01, Marek Szyprowski wrote: > > Hi Stephen > > > > On 2018-01-27 01:49, Stephen Boyd wrote: > > > On 01/26, Marek Szyprowski wrote: > > > > If orphaned clock has been already prepared/enabled (for example if > > > > it or > > > > one of its children has CLK_IS_CRITICAL flag), then the prepare/enable > > > > counters of the newly assigned parent are not updated correctly. This > > > > might later cause warnings during changing clock parents. > > > > > > This doesn't feel right. Perhaps we should delay enabling a clk > > > if it's CRITICAL until we adopt an orphaned clk. Good news is we > > > have orphan status tracking now so this should be pretty simple. > > > > Not really. It won't be so simple, because we would need to track the > > status of the whole clock sub-tree for enabling critical clocks. Even > > then we would need to delay enabling them until the real top clock is > > registered. > > > > > Otherwise migrating the count up is complicated and requires us > > > to call the prepare/enable ops on a critical clk and then keep > > > doing that each time it gets re-parented. Do you have this case, > > > where some clk is marked as CRITICAL, and then we need to migrate > > > that enable/prepare count to the parent? > > > > Yes, this is the case of Exynos5422-based boards, where we have a > > bunch of critical clocks, which are reparented from internal PLL to > > external oscillator when respective power domain is suspended. > > > > It took me a while to gather all the needed information from the > > debug logs. Here is one example of such clock tree: > > > > fin_pll (clk1, top clock, external pll) > > fout_dpll (clk2) > > mout_sclk_dpll (clk3) > > mout_aclk300_disp1 (clk4) > > dout_aclk300_disp1 (clk5) > > mout_sw_aclk300_disp1 (clk6) > > mout_user_aclk300_disp1 (clk7) > > aclk300_disp1 (clk8, critical) > > > > Before turning off power domains, aclk300_disp1 is reparented > > directly to fin_pll (this is a hardware requirement). When power > > domain is turned on again, aclk300_disp1 is reparented back to > > mout_user_aclk300_disp1. > > > > Clocks are registered in the following order: > > 1, 2, 7, 3, 6, 5, 8, 4. > > > > In this case when critical clock is registered, clk4 is not yet > > available, so clocks 1-3 won't get proper prepare/enable count. > > Then this causes a warning during reparenting clk8 to clk1: > > > > ------------[ cut here ]------------ > > WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:811 > > clk_core_disable_lock+0x18/0x24 > > Modules linked in: > > CPU: 0 PID: 59 Comm: kworker/0:1 Not tainted > > 4.15.0-next-20180130-dirty #131 > > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > > Workqueue: pm genpd_power_off_work_fn > > [] (unwind_backtrace) from [] (show_stack+0x10/0x14) > > [] (show_stack) from [] (dump_stack+0x90/0xc8) > > [] (dump_stack) from [] (__warn+0xe4/0x110) > > [] (__warn) from [] (warn_slowpath_null+0x40/0x48) > > [] (warn_slowpath_null) from [] > > (clk_core_disable_lock+0x18/0x24) > > [] (clk_core_disable_lock) from [] > > (clk_core_disable_unprepare+0xc/0x20) > > [] (clk_core_disable_unprepare) from [] > > (__clk_set_parent_after+0x48/0x4c) > > [] (__clk_set_parent_after) from [] > > (clk_core_set_parent_nolock+0x238/0x5d0) > > [] (clk_core_set_parent_nolock) from [] > > (clk_set_parent+0x38/0x6c) > > [] (clk_set_parent) from [] > > (exynos_pd_power+0x74/0x1cc) > > [] (exynos_pd_power) from [] > > (genpd_power_off+0x164/0x264) > > [] (genpd_power_off) from [] > > (genpd_power_off_work_fn+0x2c/0x40) > > [] (genpd_power_off_work_fn) from [] > > (process_one_work+0x1d0/0x7bc) > > [] (process_one_work) from [] > > (worker_thread+0x34/0x4dc) > > [] (worker_thread) from [] (kthread+0x128/0x164) > > [] (kthread) from [] (ret_from_fork+0x14/0x20) > > Exception stack(0xeeb01fb0 to 0xeeb01ff8) > > 1fa0: 00000000 00000000 00000000 > > 00000000 > > 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > > 00000000 > > 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > ---[ end trace 48eea511a44c78ef ]--- > > ------------[ cut here ]------------ > > WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:684 > > clk_core_disable_unprepare+0x18/0x20 > > Modules linked in: > > CPU: 0 PID: 59 Comm: kworker/0:1 Tainted: G W > > 4.15.0-next-20180130-dirty #131 > > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > > Workqueue: pm genpd_power_off_work_fn > > [] (unwind_backtrace) from [] (show_stack+0x10/0x14) > > [] (show_stack) from [] (dump_stack+0x90/0xc8) > > [] (dump_stack) from [] (__warn+0xe4/0x110) > > [] (__warn) from [] (warn_slowpath_null+0x40/0x48) > > [] (warn_slowpath_null) from [] > > (clk_core_disable_unprepare+0x18/0x20) > > [] (clk_core_disable_unprepare) from [] > > (__clk_set_parent_after+0x48/0x4c) > > [] (__clk_set_parent_after) from [] > > (clk_core_set_parent_nolock+0x238/0x5d0) > > [] (clk_core_set_parent_nolock) from [] > > (clk_set_parent+0x38/0x6c) > > [] (clk_set_parent) from [] > > (exynos_pd_power+0x74/0x1cc) > > [] (exynos_pd_power) from [] > > (genpd_power_off+0x164/0x264) > > [] (genpd_power_off) from [] > > (genpd_power_off_work_fn+0x2c/0x40) > > [] (genpd_power_off_work_fn) from [] > > (process_one_work+0x1d0/0x7bc) > > [] (process_one_work) from [] > > (worker_thread+0x34/0x4dc) > > [] (worker_thread) from [] (kthread+0x128/0x164) > > [] (kthread) from [] (ret_from_fork+0x14/0x20) > > Exception stack(0xeeb01fb0 to 0xeeb01ff8) > > 1fa0: 00000000 00000000 00000000 > > 00000000 > > 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > > 00000000 > > 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > ---[ end trace 48eea511a44c78f0 ]--- > > > > Please note that this pattern happens for other critical clocks too. > > > > > Hopefully it isn't the worser case, where the clk is handed out > > > to some consumer but it's still orphaned at that point, and then > > > we have little control over the migration of state to the parent. > > > > Well, in my opinion my patch is simplest fix for the regression > > introduced by commit f8f8f1d04494 ("clk: Don't touch hardware when > > reparenting during registration"). It also doesn't have any side-effects > > as it affects only the situation when orphaned clock has been already > > prepared/enabled, what in practice means it or one of its children has > > critical flag. Reworking critical clock handling will be much more > > complicated. > > Stephen, do you have any comments? > > The issue is there for almost 2 months... v4.16-rc1 is out and its > a good time to get a fix in (this or any other if you propose such). > > Best regards Hi Marek, I've sent another patch addressing the same issue (sorry, I did not see your thread earlier) While I agree parent adopting orphan need to take care of any existing count, your solution could be missing a few things, like dealing with CLK_OPS_PARENT_ENABLE clock types Fortunately, __clk_set_parent_before() and __clk_set_parent_after() already deal with those annoying stuff and does count migration well. Would you mind trying this change [0] and let me know if works for you as well ? [0] : https://lkml.kernel.org/r/20180214134340.17242-5-jbrunet@baylibre.com Cheers Jerome