* Re: [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default [not found] ` <20210702225145.2643303-2-martin.blumenstingl@googlemail.com> @ 2021-10-14 9:55 ` Alex Bee 2021-10-14 12:11 ` Martin Blumenstingl 0 siblings, 1 reply; 15+ messages in thread From: Alex Bee @ 2021-10-14 9:55 UTC (permalink / raw) To: Martin Blumenstingl, linux-clk, sboyd Cc: linux-arm-kernel, linux-kernel, Heiko Stübner, linux-rockchip Hi, Am 03.07.21 um 00:51 schrieb Martin Blumenstingl: > .determine_rate is meant to replace .round_rate. The former comes with a > benefit which is especially relevant on 32-bit systems: since > .determine_rate uses an "unsigned long" (compared to a "signed long" > which is used by .round_rate) the maximum value on 32-bit systems > increases from 2^31 (or approx. 2.14GHz) to 2^32 (or approx. 4.29GHz). > > Implement .determine_rate in addition to .round_rate so drivers that are > using clk_divider_{ro_,}ops can benefit from this by default. Keep the > .round_rate callback for now since some drivers rely on > clk_divider_ops.round_rate being implemented. > > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> This commit breaks composite clocks with multiple parents, since it adds a determine_rate callback, which is preferred over clock_round_rate in clk_composite_determine_rate in clk-composite.c and the "best-parent"-determination is only done for clock_round_rate-op there. There is no "best-parent"-determination in determine_rate in clk-divider which clk-compsite seems to expect - nor any multiple parents handling at all. That means that the composite will always stay at the same/initial parent clock (from the mux), without ever changing it (even if necessary). This breaks lot of clocks for Rockchip which intensively uses composites, i.e. those clocks will always stay at the initial parent, which in some cases is the XTAL clock and I strongly guess it is the same for other platforms, which use composite clocks having more than one parent (e.g. mediatek, ti ...) Example (RK3399) clk_sdio is set (initialized) with XTAL (24 MHz) as parent in u-boot. It will always stay at this parent, even if the mmc driver sets a rate of 200 MHz (fails, as the nature of things), which should switch it to any of its possible parent PLLs defined in mux_pll_src_cpll_gpll_npll_ppll_upll_24m_p (see clk-rk3399.c) - which never happens. Reverting this commit makes it work again: Unless there is a quick and obvious fix for that, I guess this should be done for 5.15 - even if the real issue is somewhere else. Alex > --- > drivers/clk/clk-divider.c | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c > index 87ba4966b0e8..f6b2bf558486 100644 > --- a/drivers/clk/clk-divider.c > +++ b/drivers/clk/clk-divider.c > @@ -446,6 +446,27 @@ static long clk_divider_round_rate(struct clk_hw *hw, unsigned long rate, > divider->width, divider->flags); > } > > +static int clk_divider_determine_rate(struct clk_hw *hw, > + struct clk_rate_request *req) > +{ > + struct clk_divider *divider = to_clk_divider(hw); > + > + /* if read only, just return current value */ > + if (divider->flags & CLK_DIVIDER_READ_ONLY) { > + u32 val; > + > + val = clk_div_readl(divider) >> divider->shift; > + val &= clk_div_mask(divider->width); > + > + return divider_ro_determine_rate(hw, req, divider->table, > + divider->width, > + divider->flags, val); > + } > + > + return divider_determine_rate(hw, req, divider->table, divider->width, > + divider->flags); > +} > + > int divider_get_val(unsigned long rate, unsigned long parent_rate, > const struct clk_div_table *table, u8 width, > unsigned long flags) > @@ -501,6 +522,7 @@ static int clk_divider_set_rate(struct clk_hw *hw, unsigned long rate, > const struct clk_ops clk_divider_ops = { > .recalc_rate = clk_divider_recalc_rate, > .round_rate = clk_divider_round_rate, > + .determine_rate = clk_divider_determine_rate, > .set_rate = clk_divider_set_rate, > }; > EXPORT_SYMBOL_GPL(clk_divider_ops); > @@ -508,6 +530,7 @@ EXPORT_SYMBOL_GPL(clk_divider_ops); > const struct clk_ops clk_divider_ro_ops = { > .recalc_rate = clk_divider_recalc_rate, > .round_rate = clk_divider_round_rate, > + .determine_rate = clk_divider_determine_rate, > }; > EXPORT_SYMBOL_GPL(clk_divider_ro_ops); > _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default 2021-10-14 9:55 ` [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default Alex Bee @ 2021-10-14 12:11 ` Martin Blumenstingl 2021-10-14 21:34 ` Martin Blumenstingl 0 siblings, 1 reply; 15+ messages in thread From: Martin Blumenstingl @ 2021-10-14 12:11 UTC (permalink / raw) To: Alex Bee Cc: linux-clk, sboyd, linux-arm-kernel, linux-kernel, Heiko Stübner, linux-rockchip Hi Alex, On Thu, Oct 14, 2021 at 11:55 AM Alex Bee <knaerzche@gmail.com> wrote: [...] > This breaks lot of clocks for Rockchip which intensively uses > composites, i.e. those clocks will always stay at the initial parent, > which in some cases is the XTAL clock and I strongly guess it is the > same for other platforms, which use composite clocks having more than > one parent (e.g. mediatek, ti ...) Sorry for that and thanks for bisecting this! > Example (RK3399) > clk_sdio is set (initialized) with XTAL (24 MHz) as parent in u-boot. > It will always stay at this parent, even if the mmc driver sets a rate > of 200 MHz (fails, as the nature of things), which should switch it to > any of its possible parent PLLs defined in > mux_pll_src_cpll_gpll_npll_ppll_upll_24m_p (see clk-rk3399.c) - which > never happens. My question to Stephen et. al. is: where is the correct place to solve this? What I came up with so far (in no particular order): 1) not using clk-composite from clock drivers and letting CCF take care of re-parenting clocks as needed (and as specified with CLK_SET_RATE_NO_REPARENT) 2) clk-composite.c: extending the logic so "rate" clocks with .determine_rate include the existing logic which only applies to .round_rate (which means clk-composite.c is then responsible for finding the best possible parent clock) 3) clk-divider.c: extending the logic here to account for clk_hws with multiple parents For 3) I am wondering whether this would even work because it seems that clk-composite uses multiple struct clk_hw. Letting the divider handle multiple parents means it would need to know about the information which is only available in mux_hw - whereas clk-composite currently passes rate_hw (struct clk_hw for the divider). I am happy to work on a patch for this if I can get some help with testing (since I don't have any board with Rockchip SoC). > Reverting this commit makes it work again: Unless there is a quick and > obvious fix for that, I guess this should be done for 5.15 - even if the > real issue is somewhere else. Reverting this patch is fine from the Amlogic SoC point of view. The main goal was to clean up / improve the CCF code. Nothing (that I am aware of) is going to break in Amlogic land if we revert this. Best regards, Martin _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default 2021-10-14 12:11 ` Martin Blumenstingl @ 2021-10-14 21:34 ` Martin Blumenstingl 2021-10-14 22:52 ` Stephen Boyd 2021-10-15 19:14 ` [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default Alex Bee 0 siblings, 2 replies; 15+ messages in thread From: Martin Blumenstingl @ 2021-10-14 21:34 UTC (permalink / raw) To: Alex Bee Cc: linux-clk, sboyd, linux-arm-kernel, linux-kernel, Heiko Stübner, linux-rockchip [-- Attachment #1: Type: text/plain, Size: 1338 bytes --] On Thu, Oct 14, 2021 at 2:11 PM Martin Blumenstingl <martin.blumenstingl@googlemail.com> wrote: [...] > > Reverting this commit makes it work again: Unless there is a quick and > > obvious fix for that, I guess this should be done for 5.15 - even if the > > real issue is somewhere else. > Reverting this patch is fine from the Amlogic SoC point of view. > The main goal was to clean up / improve the CCF code. > Nothing (that I am aware of) is going to break in Amlogic land if we > revert this. Unfortunately only now I realized that reverting this patch would also require reverting the other five patches in this series (since they depend on this one). For this reason I propose changing the order of the checks in clk-composite.c - see the attached patch (which I can send as a proper one once agreed that this is the way to go forward) Off-list Alex also suggested that I should use rate_ops.determine_rate if available. While I agree that this makes sense in general my plan is to do this in a follow-up patch. Changing the order of the conditions is needed anyways and it *should* fix the issue reported here (but I have no way of testing that unfortunately). Alex, it would be great if you (or someone with Rockchip boards) could test the attached patch and let me know if it fixes the reported problem. Best regards, Martin [-- Attachment #2: 0001-clk-composite-Also-consider-.determine_rate-for-rate.patch --] [-- Type: text/x-patch, Size: 3151 bytes --] From b34cde2d9e1d8ecb2c775cbcad519933f63b84f9 Mon Sep 17 00:00:00 2001 From: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Date: Thu, 14 Oct 2021 14:16:22 +0200 Subject: [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites Commit 69a00fb3d69706 ("clk: divider: Implement and wire up .determine_rate by default") switches clk_divider_ops to implement .determine_rate by default. This breaks composite clocks with multiple parents because clk-composite.c does not use the special handling for mux + divider combinations anymore (that was restricted to rate clocks which only implement .round_rate, but not .determine_rate). Alex reports: This breaks lot of clocks for Rockchip which intensively uses composites, i.e. those clocks will always stay at the initial parent, which in some cases is the XTAL clock and I strongly guess it is the same for other platforms, which use composite clocks having more than one parent (e.g. mediatek, ti ...) Example (RK3399) clk_sdio is set (initialized) with XTAL (24 MHz) as parent in u-boot. It will always stay at this parent, even if the mmc driver sets a rate of 200 MHz (fails, as the nature of things), which should switch it to any of its possible parent PLLs defined in mux_pll_src_cpll_gpll_npll_ppll_upll_24m_p (see clk-rk3399.c) - which never happens. Restore the original behavior by changing the priority of the conditions inside clk-composite.c. Now the special rate + mux case (with rate_ops having a .round_rate - which is still the case for the default clk_divider_ops) is preferred over rate_ops which have .determine_rate defined (and not further considering the mux). Fixes: 69a00fb3d69706 ("clk: divider: Implement and wire up .determine_rate by default") Reported-by: Alex Bee <knaerzche@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> --- drivers/clk/clk-composite.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c index 0506046a5f4b..510a9965633b 100644 --- a/drivers/clk/clk-composite.c +++ b/drivers/clk/clk-composite.c @@ -58,11 +58,8 @@ static int clk_composite_determine_rate(struct clk_hw *hw, long rate; int i; - if (rate_hw && rate_ops && rate_ops->determine_rate) { - __clk_hw_set_clk(rate_hw, hw); - return rate_ops->determine_rate(rate_hw, req); - } else if (rate_hw && rate_ops && rate_ops->round_rate && - mux_hw && mux_ops && mux_ops->set_parent) { + if (rate_hw && rate_ops && rate_ops->round_rate && + mux_hw && mux_ops && mux_ops->set_parent) { req->best_parent_hw = NULL; if (clk_hw_get_flags(hw) & CLK_SET_RATE_NO_REPARENT) { @@ -107,6 +104,9 @@ static int clk_composite_determine_rate(struct clk_hw *hw, req->rate = best_rate; return 0; + } else if (rate_hw && rate_ops && rate_ops->determine_rate) { + __clk_hw_set_clk(rate_hw, hw); + return rate_ops->determine_rate(rate_hw, req); } else if (mux_hw && mux_ops && mux_ops->determine_rate) { __clk_hw_set_clk(mux_hw, hw); return mux_ops->determine_rate(mux_hw, req); -- 2.33.0 [-- Attachment #3: Type: text/plain, Size: 170 bytes --] _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default 2021-10-14 21:34 ` Martin Blumenstingl @ 2021-10-14 22:52 ` Stephen Boyd 2021-10-15 12:05 ` [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites Martin Blumenstingl 2021-10-15 19:14 ` [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default Alex Bee 1 sibling, 1 reply; 15+ messages in thread From: Stephen Boyd @ 2021-10-14 22:52 UTC (permalink / raw) To: Alex Bee, Martin Blumenstingl Cc: linux-clk, linux-arm-kernel, linux-kernel, Heiko Stübner, linux-rockchip Quoting Martin Blumenstingl (2021-10-14 14:34:54) > On Thu, Oct 14, 2021 at 2:11 PM Martin Blumenstingl > <martin.blumenstingl@googlemail.com> wrote: > [...] > > > Reverting this commit makes it work again: Unless there is a quick and > > > obvious fix for that, I guess this should be done for 5.15 - even if the > > > real issue is somewhere else. > > Reverting this patch is fine from the Amlogic SoC point of view. > > The main goal was to clean up / improve the CCF code. > > Nothing (that I am aware of) is going to break in Amlogic land if we > > revert this. > Unfortunately only now I realized that reverting this patch would also > require reverting the other five patches in this series (since they > depend on this one). > For this reason I propose changing the order of the checks in > clk-composite.c - see the attached patch (which I can send as a proper > one once agreed that this is the way to go forward) > > Off-list Alex also suggested that I should use rate_ops.determine_rate > if available. > While I agree that this makes sense in general my plan is to do this > in a follow-up patch. > Changing the order of the conditions is needed anyways and it *should* > fix the issue reported here (but I have no way of testing that > unfortunately). > > Alex, it would be great if you (or someone with Rockchip boards) could > test the attached patch and let me know if it fixes the reported > problem. > I can't read your attached patch. Please send it inline. _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites 2021-10-14 22:52 ` Stephen Boyd @ 2021-10-15 12:05 ` Martin Blumenstingl 2021-10-15 21:27 ` Stephen Boyd 2021-11-01 20:19 ` Guillaume Tucker 0 siblings, 2 replies; 15+ messages in thread From: Martin Blumenstingl @ 2021-10-15 12:05 UTC (permalink / raw) To: sboyd Cc: heiko, knaerzche, linux-arm-kernel, linux-clk, linux-kernel, linux-rockchip, martin.blumenstingl Commit 69a00fb3d69706 ("clk: divider: Implement and wire up .determine_rate by default") switches clk_divider_ops to implement .determine_rate by default. This breaks composite clocks with multiple parents because clk-composite.c does not use the special handling for mux + divider combinations anymore (that was restricted to rate clocks which only implement .round_rate, but not .determine_rate). Alex reports: This breaks lot of clocks for Rockchip which intensively uses composites, i.e. those clocks will always stay at the initial parent, which in some cases is the XTAL clock and I strongly guess it is the same for other platforms, which use composite clocks having more than one parent (e.g. mediatek, ti ...) Example (RK3399) clk_sdio is set (initialized) with XTAL (24 MHz) as parent in u-boot. It will always stay at this parent, even if the mmc driver sets a rate of 200 MHz (fails, as the nature of things), which should switch it to any of its possible parent PLLs defined in mux_pll_src_cpll_gpll_npll_ppll_upll_24m_p (see clk-rk3399.c) - which never happens. Restore the original behavior by changing the priority of the conditions inside clk-composite.c. Now the special rate + mux case (with rate_ops having a .round_rate - which is still the case for the default clk_divider_ops) is preferred over rate_ops which have .determine_rate defined (and not further considering the mux). Fixes: 69a00fb3d69706 ("clk: divider: Implement and wire up .determine_rate by default") Reported-by: Alex Bee <knaerzche@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> --- Re-sending this as inline patch instead of attaching it. drivers/clk/clk-composite.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c index 0506046a5f4b..510a9965633b 100644 --- a/drivers/clk/clk-composite.c +++ b/drivers/clk/clk-composite.c @@ -58,11 +58,8 @@ static int clk_composite_determine_rate(struct clk_hw *hw, long rate; int i; - if (rate_hw && rate_ops && rate_ops->determine_rate) { - __clk_hw_set_clk(rate_hw, hw); - return rate_ops->determine_rate(rate_hw, req); - } else if (rate_hw && rate_ops && rate_ops->round_rate && - mux_hw && mux_ops && mux_ops->set_parent) { + if (rate_hw && rate_ops && rate_ops->round_rate && + mux_hw && mux_ops && mux_ops->set_parent) { req->best_parent_hw = NULL; if (clk_hw_get_flags(hw) & CLK_SET_RATE_NO_REPARENT) { @@ -107,6 +104,9 @@ static int clk_composite_determine_rate(struct clk_hw *hw, req->rate = best_rate; return 0; + } else if (rate_hw && rate_ops && rate_ops->determine_rate) { + __clk_hw_set_clk(rate_hw, hw); + return rate_ops->determine_rate(rate_hw, req); } else if (mux_hw && mux_ops && mux_ops->determine_rate) { __clk_hw_set_clk(mux_hw, hw); return mux_ops->determine_rate(mux_hw, req); -- 2.33.0 _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites 2021-10-15 12:05 ` [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites Martin Blumenstingl @ 2021-10-15 21:27 ` Stephen Boyd 2021-11-01 20:19 ` Guillaume Tucker 1 sibling, 0 replies; 15+ messages in thread From: Stephen Boyd @ 2021-10-15 21:27 UTC (permalink / raw) To: Martin Blumenstingl Cc: heiko, knaerzche, linux-arm-kernel, linux-clk, linux-kernel, linux-rockchip, martin.blumenstingl Quoting Martin Blumenstingl (2021-10-15 05:05:59) > Commit 69a00fb3d69706 ("clk: divider: Implement and wire up > .determine_rate by default") switches clk_divider_ops to implement > .determine_rate by default. This breaks composite clocks with multiple > parents because clk-composite.c does not use the special handling for > mux + divider combinations anymore (that was restricted to rate clocks > which only implement .round_rate, but not .determine_rate). > > Alex reports: > This breaks lot of clocks for Rockchip which intensively uses > composites, i.e. those clocks will always stay at the initial parent, > which in some cases is the XTAL clock and I strongly guess it is the > same for other platforms, which use composite clocks having more than > one parent (e.g. mediatek, ti ...) > > Example (RK3399) > clk_sdio is set (initialized) with XTAL (24 MHz) as parent in u-boot. > It will always stay at this parent, even if the mmc driver sets a rate > of 200 MHz (fails, as the nature of things), which should switch it > to any of its possible parent PLLs defined in > mux_pll_src_cpll_gpll_npll_ppll_upll_24m_p (see clk-rk3399.c) - which > never happens. > > Restore the original behavior by changing the priority of the conditions > inside clk-composite.c. Now the special rate + mux case (with rate_ops > having a .round_rate - which is still the case for the default > clk_divider_ops) is preferred over rate_ops which have .determine_rate > defined (and not further considering the mux). > > Fixes: 69a00fb3d69706 ("clk: divider: Implement and wire up .determine_rate by default") > Reported-by: Alex Bee <knaerzche@gmail.com> > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> > --- > Re-sending this as inline patch instead of attaching it. > > drivers/clk/clk-composite.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c > index 0506046a5f4b..510a9965633b 100644 > --- a/drivers/clk/clk-composite.c > +++ b/drivers/clk/clk-composite.c > @@ -58,11 +58,8 @@ static int clk_composite_determine_rate(struct clk_hw *hw, > long rate; > int i; > > - if (rate_hw && rate_ops && rate_ops->determine_rate) { > - __clk_hw_set_clk(rate_hw, hw); > - return rate_ops->determine_rate(rate_hw, req); > - } else if (rate_hw && rate_ops && rate_ops->round_rate && > - mux_hw && mux_ops && mux_ops->set_parent) { > + if (rate_hw && rate_ops && rate_ops->round_rate && > + mux_hw && mux_ops && mux_ops->set_parent) { What do we do if rate_ops and mux_ops only implement determine_rate? It will fail right? We can't mesh them together in function. We should probably fail to let the composite clk be registered if that happens. > req->best_parent_hw = NULL; > > if (clk_hw_get_flags(hw) & CLK_SET_RATE_NO_REPARENT) { > @@ -107,6 +104,9 @@ static int clk_composite_determine_rate(struct clk_hw *hw, > > req->rate = best_rate; > return 0; > + } else if (rate_hw && rate_ops && rate_ops->determine_rate) { > + __clk_hw_set_clk(rate_hw, hw); > + return rate_ops->determine_rate(rate_hw, req); > } else if (mux_hw && mux_ops && mux_ops->determine_rate) { > __clk_hw_set_clk(mux_hw, hw); > return mux_ops->determine_rate(mux_hw, req); > -- > 2.33.0 > _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites 2021-10-15 12:05 ` [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites Martin Blumenstingl 2021-10-15 21:27 ` Stephen Boyd @ 2021-11-01 20:19 ` Guillaume Tucker 2021-11-01 20:58 ` Martin Blumenstingl 1 sibling, 1 reply; 15+ messages in thread From: Guillaume Tucker @ 2021-11-01 20:19 UTC (permalink / raw) To: Martin Blumenstingl, sboyd Cc: heiko, knaerzche, linux-arm-kernel, linux-clk, linux-kernel, linux-rockchip, kernelci@groups.io, Collabora Kernel ML Hi Martin, Please see the bisection report below about a boot failure on rk3328-rock64. Reports aren't automatically sent to the public while we're trialing new bisection features on kernelci.org but this one looks valid. Some more details can be found here: https://linux.kernelci.org/test/case/id/617f11f5c157b666fb3358e6/ Here's what appears to be the cause of the problem: [ 0.033465] CPU: CPUs started in inconsistent modes [ 0.033557] Unexpected kernel BRK exception at EL1 [ 0.034432] Internal error: BRK handler: f2000800 [#1] PREEMPT SMP There doesn't appear to be any other platform in KernelCI showing the same issue. Please let us know if you need help debugging the issue or if you have a fix to try. Best wishes, Guillaume GitHub: https://github.com/kernelci/kernelci-project/issues/69 ------------------------------------------------------------------------------- * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * This automated bisection report was sent to you on the basis * * that you may be involved with the breaking commit it has * * found. No manual investigation has been done to verify it, * * and the root cause of the problem may be somewhere else. * * * * If you do send a fix, please include this trailer: * * Reported-by: "kernelci.org bot" <bot@kernelci.org> * * * * Hope this helps! * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * mainline/master bisection: baseline.login on rk3328-rock64 Summary: Start: 75fcbd38608c3 Merge tag 'perf-tools-fixes-for-v5.15-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Plain log: https://storage.kernelci.org/mainline/master/v5.15-rc7-240-g75fcbd38608c/arm64/defconfig/gcc-10/lab-baylibre/baseline-rk3328-rock64.txt HTML log: https://storage.kernelci.org/mainline/master/v5.15-rc7-240-g75fcbd38608c/arm64/defconfig/gcc-10/lab-baylibre/baseline-rk3328-rock64.html Result: 675c496d0f92b clk: composite: Also consider .determine_rate for rate + mux composites Checks: revert: PASS verify: PASS Parameters: Tree: mainline URL: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Branch: master Target: rk3328-rock64 CPU arch: arm64 Lab: lab-baylibre Compiler: gcc-10 Config: defconfig Test case: baseline.login ------------------------------------------------------------------------------- Git bisection log: git bisect start # good: [119c85055d867b9588263bca59794c872ef2a30e] Merge tag 'powerpc-5.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux git bisect good 119c85055d867b9588263bca59794c872ef2a30e # bad: [75fcbd38608c3ce9f4dc784f2ac8916add64c9a8] Merge tag 'perf-tools-fixes-for-v5.15-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux git bisect bad 75fcbd38608c3ce9f4dc784f2ac8916add64c9a8 # bad: [095729484efc4aa4604c474792b059bd940addce] perf build: Suppress 'rm dlfilter' build message git bisect bad 095729484efc4aa4604c474792b059bd940addce # bad: [3a4347d82efdfcc5465b3ed37616426989182915] Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux git bisect bad 3a4347d82efdfcc5465b3ed37616426989182915 # good: [54c5639d8f507ebefa814f574cb6f763033a72a5] riscv: Fix asan-stack clang build git bisect good 54c5639d8f507ebefa814f574cb6f763033a72a5 # bad: [675c496d0f92b481ebe4abf4fb06eadad7789de6] clk: composite: Also consider .determine_rate for rate + mux composites git bisect bad 675c496d0f92b481ebe4abf4fb06eadad7789de6 # first bad commit: [675c496d0f92b481ebe4abf4fb06eadad7789de6] clk: composite: Also consider .determine_rate for rate + mux composites ------------------------------------------------------------------------------- Breaking commit found: commit 675c496d0f92b481ebe4abf4fb06eadad7789de6 Author: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Date: Sat Oct 16 12:50:21 2021 +0200 On 15/10/2021 13:05, Martin Blumenstingl wrote: > Commit 69a00fb3d69706 ("clk: divider: Implement and wire up > .determine_rate by default") switches clk_divider_ops to implement > .determine_rate by default. This breaks composite clocks with multiple > parents because clk-composite.c does not use the special handling for > mux + divider combinations anymore (that was restricted to rate clocks > which only implement .round_rate, but not .determine_rate). > > Alex reports: > This breaks lot of clocks for Rockchip which intensively uses > composites, i.e. those clocks will always stay at the initial parent, > which in some cases is the XTAL clock and I strongly guess it is the > same for other platforms, which use composite clocks having more than > one parent (e.g. mediatek, ti ...) > > Example (RK3399) > clk_sdio is set (initialized) with XTAL (24 MHz) as parent in u-boot. > It will always stay at this parent, even if the mmc driver sets a rate > of 200 MHz (fails, as the nature of things), which should switch it > to any of its possible parent PLLs defined in > mux_pll_src_cpll_gpll_npll_ppll_upll_24m_p (see clk-rk3399.c) - which > never happens. > > Restore the original behavior by changing the priority of the conditions > inside clk-composite.c. Now the special rate + mux case (with rate_ops > having a .round_rate - which is still the case for the default > clk_divider_ops) is preferred over rate_ops which have .determine_rate > defined (and not further considering the mux). > > Fixes: 69a00fb3d69706 ("clk: divider: Implement and wire up .determine_rate by default") > Reported-by: Alex Bee <knaerzche@gmail.com> > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> > --- > Re-sending this as inline patch instead of attaching it. > > drivers/clk/clk-composite.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c > index 0506046a5f4b..510a9965633b 100644 > --- a/drivers/clk/clk-composite.c > +++ b/drivers/clk/clk-composite.c > @@ -58,11 +58,8 @@ static int clk_composite_determine_rate(struct clk_hw *hw, > long rate; > int i; > > - if (rate_hw && rate_ops && rate_ops->determine_rate) { > - __clk_hw_set_clk(rate_hw, hw); > - return rate_ops->determine_rate(rate_hw, req); > - } else if (rate_hw && rate_ops && rate_ops->round_rate && > - mux_hw && mux_ops && mux_ops->set_parent) { > + if (rate_hw && rate_ops && rate_ops->round_rate && > + mux_hw && mux_ops && mux_ops->set_parent) { > req->best_parent_hw = NULL; > > if (clk_hw_get_flags(hw) & CLK_SET_RATE_NO_REPARENT) { > @@ -107,6 +104,9 @@ static int clk_composite_determine_rate(struct clk_hw *hw, > > req->rate = best_rate; > return 0; > + } else if (rate_hw && rate_ops && rate_ops->determine_rate) { > + __clk_hw_set_clk(rate_hw, hw); > + return rate_ops->determine_rate(rate_hw, req); > } else if (mux_hw && mux_ops && mux_ops->determine_rate) { > __clk_hw_set_clk(mux_hw, hw); > return mux_ops->determine_rate(mux_hw, req); > _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites 2021-11-01 20:19 ` Guillaume Tucker @ 2021-11-01 20:58 ` Martin Blumenstingl 2021-11-01 21:59 ` Robin Murphy 0 siblings, 1 reply; 15+ messages in thread From: Martin Blumenstingl @ 2021-11-01 20:58 UTC (permalink / raw) To: Guillaume Tucker Cc: sboyd, heiko, knaerzche, linux-arm-kernel, linux-clk, linux-kernel, linux-rockchip, kernelci@groups.io, Collabora Kernel ML, Chen-Yu Tsai Hi Guillaume, On Mon, Nov 1, 2021 at 9:19 PM Guillaume Tucker <guillaume.tucker@collabora.com> wrote: > > Hi Martin, > > Please see the bisection report below about a boot failure on > rk3328-rock64. > > Reports aren't automatically sent to the public while we're > trialing new bisection features on kernelci.org but this one > looks valid. > > Some more details can be found here: > > https://linux.kernelci.org/test/case/id/617f11f5c157b666fb3358e6/ > > Here's what appears to be the cause of the problem: > > [ 0.033465] CPU: CPUs started in inconsistent modes > [ 0.033557] Unexpected kernel BRK exception at EL1 > [ 0.034432] Internal error: BRK handler: f2000800 [#1] PREEMPT SMP > > There doesn't appear to be any other platform in KernelCI showing > the same issue. That's a strange error for the changes from my patch. At first glance I don't see any relation to clk-composite code: - the call trace doesn't have any references to CCF or rockchip clock drivers - clk-rk3328.c uses drivers/clk/rockchip/clk-cpu.c to register the CPU clock which does not use clk-composite Chen-Yu has tested this patch (plus [0]) on RK3399 and didn't observe any problems. So maybe this is a RK3328 specific issue? Anyways, I am interested in fixing this issue because reverting is becoming more and more complex (since I think we're at eight commits which would need to be reverted in total). > Please let us know if you need help debugging the issue or if you > have a fix to try. Could you please try [0] which is the second patch in the series which finally made it upstream. This second patch is not in 5.15 because I believed that it's only something to make the code in clk-composite.c more future-proof. It's not a condition that I am aware of. I don't have any Rockchip boards myself. So I am thankful for any help I can get. Best regards, Martin [0] https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git/commit/?h=clk-next&id=6594988fd625ff0d9a8f90f1788e16185358a3e6 _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites 2021-11-01 20:58 ` Martin Blumenstingl @ 2021-11-01 21:59 ` Robin Murphy 2021-11-01 22:11 ` Robin Murphy 0 siblings, 1 reply; 15+ messages in thread From: Robin Murphy @ 2021-11-01 21:59 UTC (permalink / raw) To: Martin Blumenstingl, Guillaume Tucker Cc: sboyd, heiko, knaerzche, linux-arm-kernel, linux-clk, linux-kernel, linux-rockchip, kernelci@groups.io, Collabora Kernel ML, Chen-Yu Tsai On 2021-11-01 20:58, Martin Blumenstingl wrote: > Hi Guillaume, > > On Mon, Nov 1, 2021 at 9:19 PM Guillaume Tucker > <guillaume.tucker@collabora.com> wrote: >> >> Hi Martin, >> >> Please see the bisection report below about a boot failure on >> rk3328-rock64. >> >> Reports aren't automatically sent to the public while we're >> trialing new bisection features on kernelci.org but this one >> looks valid. >> >> Some more details can be found here: >> >> https://linux.kernelci.org/test/case/id/617f11f5c157b666fb3358e6/ >> >> Here's what appears to be the cause of the problem: >> >> [ 0.033465] CPU: CPUs started in inconsistent modes >> [ 0.033557] Unexpected kernel BRK exception at EL1 >> [ 0.034432] Internal error: BRK handler: f2000800 [#1] PREEMPT SMP What's weird is that that's really just the same WARN that's also present in 'successful' logs, except for some reason it's behaving as if the break handler hasn't been registered, despite that having happened long before we got to smp_init(). At this point we're also still some way off getting as far as initcalls, so I'm not sure that the clock driver would be in the picture at all yet. Is the bisection repeatable, or is this just random flakiness misleading things? I'd also note that you need pretty horrifically broken firmware to hit that warning in the first place, which might cast a bit of doubt over the trustworthiness of that board altogether. Robin. >> There doesn't appear to be any other platform in KernelCI showing >> the same issue. > That's a strange error for the changes from my patch. > At first glance I don't see any relation to clk-composite code: > - the call trace doesn't have any references to CCF or rockchip clock drivers > - clk-rk3328.c uses drivers/clk/rockchip/clk-cpu.c to register the CPU > clock which does not use clk-composite > > Chen-Yu has tested this patch (plus [0]) on RK3399 and didn't observe > any problems. > So maybe this is a RK3328 specific issue? > Anyways, I am interested in fixing this issue because reverting is > becoming more and more complex (since I think we're at eight commits > which would need to be reverted in total). > >> Please let us know if you need help debugging the issue or if you >> have a fix to try. > Could you please try [0] which is the second patch in the series which > finally made it upstream. > This second patch is not in 5.15 because I believed that it's only > something to make the code in clk-composite.c more future-proof. It's > not a condition that I am aware of. > > I don't have any Rockchip boards myself. > So I am thankful for any help I can get. > > > Best regards, > Martin > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git/commit/?h=clk-next&id=6594988fd625ff0d9a8f90f1788e16185358a3e6 > > _______________________________________________ > Linux-rockchip mailing list > Linux-rockchip@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-rockchip > _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites 2021-11-01 21:59 ` Robin Murphy @ 2021-11-01 22:11 ` Robin Murphy 2021-11-01 22:41 ` Alex Bee 0 siblings, 1 reply; 15+ messages in thread From: Robin Murphy @ 2021-11-01 22:11 UTC (permalink / raw) To: Martin Blumenstingl, Guillaume Tucker Cc: sboyd, heiko, knaerzche, linux-arm-kernel, linux-clk, linux-kernel, linux-rockchip, kernelci@groups.io, Collabora Kernel ML, Chen-Yu Tsai On 2021-11-01 21:59, Robin Murphy wrote: > On 2021-11-01 20:58, Martin Blumenstingl wrote: >> Hi Guillaume, >> >> On Mon, Nov 1, 2021 at 9:19 PM Guillaume Tucker >> <guillaume.tucker@collabora.com> wrote: >>> >>> Hi Martin, >>> >>> Please see the bisection report below about a boot failure on >>> rk3328-rock64. >>> >>> Reports aren't automatically sent to the public while we're >>> trialing new bisection features on kernelci.org but this one >>> looks valid. >>> >>> Some more details can be found here: >>> >>> https://linux.kernelci.org/test/case/id/617f11f5c157b666fb3358e6/ >>> >>> Here's what appears to be the cause of the problem: >>> >>> [ 0.033465] CPU: CPUs started in inconsistent modes >>> [ 0.033557] Unexpected kernel BRK exception at EL1 >>> [ 0.034432] Internal error: BRK handler: f2000800 [#1] PREEMPT SMP > > What's weird is that that's really just the same WARN that's also > present in 'successful' logs, except for some reason it's behaving as if > the break handler hasn't been registered, despite that having happened > long before we got to smp_init(). At this point we're also still some > way off getting as far as initcalls, so I'm not sure that the clock > driver would be in the picture at all yet. > > Is the bisection repeatable, or is this just random flakiness misleading > things? I'd also note that you need pretty horrifically broken firmware > to hit that warning in the first place, which might cast a bit of doubt > over the trustworthiness of that board altogether. Ah, on closer inspection it might be entirely repeatable for a given kernel build, but with the behaviour being very sensitive to code/data segment layout changes... ... 23:44:24.457917 Filename '1007060/tftp-deploy-dvdnydcw/kernel/Image'. 23:44:24.460178 Load address: 0x2000000 ... 23:44:27.180962 Bytes transferred = 33681920 (201f200 hex) ... 23:44:27.288135 Filename '1007060/tftp-deploy-dvdnydcw/ramdisk/ramdisk.cpio.gz.uboot'. 23:44:27.288465 Load address: 0x4000000 ... Yeah, that'll be a problem ;) Cheers, Robin. >>> There doesn't appear to be any other platform in KernelCI showing >>> the same issue. >> That's a strange error for the changes from my patch. >> At first glance I don't see any relation to clk-composite code: >> - the call trace doesn't have any references to CCF or rockchip clock >> drivers >> - clk-rk3328.c uses drivers/clk/rockchip/clk-cpu.c to register the CPU >> clock which does not use clk-composite >> >> Chen-Yu has tested this patch (plus [0]) on RK3399 and didn't observe >> any problems. >> So maybe this is a RK3328 specific issue? >> Anyways, I am interested in fixing this issue because reverting is >> becoming more and more complex (since I think we're at eight commits >> which would need to be reverted in total). >> >>> Please let us know if you need help debugging the issue or if you >>> have a fix to try. >> Could you please try [0] which is the second patch in the series which >> finally made it upstream. >> This second patch is not in 5.15 because I believed that it's only >> something to make the code in clk-composite.c more future-proof. It's >> not a condition that I am aware of. >> >> I don't have any Rockchip boards myself. >> So I am thankful for any help I can get. >> >> >> Best regards, >> Martin >> >> >> [0] >> https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git/commit/?h=clk-next&id=6594988fd625ff0d9a8f90f1788e16185358a3e6 >> >> >> _______________________________________________ >> Linux-rockchip mailing list >> Linux-rockchip@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-rockchip >> > > _______________________________________________ > Linux-rockchip mailing list > Linux-rockchip@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-rockchip _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites 2021-11-01 22:11 ` Robin Murphy @ 2021-11-01 22:41 ` Alex Bee 2021-11-02 7:58 ` Guillaume Tucker 0 siblings, 1 reply; 15+ messages in thread From: Alex Bee @ 2021-11-01 22:41 UTC (permalink / raw) To: Robin Murphy, Martin Blumenstingl, Guillaume Tucker Cc: sboyd, heiko, linux-arm-kernel, linux-clk, linux-kernel, linux-rockchip, kernelci@groups.io, Collabora Kernel ML, Chen-Yu Tsai Hi Guillaume, Am 01.11.21 um 23:11 schrieb Robin Murphy: > On 2021-11-01 21:59, Robin Murphy wrote: >> On 2021-11-01 20:58, Martin Blumenstingl wrote: >>> Hi Guillaume, >>> >>> On Mon, Nov 1, 2021 at 9:19 PM Guillaume Tucker >>> <guillaume.tucker@collabora.com> wrote: >>>> >>>> Hi Martin, >>>> >>>> Please see the bisection report below about a boot failure on >>>> rk3328-rock64. >>>> >>>> Reports aren't automatically sent to the public while we're >>>> trialing new bisection features on kernelci.org but this one >>>> looks valid. >>>> >>>> Some more details can be found here: >>>> >>>> https://linux.kernelci.org/test/case/id/617f11f5c157b666fb3358e6/ >>>> >>>> Here's what appears to be the cause of the problem: >>>> >>>> [ 0.033465] CPU: CPUs started in inconsistent modes >>>> [ 0.033557] Unexpected kernel BRK exception at EL1 >>>> [ 0.034432] Internal error: BRK handler: f2000800 [#1] PREEMPT SMP >> >> What's weird is that that's really just the same WARN that's also >> present in 'successful' logs, except for some reason it's behaving as >> if the break handler hasn't been registered, despite that having >> happened long before we got to smp_init(). At this point we're also >> still some way off getting as far as initcalls, so I'm not sure that >> the clock driver would be in the picture at all yet. >> >> Is the bisection repeatable, or is this just random flakiness >> misleading things? I'd also note that you need pretty horrifically >> broken firmware to hit that warning in the first place, which might >> cast a bit of doubt over the trustworthiness of that board altogether. > > Ah, on closer inspection it might be entirely repeatable for a given > kernel build, but with the behaviour being very sensitive to code/data > segment layout changes... > > ... > 23:44:24.457917 Filename '1007060/tftp-deploy-dvdnydcw/kernel/Image'. > 23:44:24.460178 Load address: 0x2000000 > ... > 23:44:27.180962 Bytes transferred = 33681920 (201f200 hex) > ... > 23:44:27.288135 Filename > '1007060/tftp-deploy-dvdnydcw/ramdisk/ramdisk.cpio.gz.uboot'. > 23:44:27.288465 Load address: 0x4000000 > ... could you try updating u-boot to more recent version: the ramdisk address has been moved [1] to 0x06000000 in v2020.01-rc5. I couldn't reproduce this issue with the very same board. [1] https://github.com/u-boot/u-boot/commit/b2e373d16b0345d3c3f4beefdf0889e83faf173d Alex > > Yeah, that'll be a problem ;) > > Cheers, > Robin. > >>>> There doesn't appear to be any other platform in KernelCI showing >>>> the same issue. >>> That's a strange error for the changes from my patch. >>> At first glance I don't see any relation to clk-composite code: >>> - the call trace doesn't have any references to CCF or rockchip clock >>> drivers >>> - clk-rk3328.c uses drivers/clk/rockchip/clk-cpu.c to register the CPU >>> clock which does not use clk-composite >>> >>> Chen-Yu has tested this patch (plus [0]) on RK3399 and didn't observe >>> any problems. >>> So maybe this is a RK3328 specific issue? >>> Anyways, I am interested in fixing this issue because reverting is >>> becoming more and more complex (since I think we're at eight commits >>> which would need to be reverted in total). >>> >>>> Please let us know if you need help debugging the issue or if you >>>> have a fix to try. >>> Could you please try [0] which is the second patch in the series which >>> finally made it upstream. >>> This second patch is not in 5.15 because I believed that it's only >>> something to make the code in clk-composite.c more future-proof. It's >>> not a condition that I am aware of. >>> >>> I don't have any Rockchip boards myself. >>> So I am thankful for any help I can get. >>> >>> >>> Best regards, >>> Martin >>> >>> >>> [0] >>> https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git/commit/?h=clk-next&id=6594988fd625ff0d9a8f90f1788e16185358a3e6 >>> >>> >>> _______________________________________________ >>> Linux-rockchip mailing list >>> Linux-rockchip@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/linux-rockchip >>> >> >> _______________________________________________ >> Linux-rockchip mailing list >> Linux-rockchip@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-rockchip _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites 2021-11-01 22:41 ` Alex Bee @ 2021-11-02 7:58 ` Guillaume Tucker 2021-11-02 21:40 ` LABBE Corentin 0 siblings, 1 reply; 15+ messages in thread From: Guillaume Tucker @ 2021-11-02 7:58 UTC (permalink / raw) To: Alex Bee, Robin Murphy, Martin Blumenstingl, Corentin Labbe, Kevin Hilman Cc: sboyd, heiko, linux-arm-kernel, linux-clk, linux-kernel, linux-rockchip, kernelci@groups.io, Collabora Kernel ML, Chen-Yu Tsai +Kevin +Corentin On 01/11/2021 22:41, Alex Bee wrote: > Hi Guillaume, > > Am 01.11.21 um 23:11 schrieb Robin Murphy: >> On 2021-11-01 21:59, Robin Murphy wrote: >>> On 2021-11-01 20:58, Martin Blumenstingl wrote: >>>> Hi Guillaume, >>>> >>>> On Mon, Nov 1, 2021 at 9:19 PM Guillaume Tucker >>>> <guillaume.tucker@collabora.com> wrote: >>>>> >>>>> Hi Martin, >>>>> >>>>> Please see the bisection report below about a boot failure on >>>>> rk3328-rock64. >>>>> >>>>> Reports aren't automatically sent to the public while we're >>>>> trialing new bisection features on kernelci.org but this one >>>>> looks valid. >>>>> >>>>> Some more details can be found here: >>>>> >>>>> https://linux.kernelci.org/test/case/id/617f11f5c157b666fb3358e6/ >>>>> >>>>> Here's what appears to be the cause of the problem: >>>>> >>>>> [ 0.033465] CPU: CPUs started in inconsistent modes >>>>> [ 0.033557] Unexpected kernel BRK exception at EL1 >>>>> [ 0.034432] Internal error: BRK handler: f2000800 [#1] PREEMPT SMP >>> >>> What's weird is that that's really just the same WARN that's also >>> present in 'successful' logs, except for some reason it's behaving as >>> if the break handler hasn't been registered, despite that having >>> happened long before we got to smp_init(). At this point we're also >>> still some way off getting as far as initcalls, so I'm not sure that >>> the clock driver would be in the picture at all yet. >>> >>> Is the bisection repeatable, or is this just random flakiness >>> misleading things? I'd also note that you need pretty horrifically >>> broken firmware to hit that warning in the first place, which might >>> cast a bit of doubt over the trustworthiness of that board altogether. The bisection has checks to avoid false positives, so tests that produce flaky results won't normally lead to a report like this. Then they're manually triaged, and there were 2 separate bisections that landed on this same commit. >> Ah, on closer inspection it might be entirely repeatable for a given >> kernel build, but with the behaviour being very sensitive to code/data >> segment layout changes... >> >> ... >> 23:44:24.457917 Filename '1007060/tftp-deploy-dvdnydcw/kernel/Image'. >> 23:44:24.460178 Load address: 0x2000000 >> ... >> 23:44:27.180962 Bytes transferred = 33681920 (201f200 hex) >> ... >> 23:44:27.288135 Filename >> '1007060/tftp-deploy-dvdnydcw/ramdisk/ramdisk.cpio.gz.uboot'. >> 23:44:27.288465 Load address: 0x4000000 >> ... That is indeed where the remaining false positives are still likely to be coming from, when the infrastructure consistently causes test failures following particular kernel revisions. I don't think there's an easy way to rule those out, but we can try to address them one by one at least. In the case of colliding address ranges in the bootloader, we could add a check with the "good" revision and extra data in the kernel image to make it at least as big as the "bad" revision... > could you try updating u-boot to more recent version: the ramdisk > address has been moved [1] to 0x06000000 in v2020.01-rc5. Thanks for investigating this. The board is in BayLibre's lab. Corentin, Kevin, could you please take a look? Thanks, Guillaume > I couldn't reproduce this issue with the very same board. > > [1] > https://github.com/u-boot/u-boot/commit/b2e373d16b0345d3c3f4beefdf0889e83faf173d > > Alex > >> >> Yeah, that'll be a problem ;) >> >> Cheers, >> Robin. >> >>>>> There doesn't appear to be any other platform in KernelCI showing >>>>> the same issue. >>>> That's a strange error for the changes from my patch. >>>> At first glance I don't see any relation to clk-composite code: >>>> - the call trace doesn't have any references to CCF or rockchip clock >>>> drivers >>>> - clk-rk3328.c uses drivers/clk/rockchip/clk-cpu.c to register the CPU >>>> clock which does not use clk-composite >>>> >>>> Chen-Yu has tested this patch (plus [0]) on RK3399 and didn't observe >>>> any problems. >>>> So maybe this is a RK3328 specific issue? >>>> Anyways, I am interested in fixing this issue because reverting is >>>> becoming more and more complex (since I think we're at eight commits >>>> which would need to be reverted in total). >>>> >>>>> Please let us know if you need help debugging the issue or if you >>>>> have a fix to try. >>>> Could you please try [0] which is the second patch in the series which >>>> finally made it upstream. >>>> This second patch is not in 5.15 because I believed that it's only >>>> something to make the code in clk-composite.c more future-proof. It's >>>> not a condition that I am aware of. >>>> >>>> I don't have any Rockchip boards myself. >>>> So I am thankful for any help I can get. >>>> >>>> >>>> Best regards, >>>> Martin >>>> >>>> >>>> [0] >>>> https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git/commit/?h=clk-next&id=6594988fd625ff0d9a8f90f1788e16185358a3e6 >>>> >>>> >>>> _______________________________________________ >>>> Linux-rockchip mailing list >>>> Linux-rockchip@lists.infradead.org >>>> http://lists.infradead.org/mailman/listinfo/linux-rockchip >>>> >>> >>> _______________________________________________ >>> Linux-rockchip mailing list >>> Linux-rockchip@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/linux-rockchip _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites 2021-11-02 7:58 ` Guillaume Tucker @ 2021-11-02 21:40 ` LABBE Corentin 0 siblings, 0 replies; 15+ messages in thread From: LABBE Corentin @ 2021-11-02 21:40 UTC (permalink / raw) To: Guillaume Tucker Cc: Alex Bee, Robin Murphy, Martin Blumenstingl, Kevin Hilman, sboyd, heiko, linux-arm-kernel, linux-clk, linux-kernel, linux-rockchip, kernelci@groups.io, Collabora Kernel ML, Chen-Yu Tsai Le Tue, Nov 02, 2021 at 07:58:42AM +0000, Guillaume Tucker a écrit : > +Kevin +Corentin > > On 01/11/2021 22:41, Alex Bee wrote: > > Hi Guillaume, > > > > Am 01.11.21 um 23:11 schrieb Robin Murphy: > >> On 2021-11-01 21:59, Robin Murphy wrote: > >>> On 2021-11-01 20:58, Martin Blumenstingl wrote: > >>>> Hi Guillaume, > >>>> > >>>> On Mon, Nov 1, 2021 at 9:19 PM Guillaume Tucker > >>>> <guillaume.tucker@collabora.com> wrote: > >>>>> > >>>>> Hi Martin, > >>>>> > >>>>> Please see the bisection report below about a boot failure on > >>>>> rk3328-rock64. > >>>>> > >>>>> Reports aren't automatically sent to the public while we're > >>>>> trialing new bisection features on kernelci.org but this one > >>>>> looks valid. > >>>>> > >>>>> Some more details can be found here: > >>>>> > >>>>> https://linux.kernelci.org/test/case/id/617f11f5c157b666fb3358e6/ > >>>>> > >>>>> Here's what appears to be the cause of the problem: > >>>>> > >>>>> [ 0.033465] CPU: CPUs started in inconsistent modes > >>>>> [ 0.033557] Unexpected kernel BRK exception at EL1 > >>>>> [ 0.034432] Internal error: BRK handler: f2000800 [#1] PREEMPT SMP > >>> > >>> What's weird is that that's really just the same WARN that's also > >>> present in 'successful' logs, except for some reason it's behaving as > >>> if the break handler hasn't been registered, despite that having > >>> happened long before we got to smp_init(). At this point we're also > >>> still some way off getting as far as initcalls, so I'm not sure that > >>> the clock driver would be in the picture at all yet. > >>> > >>> Is the bisection repeatable, or is this just random flakiness > >>> misleading things? I'd also note that you need pretty horrifically > >>> broken firmware to hit that warning in the first place, which might > >>> cast a bit of doubt over the trustworthiness of that board altogether. > > The bisection has checks to avoid false positives, so tests that > produce flaky results won't normally lead to a report like this. > Then they're manually triaged, and there were 2 separate > bisections that landed on this same commit. > > >> Ah, on closer inspection it might be entirely repeatable for a given > >> kernel build, but with the behaviour being very sensitive to code/data > >> segment layout changes... > >> > >> ... > >> 23:44:24.457917 Filename '1007060/tftp-deploy-dvdnydcw/kernel/Image'. > >> 23:44:24.460178 Load address: 0x2000000 > >> ... > >> 23:44:27.180962 Bytes transferred = 33681920 (201f200 hex) > >> ... > >> 23:44:27.288135 Filename > >> '1007060/tftp-deploy-dvdnydcw/ramdisk/ramdisk.cpio.gz.uboot'. > >> 23:44:27.288465 Load address: 0x4000000 > >> ... > > That is indeed where the remaining false positives are still > likely to be coming from, when the infrastructure consistently > causes test failures following particular kernel revisions. I > don't think there's an easy way to rule those out, but we can try > to address them one by one at least. > > In the case of colliding address ranges in the bootloader, we > could add a check with the "good" revision and extra data in the > kernel image to make it at least as big as the "bad" revision... > > > could you try updating u-boot to more recent version: the ramdisk > > address has been moved [1] to 0x06000000 in v2020.01-rc5. > > Thanks for investigating this. The board is in BayLibre's lab. > > Corentin, Kevin, could you please take a look? > Hello I tried to update uboot on it but failed for today. I found only how to flash sdcard (doiing it remotly), but the board boots SPI first (and I saw no documentation on how to flash SPI). I need to have physical access to change this. So probably later this week. Regards _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default 2021-10-14 21:34 ` Martin Blumenstingl 2021-10-14 22:52 ` Stephen Boyd @ 2021-10-15 19:14 ` Alex Bee 2021-10-15 21:31 ` Stephen Boyd 1 sibling, 1 reply; 15+ messages in thread From: Alex Bee @ 2021-10-15 19:14 UTC (permalink / raw) To: Martin Blumenstingl Cc: linux-clk, sboyd, linux-arm-kernel, linux-kernel, Heiko Stübner, linux-rockchip Am 14.10.21 um 23:34 schrieb Martin Blumenstingl: > On Thu, Oct 14, 2021 at 2:11 PM Martin Blumenstingl > <martin.blumenstingl@googlemail.com> wrote: > [...] >>> Reverting this commit makes it work again: Unless there is a quick and >>> obvious fix for that, I guess this should be done for 5.15 - even if the >>> real issue is somewhere else. >> Reverting this patch is fine from the Amlogic SoC point of view. >> The main goal was to clean up / improve the CCF code. >> Nothing (that I am aware of) is going to break in Amlogic land if we >> revert this. > Unfortunately only now I realized that reverting this patch would also > require reverting the other five patches in this series (since they > depend on this one). Indeed, that whole series would need have to reverted, which is clear (to me) when looking at the patches. > For this reason I propose changing the order of the checks in > clk-composite.c - see the attached patch (which I can send as a proper > one once agreed that this is the way to go forward) Yes, your patch papers over the actual issue (no best parent-selection in case determine_rate is used) and fixes it for now - as expected. But it is not a long-term solution, as we will be at the same point, as soon as round_rate gets removed from clk-divider. For me, who is semi-knowledged in clock-framework, it was relatively hard to figure out what was going on. "I'll do this later" has often been heard here. Though, I don't fully follow why moving the priority of determine_rate lower would have been necessary anyways: from what I understand determine_rate is expected to be the future-replacement of round_rate everywhere and should be preferred. I guess it's up to the maintainers on how to proceed. Alex > Off-list Alex also suggested that I should use rate_ops.determine_rate > if available. > While I agree that this makes sense in general my plan is to do this > in a follow-up patch. > Changing the order of the conditions is needed anyways and it *should* > fix the issue reported here (but I have no way of testing that > unfortunately). > > Alex, it would be great if you (or someone with Rockchip boards) could > test the attached patch and let me know if it fixes the reported > problem. > > > Best regards, > Martin _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default 2021-10-15 19:14 ` [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default Alex Bee @ 2021-10-15 21:31 ` Stephen Boyd 0 siblings, 0 replies; 15+ messages in thread From: Stephen Boyd @ 2021-10-15 21:31 UTC (permalink / raw) To: Alex Bee, Martin Blumenstingl Cc: linux-clk, linux-arm-kernel, linux-kernel, Heiko Stübner, linux-rockchip Quoting Alex Bee (2021-10-15 12:14:36) > > Am 14.10.21 um 23:34 schrieb Martin Blumenstingl: > > On Thu, Oct 14, 2021 at 2:11 PM Martin Blumenstingl > > <martin.blumenstingl@googlemail.com> wrote: > > [...] > >>> Reverting this commit makes it work again: Unless there is a quick and > >>> obvious fix for that, I guess this should be done for 5.15 - even if the > >>> real issue is somewhere else. > >> Reverting this patch is fine from the Amlogic SoC point of view. > >> The main goal was to clean up / improve the CCF code. > >> Nothing (that I am aware of) is going to break in Amlogic land if we > >> revert this. > > Unfortunately only now I realized that reverting this patch would also > > require reverting the other five patches in this series (since they > > depend on this one). > Indeed, that whole series would need have to reverted, which is clear > (to me) when looking at the patches. > > For this reason I propose changing the order of the checks in > > clk-composite.c - see the attached patch (which I can send as a proper > > one once agreed that this is the way to go forward) > > Yes, your patch papers over the actual issue (no best parent-selection > in case determine_rate is used) and fixes it for now - as expected. > > But it is not a long-term solution, as we will be at the same point, as > soon as round_rate gets removed from clk-divider. For me, who is > semi-knowledged in clock-framework, it was relatively hard to figure out > what was going on. "I'll do this later" has often been heard here. > > Though, I don't fully follow why moving the priority of determine_rate > lower would have been necessary anyways: from what I understand > determine_rate is expected to be the future-replacement of round_rate > everywhere and should be preferred. This is the only place I can see where we would have to keep round_rate around, to mesh together rate rounding with parent selection. We can probably stop using round_rate in the framework and only use it in the composite code. It's not a big deal but it's sort of annoying that we won't be able to fully remove round_rate from the clk_ops without resolving this. Long term it may make sense to get rid of the composite clk code entirely. It's pretty confusing code to read and encourages use of the "basic clk types" which I'd like to see be used via direct function calls instead of through clk_ops structures. _______________________________________________ Linux-rockchip mailing list Linux-rockchip@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-rockchip ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2021-11-02 21:40 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20210702225145.2643303-1-martin.blumenstingl@googlemail.com>
[not found] ` <20210702225145.2643303-2-martin.blumenstingl@googlemail.com>
2021-10-14 9:55 ` [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default Alex Bee
2021-10-14 12:11 ` Martin Blumenstingl
2021-10-14 21:34 ` Martin Blumenstingl
2021-10-14 22:52 ` Stephen Boyd
2021-10-15 12:05 ` [PATCH] clk: composite: Also consider .determine_rate for rate + mux composites Martin Blumenstingl
2021-10-15 21:27 ` Stephen Boyd
2021-11-01 20:19 ` Guillaume Tucker
2021-11-01 20:58 ` Martin Blumenstingl
2021-11-01 21:59 ` Robin Murphy
2021-11-01 22:11 ` Robin Murphy
2021-11-01 22:41 ` Alex Bee
2021-11-02 7:58 ` Guillaume Tucker
2021-11-02 21:40 ` LABBE Corentin
2021-10-15 19:14 ` [PATCH v1 1/6] clk: divider: Implement and wire up .determine_rate by default Alex Bee
2021-10-15 21:31 ` Stephen Boyd
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox