From mboxrd@z Thu Jan 1 00:00:00 1970 From: heiko@sntech.de (Heiko =?ISO-8859-1?Q?St=FCbner?=) Date: Fri, 05 Jun 2015 10:45:50 +0200 Subject: [PATCH 1/3] ARM: rockchip: fix the CPU soft reset In-Reply-To: <1433479677-18086-2-git-send-email-wxt@rock-chips.com> References: <1433479677-18086-1-git-send-email-wxt@rock-chips.com> <1433479677-18086-2-git-send-email-wxt@rock-chips.com> Message-ID: <1604172.FydsUUUXNY@diego> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi Caesar, thanks for investigating this. Am Freitag, 5. Juni 2015, 12:47:55 schrieb Caesar Wang: > In general, the correct flow is: > > cpu off: > reset_control_assert > regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd)) > > cpu on: > regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0) > reset_control_deassert > > You can repro it with bringing CPU up and down. > Says:(test scripts) > > cd /sys/devices/system/cpu/ > for i in $(seq 1000); do > echo "================= $i ============" > for j in $(seq 100); do > while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat > cpu3/online)" != "000" ]]; do echo 0 > cpu1/online > echo 0 > cpu2/online > echo 0 > cpu3/online > done > while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat > cpu3/online)" != "111" ]]; do echo 1 > cpu1/online > echo 1 > cpu2/online > echo 1 > cpu3/online > done > done > done > > The following is reproducile log: > [34466.186812] PM: noirq suspend of devices complete after 0.669 msecs > [34466.186824] Disabling non-boot CPUs ... > [34466.187509] CPU1: shutdown > [34466.188672] CPU2: shutdown > [34473.736627] Kernel panic - not syncing: Watchdog detected hard LOCKUP on > cpu 0 [34473.736646] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.14.0 #1 > [34473.736687] [] (unwind_backtrace) from [] > (show_stack+0x20/0x24) [34473.736711] [] (show_stack) from > [] (dump_stack+0x70/0x8c) [34473.736731] [] > (dump_stack) from [] (panic+0xa8/0x1fc) [34473.736754] > [] (panic) from [] (watchdog_timer_fn+0x234/0x26c) > [34473.736777] [] (watchdog_timer_fn) from [] > (__run_hrtimer+0x118/0x1e0) [34473.736797] [] (__run_hrtimer) > from [] (hrtimer_interrupt+0x148/0x2a0) [34473.736820] > [] (hrtimer_interrupt) from [] > (arch_timer_handler_phys+0x38/0x48) [34473.736844] [] > (arch_timer_handler_phys) from [] > (handle_percpu_devid_irq+0xb8/0x124) [34473.736867] [] > (handle_percpu_devid_irq) from [] (generic_handle_irq+0x30/0x40) > [34473.736887] [] (generic_handle_irq) from [] > (__handle_domain_irq+0x8c/0xb0) [34473.736905] [] > (__handle_domain_irq) from [] (gic_handle_irq+0x48/0x6c) > [34473.736922] [] (gic_handle_irq) from [] > (__irq_svc+0x40/0x50) [34473.736936] Exception stack(0xee127f70 to > 0xee127fb8) > [34473.736948] 7f60: ffffffed 00000000 > 2dd6d000 00000000 [34473.736964] 7f80: ee126000 00000015 c0b46bac c0b46bac > 0000406a 410fc0d1 00000000 ee127fc4 [34473.736979] 7fa0: ee127fb8 ee127fb8 > c0107038 c010703c 600f0013 ffffffff [34473.736995] [] (__irq_svc) > from [] (arch_cpu_idle+0x40/0x48) [34473.737013] [] > (arch_cpu_idle) from [] (cpu_startup_entry+0x170/0x1d0) > [34473.737031] [] (cpu_startup_entry) from [] > (secondary_start_kernel+0x138/0x160) [34473.737059] [] > (secondary_start_kernel) from [<00100464>] (0x100464) [34474.903740] SMP: > failed to stop secondary CPUs > [34476.099964] SMP: failed to stop secondary CPUs > ... > > Signed-off-by: Caesar Wang > --- > > arch/arm/mach-rockchip/platsmp.c | 13 +++++-------- > 1 file changed, 5 insertions(+), 8 deletions(-) > > diff --git a/arch/arm/mach-rockchip/platsmp.c > b/arch/arm/mach-rockchip/platsmp.c index 5b4ca3c..1230d3d 100644 > --- a/arch/arm/mach-rockchip/platsmp.c > +++ b/arch/arm/mach-rockchip/platsmp.c > @@ -88,20 +88,17 @@ static int pmu_set_power_domain(int pd, bool on) > return PTR_ERR(rstc); > } > > - if (on) > + if (on) { > + regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val); > reset_control_deassert(rstc); > - else > + } else { > reset_control_assert(rstc); > + regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val); > + } you're loosing the return value of regmap_update_bits here, I guess it should look like below? if (on) { ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val); if (ret < 0) { pr_err("%s: could not update power domain\n", __func__); reset_control_put(rstc); return ret; } reset_control_deassert(rstc); } else { reset_control_assert(rstc); ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val); if (ret < 0) { pr_err("%s: could not update power domain\n", __func__); reset_control_put(rstc); return ret; } } > > reset_control_put(rstc); > } > > - ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val); > - if (ret < 0) { > - pr_err("%s: could not update power domain\n", __func__); > - return ret; > - } > - > ret = -1; > while (ret != on) { > ret = pmu_power_domain_is_on(pd); second question - with this patch, what happens actually is cpu off: reset_control_assert regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd)) wait_for_power_domain_to_turn_off cpu on: regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0) reset_control_deassert wait_for_power_domain_to_turn_on So shouldn't the deassertion of the reset happen after the powerdomain sucessfull turned on? Like cpu on: regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0) wait_for_power_domain_to_turn_on reset_control_deassert