* [PATCH 0/2] arm64: Fix issues with CPU hotplug and RCU @ 2020-11-06 10:36 Will Deacon 2020-11-06 10:36 ` [PATCH 1/2] arm64: psci: Avoid printing in cpu_psci_cpu_die() Will Deacon 2020-11-06 10:36 ` [PATCH 2/2] arm64: smp: Tell RCU about CPUs that fail to come online Will Deacon 0 siblings, 2 replies; 4+ messages in thread From: Will Deacon @ 2020-11-06 10:36 UTC (permalink / raw) To: linux-arm-kernel Cc: Paul E. McKenney, Qian Cai, kernel-team, linux-kernel, Catalin Marinas, Will Deacon Hi folks, Here are a couple of patches following on from: https://lore.kernel.org/r/20201105222242.GA8842@willie-the-truck which address issues when CPU onlining fails but RCU is left none the wiser. Tested under QEMU. If Paul is happy with the second patch, then I can take both of these via arm64 as fixes for 5.11. Cheers, Will Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Qian Cai <cai@redhat.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> --->8 Will Deacon (2): arm64: psci: Avoid printing in cpu_psci_cpu_die() arm64: smp: Tell RCU about CPUs that fail to come online arch/arm64/kernel/psci.c | 2 -- arch/arm64/kernel/smp.c | 1 + kernel/rcu/tree.c | 2 +- 3 files changed, 2 insertions(+), 3 deletions(-) -- 2.29.1.341.ge80a0c044ae-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/2] arm64: psci: Avoid printing in cpu_psci_cpu_die() 2020-11-06 10:36 [PATCH 0/2] arm64: Fix issues with CPU hotplug and RCU Will Deacon @ 2020-11-06 10:36 ` Will Deacon 2020-11-06 10:36 ` [PATCH 2/2] arm64: smp: Tell RCU about CPUs that fail to come online Will Deacon 1 sibling, 0 replies; 4+ messages in thread From: Will Deacon @ 2020-11-06 10:36 UTC (permalink / raw) To: linux-arm-kernel Cc: Paul E. McKenney, Qian Cai, kernel-team, linux-kernel, Catalin Marinas, Will Deacon cpu_psci_cpu_die() is called in the context of the dying CPU, which will no longer be online or tracked by RCU. It is therefore not generally safe to call printk() if the PSCI "cpu off" request fails, so remove the pr_crit() invocation. Cc: Qian Cai <cai@redhat.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org> --- arch/arm64/kernel/psci.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c index 43ae4e0c968f..6a4f3e37c3b4 100644 --- a/arch/arm64/kernel/psci.c +++ b/arch/arm64/kernel/psci.c @@ -75,8 +75,6 @@ static void cpu_psci_cpu_die(unsigned int cpu) PSCI_0_2_POWER_STATE_TYPE_SHIFT; ret = psci_ops.cpu_off(state); - - pr_crit("unable to power off CPU%u (%d)\n", cpu, ret); } static int cpu_psci_cpu_kill(unsigned int cpu) -- 2.29.1.341.ge80a0c044ae-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] arm64: smp: Tell RCU about CPUs that fail to come online 2020-11-06 10:36 [PATCH 0/2] arm64: Fix issues with CPU hotplug and RCU Will Deacon 2020-11-06 10:36 ` [PATCH 1/2] arm64: psci: Avoid printing in cpu_psci_cpu_die() Will Deacon @ 2020-11-06 10:36 ` Will Deacon 2020-11-06 14:43 ` Paul E. McKenney 1 sibling, 1 reply; 4+ messages in thread From: Will Deacon @ 2020-11-06 10:36 UTC (permalink / raw) To: linux-arm-kernel Cc: Paul E. McKenney, Qian Cai, kernel-team, linux-kernel, Catalin Marinas, Will Deacon Commit ce3d31ad3cac ("arm64/smp: Move rcu_cpu_starting() earlier") ensured that RCU is informed early about incoming CPUs that might end up calling into printk() before they are online. However, if such a CPU fails the early CPU feature compatibility checks in check_local_cpu_capabilities(), then it will be powered off or parked without informing RCU, leading to an endless stream of stalls: | rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: | rcu: 2-O...: (0 ticks this GP) idle=002/1/0x4000000000000000 softirq=0/0 fqs=2593 | (detected by 0, t=5252 jiffies, g=9317, q=136) | Task dump for CPU 2: | task:swapper/2 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x00000028 | Call trace: | ret_from_fork+0x0/0x30 Ensure that the dying CPU invokes rcu_report_dead() prior to being powered off or parked. Cc: Qian Cai <cai@redhat.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Suggested-by: Qian Cai <cai@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> --- arch/arm64/kernel/smp.c | 1 + kernel/rcu/tree.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 09c96f57818c..18e9727d3f64 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -413,6 +413,7 @@ void cpu_die_early(void) /* Mark this CPU absent */ set_cpu_present(cpu, 0); + rcu_report_dead(cpu); if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) { update_cpu_boot_status(CPU_KILL_ME); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2a52f42f64b6..bd04b09b84b3 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -4077,7 +4077,6 @@ void rcu_cpu_starting(unsigned int cpu) smp_mb(); /* Ensure RCU read-side usage follows above initialization. */ } -#ifdef CONFIG_HOTPLUG_CPU /* * The outgoing function has no further need of RCU, so remove it from * the rcu_node tree's ->qsmaskinitnext bit masks. @@ -4117,6 +4116,7 @@ void rcu_report_dead(unsigned int cpu) rdp->cpu_started = false; } +#ifdef CONFIG_HOTPLUG_CPU /* * The outgoing CPU has just passed through the dying-idle state, and we * are being invoked from the CPU that was IPIed to continue the offline -- 2.29.1.341.ge80a0c044ae-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 2/2] arm64: smp: Tell RCU about CPUs that fail to come online 2020-11-06 10:36 ` [PATCH 2/2] arm64: smp: Tell RCU about CPUs that fail to come online Will Deacon @ 2020-11-06 14:43 ` Paul E. McKenney 0 siblings, 0 replies; 4+ messages in thread From: Paul E. McKenney @ 2020-11-06 14:43 UTC (permalink / raw) To: Will Deacon Cc: Qian Cai, Catalin Marinas, kernel-team, linux-kernel, linux-arm-kernel On Fri, Nov 06, 2020 at 10:36:02AM +0000, Will Deacon wrote: > Commit ce3d31ad3cac ("arm64/smp: Move rcu_cpu_starting() earlier") ensured > that RCU is informed early about incoming CPUs that might end up calling > into printk() before they are online. However, if such a CPU fails the > early CPU feature compatibility checks in check_local_cpu_capabilities(), > then it will be powered off or parked without informing RCU, leading to > an endless stream of stalls: > > | rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > | rcu: 2-O...: (0 ticks this GP) idle=002/1/0x4000000000000000 softirq=0/0 fqs=2593 > | (detected by 0, t=5252 jiffies, g=9317, q=136) > | Task dump for CPU 2: > | task:swapper/2 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x00000028 > | Call trace: > | ret_from_fork+0x0/0x30 > > Ensure that the dying CPU invokes rcu_report_dead() prior to being powered > off or parked. > > Cc: Qian Cai <cai@redhat.com> > Cc: "Paul E. McKenney" <paulmck@kernel.org> > Suggested-by: Qian Cai <cai@redhat.com> > Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> > --- > arch/arm64/kernel/smp.c | 1 + > kernel/rcu/tree.c | 2 +- > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > index 09c96f57818c..18e9727d3f64 100644 > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -413,6 +413,7 @@ void cpu_die_early(void) > > /* Mark this CPU absent */ > set_cpu_present(cpu, 0); > + rcu_report_dead(cpu); > > if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) { > update_cpu_boot_status(CPU_KILL_ME); > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 2a52f42f64b6..bd04b09b84b3 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -4077,7 +4077,6 @@ void rcu_cpu_starting(unsigned int cpu) > smp_mb(); /* Ensure RCU read-side usage follows above initialization. */ > } > > -#ifdef CONFIG_HOTPLUG_CPU > /* > * The outgoing function has no further need of RCU, so remove it from > * the rcu_node tree's ->qsmaskinitnext bit masks. > @@ -4117,6 +4116,7 @@ void rcu_report_dead(unsigned int cpu) > rdp->cpu_started = false; > } > > +#ifdef CONFIG_HOTPLUG_CPU > /* > * The outgoing CPU has just passed through the dying-idle state, and we > * are being invoked from the CPU that was IPIed to continue the offline > -- > 2.29.1.341.ge80a0c044ae-goog > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-11-06 14:43 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-11-06 10:36 [PATCH 0/2] arm64: Fix issues with CPU hotplug and RCU Will Deacon 2020-11-06 10:36 ` [PATCH 1/2] arm64: psci: Avoid printing in cpu_psci_cpu_die() Will Deacon 2020-11-06 10:36 ` [PATCH 2/2] arm64: smp: Tell RCU about CPUs that fail to come online Will Deacon 2020-11-06 14:43 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).