From: Suzuki.Poulose@arm.com (Suzuki K Poulose)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel
Date: Fri, 17 Jun 2016 12:16:59 +0100 [thread overview]
Message-ID: <5763DC2B.5030705@arm.com> (raw)
In-Reply-To: <20160617102713.GA14524@leverpostej>
On 17/06/16 11:27, Mark Rutland wrote:
> On Fri, Jun 17, 2016 at 10:34:56AM +0100, James Morse wrote:
>> kernel/smp.c has a fancy counter that keeps track of the number of CPUs
>> it marked as not-present and left in cpu_park_loop(). If there are any
>> CPUs spinning in here, features like kexec or hibernate may release them
>> by overwriting this memory.
>>
>> This problem also occurs on machines using spin-tables to release
>> secondary cores.
>> After commit 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N")
>> we bring all known cpus into the secondary holding pen, but may not bring
>> them up depending on 'maxcpus'. This memory can't be re-used by kexec
>> or hibernate.
>>
>> Add a function cpus_are_stuck_in_kernel() to determine if either of these
>> cases have occurred.
>>
>> Signed-off-by: James Morse <james.morse@arm.com>
>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>> arch/arm64/include/asm/smp.h | 20 ++++++++++++++++++++
>> arch/arm64/kernel/smp.c | 13 +++++++++++++
>> 2 files changed, 33 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
>> index 433e50405274..4be755bcc07a 100644
>> --- a/arch/arm64/include/asm/smp.h
>> +++ b/arch/arm64/include/asm/smp.h
>> @@ -124,6 +124,26 @@ static inline void cpu_panic_kernel(void)
>> cpu_park_loop();
>> }
>>
>> +/*
>> + * Kernel features such as hibernate and kexec depend on cpu hotplug to know
>> + * they can replace any kernel memory they are not using themselves.
>> + *
>> + * There are two corner cases:
>> + * If a secondary CPU fails to come online, (e.g. due to mismatched features),
>> + * it will try to call cpu_die(). If this fails, it increases the counter
>> + * cpus_stuck_in_kernel and sits in cpu_park_loop(). The memory containing
>> + * this function must not be re-used for anything else as the 'stuck' core
>> + * is executing it.
>
> It might also be stuck in __no_granule_support, if it never made it to C
> code. In that case, the CPU in charge of bringing up that new CPU will
> increment the counter in __cpu_up.
Just to clarify, *in all the cases*, the CPU in charge of bringing up updates
the cpus_stuck_in_kernel.
>
> There might be other reasons we do something like that in future, so it
> might be better to be a little less specific and say something like:
>
> If a secondary CPU enters the kernel but fails to come online,
> (e.g. due to mismatched features), and cannot exit the kernel,
> we increment cpus_stuck_in_kernel and leave the CPU in a
> quiesecent loop within the kernel text. The memory containing
> this loop must not be re-used for anything else as the 'stuck'
> core is executing it.
Agree.
>> +bool cpus_are_stuck_in_kernel(void);
>> +
>> #endif /* ifndef __ASSEMBLY__ */
>>
>> #endif /* ifndef __ASM_SMP_H */
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index 678e0842cb3b..e197502f94fd 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -909,3 +909,16 @@ int setup_profiling_timer(unsigned int multiplier)
>> {
>> return -EINVAL;
>> }
>> +
>> +bool cpus_are_stuck_in_kernel(void)
>> +{
>> + bool ret = !!cpus_stuck_in_kernel;
>> +#ifdef CONFIG_HOTPLUG_CPU
>> + int any_cpu = raw_smp_processor_id();
>> +
>> + if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die)
>> + ret = true;
>> +#endif
Minor nit: Moving the cpu_die check to a static inline function with
an obvious name might make the code look better.
return !!cpus_stuck_in_kernel || !have_cpu_die() ?
Eitherway,
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cheers
Suzuki
>> +
>> + return ret;
>> +}
>> --
>> 2.8.0.rc3
>>
>
next prev parent reply other threads:[~2016-06-17 11:16 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-17 9:34 [PATCH 0/2] Fix hibernate on SMP spin-table systems James Morse
2016-06-17 9:34 ` [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel James Morse
2016-06-17 10:27 ` Mark Rutland
2016-06-17 11:16 ` Suzuki K Poulose [this message]
2016-06-17 16:48 ` James Morse
2016-06-21 18:07 ` Will Deacon
2016-06-17 16:37 ` Geoff Levand
2016-06-17 16:39 ` Suzuki K Poulose
2016-06-17 16:42 ` Mark Rutland
2016-06-17 9:34 ` [PATCH 2/2] arm64: hibernate: Don't hibernate on systems with stuck CPUs James Morse
2016-06-17 10:28 ` Mark Rutland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5763DC2B.5030705@arm.com \
--to=suzuki.poulose@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).