public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: mark.rutland@arm.com (Mark Rutland)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel
Date: Fri, 17 Jun 2016 11:27:13 +0100	[thread overview]
Message-ID: <20160617102713.GA14524@leverpostej> (raw)
In-Reply-To: <1466156097-20028-2-git-send-email-james.morse@arm.com>

On Fri, Jun 17, 2016 at 10:34:56AM +0100, James Morse wrote:
> kernel/smp.c has a fancy counter that keeps track of the number of CPUs
> it marked as not-present and left in cpu_park_loop(). If there are any
> CPUs spinning in here, features like kexec or hibernate may release them
> by overwriting this memory.
> 
> This problem also occurs on machines using spin-tables to release
> secondary cores.
> After commit 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N")
> we bring all known cpus into the secondary holding pen, but may not bring
> them up depending on 'maxcpus'. This memory can't be re-used by kexec
> or hibernate.
> 
> Add a function cpus_are_stuck_in_kernel() to determine if either of these
> cases have occurred.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/include/asm/smp.h | 20 ++++++++++++++++++++
>  arch/arm64/kernel/smp.c      | 13 +++++++++++++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
> index 433e50405274..4be755bcc07a 100644
> --- a/arch/arm64/include/asm/smp.h
> +++ b/arch/arm64/include/asm/smp.h
> @@ -124,6 +124,26 @@ static inline void cpu_panic_kernel(void)
>  	cpu_park_loop();
>  }
>  
> +/*
> + * Kernel features such as hibernate and kexec depend on cpu hotplug to know
> + * they can replace any kernel memory they are not using themselves.
> + *
> + * There are two corner cases:
> + * If a secondary CPU fails to come online, (e.g. due to mismatched features),
> + * it will try to call cpu_die(). If this fails, it increases the counter
> + * cpus_stuck_in_kernel and sits in cpu_park_loop(). The memory containing
> + * this function must not be re-used for anything else as the 'stuck' core
> + * is executing it.

It might also be stuck in __no_granule_support, if it never made it to C
code. In that case, the CPU in charge of bringing up that new CPU will
increment the counter in __cpu_up.

There might be other reasons we do something like that in future, so it
might be better to be a little less specific and say something like:

	If a secondary CPU enters the kernel but fails to come online,
	(e.g. due to mismatched features), and cannot exit the kernel,
	we increment cpus_stuck_in_kernel and leave the CPU in a
	quiesecent loop within the kernel text. The memory containing
	this loop must not be re-used for anything else as the 'stuck'
	core is executing it.

Otherwise, this looks good. FWIW, either way:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Thanks,
Mark.

> + *
> + * CPUs are also considered stuck in the kernel if we have multiple CPUs
> + * and no way to offline secondary CPUs. This happens when secondaries
> + * are released via spin-table, these CPUs are moved into the kernel's
> + * secondary_holding_pen, which must not be overwritten.
> + *
> + * This function is used to inhibit features like kexec and hibernate.
> + */
> +bool cpus_are_stuck_in_kernel(void);
> +
>  #endif /* ifndef __ASSEMBLY__ */
>  
>  #endif /* ifndef __ASM_SMP_H */
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 678e0842cb3b..e197502f94fd 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -909,3 +909,16 @@ int setup_profiling_timer(unsigned int multiplier)
>  {
>  	return -EINVAL;
>  }
> +
> +bool cpus_are_stuck_in_kernel(void)
> +{
> +	bool ret = !!cpus_stuck_in_kernel;
> +#ifdef CONFIG_HOTPLUG_CPU
> +	int any_cpu = raw_smp_processor_id();
> +
> +	if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die)
> +		ret = true;
> +#endif
> +
> +	return ret;
> +}
> -- 
> 2.8.0.rc3
> 

  reply	other threads:[~2016-06-17 10:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-17  9:34 [PATCH 0/2] Fix hibernate on SMP spin-table systems James Morse
2016-06-17  9:34 ` [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel James Morse
2016-06-17 10:27   ` Mark Rutland [this message]
2016-06-17 11:16     ` Suzuki K Poulose
2016-06-17 16:48       ` James Morse
2016-06-21 18:07         ` Will Deacon
2016-06-17 16:37   ` Geoff Levand
2016-06-17 16:39     ` Suzuki K Poulose
2016-06-17 16:42     ` Mark Rutland
2016-06-17  9:34 ` [PATCH 2/2] arm64: hibernate: Don't hibernate on systems with stuck CPUs James Morse
2016-06-17 10:28   ` Mark Rutland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160617102713.GA14524@leverpostej \
    --to=mark.rutland@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox