All of lore.kernel.org
 help / color / mirror / Atom feed
From: mark.rutland@arm.com (Mark Rutland)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel
Date: Fri, 17 Jun 2016 11:27:13 +0100	[thread overview]
Message-ID: <20160617102713.GA14524@leverpostej> (raw)
In-Reply-To: <1466156097-20028-2-git-send-email-james.morse@arm.com>

On Fri, Jun 17, 2016 at 10:34:56AM +0100, James Morse wrote:
> kernel/smp.c has a fancy counter that keeps track of the number of CPUs
> it marked as not-present and left in cpu_park_loop(). If there are any
> CPUs spinning in here, features like kexec or hibernate may release them
> by overwriting this memory.
> 
> This problem also occurs on machines using spin-tables to release
> secondary cores.
> After commit 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N")
> we bring all known cpus into the secondary holding pen, but may not bring
> them up depending on 'maxcpus'. This memory can't be re-used by kexec
> or hibernate.
> 
> Add a function cpus_are_stuck_in_kernel() to determine if either of these
> cases have occurred.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/include/asm/smp.h | 20 ++++++++++++++++++++
>  arch/arm64/kernel/smp.c      | 13 +++++++++++++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
> index 433e50405274..4be755bcc07a 100644
> --- a/arch/arm64/include/asm/smp.h
> +++ b/arch/arm64/include/asm/smp.h
> @@ -124,6 +124,26 @@ static inline void cpu_panic_kernel(void)
>  	cpu_park_loop();
>  }
>  
> +/*
> + * Kernel features such as hibernate and kexec depend on cpu hotplug to know
> + * they can replace any kernel memory they are not using themselves.
> + *
> + * There are two corner cases:
> + * If a secondary CPU fails to come online, (e.g. due to mismatched features),
> + * it will try to call cpu_die(). If this fails, it increases the counter
> + * cpus_stuck_in_kernel and sits in cpu_park_loop(). The memory containing
> + * this function must not be re-used for anything else as the 'stuck' core
> + * is executing it.

It might also be stuck in __no_granule_support, if it never made it to C
code. In that case, the CPU in charge of bringing up that new CPU will
increment the counter in __cpu_up.

There might be other reasons we do something like that in future, so it
might be better to be a little less specific and say something like:

	If a secondary CPU enters the kernel but fails to come online,
	(e.g. due to mismatched features), and cannot exit the kernel,
	we increment cpus_stuck_in_kernel and leave the CPU in a
	quiesecent loop within the kernel text. The memory containing
	this loop must not be re-used for anything else as the 'stuck'
	core is executing it.

Otherwise, this looks good. FWIW, either way:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Thanks,
Mark.

> + *
> + * CPUs are also considered stuck in the kernel if we have multiple CPUs
> + * and no way to offline secondary CPUs. This happens when secondaries
> + * are released via spin-table, these CPUs are moved into the kernel's
> + * secondary_holding_pen, which must not be overwritten.
> + *
> + * This function is used to inhibit features like kexec and hibernate.
> + */
> +bool cpus_are_stuck_in_kernel(void);
> +
>  #endif /* ifndef __ASSEMBLY__ */
>  
>  #endif /* ifndef __ASM_SMP_H */
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 678e0842cb3b..e197502f94fd 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -909,3 +909,16 @@ int setup_profiling_timer(unsigned int multiplier)
>  {
>  	return -EINVAL;
>  }
> +
> +bool cpus_are_stuck_in_kernel(void)
> +{
> +	bool ret = !!cpus_stuck_in_kernel;
> +#ifdef CONFIG_HOTPLUG_CPU
> +	int any_cpu = raw_smp_processor_id();
> +
> +	if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die)
> +		ret = true;
> +#endif
> +
> +	return ret;
> +}
> -- 
> 2.8.0.rc3
> 

  reply	other threads:[~2016-06-17 10:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-17  9:34 [PATCH 0/2] Fix hibernate on SMP spin-table systems James Morse
2016-06-17  9:34 ` [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel James Morse
2016-06-17 10:27   ` Mark Rutland [this message]
2016-06-17 11:16     ` Suzuki K Poulose
2016-06-17 16:48       ` James Morse
2016-06-21 18:07         ` Will Deacon
2016-06-17 16:37   ` Geoff Levand
2016-06-17 16:39     ` Suzuki K Poulose
2016-06-17 16:42     ` Mark Rutland
2016-06-17  9:34 ` [PATCH 2/2] arm64: hibernate: Don't hibernate on systems with stuck CPUs James Morse
2016-06-17 10:28   ` Mark Rutland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160617102713.GA14524@leverpostej \
    --to=mark.rutland@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.