All of lore.kernel.org
 help / color / mirror / Atom feed
From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCHv2] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores
Date: Fri, 11 Aug 2017 18:02:54 +0100	[thread overview]
Message-ID: <598DE33E.6050606@arm.com> (raw)
In-Reply-To: <1502082623-23952-1-git-send-email-hoeun.ryu@gmail.com>

Hi Hoeun,

On 07/08/17 06:09, Hoeun Ryu wrote:
>  Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly
> version in panic path) introduced crash_smp_send_stop() which is a weak
> function and can be overriden by architecture codes to fix the side effect

(overridden)


> caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_
> notifiers" option).
> 
>  ARM64 architecture uses the weak version function and the problem is that
> the weak function simply calls smp_send_stop() which makes other CPUs
> offline and takes away the chance to save crash information for nonpanic
> CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel
> option is enabled.
> 
>  Calling smp_send_crash_stop() in machine_crash_shutdown() is useless
> because all nonpanic CPUs are already offline by smp_send_stop() in this
> case and smp_send_crash_stop() only works against online CPUs.


>  The result is that /proc/vmcore is not available with the error messages;
> "Warning: Zero PT_NOTE entries found", "Kdump: vmcore not initialized".

When I tried this I got one of these warnings for each secondary CPU, but the
vmcore file was still available. When I ran 'crash' on the vmcore it reported:
> CPUS: 6 [OFFLINE: 5]

Did I miss as step to reproduce this? If not, can we change this paragraph to
say something like:
> The result is that secondary CPUs registers are not saved by crash_save_cpu()
> and the vmcore file misreports these CPUs as being offline.


>  crash_smp_send_stop() is implemented to fix this problem by replacing the
> exising smp_send_crash_stop() and adding a check for multiple calling to

(existing)


> the function. The function (strong symbol version) saves crash information
> for nonpanic CPUs and machine_crash_shutdown() tries to save crash
> information for nonpanic CPUs only when crash_kexec_post_notifiers kernel
> option is disabled.
> 
> * crash_kexec_post_notifiers : false
> 
>   panic()
>     __crash_kexec()
>       machine_crash_shutdown()
>         crash_smp_send_stop()    <= save crash dump for nonpanic cores
> 
> * crash_kexec_post_notifiers : true
> 
>   panic()
>     crash_smp_send_stop()        <= save crash dump for nonpanic cores
>     __crash_kexec()
>       machine_crash_shutdown()
>         crash_smp_send_stop()    <= just return.


> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index dc66e6e..73d8f5e 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -977,11 +977,21 @@ void smp_send_stop(void)
>  }
>  
>  #ifdef CONFIG_KEXEC_CORE
> -void smp_send_crash_stop(void)
> +void crash_smp_send_stop(void)
>  {
> +	static int cpus_stopped;
>  	cpumask_t mask;
>  	unsigned long timeout;
>  
> +	/*
> +	 * This function can be called twice in panic path, but obviously
> +	 * we execute this only once.
> +	 */
> +	if (cpus_stopped)
> +		return;
> +
> +	cpus_stopped = 1;
> +

This cpus_stopped=1 can't happen on multiple CPUs at the same time as any second
call is guaranteed to be on the same CPU, both are behind panic()s
'atomic_cmpxchg()'.


Other than my '/proc/vmcore is not available' question above, this looks fine to me:
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>


Thanks!

James

WARNING: multiple messages have this Message-ID (diff)
From: James Morse <james.morse@arm.com>
To: Hoeun Ryu <hoeun.ryu@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	AKASHI Takahiro <takahiro.akashi@linaro.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Ingo Molnar <mingo@kernel.org>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	David Daney <david.daney@cavium.com>,
	Rob Herring <robh@kernel.org>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCHv2] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores
Date: Fri, 11 Aug 2017 18:02:54 +0100	[thread overview]
Message-ID: <598DE33E.6050606@arm.com> (raw)
In-Reply-To: <1502082623-23952-1-git-send-email-hoeun.ryu@gmail.com>

Hi Hoeun,

On 07/08/17 06:09, Hoeun Ryu wrote:
>  Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly
> version in panic path) introduced crash_smp_send_stop() which is a weak
> function and can be overriden by architecture codes to fix the side effect

(overridden)


> caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_
> notifiers" option).
> 
>  ARM64 architecture uses the weak version function and the problem is that
> the weak function simply calls smp_send_stop() which makes other CPUs
> offline and takes away the chance to save crash information for nonpanic
> CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel
> option is enabled.
> 
>  Calling smp_send_crash_stop() in machine_crash_shutdown() is useless
> because all nonpanic CPUs are already offline by smp_send_stop() in this
> case and smp_send_crash_stop() only works against online CPUs.


>  The result is that /proc/vmcore is not available with the error messages;
> "Warning: Zero PT_NOTE entries found", "Kdump: vmcore not initialized".

When I tried this I got one of these warnings for each secondary CPU, but the
vmcore file was still available. When I ran 'crash' on the vmcore it reported:
> CPUS: 6 [OFFLINE: 5]

Did I miss as step to reproduce this? If not, can we change this paragraph to
say something like:
> The result is that secondary CPUs registers are not saved by crash_save_cpu()
> and the vmcore file misreports these CPUs as being offline.


>  crash_smp_send_stop() is implemented to fix this problem by replacing the
> exising smp_send_crash_stop() and adding a check for multiple calling to

(existing)


> the function. The function (strong symbol version) saves crash information
> for nonpanic CPUs and machine_crash_shutdown() tries to save crash
> information for nonpanic CPUs only when crash_kexec_post_notifiers kernel
> option is disabled.
> 
> * crash_kexec_post_notifiers : false
> 
>   panic()
>     __crash_kexec()
>       machine_crash_shutdown()
>         crash_smp_send_stop()    <= save crash dump for nonpanic cores
> 
> * crash_kexec_post_notifiers : true
> 
>   panic()
>     crash_smp_send_stop()        <= save crash dump for nonpanic cores
>     __crash_kexec()
>       machine_crash_shutdown()
>         crash_smp_send_stop()    <= just return.


> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index dc66e6e..73d8f5e 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -977,11 +977,21 @@ void smp_send_stop(void)
>  }
>  
>  #ifdef CONFIG_KEXEC_CORE
> -void smp_send_crash_stop(void)
> +void crash_smp_send_stop(void)
>  {
> +	static int cpus_stopped;
>  	cpumask_t mask;
>  	unsigned long timeout;
>  
> +	/*
> +	 * This function can be called twice in panic path, but obviously
> +	 * we execute this only once.
> +	 */
> +	if (cpus_stopped)
> +		return;
> +
> +	cpus_stopped = 1;
> +

This cpus_stopped=1 can't happen on multiple CPUs at the same time as any second
call is guaranteed to be on the same CPU, both are behind panic()s
'atomic_cmpxchg()'.


Other than my '/proc/vmcore is not available' question above, this looks fine to me:
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>


Thanks!

James

  reply	other threads:[~2017-08-11 17:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-07  5:09 [PATCHv2] arm64:kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores Hoeun Ryu
2017-08-07  5:09 ` Hoeun Ryu
2017-08-11 17:02 ` James Morse [this message]
2017-08-11 17:02   ` James Morse
2017-08-17  2:20   ` Hoeun Ryu
2017-08-17  2:20     ` Hoeun Ryu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=598DE33E.6050606@arm.com \
    --to=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.