public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: linux-kernel@vger.kernel.org, dianders@chromium.org,
	sumit.garg@linaro.org, swboyd@chromium.org
Subject: Re: [PATCH] lkdtm/bugs: add test for panic() with stuck secondary CPUs
Date: Thu, 31 Aug 2023 12:15:52 -0700	[thread overview]
Message-ID: <202308311215.BA352518C@keescook> (raw)
In-Reply-To: <20230831101026.3122590-1-mark.rutland@arm.com>

On Thu, Aug 31, 2023 at 11:10:26AM +0100, Mark Rutland wrote:
> Upon a panic() the kernel will use either smp_send_stop() or
> crash_smp_send_stop() to attempt to stop secondary CPUs via an IPI,
> which may or may not be an NMI. Generally it's preferable that this is an
> NMI so that CPUs can be stopped in as many situations as possible, but
> it's not always possible to provide an NMI, and there are cases where
> CPUs may be unable to handle the NMI regardless.
> 
> This patch adds a test for panic() where all other CPUs are stuck with
> interrupts disabled, which can be used to check whether the kernel
> gracefully handles CPUs failing to respond to a stop, and whe NMIs stops
> work.
> 
> For example, on arm64 *without* an NMI, this results in:
> 
> | # echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT
> | lkdtm: Performing direct entry PANIC_STOP_IRQOFF
> | Kernel panic - not syncing: panic stop irqoff test
> | CPU: 2 PID: 24 Comm: migration/2 Not tainted 6.5.0-rc3-00077-ge6c782389895-dirty #4
> | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> | Stopper: multi_cpu_stop+0x0/0x1a0 <- stop_machine_cpuslocked+0x158/0x1a4
> | Call trace:
> |  dump_backtrace+0x94/0xec
> |  show_stack+0x18/0x24
> |  dump_stack_lvl+0x74/0xc0
> |  dump_stack+0x18/0x24
> |  panic+0x358/0x3e8
> |  lkdtm_PANIC+0x0/0x18
> |  multi_cpu_stop+0x9c/0x1a0
> |  cpu_stopper_thread+0x84/0x118
> |  smpboot_thread_fn+0x224/0x248
> |  kthread+0x114/0x118
> |  ret_from_fork+0x10/0x20
> | SMP: stopping secondary CPUs
> | SMP: failed to stop secondary CPUs 0-3
> | Kernel Offset: 0x401cf3490000 from 0xffff800080000000
> | PHYS_OFFSET: 0x40000000
> | CPU features: 0x00000000,68c167a1,cce6773f
> | Memory Limit: none
> | ---[ end Kernel panic - not syncing: panic stop irqoff test ]---
> 
> On arm64 *with* an NMI, this results in:
> 
> | # echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT
> | lkdtm: Performing direct entry PANIC_STOP_IRQOFF
> | Kernel panic - not syncing: panic stop irqoff test
> | CPU: 1 PID: 19 Comm: migration/1 Not tainted 6.5.0-rc3-00077-ge6c782389895-dirty #4
> | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> | Stopper: multi_cpu_stop+0x0/0x1a0 <- stop_machine_cpuslocked+0x158/0x1a4
> | Call trace:
> |  dump_backtrace+0x94/0xec
> |  show_stack+0x18/0x24
> |  dump_stack_lvl+0x74/0xc0
> |  dump_stack+0x18/0x24
> |  panic+0x358/0x3e8
> |  lkdtm_PANIC+0x0/0x18
> |  multi_cpu_stop+0x9c/0x1a0
> |  cpu_stopper_thread+0x84/0x118
> |  smpboot_thread_fn+0x224/0x248
> |  kthread+0x114/0x118
> |  ret_from_fork+0x10/0x20
> | SMP: stopping secondary CPUs
> | Kernel Offset: 0x55a9c0bc0000 from 0xffff800080000000
> | PHYS_OFFSET: 0x40000000
> | CPU features: 0x00000000,68c167a1,fce6773f
> | Memory Limit: none
> | ---[ end Kernel panic - not syncing: panic stop irqoff test ]---
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Douglas Anderson <dianders@chromium.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Stephen Boyd <swboyd@chromium.org
> Cc: Sumit Garg <sumit.garg@linaro.org>
> ---
>  drivers/misc/lkdtm/bugs.c | 29 ++++++++++++++++++++++++++++-
>  1 file changed, 28 insertions(+), 1 deletion(-)
> 
> I've tested this with the arm64 NMI IPI patches:
> 
>   https://lore.kernel.org/linux-arm-kernel/20230830191314.1618136-1-dianders@chromium.org/
> 
> Specifically, with the patch that uses an NMI for IPI_CPU_STOP and
> IPI_CPU_CRASH_STOP:
> 
>   https://lore.kernel.org/linux-arm-kernel/20230830121115.v12.5.Ifadbfd45b22c52edcb499034dd4783d096343260@changeid/
> 
> Mark.
> 
> diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
> index 3c95600ab2f71..368da8b83cd1c 100644
> --- a/drivers/misc/lkdtm/bugs.c
> +++ b/drivers/misc/lkdtm/bugs.c
> @@ -6,12 +6,14 @@
>   * test source files.
>   */
>  #include "lkdtm.h"
> +#include <linux/cpu.h>
>  #include <linux/list.h>
>  #include <linux/sched.h>
>  #include <linux/sched/signal.h>
>  #include <linux/sched/task_stack.h>
> -#include <linux/uaccess.h>
>  #include <linux/slab.h>
> +#include <linux/stop_machine.h>
> +#include <linux/uaccess.h>
>  
>  #if IS_ENABLED(CONFIG_X86_32) && !IS_ENABLED(CONFIG_UML)
>  #include <asm/desc.h>
> @@ -73,6 +75,30 @@ static void lkdtm_PANIC(void)
>  	panic("dumptest");
>  }
>  
> +static int panic_stop_irqoff_fn(void *arg)
> +{
> +	atomic_t *v = arg;
> +
> +	/*
> +	 * Trigger the panic after all other CPUs have entered this function,
> +	 * so that they are guaranteed to have IRQs disabled.
> +	 */
> +	if (atomic_inc_return(v) == num_online_cpus())
> +		panic("panic stop irqoff test");
> +
> +	for (;;)
> +		cpu_relax();
> +}
> +
> +static void lkdtm_PANIC_STOP_IRQOFF(void)
> +{
> +	atomic_t v = ATOMIC_INIT(0);
> +
> +	cpus_read_lock();
> +	stop_machine(panic_stop_irqoff_fn, &v, cpu_online_mask);
> +	cpus_read_unlock();
> +}
> +
>  static void lkdtm_BUG(void)
>  {
>  	BUG();
> @@ -598,6 +624,7 @@ static noinline void lkdtm_CORRUPT_PAC(void)
>  
>  static struct crashtype crashtypes[] = {
>  	CRASHTYPE(PANIC),
> +	CRASHTYPE(PANIC_STOP_IRQOFF),
>  	CRASHTYPE(BUG),
>  	CRASHTYPE(WARNING),
>  	CRASHTYPE(WARNING_MESSAGE),

Modulo the other comments in the thread, this looks good to me. :)

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

      parent reply	other threads:[~2023-08-31 19:15 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-31 10:10 [PATCH] lkdtm/bugs: add test for panic() with stuck secondary CPUs Mark Rutland
2023-08-31 12:45 ` Sumit Garg
2023-08-31 13:07   ` Mark Rutland
2023-08-31 13:16     ` Sumit Garg
2023-08-31 16:16 ` Doug Anderson
2023-09-21 16:03   ` Mark Rutland
2023-08-31 19:15 ` Kees Cook [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202308311215.BA352518C@keescook \
    --to=keescook@chromium.org \
    --cc=dianders@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=sumit.garg@linaro.org \
    --cc=swboyd@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox