* [PATCH] panic: disable optimistic spin after halting CPUs
@ 2022-01-13 0:54 Stephen Brennan
2022-01-13 7:03 ` kernel test robot
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Stephen Brennan @ 2022-01-13 0:54 UTC (permalink / raw)
To: Greg Kroah-Hartman, Petr Mladek, Sergey Senozhatsky,
Steven Rostedt, John Ogness
Cc: Stephen Brennan, Sergey Senozhatsky, linux-kernel
A CPU executing with console lock spinning enabled might be halted
during a panic. Before console_flush_on_panic(), it's possible for
console_trylock() to attempt optimistic spinning, deadlocking the panic
CPU:
CPU 0 (panic CPU) CPU 1
----------------- ------
printk() {
vprintk_func() {
vprintk_default() {
vprintk_emit() {
console_unlock() {
console_lock_spinning_enable();
... printing to console ...
panic() {
crash_smp_send_stop() {
NMI -------------------> HALT
}
atomic_notifier_call_chain() {
printk() {
...
console_trylock_spinnning() {
// optimistic spin infinitely
This hang during panic can be induced when a kdump kernel is loaded, and
crash_kexec_post_notifiers=1 is present on the kernel command line. The
following script which concurrently writes to /dev/kmsg, and triggers a
panic, can result in this hang:
#!/bin/bash
date
# 991 chars (based on log buffer size):
chars="$(printf 'a%.0s' {1..991})"
while :; do
echo $chars > /dev/kmsg
done &
echo c > /proc/sysrq-trigger &
date
exit
Below are the hang rates for the above test case, based on v5.16-rc8
before and after this patch:
before: 197 hangs / 340 trials - 57.9%
after : 0 hangs / 245 trials - 0.0%
Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes")
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
---
include/linux/console.h | 1 +
kernel/panic.c | 7 +++++++
kernel/printk/printk.c | 17 +++++++++++++++++
3 files changed, 25 insertions(+)
diff --git a/include/linux/console.h b/include/linux/console.h
index a97f277cfdfa..4eeb46927d96 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -179,6 +179,7 @@ extern void console_unlock(void);
extern void console_conditional_schedule(void);
extern void console_unblank(void);
extern void console_flush_on_panic(enum con_flush_mode mode);
+extern void console_lock_spinning_disable_on_panic(void);
extern struct tty_driver *console_device(int *);
extern void console_stop(struct console *);
extern void console_start(struct console *);
diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..a9340e580b20 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -265,6 +265,13 @@ void panic(const char *fmt, ...)
crash_smp_send_stop();
}
+ /*
+ * Now that we've halted other CPUs, disable optimistic spinning in
+ * printk(). This avoids deadlocking in console_trylock(), until we call
+ * console_flush_on_panic().
+ */
+ console_lock_spinning_disable_on_panic();
+
/*
* Run any panic handlers, including those that might need to
* add information to the kmsg dump output.
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 57b132b658e1..d824fdf7d3fd 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1823,6 +1823,23 @@ static int console_lock_spinning_disable_and_check(void)
return 1;
}
+/**
+ * console_lock_spinning_disable_on_panic - disable spinning so that
+ * a panic CPU does not enter an infinite loop
+ *
+ * This is called once all CPUs are halted. A CPU halted during a section which
+ * allowed spinning, could trigger an infinite loop in console_trylock. To avoid
+ * this, mark console_owner as NULL.
+ */
+void console_lock_spinning_disable_on_panic(void)
+{
+ WRITE_ONCE(console_owner, NULL);
+ if (raw_spin_is_locked(&console_owner_lock)) {
+ debug_locks_off();
+ raw_spin_lock_init(&console_owner_lock);
+ }
+}
+
/**
* console_trylock_spinning - try to get console_lock by busy waiting
*
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH] panic: disable optimistic spin after halting CPUs 2022-01-13 0:54 [PATCH] panic: disable optimistic spin after halting CPUs Stephen Brennan @ 2022-01-13 7:03 ` kernel test robot 2022-01-13 8:15 ` kernel test robot 2022-01-13 17:39 ` Petr Mladek 2 siblings, 0 replies; 8+ messages in thread From: kernel test robot @ 2022-01-13 7:03 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 1908 bytes --] Hi Stephen, Thank you for the patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v5.16 next-20220112] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Stephen-Brennan/panic-disable-optimistic-spin-after-halting-CPUs/20220113-085656 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 362f533a2a1098fe95020cb59340023e9b11d062 config: parisc-randconfig-r002-20220113 (https://download.01.org/0day-ci/archive/20220113/202201131446.crnr4Sxn-lkp(a)intel.com/config) compiler: hppa-linux-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/3c321d1696e15907efda52875980b82c4bbf64bc git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Stephen-Brennan/panic-disable-optimistic-spin-after-halting-CPUs/20220113-085656 git checkout 3c321d1696e15907efda52875980b82c4bbf64bc # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=parisc SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): hppa-linux-ld: kernel/panic.o: in function `panic': >> (.text.unlikely+0xe4): undefined reference to `console_lock_spinning_disable_on_panic' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] panic: disable optimistic spin after halting CPUs 2022-01-13 0:54 [PATCH] panic: disable optimistic spin after halting CPUs Stephen Brennan 2022-01-13 7:03 ` kernel test robot @ 2022-01-13 8:15 ` kernel test robot 2022-01-13 17:39 ` Petr Mladek 2 siblings, 0 replies; 8+ messages in thread From: kernel test robot @ 2022-01-13 8:15 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 2079 bytes --] Hi Stephen, Thank you for the patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v5.16 next-20220113] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Stephen-Brennan/panic-disable-optimistic-spin-after-halting-CPUs/20220113-085656 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 362f533a2a1098fe95020cb59340023e9b11d062 config: xtensa-randconfig-r006-20220113 (https://download.01.org/0day-ci/archive/20220113/202201131622.yTxLGbSw-lkp(a)intel.com/config) compiler: xtensa-linux-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/3c321d1696e15907efda52875980b82c4bbf64bc git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Stephen-Brennan/panic-disable-optimistic-spin-after-halting-CPUs/20220113-085656 git checkout 3c321d1696e15907efda52875980b82c4bbf64bc # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=xtensa SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): xtensa-linux-ld: kernel/panic.o: in function `strcmp': >> panic.c:(.text.unlikely+0x68): undefined reference to `console_lock_spinning_disable_on_panic' xtensa-linux-ld: kernel/panic.o: in function `panic': panic.c:(.text.unlikely+0x145): undefined reference to `console_lock_spinning_disable_on_panic' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] panic: disable optimistic spin after halting CPUs 2022-01-13 0:54 [PATCH] panic: disable optimistic spin after halting CPUs Stephen Brennan 2022-01-13 7:03 ` kernel test robot 2022-01-13 8:15 ` kernel test robot @ 2022-01-13 17:39 ` Petr Mladek 2022-01-13 19:36 ` Stephen Brennan 2 siblings, 1 reply; 8+ messages in thread From: Petr Mladek @ 2022-01-13 17:39 UTC (permalink / raw) To: Stephen Brennan Cc: Greg Kroah-Hartman, Sergey Senozhatsky, Steven Rostedt, John Ogness, Sergey Senozhatsky, linux-kernel On Wed 2022-01-12 16:54:25, Stephen Brennan wrote: > A CPU executing with console lock spinning enabled might be halted > during a panic. Before console_flush_on_panic(), it's possible for > console_trylock() to attempt optimistic spinning, deadlocking the panic > CPU: > > CPU 0 (panic CPU) CPU 1 > ----------------- ------ > printk() { > vprintk_func() { > vprintk_default() { > vprintk_emit() { > console_unlock() { > console_lock_spinning_enable(); > ... printing to console ... > panic() { > crash_smp_send_stop() { > NMI -------------------> HALT > } > atomic_notifier_call_chain() { > printk() { > ... > console_trylock_spinnning() { > // optimistic spin infinitely I see. > This hang during panic can be induced when a kdump kernel is loaded, and > crash_kexec_post_notifiers=1 is present on the kernel command line. The > following script which concurrently writes to /dev/kmsg, and triggers a > panic, can result in this hang: > > #!/bin/bash > date > # 991 chars (based on log buffer size): > chars="$(printf 'a%.0s' {1..991})" > while :; do > echo $chars > /dev/kmsg > done & > echo c > /proc/sysrq-trigger & > date > exit > > Below are the hang rates for the above test case, based on v5.16-rc8 > before and after this patch: > > before: 197 hangs / 340 trials - 57.9% > after : 0 hangs / 245 trials - 0.0% > > Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes") > > Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> > Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> > --- > include/linux/console.h | 1 + > kernel/panic.c | 7 +++++++ > kernel/printk/printk.c | 17 +++++++++++++++++ > 3 files changed, 25 insertions(+) > > diff --git a/include/linux/console.h b/include/linux/console.h > index a97f277cfdfa..4eeb46927d96 100644 > --- a/include/linux/console.h > +++ b/include/linux/console.h > @@ -179,6 +179,7 @@ extern void console_unlock(void); > extern void console_conditional_schedule(void); > extern void console_unblank(void); > extern void console_flush_on_panic(enum con_flush_mode mode); > +extern void console_lock_spinning_disable_on_panic(void); > extern struct tty_driver *console_device(int *); > extern void console_stop(struct console *); > extern void console_start(struct console *); > diff --git a/kernel/panic.c b/kernel/panic.c > index cefd7d82366f..a9340e580b20 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -265,6 +265,13 @@ void panic(const char *fmt, ...) > crash_smp_send_stop(); > } > > + /* > + * Now that we've halted other CPUs, disable optimistic spinning in > + * printk(). This avoids deadlocking in console_trylock(), until we call > + * console_flush_on_panic(). > + */ > + console_lock_spinning_disable_on_panic(); The proposed implementation causes that the panicing CPU takes over console_lock even when the current owner is in the middle of call_console_drivers(). It means that calling console drivers from another CPU is not completely safe. This is why console_flush_on_panic() is called at end when the console is the only way to see the messages. We should not do this before kmsg_dump() and __crash_kexec() that allow to see the messages a "more safe" way. I suggest to disable the spinning when panic is in progress. I mean something like: --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1842,6 +1842,10 @@ static int console_trylock_spinning(void) if (console_trylock()) return 1; + /* Spinning is not safe when the system is stopped */ + if (read_atomic(&panic_cpu) != PANIC_CPU_INVALID) + return 0; + printk_safe_enter_irqsave(flags); raw_spin_lock(&console_owner_lock); It would be also great when the current owner releases console_lock so that the panicing CPU could take over it. I think about the following. Well, I am not sure if it would help in the real life because other CPUs are stopped quite early in panic(). --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2646,13 +2650,18 @@ void console_unlock(void) for (;;) { size_t ext_len = 0; - int handover; + int handover, pcpu; size_t len; skip: if (!prb_read_valid(prb, console_seq, &r)) break; + /* Allow the panic_cpu to take over consoles a safe way. */ + pcpu = read_atomic(&panic_cpu); + if (pcpu != PANIC_CPU_INVALID && pcpu != smp_processor_id()) + break; + if (console_seq != r.info->seq) { console_dropped += r.info->seq - console_seq; console_seq = r.info->seq; Note that the above code is not even compile tested. panic_cpu is local variable in panic.c. If we go this way than I would define some helpers: bool panic_in_progress(void) { return read_atomic(&panic_cpu) != PANIC_CPU_INVALID; } bool panic_in_progress_on_other_cpu(void) { int pcpu = read_atomic(&panic_cpu); if (pcpu == PANIC_CPU_INVALID) return false; if (pcpu == smp_processor_id()) return false; return true; } Best Regards, Petr ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] panic: disable optimistic spin after halting CPUs 2022-01-13 17:39 ` Petr Mladek @ 2022-01-13 19:36 ` Stephen Brennan 2022-01-14 10:24 ` Petr Mladek 0 siblings, 1 reply; 8+ messages in thread From: Stephen Brennan @ 2022-01-13 19:36 UTC (permalink / raw) To: Petr Mladek Cc: Greg Kroah-Hartman, Sergey Senozhatsky, Steven Rostedt, John Ogness, Sergey Senozhatsky, linux-kernel Hi Petr, Thanks for taking a look! Petr Mladek <pmladek@suse.com> writes: > On Wed 2022-01-12 16:54:25, Stephen Brennan wrote: >> A CPU executing with console lock spinning enabled might be halted >> during a panic. Before console_flush_on_panic(), it's possible for >> console_trylock() to attempt optimistic spinning, deadlocking the panic >> CPU: >> >> CPU 0 (panic CPU) CPU 1 >> ----------------- ------ >> printk() { >> vprintk_func() { >> vprintk_default() { >> vprintk_emit() { >> console_unlock() { >> console_lock_spinning_enable(); >> ... printing to console ... >> panic() { >> crash_smp_send_stop() { >> NMI -------------------> HALT >> } >> atomic_notifier_call_chain() { >> printk() { >> ... >> console_trylock_spinnning() { >> // optimistic spin infinitely > > I see. > >> This hang during panic can be induced when a kdump kernel is loaded, and >> crash_kexec_post_notifiers=1 is present on the kernel command line. The >> following script which concurrently writes to /dev/kmsg, and triggers a >> panic, can result in this hang: >> >> #!/bin/bash >> date >> # 991 chars (based on log buffer size): >> chars="$(printf 'a%.0s' {1..991})" >> while :; do >> echo $chars > /dev/kmsg >> done & >> echo c > /proc/sysrq-trigger & >> date >> exit >> >> Below are the hang rates for the above test case, based on v5.16-rc8 >> before and after this patch: >> >> before: 197 hangs / 340 trials - 57.9% >> after : 0 hangs / 245 trials - 0.0% >> >> Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes") >> >> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> >> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> >> --- >> include/linux/console.h | 1 + >> kernel/panic.c | 7 +++++++ >> kernel/printk/printk.c | 17 +++++++++++++++++ >> 3 files changed, 25 insertions(+) >> >> diff --git a/include/linux/console.h b/include/linux/console.h >> index a97f277cfdfa..4eeb46927d96 100644 >> --- a/include/linux/console.h >> +++ b/include/linux/console.h >> @@ -179,6 +179,7 @@ extern void console_unlock(void); >> extern void console_conditional_schedule(void); >> extern void console_unblank(void); >> extern void console_flush_on_panic(enum con_flush_mode mode); >> +extern void console_lock_spinning_disable_on_panic(void); >> extern struct tty_driver *console_device(int *); >> extern void console_stop(struct console *); >> extern void console_start(struct console *); >> diff --git a/kernel/panic.c b/kernel/panic.c >> index cefd7d82366f..a9340e580b20 100644 >> --- a/kernel/panic.c >> +++ b/kernel/panic.c >> @@ -265,6 +265,13 @@ void panic(const char *fmt, ...) >> crash_smp_send_stop(); >> } >> >> + /* >> + * Now that we've halted other CPUs, disable optimistic spinning in >> + * printk(). This avoids deadlocking in console_trylock(), until we call >> + * console_flush_on_panic(). >> + */ >> + console_lock_spinning_disable_on_panic(); > > The proposed implementation causes that the panicing CPU takes over > console_lock even when the current owner is in the middle of > call_console_drivers(). I don't think this is true. All that console_lock_spinning_disable_on_panic() does is clear out the console_owner and console_owner_lock, signalling that the panicking CPU should not attempt to spin and wait for the lock. The only time this matters is when another task already holds console_sem and is executing presently in call_console_drivers(), as you said. With console_owner set to NULL, console_trylock_spinning() will get spin=false, and thus it will bail out, returning 0. Thus, the panicking CPU would fail to acquire console_sem, and so in vprintk_emit it will not end up going into console_unlock(), or doing call_console_drivers(). It would not be until console_flush_on_panic() down below (when there is no other option than to try to use the console) that the panicking CPU actually attempts to call the drivers. > > It means that calling console drivers from another CPU is not > completely safe. This is why console_flush_on_panic() is called > at end when the console is the only way to see the messages. > > We should not do this before kmsg_dump() and __crash_kexec() > that allow to see the messages a "more safe" way. > > I suggest to disable the spinning when panic is in progress. I mean > something like: > > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -1842,6 +1842,10 @@ static int console_trylock_spinning(void) > if (console_trylock()) > return 1; > > + /* Spinning is not safe when the system is stopped */ > + if (read_atomic(&panic_cpu) != PANIC_CPU_INVALID) > + return 0; > + > printk_safe_enter_irqsave(flags); > > raw_spin_lock(&console_owner_lock); I think this is pretty much equivalent to my fix, since it also results in the panicking CPU failing to acquire console_sem during console_trylock_spinning(). I do think this is better than my fix :) since it doesn't clutter up panic() even more, nor introduce an additional function. It even avoids resetting the console_owner_lock spinlock. I'd be happy to do this as a v2, if you'd prefer? > > It would be also great when the current owner releases console_lock > so that the panicing CPU could take over it. > > I think about the following. Well, I am not sure if it would help > in the real life because other CPUs are stopped quite early in panic(). > > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -2646,13 +2650,18 @@ void console_unlock(void) > > for (;;) { > size_t ext_len = 0; > - int handover; > + int handover, pcpu; > size_t len; > > skip: > if (!prb_read_valid(prb, console_seq, &r)) > break; > > + /* Allow the panic_cpu to take over consoles a safe way. */ > + pcpu = read_atomic(&panic_cpu); > + if (pcpu != PANIC_CPU_INVALID && pcpu != smp_processor_id()) > + break; > + > if (console_seq != r.info->seq) { > console_dropped += r.info->seq - console_seq; > console_seq = r.info->seq; > I suppose this could help reduce the odds of a CPU getting interrupted during call_console_drivers(), and it might reduce the odds of console_sem being held by a halted CPU during panic(). However, it occurs to me that there are two cases: 1) a kdump kernel is loaded. In this case, crash_smp_send_stop() is used, which (at least on x86_64) sends NMIs. In this case, __crash_kexec() will not return (barring any errors) and we won't ever really need to get the console_sem. 2) no kdump kernel is loaded. In this case, smp_send_stop() is used, which (again on x86_64) sends regular IPIs. In this case, we know we can at least avoid interrupting call_console_drivers(), since that executes with local IRQ disabled. So I'm also unsure how much this would help in the real world. But it's a small change and so it doesn't need to help much in order to be a net positive :) Thanks, Stephen > > > Note that the above code is not even compile tested. panic_cpu is > local variable in panic.c. If we go this way than I would define > some helpers: > > bool panic_in_progress(void) > { > return read_atomic(&panic_cpu) != PANIC_CPU_INVALID; > } > > bool panic_in_progress_on_other_cpu(void) > { > int pcpu = read_atomic(&panic_cpu); > > if (pcpu == PANIC_CPU_INVALID) > return false; > > if (pcpu == smp_processor_id()) > return false; > > return true; > } > > Best Regards, > Petr ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] panic: disable optimistic spin after halting CPUs 2022-01-13 19:36 ` Stephen Brennan @ 2022-01-14 10:24 ` Petr Mladek 2022-01-18 23:13 ` Stephen Brennan 0 siblings, 1 reply; 8+ messages in thread From: Petr Mladek @ 2022-01-14 10:24 UTC (permalink / raw) To: Stephen Brennan Cc: Greg Kroah-Hartman, Sergey Senozhatsky, Steven Rostedt, John Ogness, Sergey Senozhatsky, linux-kernel On Thu 2022-01-13 11:36:35, Stephen Brennan wrote: > Hi Petr, > > Thanks for taking a look! > > Petr Mladek <pmladek@suse.com> writes: > > On Wed 2022-01-12 16:54:25, Stephen Brennan wrote: > >> A CPU executing with console lock spinning enabled might be halted > >> during a panic. Before console_flush_on_panic(), it's possible for > >> console_trylock() to attempt optimistic spinning, deadlocking the panic > >> CPU: > >> > >> CPU 0 (panic CPU) CPU 1 > >> ----------------- ------ > >> printk() { > >> vprintk_func() { > >> vprintk_default() { > >> vprintk_emit() { > >> console_unlock() { > >> console_lock_spinning_enable(); > >> ... printing to console ... > >> panic() { > >> crash_smp_send_stop() { > >> NMI -------------------> HALT > >> } > >> atomic_notifier_call_chain() { > >> printk() { > >> ... > >> console_trylock_spinnning() { > >> // optimistic spin infinitely > > > > I see. > > > >> This hang during panic can be induced when a kdump kernel is loaded, and > >> crash_kexec_post_notifiers=1 is present on the kernel command line. The > >> following script which concurrently writes to /dev/kmsg, and triggers a > >> panic, can result in this hang: > >> > >> #!/bin/bash > >> date > >> # 991 chars (based on log buffer size): > >> chars="$(printf 'a%.0s' {1..991})" > >> while :; do > >> echo $chars > /dev/kmsg > >> done & > >> echo c > /proc/sysrq-trigger & > >> date > >> exit > >> > >> Below are the hang rates for the above test case, based on v5.16-rc8 > >> before and after this patch: > >> > >> before: 197 hangs / 340 trials - 57.9% > >> after : 0 hangs / 245 trials - 0.0% > >> > >> Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes") > >> > >> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> > >> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> > >> --- > >> include/linux/console.h | 1 + > >> kernel/panic.c | 7 +++++++ > >> kernel/printk/printk.c | 17 +++++++++++++++++ > >> 3 files changed, 25 insertions(+) > >> > >> diff --git a/include/linux/console.h b/include/linux/console.h > >> index a97f277cfdfa..4eeb46927d96 100644 > >> --- a/include/linux/console.h > >> +++ b/include/linux/console.h > >> @@ -179,6 +179,7 @@ extern void console_unlock(void); > >> extern void console_conditional_schedule(void); > >> extern void console_unblank(void); > >> extern void console_flush_on_panic(enum con_flush_mode mode); > >> +extern void console_lock_spinning_disable_on_panic(void); > >> extern struct tty_driver *console_device(int *); > >> extern void console_stop(struct console *); > >> extern void console_start(struct console *); > >> diff --git a/kernel/panic.c b/kernel/panic.c > >> index cefd7d82366f..a9340e580b20 100644 > >> --- a/kernel/panic.c > >> +++ b/kernel/panic.c > >> @@ -265,6 +265,13 @@ void panic(const char *fmt, ...) > >> crash_smp_send_stop(); > >> } > >> > >> + /* > >> + * Now that we've halted other CPUs, disable optimistic spinning in > >> + * printk(). This avoids deadlocking in console_trylock(), until we call > >> + * console_flush_on_panic(). > >> + */ > >> + console_lock_spinning_disable_on_panic(); > > > > The proposed implementation causes that the panicing CPU takes over > > console_lock even when the current owner is in the middle of > > call_console_drivers(). > > I don't think this is true. All that > console_lock_spinning_disable_on_panic() does is clear out the > console_owner and console_owner_lock, signalling that the panicking CPU > should not attempt to spin and wait for the lock. > > The only time this matters is when another task already holds > console_sem and is executing presently in call_console_drivers(), as you > said. > > With console_owner set to NULL, console_trylock_spinning() will get > spin=false, and thus it will bail out, returning 0. Thus, the panicking > CPU would fail to acquire console_sem, and so in vprintk_emit it will > not end up going into console_unlock(), or doing call_console_drivers(). > > It would not be until console_flush_on_panic() down below (when there is > no other option than to try to use the console) that the panicking CPU > actually attempts to call the drivers. You are right. I should have used drawing board. > > It means that calling console drivers from another CPU is not > > completely safe. This is why console_flush_on_panic() is called > > at end when the console is the only way to see the messages. > > > > We should not do this before kmsg_dump() and __crash_kexec() > > that allow to see the messages a "more safe" way. > > > > I suggest to disable the spinning when panic is in progress. I mean > > something like: > > > > --- a/kernel/printk/printk.c > > +++ b/kernel/printk/printk.c > > @@ -1842,6 +1842,10 @@ static int console_trylock_spinning(void) > > if (console_trylock()) > > return 1; > > > > + /* Spinning is not safe when the system is stopped */ > > + if (read_atomic(&panic_cpu) != PANIC_CPU_INVALID) > > + return 0; > > + > > printk_safe_enter_irqsave(flags); > > > > raw_spin_lock(&console_owner_lock); > > I think this is pretty much equivalent to my fix, since it also results > in the panicking CPU failing to acquire console_sem during > console_trylock_spinning(). Yes. > I do think this is better than my fix :) since it doesn't clutter up > panic() even more, nor introduce an additional function. It even avoids > resetting the console_owner_lock spinlock. Yes. I agree. It also looks slightly more straightforward to me. It might be matter of taste. It is just that I misunderstood the effect of your variant yesterday ;-) > I'd be happy to do this as a v2, if you'd prefer? Yes, please. Go for it. > > It would be also great when the current owner releases console_lock > > so that the panicing CPU could take over it. > > > > I think about the following. Well, I am not sure if it would help > > in the real life because other CPUs are stopped quite early in panic(). > > > > --- a/kernel/printk/printk.c > > +++ b/kernel/printk/printk.c > > @@ -2646,13 +2650,18 @@ void console_unlock(void) > > > > for (;;) { > > size_t ext_len = 0; > > - int handover; > > + int handover, pcpu; > > size_t len; > > > > skip: > > if (!prb_read_valid(prb, console_seq, &r)) > > break; > > > > + /* Allow the panic_cpu to take over consoles a safe way. */ > > + pcpu = read_atomic(&panic_cpu); > > + if (pcpu != PANIC_CPU_INVALID && pcpu != smp_processor_id()) > > + break; > > + > > if (console_seq != r.info->seq) { > > console_dropped += r.info->seq - console_seq; > > console_seq = r.info->seq; > > > > I suppose this could help reduce the odds of a CPU getting interrupted > during call_console_drivers(), and it might reduce the odds of > console_sem being held by a halted CPU during panic(). > > However, it occurs to me that there are two cases: > > 1) a kdump kernel is loaded. In this case, crash_smp_send_stop() is > used, which (at least on x86_64) sends NMIs. In this case, > __crash_kexec() will not return (barring any errors) and we won't ever > really need to get the console_sem. This is likely the more safe case. NMI will make sure that the current owner will not mess with the console drivers any longer. > 2) no kdump kernel is loaded. In this case, smp_send_stop() is used, > which (again on x86_64) sends regular IPIs. In this case, we know we can > at least avoid interrupting call_console_drivers(), since that executes > with local IRQ disabled. This is probably more dangerous case. Regular IPIs will not stop the current owner when it is running with IRQ disabled. It means the it could still manipulate consoles and race with console_flush_on_panic(). Note that printk() can be called in IRQ disabled context. Also note that some architectures do not have NMI. They use normal IRQ even in the 1st case. > So I'm also unsure how much this would help in the real world. But it's > a small change and so it doesn't need to help much in order to be a net > positive :) I agree that it is a corner case. But I think that it is worth it. Well, we could do this change separately. I could send the patch for this part if you would prefer it. But feel free to send it yourself. Best Regards, Petr ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] panic: disable optimistic spin after halting CPUs 2022-01-14 10:24 ` Petr Mladek @ 2022-01-18 23:13 ` Stephen Brennan 2022-01-20 16:48 ` Petr Mladek 0 siblings, 1 reply; 8+ messages in thread From: Stephen Brennan @ 2022-01-18 23:13 UTC (permalink / raw) To: Petr Mladek Cc: Greg Kroah-Hartman, Sergey Senozhatsky, Steven Rostedt, John Ogness, Sergey Senozhatsky, linux-kernel Petr Mladek <pmladek@suse.com> writes: > On Thu 2022-01-13 11:36:35, Stephen Brennan wrote: >> Petr Mladek <pmladek@suse.com> writes: >> > On Wed 2022-01-12 16:54:25, Stephen Brennan wrote: [snip] >> > I suggest to disable the spinning when panic is in progress. I mean >> > something like: >> > >> > --- a/kernel/printk/printk.c >> > +++ b/kernel/printk/printk.c >> > @@ -1842,6 +1842,10 @@ static int console_trylock_spinning(void) >> > if (console_trylock()) >> > return 1; >> > >> > + /* Spinning is not safe when the system is stopped */ >> > + if (read_atomic(&panic_cpu) != PANIC_CPU_INVALID) >> > + return 0; >> > + >> > printk_safe_enter_irqsave(flags); >> > >> > raw_spin_lock(&console_owner_lock); >> >> I think this is pretty much equivalent to my fix, since it also results >> in the panicking CPU failing to acquire console_sem during >> console_trylock_spinning(). > > Yes. > >> I do think this is better than my fix :) since it doesn't clutter up >> panic() even more, nor introduce an additional function. It even avoids >> resetting the console_owner_lock spinlock. > > Yes. I agree. > > It also looks slightly more straightforward to me. It might be matter > of taste. It is just that I misunderstood the effect of your variant > yesterday ;-) > >> I'd be happy to do this as a v2, if you'd prefer? > > Yes, please. Go for it. Hi Petr, Sorry for the delayed response due to the US holiday weekend. I switched to your approach, and began running it through my test script (from the commit message) on QEMU. However, 14% of the time (200/1409) I found that the system fell into an interesting race condition / loop. 178 void panic(const char *fmt, ...) 179 { ... 187 /* 188 * Disable local interrupts. This will prevent panic_smp_self_stop 189 * from deadlocking the first cpu that invokes the panic, since 190 * there is nothing to prevent an interrupt handler (that runs 191 * after setting panic_cpu) from invoking panic() again. 192 */ 193 local_irq_disable(); 194 preempt_disable_notrace(); ... 211 this_cpu = raw_smp_processor_id(); 212 old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu); 213 214 if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu) 215 panic_smp_self_stop(); ... 226 pr_emerg("Kernel panic - not syncing: %s\n", buf); ... 250 if (!_crash_kexec_post_notifiers) { // <- HALT CPUs ... We disable IRQ and preemption at the beginning of panic(), then set panic_cpu. This opens a window (between lines 212 and 250) where printk() spinning is disabled. Then, we go ahead and we use printk() during this window, at line 226. If another CPU is adding to the log buffer at the right time (concurrent with the pr_emerg in line 226), then they can successfully prevent the panic CPU from making progress. Since spinning is disabled, the other CPU in vprintk_emit() will never get the console_sem, and will just leave its buffered contents in the log buffer. And if the other CPU can add to the log buffer faster than the panic CPU can drain it... then the panic CPU is stuck forever writing into the console. And I do mean forever. If a watchdog triggers, it will find panic_cpu already set, and so it won't be able to do anything! Thus I'm now testing the following modification: // in console_trylock_spinning() if (atomic_read(&panic_cpu) == this_cpu) return 0; This would ensure that the panic CPU won't fall into the spin loop, but also ensures that other CPUs can't flood the panic CPU with buffered messages. I'll test this method and ensure there are no hangs, then send the patch. > >> > It would be also great when the current owner releases console_lock >> > so that the panicing CPU could take over it. >> > >> > I think about the following. Well, I am not sure if it would help >> > in the real life because other CPUs are stopped quite early in panic(). >> > >> > --- a/kernel/printk/printk.c >> > +++ b/kernel/printk/printk.c >> > @@ -2646,13 +2650,18 @@ void console_unlock(void) >> > >> > for (;;) { >> > size_t ext_len = 0; >> > - int handover; >> > + int handover, pcpu; >> > size_t len; >> > >> > skip: >> > if (!prb_read_valid(prb, console_seq, &r)) >> > break; >> > >> > + /* Allow the panic_cpu to take over consoles a safe way. */ >> > + pcpu = read_atomic(&panic_cpu); >> > + if (pcpu != PANIC_CPU_INVALID && pcpu != smp_processor_id()) >> > + break; >> > + >> > if (console_seq != r.info->seq) { >> > console_dropped += r.info->seq - console_seq; >> > console_seq = r.info->seq; >> > >> >> I suppose this could help reduce the odds of a CPU getting interrupted >> during call_console_drivers(), and it might reduce the odds of >> console_sem being held by a halted CPU during panic(). >> >> However, it occurs to me that there are two cases: >> >> 1) a kdump kernel is loaded. In this case, crash_smp_send_stop() is >> used, which (at least on x86_64) sends NMIs. In this case, >> __crash_kexec() will not return (barring any errors) and we won't ever >> really need to get the console_sem. > > This is likely the more safe case. NMI will make sure that > the current owner will not mess with the console drivers any longer. > > >> 2) no kdump kernel is loaded. In this case, smp_send_stop() is used, >> which (again on x86_64) sends regular IPIs. In this case, we know we can >> at least avoid interrupting call_console_drivers(), since that executes >> with local IRQ disabled. > > This is probably more dangerous case. Regular IPIs will not stop the > current owner when it is running with IRQ disabled. It means > the it could still manipulate consoles and race with > console_flush_on_panic(). On x86_64, smp_send_stop() will send an IPI and try to wait a grace period for CPUs to stop, and after that grace period, use NMI, so there is thankfully some protection. > > Note that printk() can be called in IRQ disabled context. Also note > that some architectures do not have NMI. They use normal IRQ even > in the 1st case. Yes, this is a very important consideration. > > >> So I'm also unsure how much this would help in the real world. But it's >> a small change and so it doesn't need to help much in order to be a net >> positive :) > > I agree that it is a corner case. But I think that it is worth it. > > Well, we could do this change separately. I could send the patch > for this part if you would prefer it. But feel free to send it > yourself. I'd be glad to write this patch, put it through my VM stress testing, and then send it. I've already got it setup so it might as well get put to good use :) Thanks, Stephen > > Best Regards, > Petr ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] panic: disable optimistic spin after halting CPUs 2022-01-18 23:13 ` Stephen Brennan @ 2022-01-20 16:48 ` Petr Mladek 0 siblings, 0 replies; 8+ messages in thread From: Petr Mladek @ 2022-01-20 16:48 UTC (permalink / raw) To: Stephen Brennan Cc: Greg Kroah-Hartman, Sergey Senozhatsky, Steven Rostedt, John Ogness, Sergey Senozhatsky, linux-kernel On Tue 2022-01-18 15:13:51, Stephen Brennan wrote: > Petr Mladek <pmladek@suse.com> writes: > > On Thu 2022-01-13 11:36:35, Stephen Brennan wrote: > Hi Petr, > > Sorry for the delayed response due to the US holiday weekend. No problem at all :-) > I switched to your approach, and began running it through my test script > (from the commit message) on QEMU. However, 14% of the time (200/1409) I > found that the system fell into an interesting race condition / loop. > > 178 void panic(const char *fmt, ...) > 179 { > ... > 187 /* > 188 * Disable local interrupts. This will prevent panic_smp_self_stop > 189 * from deadlocking the first cpu that invokes the panic, since > 190 * there is nothing to prevent an interrupt handler (that runs > 191 * after setting panic_cpu) from invoking panic() again. > 192 */ > 193 local_irq_disable(); > 194 preempt_disable_notrace(); > ... > 211 this_cpu = raw_smp_processor_id(); > 212 old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu); > 213 > 214 if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu) > 215 panic_smp_self_stop(); > ... > 226 pr_emerg("Kernel panic - not syncing: %s\n", buf); > ... > 250 if (!_crash_kexec_post_notifiers) { // <- HALT CPUs > ... > > We disable IRQ and preemption at the beginning of panic(), then set > panic_cpu. This opens a window (between lines 212 and 250) where > printk() spinning is disabled. Then, we go ahead and we use printk() > during this window, at line 226. > > If another CPU is adding to the log buffer at the right time (concurrent > with the pr_emerg in line 226), then they can successfully prevent the > panic CPU from making progress. Since spinning is disabled, the other > CPU in vprintk_emit() will never get the console_sem, and will just > leave its buffered contents in the log buffer. And if the other CPU can > add to the log buffer faster than the panic CPU can drain it... then the > panic CPU is stuck forever writing into the console. And I do mean > forever. Yeah. The console waiter logic helps a bit to avoid this livelock. > If a watchdog triggers, it will find panic_cpu already set, and > so it won't be able to do anything! > > Thus I'm now testing the following modification: > > // in console_trylock_spinning() > if (atomic_read(&panic_cpu) == this_cpu) > return 0; > > This would ensure that the panic CPU won't fall into the spin loop, but > also ensures that other CPUs can't flood the panic CPU with buffered > messages. Hmm, I am not sure if we really want it: The other CPU will likely become the console_sem owner. If the other CPU is running, the current console_sem owner is likely running as well and will be able to pass console_sem. Yes, it will throttle the other CPU. It will become busy with pushing its own messages to the console. So far so good. But panic() will stop the other CPU at some stage. And the panic CPU will not be able to push the messages to the console because @console_sem is owned by the other (stopped) CPU. panic CPU will try to get the messages out later in console_flush_on_panic(). But it is prone to deadlocks. We want panic CPU to become console_sem owner and be able to flush the messages to the consoles a clean way. This is why I proposed the 2nd patch where the non-panic owner releases console_sem. I wonder if we could prevent the livelock another way. The livelock happens when the messages added faster into the log buffer than they can reach the console. The result must be that some messages are lost, see in console_unlock(): if (console_seq != r.info->seq) { console_dropped += r.info->seq - console_seq; console_seq = r.info->seq; } A solution might be to disable printk() on non-panic CPUs when some messages are repeatedly dropped during panic. I mean something like: void console_unlock(void) { [...] + static panic_console_dropped; [...] if (console_seq != r.info->seq) { console_dropped += r.info->seq - console_seq; console_seq = r.info->seq; + + if (read_atomic(&panic_cpu) != PANIC_CPU_INVALID) { + if (panic_console_dropped++ > 10) + suppress_non_panic_printk = 1; + } } , where "suppress_non_panic_printk" will have similar effect as "supress_printk" except that it will supress only printk on non-panic CPUs. I am not completely sure that it is the right approach. But it is not completely bad: + allows the panic() CPU to become clean console_sem owner + blocks printk() only when many messages are lost anyway > > Well, we could do this change separately. I could send the patch > > for this part if you would prefer it. But feel free to send it > > yourself. > > I'd be glad to write this patch, put it through my VM stress testing, > and then send it. I've already got it setup so it might as well get put > to good use :) Great, thanks. Best Regards, Petr ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-01-20 16:49 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-01-13 0:54 [PATCH] panic: disable optimistic spin after halting CPUs Stephen Brennan 2022-01-13 7:03 ` kernel test robot 2022-01-13 8:15 ` kernel test robot 2022-01-13 17:39 ` Petr Mladek 2022-01-13 19:36 ` Stephen Brennan 2022-01-14 10:24 ` Petr Mladek 2022-01-18 23:13 ` Stephen Brennan 2022-01-20 16:48 ` Petr Mladek
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.