* [PATCH V4] panic: Move panic_print before kmsg dumpers @ 2022-01-24 20:31 ` Guilherme G. Piccoli 0 siblings, 0 replies; 6+ messages in thread From: Guilherme G. Piccoli @ 2022-01-24 20:31 UTC (permalink / raw) To: kexec The panic_print setting allows users to collect more information in a panic event, like memory stats, tasks, CPUs backtraces, etc. This is an interesting debug mechanism, but currently the print event happens *after* kmsg_dump(), meaning that pstore, for example, cannot collect a dmesg with the panic_print extra information. This patch changes that in 2 ways: (a) The panic_print setting allows to replay the existing kernel log buffer to the console (bit 5), besides the extra information dump. This functionality makes sense only at the end of the panic() function. So, we hereby allow to distinguish the two situations by a new boolean parameter in the function panic_print_sys_info(). (b) With the above change, we can safely call panic_print_sys_info() before kmsg_dump(), allowing to dump the extra information when using pstore or other kmsg dumpers. The additional messages from panic_print could overwrite the oldest messages when the buffer is full. The only reasonable solution is to use a large enough log buffer, hence we added an advice into the kernel parameters documentation about that. Finally, some panic notifiers might reset watchdogs, like RCU or hung task detector. Due to that, it's optimal to dump the extra information from panic_print after the notifiers, when possible. Sometimes it's not possible though - for example, when users have kdump set but don't pass "crash_kexec_post_notifiers" in the kernel command-line. For this reason, we kept 2 calls for panic_print_sys_info(). Cc: Baoquan He <bhe@redhat.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> --- V4: * Addressed feedback from Petr (thanks!), by refactoring the commit message to be more direct plus his suggestions to improve code path and the comment. * Addressed feedback from Baoquan (thanks!) about the new boolean parameter name on panic_print_sys_info(). V3: https://lore.kernel.org/lkml/20220114183046.428796-1-gpiccoli at igalia.com V2: https://lore.kernel.org/lkml/20220106212835.119409-1-gpiccoli at igalia.com V1: https://lore.kernel.org/lkml/20211230161828.121858-1-gpiccoli at igalia.com .../admin-guide/kernel-parameters.txt | 4 +++ kernel/panic.c | 34 ++++++++++++------- 2 files changed, 26 insertions(+), 12 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a069d8fe2fee..0f5cbe141bfd 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3727,6 +3727,10 @@ bit 4: print ftrace buffer bit 5: print all printk messages in buffer bit 6: print all CPUs backtrace (if available in the arch) + *Be aware* that this option may print a _lot_ of lines, + so there are risks of losing older messages in the log. + Use this option carefully, maybe worth to setup a + bigger log buffer with "log_buf_len" along with this. panic_on_taint= Bitmask for conditionally calling panic() in add_taint() Format: <hex>[,nousertaint] diff --git a/kernel/panic.c b/kernel/panic.c index 41ecf9ab824a..b274e6c241d9 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -148,10 +148,13 @@ void nmi_panic(struct pt_regs *regs, const char *msg) } EXPORT_SYMBOL(nmi_panic); -static void panic_print_sys_info(void) +static void panic_print_sys_info(bool console_flush) { - if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG) - console_flush_on_panic(CONSOLE_REPLAY_ALL); + if (console_flush) { + if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG) + console_flush_on_panic(CONSOLE_REPLAY_ALL); + return; + } if (panic_print & PANIC_PRINT_ALL_CPU_BT) trigger_all_cpu_backtrace(); @@ -244,22 +247,20 @@ void panic(const char *fmt, ...) */ kgdb_panic(buf); - /* - * If we have a kdump kernel loaded, give a chance to panic_print - * show some extra information on kernel log if it was set... - */ - if (kexec_crash_loaded()) - panic_print_sys_info(); - /* * If we have crashed and we have a crash kernel loaded let it handle - * everything else. + * everything else. Also, give a chance to panic_print show some extra + * information on kernel log if it was set... + * * If we want to run this after calling panic_notifiers, pass * the "crash_kexec_post_notifiers" option to the kernel. * * Bypass the panic_cpu check and call __crash_kexec directly. */ if (!_crash_kexec_post_notifiers) { + if (kexec_crash_loaded()) + panic_print_sys_info(false); + __crash_kexec(NULL); /* @@ -283,6 +284,15 @@ void panic(const char *fmt, ...) */ atomic_notifier_call_chain(&panic_notifier_list, 0, buf); + /* + * If a crash kernel is not loaded (or if it's loaded but we still + * want to allow the panic notifiers), then we dump panic_print after + * the notifiers - some notifiers disable watchdogs, for example, so + * we reduce the risk of lockups/hangs or garbled output this way. + */ + if (_crash_kexec_post_notifiers || !kexec_crash_loaded()) + panic_print_sys_info(false); + kmsg_dump(KMSG_DUMP_PANIC); /* @@ -313,7 +323,7 @@ void panic(const char *fmt, ...) debug_locks_off(); console_flush_on_panic(CONSOLE_FLUSH_PENDING); - panic_print_sys_info(); + panic_print_sys_info(true); if (!panic_blink) panic_blink = no_blink; -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH V4] panic: Move panic_print before kmsg dumpers @ 2022-01-24 20:31 ` Guilherme G. Piccoli 0 siblings, 0 replies; 6+ messages in thread From: Guilherme G. Piccoli @ 2022-01-24 20:31 UTC (permalink / raw) To: linux-kernel, bhe, pmladek Cc: gpiccoli, akpm, anton, ccross, dyoung, feng.tang, john.ogness, keescook, kernel, kexec, rostedt, senozhatsky, tony.luck, vgoyal The panic_print setting allows users to collect more information in a panic event, like memory stats, tasks, CPUs backtraces, etc. This is an interesting debug mechanism, but currently the print event happens *after* kmsg_dump(), meaning that pstore, for example, cannot collect a dmesg with the panic_print extra information. This patch changes that in 2 ways: (a) The panic_print setting allows to replay the existing kernel log buffer to the console (bit 5), besides the extra information dump. This functionality makes sense only at the end of the panic() function. So, we hereby allow to distinguish the two situations by a new boolean parameter in the function panic_print_sys_info(). (b) With the above change, we can safely call panic_print_sys_info() before kmsg_dump(), allowing to dump the extra information when using pstore or other kmsg dumpers. The additional messages from panic_print could overwrite the oldest messages when the buffer is full. The only reasonable solution is to use a large enough log buffer, hence we added an advice into the kernel parameters documentation about that. Finally, some panic notifiers might reset watchdogs, like RCU or hung task detector. Due to that, it's optimal to dump the extra information from panic_print after the notifiers, when possible. Sometimes it's not possible though - for example, when users have kdump set but don't pass "crash_kexec_post_notifiers" in the kernel command-line. For this reason, we kept 2 calls for panic_print_sys_info(). Cc: Baoquan He <bhe@redhat.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> --- V4: * Addressed feedback from Petr (thanks!), by refactoring the commit message to be more direct plus his suggestions to improve code path and the comment. * Addressed feedback from Baoquan (thanks!) about the new boolean parameter name on panic_print_sys_info(). V3: https://lore.kernel.org/lkml/20220114183046.428796-1-gpiccoli@igalia.com V2: https://lore.kernel.org/lkml/20220106212835.119409-1-gpiccoli@igalia.com V1: https://lore.kernel.org/lkml/20211230161828.121858-1-gpiccoli@igalia.com .../admin-guide/kernel-parameters.txt | 4 +++ kernel/panic.c | 34 ++++++++++++------- 2 files changed, 26 insertions(+), 12 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index a069d8fe2fee..0f5cbe141bfd 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3727,6 +3727,10 @@ bit 4: print ftrace buffer bit 5: print all printk messages in buffer bit 6: print all CPUs backtrace (if available in the arch) + *Be aware* that this option may print a _lot_ of lines, + so there are risks of losing older messages in the log. + Use this option carefully, maybe worth to setup a + bigger log buffer with "log_buf_len" along with this. panic_on_taint= Bitmask for conditionally calling panic() in add_taint() Format: <hex>[,nousertaint] diff --git a/kernel/panic.c b/kernel/panic.c index 41ecf9ab824a..b274e6c241d9 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -148,10 +148,13 @@ void nmi_panic(struct pt_regs *regs, const char *msg) } EXPORT_SYMBOL(nmi_panic); -static void panic_print_sys_info(void) +static void panic_print_sys_info(bool console_flush) { - if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG) - console_flush_on_panic(CONSOLE_REPLAY_ALL); + if (console_flush) { + if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG) + console_flush_on_panic(CONSOLE_REPLAY_ALL); + return; + } if (panic_print & PANIC_PRINT_ALL_CPU_BT) trigger_all_cpu_backtrace(); @@ -244,22 +247,20 @@ void panic(const char *fmt, ...) */ kgdb_panic(buf); - /* - * If we have a kdump kernel loaded, give a chance to panic_print - * show some extra information on kernel log if it was set... - */ - if (kexec_crash_loaded()) - panic_print_sys_info(); - /* * If we have crashed and we have a crash kernel loaded let it handle - * everything else. + * everything else. Also, give a chance to panic_print show some extra + * information on kernel log if it was set... + * * If we want to run this after calling panic_notifiers, pass * the "crash_kexec_post_notifiers" option to the kernel. * * Bypass the panic_cpu check and call __crash_kexec directly. */ if (!_crash_kexec_post_notifiers) { + if (kexec_crash_loaded()) + panic_print_sys_info(false); + __crash_kexec(NULL); /* @@ -283,6 +284,15 @@ void panic(const char *fmt, ...) */ atomic_notifier_call_chain(&panic_notifier_list, 0, buf); + /* + * If a crash kernel is not loaded (or if it's loaded but we still + * want to allow the panic notifiers), then we dump panic_print after + * the notifiers - some notifiers disable watchdogs, for example, so + * we reduce the risk of lockups/hangs or garbled output this way. + */ + if (_crash_kexec_post_notifiers || !kexec_crash_loaded()) + panic_print_sys_info(false); + kmsg_dump(KMSG_DUMP_PANIC); /* @@ -313,7 +323,7 @@ void panic(const char *fmt, ...) debug_locks_off(); console_flush_on_panic(CONSOLE_FLUSH_PENDING); - panic_print_sys_info(); + panic_print_sys_info(true); if (!panic_blink) panic_blink = no_blink; -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH V4] panic: Move panic_print before kmsg dumpers 2022-01-24 20:31 ` Guilherme G. Piccoli @ 2022-01-26 5:22 ` Baoquan He -1 siblings, 0 replies; 6+ messages in thread From: Baoquan He @ 2022-01-26 5:22 UTC (permalink / raw) To: kexec On 01/24/22 at 05:31pm, Guilherme G. Piccoli wrote: Format: <hex>[,nousertaint] ...snip... > diff --git a/kernel/panic.c b/kernel/panic.c > index 41ecf9ab824a..b274e6c241d9 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -148,10 +148,13 @@ void nmi_panic(struct pt_regs *regs, const char *msg) > } > EXPORT_SYMBOL(nmi_panic); > > -static void panic_print_sys_info(void) > +static void panic_print_sys_info(bool console_flush) > { > - if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG) > - console_flush_on_panic(CONSOLE_REPLAY_ALL); > + if (console_flush) { > + if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG) > + console_flush_on_panic(CONSOLE_REPLAY_ALL); > + return; > + } > > if (panic_print & PANIC_PRINT_ALL_CPU_BT) > trigger_all_cpu_backtrace(); > @@ -244,22 +247,20 @@ void panic(const char *fmt, ...) > */ > kgdb_panic(buf); > > - /* > - * If we have a kdump kernel loaded, give a chance to panic_print > - * show some extra information on kernel log if it was set... > - */ > - if (kexec_crash_loaded()) > - panic_print_sys_info(); > - > /* > * If we have crashed and we have a crash kernel loaded let it handle > - * everything else. > + * everything else. Also, give a chance to panic_print show some extra > + * information on kernel log if it was set... > + * > * If we want to run this after calling panic_notifiers, pass > * the "crash_kexec_post_notifiers" option to the kernel. > * > * Bypass the panic_cpu check and call __crash_kexec directly. > */ > if (!_crash_kexec_post_notifiers) { > + if (kexec_crash_loaded()) > + panic_print_sys_info(false); > + Please reconsider this change. As I said in another thread, it's not suggested when adding any action before kdump switching and the action doesn't benefit kdump switching. We don't oppose execute handling before kdump switching as long as it's executed conditionally. For those conditional extra handling and the followoing crash dumping's stability, it's not under kdump's care. > __crash_kexec(NULL); > > /* > @@ -283,6 +284,15 @@ void panic(const char *fmt, ...) > */ > atomic_notifier_call_chain(&panic_notifier_list, 0, buf); > > + /* > + * If a crash kernel is not loaded (or if it's loaded but we still > + * want to allow the panic notifiers), then we dump panic_print after > + * the notifiers - some notifiers disable watchdogs, for example, so > + * we reduce the risk of lockups/hangs or garbled output this way. > + */ > + if (_crash_kexec_post_notifiers || !kexec_crash_loaded()) > + panic_print_sys_info(false); > + > kmsg_dump(KMSG_DUMP_PANIC); > > /* > @@ -313,7 +323,7 @@ void panic(const char *fmt, ...) > debug_locks_off(); > console_flush_on_panic(CONSOLE_FLUSH_PENDING); > > - panic_print_sys_info(); > + panic_print_sys_info(true); > > if (!panic_blink) > panic_blink = no_blink; > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH V4] panic: Move panic_print before kmsg dumpers @ 2022-01-26 5:22 ` Baoquan He 0 siblings, 0 replies; 6+ messages in thread From: Baoquan He @ 2022-01-26 5:22 UTC (permalink / raw) To: Guilherme G. Piccoli Cc: linux-kernel, pmladek, akpm, anton, ccross, dyoung, feng.tang, john.ogness, keescook, kernel, kexec, rostedt, senozhatsky, tony.luck, vgoyal On 01/24/22 at 05:31pm, Guilherme G. Piccoli wrote: Format: <hex>[,nousertaint] ...snip... > diff --git a/kernel/panic.c b/kernel/panic.c > index 41ecf9ab824a..b274e6c241d9 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -148,10 +148,13 @@ void nmi_panic(struct pt_regs *regs, const char *msg) > } > EXPORT_SYMBOL(nmi_panic); > > -static void panic_print_sys_info(void) > +static void panic_print_sys_info(bool console_flush) > { > - if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG) > - console_flush_on_panic(CONSOLE_REPLAY_ALL); > + if (console_flush) { > + if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG) > + console_flush_on_panic(CONSOLE_REPLAY_ALL); > + return; > + } > > if (panic_print & PANIC_PRINT_ALL_CPU_BT) > trigger_all_cpu_backtrace(); > @@ -244,22 +247,20 @@ void panic(const char *fmt, ...) > */ > kgdb_panic(buf); > > - /* > - * If we have a kdump kernel loaded, give a chance to panic_print > - * show some extra information on kernel log if it was set... > - */ > - if (kexec_crash_loaded()) > - panic_print_sys_info(); > - > /* > * If we have crashed and we have a crash kernel loaded let it handle > - * everything else. > + * everything else. Also, give a chance to panic_print show some extra > + * information on kernel log if it was set... > + * > * If we want to run this after calling panic_notifiers, pass > * the "crash_kexec_post_notifiers" option to the kernel. > * > * Bypass the panic_cpu check and call __crash_kexec directly. > */ > if (!_crash_kexec_post_notifiers) { > + if (kexec_crash_loaded()) > + panic_print_sys_info(false); > + Please reconsider this change. As I said in another thread, it's not suggested when adding any action before kdump switching and the action doesn't benefit kdump switching. We don't oppose execute handling before kdump switching as long as it's executed conditionally. For those conditional extra handling and the followoing crash dumping's stability, it's not under kdump's care. > __crash_kexec(NULL); > > /* > @@ -283,6 +284,15 @@ void panic(const char *fmt, ...) > */ > atomic_notifier_call_chain(&panic_notifier_list, 0, buf); > > + /* > + * If a crash kernel is not loaded (or if it's loaded but we still > + * want to allow the panic notifiers), then we dump panic_print after > + * the notifiers - some notifiers disable watchdogs, for example, so > + * we reduce the risk of lockups/hangs or garbled output this way. > + */ > + if (_crash_kexec_post_notifiers || !kexec_crash_loaded()) > + panic_print_sys_info(false); > + > kmsg_dump(KMSG_DUMP_PANIC); > > /* > @@ -313,7 +323,7 @@ void panic(const char *fmt, ...) > debug_locks_off(); > console_flush_on_panic(CONSOLE_FLUSH_PENDING); > > - panic_print_sys_info(); > + panic_print_sys_info(true); > > if (!panic_blink) > panic_blink = no_blink; > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH V4] panic: Move panic_print before kmsg dumpers 2022-01-26 5:22 ` Baoquan He @ 2022-01-27 16:47 ` Guilherme G. Piccoli -1 siblings, 0 replies; 6+ messages in thread From: Guilherme G. Piccoli @ 2022-01-27 16:47 UTC (permalink / raw) To: kexec On 26/01/2022 02:22, Baoquan He wrote: > [...] >> if (!_crash_kexec_post_notifiers) { >> + if (kexec_crash_loaded()) >> + panic_print_sys_info(false); >> + > > Please reconsider this change. As I said in another thread, it's not > suggested when adding any action before kdump switching and the action > doesn't benefit kdump switching. > > We don't oppose execute handling before kdump switching as long as > it's executed conditionally. For those conditional extra handling and > the followoing crash dumping's stability, it's not under kdump's care. > Hi Baoquan, thanks for your review - I understand your concern, so let's reconsider the change, as you suggest. The only thing is that the specific bit that concerns you is not really a code added by the hereby proposed patch, but it was in another patch I submitted, that reached linux-next. So, Andrew : can I ask you to please remove the following patch from linux-next? "panic: allow printing extra panic information on kdump" https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=56439cb78293 (I'll also send this request in the original thread of the patch, for completeness). Baoquan: once it's removed from linux-next, I'll rework this proposed patch and send a V5, hopefully a version that you consider more safe =) Cheers, Guilherme ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH V4] panic: Move panic_print before kmsg dumpers @ 2022-01-27 16:47 ` Guilherme G. Piccoli 0 siblings, 0 replies; 6+ messages in thread From: Guilherme G. Piccoli @ 2022-01-27 16:47 UTC (permalink / raw) To: Baoquan He, akpm Cc: linux-kernel, pmladek, anton, ccross, dyoung, feng.tang, john.ogness, keescook, kernel, kexec, rostedt, senozhatsky, tony.luck, vgoyal On 26/01/2022 02:22, Baoquan He wrote: > [...] >> if (!_crash_kexec_post_notifiers) { >> + if (kexec_crash_loaded()) >> + panic_print_sys_info(false); >> + > > Please reconsider this change. As I said in another thread, it's not > suggested when adding any action before kdump switching and the action > doesn't benefit kdump switching. > > We don't oppose execute handling before kdump switching as long as > it's executed conditionally. For those conditional extra handling and > the followoing crash dumping's stability, it's not under kdump's care. > Hi Baoquan, thanks for your review - I understand your concern, so let's reconsider the change, as you suggest. The only thing is that the specific bit that concerns you is not really a code added by the hereby proposed patch, but it was in another patch I submitted, that reached linux-next. So, Andrew : can I ask you to please remove the following patch from linux-next? "panic: allow printing extra panic information on kdump" https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=56439cb78293 (I'll also send this request in the original thread of the patch, for completeness). Baoquan: once it's removed from linux-next, I'll rework this proposed patch and send a V5, hopefully a version that you consider more safe =) Cheers, Guilherme ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-01-27 16:48 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-01-24 20:31 [PATCH V4] panic: Move panic_print before kmsg dumpers Guilherme G. Piccoli 2022-01-24 20:31 ` Guilherme G. Piccoli 2022-01-26 5:22 ` Baoquan He 2022-01-26 5:22 ` Baoquan He 2022-01-27 16:47 ` Guilherme G. Piccoli 2022-01-27 16:47 ` Guilherme G. Piccoli
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.