public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] printk: CPU backtrace not printing on panic
@ 2024-08-12  7:21 takakura
  2024-08-12  7:27 ` [PATCH v3 1/2] Allow cpu backtraces to be written into ringbuffer during panic takakura
  2024-08-12  7:29 ` [PATCH v3 2/2] Handle flushing of CPU backtraces " takakura
  0 siblings, 2 replies; 11+ messages in thread
From: takakura @ 2024-08-12  7:21 UTC (permalink / raw)
  To: pmladek, rostedt, john.ogness, senozhatsky, akpm, bhe, lukas,
	wangkefeng.wang, ubizjak, feng.tang, j.granados,
	stephen.s.brennan
  Cc: linux-kernel, nishimura, taka, Ryo Takakura

From: Ryo Takakura <takakura@valinux.co.jp>

Hi! 

This patchset fixes 2 issues on CPU backtrace during panic. The second 
issue was pointed out by John [1].

(1) CPU backtrace triggered during panic has stopped working since the 
commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing to ringbuffer") 
as it disabled non-panicked cpus writing into ringbuffer after panic.

(2) The usual(non-panic context) flushing of backtraces written into 
ringbuffer does not work during panic as non-panicked CPUs can't do the 
flushing themselves.

Reviewed-by: John Ogness <john.ogness@linutronix.de>

Sincerely,
Ryo Takakura

---

V1:
[1] https://lore.kernel.org/all/20240729114601.176047-1-takakura@valinux.co.jp/T/

Changes since V2:
[2] https://lore.kernel.org/lkml/20240803081649.224627-1-takakura@valinux.co.jp/T/

- Thanks Petr and John for the feedbacks!
- Add reviewed-by. Petr on (1) and John on the series.
- Fix the order of the patches based on their dependency as suggested by John.
- For (2), instead of using console_flush_on_panic() for flushing backtraces, 
introduce a function which can safely do the flushing in panic while non-panic 
CPU's are still online as suggested by Petr.  Note that the new function calls 
is_printk_legacy_deferred() which is not yet in mainline [3].
[3] https://lore.kernel.org/all/20240804005138.3722656-24-john.ogness@linutronix.de/

---

Ryo Takakura (2):
  (1) Allow cpu backtraces to be written into ringbuffer during panic
  (2) Handle flushing of CPU backtraces during panic

 include/linux/console.h |  1 +
 include/linux/panic.h   |  1 +
 kernel/panic.c          |  9 ++++++++-
 kernel/printk/printk.c  | 16 +++++++++++++++-
 4 files changed, 25 insertions(+), 2 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/2] Allow cpu backtraces to be written into ringbuffer during panic
  2024-08-12  7:21 [PATCH v3 0/2] printk: CPU backtrace not printing on panic takakura
@ 2024-08-12  7:27 ` takakura
  2024-08-13 13:48   ` Petr Mladek
  2024-08-12  7:29 ` [PATCH v3 2/2] Handle flushing of CPU backtraces " takakura
  1 sibling, 1 reply; 11+ messages in thread
From: takakura @ 2024-08-12  7:27 UTC (permalink / raw)
  To: pmladek, rostedt, john.ogness, senozhatsky, akpm, bhe, lukas,
	wangkefeng.wang, ubizjak, feng.tang, j.granados,
	stephen.s.brennan
  Cc: linux-kernel, nishimura, taka, Ryo Takakura

From: Ryo Takakura <takakura@valinux.co.jp>

commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing
to ringbuffer") disabled non-panic CPUs to further write messages to
ringbuffer after panicked.

Since the commit, non-panicked CPU's are not allowed to write to
ring buffer after panicked and CPU backtrace which is triggered
after panicked to sample non-panicked CPUs' backtrace no longer
serves its function as it has nothing to print.

Fix the issue by allowing non-panicked CPUs to write into ringbuffer
while CPU backtrace is in flight.

Fixes: 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing to ringbuffer")
Signed-off-by: Ryo Takakura <takakura@valinux.co.jp>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/panic.h  | 1 +
 kernel/panic.c         | 8 +++++++-
 kernel/printk/printk.c | 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/include/linux/panic.h b/include/linux/panic.h
index 3130e0b51..54d90b6c5 100644
--- a/include/linux/panic.h
+++ b/include/linux/panic.h
@@ -16,6 +16,7 @@ extern void oops_enter(void);
 extern void oops_exit(void);
 extern bool oops_may_print(void);
 
+extern bool panic_triggering_all_cpu_backtrace;
 extern int panic_timeout;
 extern unsigned long panic_print;
 extern int panic_on_oops;
diff --git a/kernel/panic.c b/kernel/panic.c
index f861bedc1..2a0449144 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -64,6 +64,8 @@ unsigned long panic_on_taint;
 bool panic_on_taint_nousertaint = false;
 static unsigned int warn_limit __read_mostly;
 
+bool panic_triggering_all_cpu_backtrace;
+
 int panic_timeout = CONFIG_PANIC_TIMEOUT;
 EXPORT_SYMBOL_GPL(panic_timeout);
 
@@ -253,8 +255,12 @@ void check_panic_on_warn(const char *origin)
  */
 static void panic_other_cpus_shutdown(bool crash_kexec)
 {
-	if (panic_print & PANIC_PRINT_ALL_CPU_BT)
+	if (panic_print & PANIC_PRINT_ALL_CPU_BT) {
+		/* Temporary allow non-panic CPUs to write their backtraces. */
+		panic_triggering_all_cpu_backtrace = true;
 		trigger_all_cpu_backtrace();
+		panic_triggering_all_cpu_backtrace = false;
+	}
 
 	/*
 	 * Note that smp_send_stop() is the usual SMP shutdown function,
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 054c0e778..c22b07049 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2316,7 +2316,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * non-panic CPUs are generating any messages, they will be
 	 * silently dropped.
 	 */
-	if (other_cpu_in_panic())
+	if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
 		return 0;
 
 	if (level == LOGLEVEL_SCHED) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/2] Handle flushing of CPU backtraces during panic
  2024-08-12  7:21 [PATCH v3 0/2] printk: CPU backtrace not printing on panic takakura
  2024-08-12  7:27 ` [PATCH v3 1/2] Allow cpu backtraces to be written into ringbuffer during panic takakura
@ 2024-08-12  7:29 ` takakura
  2024-08-13 13:45   ` Petr Mladek
  1 sibling, 1 reply; 11+ messages in thread
From: takakura @ 2024-08-12  7:29 UTC (permalink / raw)
  To: pmladek, rostedt, john.ogness, senozhatsky, akpm, bhe, lukas,
	wangkefeng.wang, ubizjak, feng.tang, j.granados,
	stephen.s.brennan
  Cc: linux-kernel, nishimura, taka, Ryo Takakura

From: Ryo Takakura <takakura@valinux.co.jp>

After panic, non-panicked CPU's has been unable to flush ringbuffer
while they can still write into it. This can affect CPU backtrace
triggered in panic only able to write into ringbuffer incapable of
flushing them.

Fix the issue by letting the panicked CPU handle the flushing of
ringbuffer right after non-panicked CPUs finished writing their
backtraces.

Signed-off-by: Ryo Takakura <takakura@valinux.co.jp>
---
 include/linux/console.h |  1 +
 kernel/panic.c          |  1 +
 kernel/printk/printk.c  | 14 ++++++++++++++
 3 files changed, 16 insertions(+)

diff --git a/include/linux/console.h b/include/linux/console.h
index 31a8f5b85..c7eb6f785 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -504,6 +504,7 @@ extern void console_unlock(void);
 extern void console_conditional_schedule(void);
 extern void console_unblank(void);
 extern void console_flush_on_panic(enum con_flush_mode mode);
+extern void console_try_flush(void);
 extern struct tty_driver *console_device(int *);
 extern void console_stop(struct console *);
 extern void console_start(struct console *);
diff --git a/kernel/panic.c b/kernel/panic.c
index 2a0449144..6519cc6bd 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -260,6 +260,7 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
 		panic_triggering_all_cpu_backtrace = true;
 		trigger_all_cpu_backtrace();
 		panic_triggering_all_cpu_backtrace = false;
+		console_try_flush();
 	}
 
 	/*
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index c22b07049..517b8b02e 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3284,6 +3284,20 @@ void console_flush_on_panic(enum con_flush_mode mode)
 	console_flush_all(false, &next_seq, &handover);
 }
 
+/**
+ * console_try_flush - try to flush consoles when safe
+ *
+ * Context: Any, except for NMI.
+ */
+void console_try_flush(void)
+{
+	if (is_printk_legacy_deferred())
+		return;
+
+	if (console_trylock())
+		console_unlock();
+}
+
 /*
  * Return the console tty driver structure and its associated index
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] Handle flushing of CPU backtraces during panic
  2024-08-12  7:29 ` [PATCH v3 2/2] Handle flushing of CPU backtraces " takakura
@ 2024-08-13 13:45   ` Petr Mladek
  2024-08-15 10:43     ` takakura
  2024-08-19 19:23     ` John Ogness
  0 siblings, 2 replies; 11+ messages in thread
From: Petr Mladek @ 2024-08-13 13:45 UTC (permalink / raw)
  To: takakura
  Cc: rostedt, john.ogness, senozhatsky, akpm, bhe, lukas,
	wangkefeng.wang, ubizjak, feng.tang, j.granados,
	stephen.s.brennan, linux-kernel, nishimura, taka

On Mon 2024-08-12 16:29:31, takakura@valinux.co.jp wrote:
> From: Ryo Takakura <takakura@valinux.co.jp>
> 
> After panic, non-panicked CPU's has been unable to flush ringbuffer
> while they can still write into it. This can affect CPU backtrace
> triggered in panic only able to write into ringbuffer incapable of
> flushing them.

I still think about this. The motivation is basically the same
as in the commit 5d5e4522a7f404d1a96fd ("printk: restore flushing
of NMI buffers on remote CPUs after NMI backtraces").

It would make sense to replace printk_trigger_flush() with
console_try_flush(). And the function should queue the irq
work when it can't be flushed directly, see below.

> Fix the issue by letting the panicked CPU handle the flushing of
> ringbuffer right after non-panicked CPUs finished writing their
> backtraces.
> 
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -260,6 +260,7 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
>  		panic_triggering_all_cpu_backtrace = true;
>  		trigger_all_cpu_backtrace();
>  		panic_triggering_all_cpu_backtrace = false;
> +		console_try_flush();
>  	}
>  
>  	/*
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3284,6 +3284,20 @@ void console_flush_on_panic(enum con_flush_mode mode)
>  	console_flush_all(false, &next_seq, &handover);
>  }
>  
> +/**
> + * console_try_flush - try to flush consoles when safe
> + *
> + * Context: Any, except for NMI.

It is safe even in NMI. is_printk_legacy_deferred() would return true
in this case.

> + */
> +void console_try_flush(void)
> +{
> +	if (is_printk_legacy_deferred())
> +		return;
> +
> +	if (console_trylock())
> +		console_unlock();
> +}

I would do something like:

/**
 * console_try_or_trigger_flush - try to flush consoles directly when
 *	safe or the trigger deferred flush.
 *
 * Context: Any
 */
void console_try_or_trigger_flush(void)
{
	if (!is_printk_legacy_deferred() && console_trylock())
		console_unlock();
	else
		defer_console_output();
}

and use it instead of printk_trigger_flush() in
nmi_trigger_cpumask_backtrace().


Well, I would postpone this patch after we finalize the patchset
adding con->write_atomic() callback. This patch depends on it anyway
via is_printk_legacy_deferred(). The patchset might also add
other wrappers for flushing consoles and we have to choose some
reasonable names. Or John could integrate this patch into the patchset.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] Allow cpu backtraces to be written into ringbuffer during panic
  2024-08-12  7:27 ` [PATCH v3 1/2] Allow cpu backtraces to be written into ringbuffer during panic takakura
@ 2024-08-13 13:48   ` Petr Mladek
  2024-08-15 10:41     ` takakura
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Mladek @ 2024-08-13 13:48 UTC (permalink / raw)
  To: takakura
  Cc: rostedt, john.ogness, senozhatsky, akpm, bhe, lukas,
	wangkefeng.wang, ubizjak, feng.tang, j.granados,
	stephen.s.brennan, linux-kernel, nishimura, taka

On Mon 2024-08-12 16:27:03, takakura@valinux.co.jp wrote:
> From: Ryo Takakura <takakura@valinux.co.jp>
> 
> commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing
> to ringbuffer") disabled non-panic CPUs to further write messages to
> ringbuffer after panicked.
> 
> Since the commit, non-panicked CPU's are not allowed to write to
> ring buffer after panicked and CPU backtrace which is triggered
> after panicked to sample non-panicked CPUs' backtrace no longer
> serves its function as it has nothing to print.
> 
> Fix the issue by allowing non-panicked CPUs to write into ringbuffer
> while CPU backtrace is in flight.
> 
> Fixes: 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing to ringbuffer")
> Signed-off-by: Ryo Takakura <takakura@valinux.co.jp>
> Reviewed-by: Petr Mladek <pmladek@suse.com>

JFYI, I have pushed this patch into printk/linux.git, branch for-6.11-fixup.
I am going to create pull request after it spends at least one or two
days in linux-next.

The 2nd patch is more complicated. It depends on another patchset
integrating con->write_atomic() callback. 

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] Allow cpu backtraces to be written into ringbuffer during panic
  2024-08-13 13:48   ` Petr Mladek
@ 2024-08-15 10:41     ` takakura
  0 siblings, 0 replies; 11+ messages in thread
From: takakura @ 2024-08-15 10:41 UTC (permalink / raw)
  To: pmladek, john.ogness
  Cc: akpm, bhe, feng.tang, j.granados, linux-kernel, lukas, nishimura,
	rostedt, senozhatsky, stephen.s.brennan, taka, takakura, ubizjak,
	wangkefeng.wang

Hi Petr and John,

On Tue 2024-08-13 13:48, Petr Mladek wrote:
>On Mon 2024-08-12 16:27:03, takakura@valinux.co.jp wrote:
>> From: Ryo Takakura <takakura@valinux.co.jp>
>> 
>> commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing
>> to ringbuffer") disabled non-panic CPUs to further write messages to
>> ringbuffer after panicked.
>> 
>> Since the commit, non-panicked CPU's are not allowed to write to
>> ring buffer after panicked and CPU backtrace which is triggered
>> after panicked to sample non-panicked CPUs' backtrace no longer
>> serves its function as it has nothing to print.
>> 
>> Fix the issue by allowing non-panicked CPUs to write into ringbuffer
>> while CPU backtrace is in flight.
>> 
>> Fixes: 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing to ringbuffer")
>> Signed-off-by: Ryo Takakura <takakura@valinux.co.jp>
>> Reviewed-by: Petr Mladek <pmladek@suse.com>
>
>JFYI, I have pushed this patch into printk/linux.git, branch for-6.11-fixup.
>I am going to create pull request after it spends at least one or two
>days in linux-next.

Thanks!!

>The 2nd patch is more complicated. It depends on another patchset
>integrating con->write_atomic() callback. 
>
>Best Regards,
>Petr

Sincerely,
Ryo Takakura

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] Handle flushing of CPU backtraces during panic
  2024-08-13 13:45   ` Petr Mladek
@ 2024-08-15 10:43     ` takakura
  2024-08-19 19:23     ` John Ogness
  1 sibling, 0 replies; 11+ messages in thread
From: takakura @ 2024-08-15 10:43 UTC (permalink / raw)
  To: pmladek, john.ogness
  Cc: akpm, bhe, feng.tang, j.granados, linux-kernel, lukas, nishimura,
	rostedt, senozhatsky, stephen.s.brennan, taka, takakura, ubizjak,
	wangkefeng.wang

Hi Petr and John,

On Tue 2024-08-13 13:45, Petr Mladek wrote:
>On Mon 2024-08-12 16:29:31, takakura@valinux.co.jp wrote:
>> From: Ryo Takakura <takakura@valinux.co.jp>
>> 
>> After panic, non-panicked CPU's has been unable to flush ringbuffer
>> while they can still write into it. This can affect CPU backtrace
>> triggered in panic only able to write into ringbuffer incapable of
>> flushing them.
>
>I still think about this. The motivation is basically the same
>as in the commit 5d5e4522a7f404d1a96fd ("printk: restore flushing
>of NMI buffers on remote CPUs after NMI backtraces").
>
>It would make sense to replace printk_trigger_flush() with
>console_try_flush(). And the function should queue the irq
>work when it can't be flushed directly, see below.

Yes, I agree.

>> Fix the issue by letting the panicked CPU handle the flushing of
>> ringbuffer right after non-panicked CPUs finished writing their
>> backtraces.
>> 
>> --- a/kernel/panic.c
>> +++ b/kernel/panic.c
>> @@ -260,6 +260,7 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
>>  		panic_triggering_all_cpu_backtrace = true;
>>  		trigger_all_cpu_backtrace();
>>  		panic_triggering_all_cpu_backtrace = false;
>> +		console_try_flush();
>>  	}
>> 
>>  	/*
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -3284,6 +3284,20 @@ void console_flush_on_panic(enum con_flush_mode mode)
>>  	console_flush_all(false, &next_seq, &handover);
>>  }
>> 
>> +/**
>> + * console_try_flush - try to flush consoles when safe
>> + *
>> + * Context: Any, except for NMI.
>
>It is safe even in NMI. is_printk_legacy_deferred() would return true
>in this case.

I'm sorry... I didn't notice this. Thanks.

>> + */
>> +void console_try_flush(void)
>> +{
>> +	if (is_printk_legacy_deferred())
>> +		return;
>> +
>> +	if (console_trylock())
>> +		console_unlock();
>> +}
>
>I would do something like:
>
>/**
> * console_try_or_trigger_flush - try to flush consoles directly when
> *	safe or the trigger deferred flush.
> *
> * Context: Any
> */
>void console_try_or_trigger_flush(void)
>{
>	if (!is_printk_legacy_deferred() && console_trylock())
>		console_unlock();
>	else
>		defer_console_output();
>}
>
>and use it instead of printk_trigger_flush() in
>nmi_trigger_cpumask_backtrace().

Yes, that sounds better to me as well!

>Well, I would postpone this patch after we finalize the patchset
>adding con->write_atomic() callback. This patch depends on it anyway
>via is_printk_legacy_deferred(). The patchset might also add
>other wrappers for flushing consoles and we have to choose some
>reasonable names. Or John could integrate this patch into the patchset.

Sure, I can resend one once the patchset is finalized!
Or would it be better if I just leave it to John so that it can be part of 
the patchset?

>Best Regards,
>Petr

Sincerely,
Ryo Takakura

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] Handle flushing of CPU backtraces during panic
  2024-08-13 13:45   ` Petr Mladek
  2024-08-15 10:43     ` takakura
@ 2024-08-19 19:23     ` John Ogness
  2024-08-21  5:02       ` takakura
  1 sibling, 1 reply; 11+ messages in thread
From: John Ogness @ 2024-08-19 19:23 UTC (permalink / raw)
  To: Petr Mladek, takakura
  Cc: rostedt, senozhatsky, akpm, bhe, lukas, wangkefeng.wang, ubizjak,
	feng.tang, j.granados, stephen.s.brennan, linux-kernel, nishimura,
	taka

On 2024-08-13, Petr Mladek <pmladek@suse.com> wrote:
> I would do something like:
>
> /**
>  * console_try_or_trigger_flush - try to flush consoles directly when
>  *	safe or the trigger deferred flush.
>  *
>  * Context: Any
>  */
> void console_try_or_trigger_flush(void)
> {
> 	if (!is_printk_legacy_deferred() && console_trylock())
> 		console_unlock();
> 	else
> 		defer_console_output();
> }
>
> and use it instead of printk_trigger_flush() in
> nmi_trigger_cpumask_backtrace().

Just to be clear, you are talking about removing printk_trigger_flush()
entirely and instead provide the new console_try_or_trigger_flush()?
Which then also involves updating the call sites:

lib/nmi_backtrace.c:nmi_trigger_cpumask_backtrace()
arch/powerpc/kernel/watchdog.c:watchdog_timer_interrupt()

> Well, I would postpone this patch after we finalize the patchset
> adding con->write_atomic() callback. This patch depends on it anyway
> via is_printk_legacy_deferred(). The patchset might also add
> other wrappers for flushing consoles and we have to choose some
> reasonable names.

I agree. Let's finish up the atomic series and then we can worry about
this.

John

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] Handle flushing of CPU backtraces during panic
  2024-08-19 19:23     ` John Ogness
@ 2024-08-21  5:02       ` takakura
  2024-08-26 15:46         ` John Ogness
  0 siblings, 1 reply; 11+ messages in thread
From: takakura @ 2024-08-21  5:02 UTC (permalink / raw)
  To: pmladek, john.ogness
  Cc: akpm, bhe, feng.tang, j.granados, linux-kernel, lukas, nishimura,
	rostedt, senozhatsky, stephen.s.brennan, taka, takakura, ubizjak,
	wangkefeng.wang

Hi Petr and John,

On 2024-08-19, John Ogness wrote:
>On 2024-08-13, Petr Mladek <pmladek@suse.com> wrote:
>> I would do something like:
>>
>> /**
>>  * console_try_or_trigger_flush - try to flush consoles directly when
>>  *   safe or the trigger deferred flush.
>>  *
>>  * Context: Any
>>  */
>> void console_try_or_trigger_flush(void)
>> {
>>      if (!is_printk_legacy_deferred() && console_trylock())
>>              console_unlock();
>>      else
>>              defer_console_output();
>> }
>>
>> and use it instead of printk_trigger_flush() in
>> nmi_trigger_cpumask_backtrace().
>
>Just to be clear, you are talking about removing printk_trigger_flush()
>entirely and instead provide the new console_try_or_trigger_flush()?
>Which then also involves updating the call sites:
>
>lib/nmi_backtrace.c:nmi_trigger_cpumask_backtrace()
>arch/powerpc/kernel/watchdog.c:watchdog_timer_interrupt()
>

Taking a look at [0], in addition to the mentioned call sites, 
nbcon_device_release() will also be calling printk_trigger_flush()?
For nbcon_device_release(), I thought its better not to be replaced as 
it calles for @legacy_off, in which case printk_trigger_flush() seems 
more suitable as it always defers printing.

Also taking a look at the [1], for nmi_trigger_cpumask_backtrace(), 
I thought that it will not comply with the syncing of 
legacy_allow_panic_sync. I believe it will allow flushing of legacy consoles 
before printk_legacy_allow_panic_sync() which is out of sync.

>> Well, I would postpone this patch after we finalize the patchset
>> adding con->write_atomic() callback. This patch depends on it anyway
>> via is_printk_legacy_deferred(). The patchset might also add
>> other wrappers for flushing consoles and we have to choose some
>> reasonable names.
>
>I agree. Let's finish up the atomic series and then we can worry about
>this.
>

Ok! I see that it can be better discussed after the atomic series.

>John

Sincerely,
Ryo Takakura

[0] https://lore.kernel.org/all/20240820063001.36405-31-john.ogness@linutronix.de/
[1] https://lore.kernel.org/all/20240820063001.36405-30-john.ogness@linutronix.de/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] Handle flushing of CPU backtraces during panic
  2024-08-21  5:02       ` takakura
@ 2024-08-26 15:46         ` John Ogness
  2024-08-31  8:20           ` takakura
  0 siblings, 1 reply; 11+ messages in thread
From: John Ogness @ 2024-08-26 15:46 UTC (permalink / raw)
  To: takakura, pmladek
  Cc: akpm, bhe, feng.tang, j.granados, linux-kernel, lukas, nishimura,
	rostedt, senozhatsky, stephen.s.brennan, taka, takakura, ubizjak,
	wangkefeng.wang

On 2024-08-21, takakura@valinux.co.jp wrote:
>>> /**
>>>  * console_try_or_trigger_flush - try to flush consoles directly when
>>>  *   safe or the trigger deferred flush.
>>>  *
>>>  * Context: Any
>>>  */
>>> void console_try_or_trigger_flush(void)
>>> {
>>>      if (!is_printk_legacy_deferred() && console_trylock())
>>>              console_unlock();
>>>      else
>>>              defer_console_output();
>>> }
>>>
>>> and use it instead of printk_trigger_flush() in
>>> nmi_trigger_cpumask_backtrace().
>>
>> Just to be clear, you are talking about removing
>> printk_trigger_flush() entirely and instead provide the new
>> console_try_or_trigger_flush()?  Which then also involves updating
>> the call sites:
>>
>> lib/nmi_backtrace.c:nmi_trigger_cpumask_backtrace()
>> arch/powerpc/kernel/watchdog.c:watchdog_timer_interrupt()
>>
>
> Taking a look at [0], in addition to the mentioned call sites, 
> nbcon_device_release() will also be calling printk_trigger_flush()?
> For nbcon_device_release(), I thought its better not to be replaced as 
> it calles for @legacy_off, in which case printk_trigger_flush() seems 
> more suitable as it always defers printing.

The @legacy_off logic would be within console_try_or_trigger_flush(),
so the result would be the same.

> Also taking a look at the [1], for nmi_trigger_cpumask_backtrace(), 
> I thought that it will not comply with the syncing of 
> legacy_allow_panic_sync. I believe it will allow flushing of legacy consoles 
> before printk_legacy_allow_panic_sync() which is out of sync.

But isn't your patch also causing that violation?

printk_legacy_allow_panic_sync() performs a trylock/unlock. Isn't that
good enough?

John

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] Handle flushing of CPU backtraces during panic
  2024-08-26 15:46         ` John Ogness
@ 2024-08-31  8:20           ` takakura
  0 siblings, 0 replies; 11+ messages in thread
From: takakura @ 2024-08-31  8:20 UTC (permalink / raw)
  To: john.ogness
  Cc: pmladek, akpm, bhe, feng.tang, j.granados, linux-kernel, lukas,
	nishimura, rostedt, senozhatsky, stephen.s.brennan, taka,
	takakura, ubizjak, wangkefeng.wang

Hi John,

Thanks for checking into what I pointed out!

On 2024-08-26, John Ogness wrote:
>On 2024-08-21, takakura@valinux.co.jp wrote:
>>>> /**
>>>>  * console_try_or_trigger_flush - try to flush consoles directly when
>>>>  *   safe or the trigger deferred flush.
>>>>  *
>>>>  * Context: Any
>>>>  */
>>>> void console_try_or_trigger_flush(void)
>>>> {
>>>>      if (!is_printk_legacy_deferred() && console_trylock())
>>>>              console_unlock();
>>>>      else
>>>>              defer_console_output();
>>>> }
>>>>
>>>> and use it instead of printk_trigger_flush() in
>>>> nmi_trigger_cpumask_backtrace().
>>>
>>> Just to be clear, you are talking about removing
>>> printk_trigger_flush() entirely and instead provide the new
>>> console_try_or_trigger_flush()?  Which then also involves updating
>>> the call sites:
>>>
>>> lib/nmi_backtrace.c:nmi_trigger_cpumask_backtrace()
>>> arch/powerpc/kernel/watchdog.c:watchdog_timer_interrupt()
>>>
>>
>> Taking a look at [0], in addition to the mentioned call sites, 
>> nbcon_device_release() will also be calling printk_trigger_flush()?
>> For nbcon_device_release(), I thought its better not to be replaced as 
>> it calles for @legacy_off, in which case printk_trigger_flush() seems 
>> more suitable as it always defers printing.
>
>The @legacy_off logic would be within console_try_or_trigger_flush(),
>so the result would be the same.

I see.

>> Also taking a look at the [1], for nmi_trigger_cpumask_backtrace(), 
>> I thought that it will not comply with the syncing of 
>> legacy_allow_panic_sync. I believe it will allow flushing of legacy consoles 
>> before printk_legacy_allow_panic_sync() which is out of sync.
>
>But isn't your patch also causing that violation?

Yes, the patch I sent would be causing the violation...
Sorry I should have asked this earlier before sending patch. The question came up
after sending the patch.

>printk_legacy_allow_panic_sync() performs a trylock/unlock. Isn't that
>good enough?

Also sorry for being unclear here. I also think that 
printk_legacy_allow_panic_sync()'s trylock/unlock is good enough.
My question was that if we were to call console_try_or_trigger_flush() in
panic, wouldn't we need to check @legacy_direct to avoid flushing 
of legacy consoles.

Something like:
/**
 * console_try_or_trigger_flush - try to flush consoles directly when
 *	safe or the trigger deferred flush.
 *
 * Context: Any
 */
void console_try_or_trigger_flush(void)
{
        struct console_flush_type ft;
        printk_get_console_flush_type(&ft);
        if (ft.legacy_direct) {
                if (console_trylock())
                        console_unlock();
        } else {
                defer_console_output();
        }
}

>John

Sincerely,
Ryo Takakura

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-08-31  8:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-12  7:21 [PATCH v3 0/2] printk: CPU backtrace not printing on panic takakura
2024-08-12  7:27 ` [PATCH v3 1/2] Allow cpu backtraces to be written into ringbuffer during panic takakura
2024-08-13 13:48   ` Petr Mladek
2024-08-15 10:41     ` takakura
2024-08-12  7:29 ` [PATCH v3 2/2] Handle flushing of CPU backtraces " takakura
2024-08-13 13:45   ` Petr Mladek
2024-08-15 10:43     ` takakura
2024-08-19 19:23     ` John Ogness
2024-08-21  5:02       ` takakura
2024-08-26 15:46         ` John Ogness
2024-08-31  8:20           ` takakura

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox