* [PATCH printk v2 0/2] Fix reported suspend failures
@ 2025-11-13 16:03 John Ogness
2025-11-13 16:03 ` [PATCH printk v2 1/2] printk: Allow printk_trigger_flush() to flush all types John Ogness
` (5 more replies)
0 siblings, 6 replies; 17+ messages in thread
From: John Ogness @ 2025-11-13 16:03 UTC (permalink / raw)
To: Petr Mladek
Cc: Sergey Senozhatsky, Steven Rostedt, Sherry Sun, Jacky Bai,
Jon Hunter, Thierry Reding, Derek Barbosa, linux-kernel, stable
This is v2 of a series to address multiple reports [0][1]
(+ 2 offlist) of suspend failing when NBCON console drivers are
in use. With the help of NXP and NVIDIA we were able to isolate
the problem and verify the fix.
v1 is here [2].
The first NBCON drivers appeared in 6.13, so currently there is
no LTS kernel that requires this series. But it should go into
6.17.x and 6.18.
The changes since v1:
- For printk_trigger_flush() add support for all flush types
that are available. This will prevent printk_trigger_flush()
from trying to inappropriately queue irq_work after this
series is applied.
- Add WARN_ON_ONCE() to the printk irq_work queueing functions
in case they are called when irq_work is blocked. There
should never be (and currently are no) such callers, but
these functions are externally available.
John Ogness
[0] https://lore.kernel.org/lkml/80b020fc-c18a-4da4-b222-16da1cab2f4c@nvidia.com
[1] https://lore.kernel.org/lkml/DB9PR04MB8429E7DDF2D93C2695DE401D92C4A@DB9PR04MB8429.eurprd04.prod.outlook.com
[2] https://lore.kernel.org/lkml/20251111144328.887159-1-john.ogness@linutronix.de
John Ogness (2):
printk: Allow printk_trigger_flush() to flush all types
printk: Avoid scheduling irq_work on suspend
kernel/printk/internal.h | 8 ++--
kernel/printk/nbcon.c | 9 ++++-
kernel/printk/printk.c | 81 ++++++++++++++++++++++++++++++++--------
3 files changed, 78 insertions(+), 20 deletions(-)
base-commit: e9a6fb0bcdd7609be6969112f3fbfcce3b1d4a7c
--
2.47.3
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH printk v2 1/2] printk: Allow printk_trigger_flush() to flush all types
2025-11-13 16:03 [PATCH printk v2 0/2] Fix reported suspend failures John Ogness
@ 2025-11-13 16:03 ` John Ogness
2025-11-13 16:20 ` kernel test robot
2025-11-14 13:42 ` Petr Mladek
2025-11-13 16:03 ` [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend John Ogness
` (4 subsequent siblings)
5 siblings, 2 replies; 17+ messages in thread
From: John Ogness @ 2025-11-13 16:03 UTC (permalink / raw)
To: Petr Mladek
Cc: Sergey Senozhatsky, Steven Rostedt, Sherry Sun, Jacky Bai,
Jon Hunter, Thierry Reding, Derek Barbosa, linux-kernel, stable
Currently printk_trigger_flush() only triggers legacy offloaded
flushing, even if that may not be the appropriate method to flush
for currently registered consoles. (The function predates the
NBCON consoles.)
Since commit 6690d6b52726 ("printk: Add helper for flush type
logic") there is printk_get_console_flush_type(), which also
considers NBCON consoles and reports all the methods of flushing
appropriate based on the system state and consoles available.
Update printk_trigger_flush() to use
printk_get_console_flush_type() to appropriately flush registered
consoles.
Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
kernel/printk/nbcon.c | 2 +-
kernel/printk/printk.c | 23 ++++++++++++++++++++++-
2 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
index 558ef31779760..73f315fd97a3e 100644
--- a/kernel/printk/nbcon.c
+++ b/kernel/printk/nbcon.c
@@ -1849,7 +1849,7 @@ void nbcon_device_release(struct console *con)
if (console_trylock())
console_unlock();
} else if (ft.legacy_offload) {
- printk_trigger_flush();
+ defer_console_output();
}
}
console_srcu_read_unlock(cookie);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 5aee9ffb16b9a..dc89239cf1b58 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -4567,9 +4567,30 @@ void defer_console_output(void)
__wake_up_klogd(PRINTK_PENDING_WAKEUP | PRINTK_PENDING_OUTPUT);
}
+/**
+ * printk_trigger_flush - Attempt to flush printk buffer to consoles.
+ *
+ * If possible, flush the printk buffer to all consoles in the caller's
+ * context. If offloading is available, trigger deferred printing.
+ *
+ * This is best effort. Depending on the system state, console states,
+ * and caller context, no actual flushing may result from this call.
+ */
void printk_trigger_flush(void)
{
- defer_console_output();
+ struct console_flush_type ft;
+
+ printk_get_console_flush_type(&ft);
+ if (ft.nbcon_atomic)
+ nbcon_atomic_flush_pending();
+ if (ft.nbcon_offload)
+ nbcon_kthreads_wake();
+ if (ft.legacy_direct) {
+ if (console_trylock())
+ console_unlock();
+ }
+ if (ft.legacy_offload)
+ defer_console_output();
}
int vprintk_deferred(const char *fmt, va_list args)
--
2.47.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend
2025-11-13 16:03 [PATCH printk v2 0/2] Fix reported suspend failures John Ogness
2025-11-13 16:03 ` [PATCH printk v2 1/2] printk: Allow printk_trigger_flush() to flush all types John Ogness
@ 2025-11-13 16:03 ` John Ogness
2025-11-13 16:38 ` Derek Barbosa
2025-11-14 14:55 ` Petr Mladek
2025-11-14 14:57 ` [PATCH printk v2 0/2] Fix reported suspend failures Petr Mladek
` (3 subsequent siblings)
5 siblings, 2 replies; 17+ messages in thread
From: John Ogness @ 2025-11-13 16:03 UTC (permalink / raw)
To: Petr Mladek
Cc: Sergey Senozhatsky, Steven Rostedt, Sherry Sun, Jacky Bai,
Jon Hunter, Thierry Reding, Derek Barbosa, linux-kernel, stable
Allowing irq_work to be scheduled while trying to suspend has shown
to cause problems as some architectures interpret the pending
interrupts as a reason to not suspend. This became a problem for
printk() with the introduction of NBCON consoles. With every
printk() call, NBCON console printing kthreads are woken by queueing
irq_work. This means that irq_work continues to be queued due to
printk() calls late in the suspend procedure.
Avoid this problem by preventing printk() from queueing irq_work
once console suspending has begun. This applies to triggering NBCON
and legacy deferred printing as well as klogd waiters.
Since triggering of NBCON threaded printing relies on irq_work, the
pr_flush() within console_suspend_all() is used to perform the final
flushing before suspending consoles and blocking irq_work queueing.
NBCON consoles that are not suspended (due to the usage of the
"no_console_suspend" boot argument) transition to atomic flushing.
Introduce a new global variable @console_irqwork_blocked to flag
when irq_work queueing is to be avoided. The flag is used by
printk_get_console_flush_type() to avoid allowing deferred printing
and switch NBCON consoles to atomic flushing. It is also used by
vprintk_emit() to avoid klogd waking.
Add WARN_ON_ONCE(console_irqwork_blocked) to the irq_work queuing
functions to catch any code that attempts to queue printk irq_work
during the suspending/resuming procedure.
Cc: <stable@vger.kernel.org> # 6.13.x because no drivers in 6.12.x
Fixes: 6b93bb41f6ea ("printk: Add non-BKL (nbcon) console basic infrastructure")
Closes: https://lore.kernel.org/lkml/DB9PR04MB8429E7DDF2D93C2695DE401D92C4A@DB9PR04MB8429.eurprd04.prod.outlook.com
Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
@sherry.sun: This patch is essentially the same as v1, but since
two WARN_ON_ONCE() were added, I decided not to use your
Tested-by. It would be great if you could test again with this
series.
kernel/printk/internal.h | 8 +++---
kernel/printk/nbcon.c | 7 +++++
kernel/printk/printk.c | 58 +++++++++++++++++++++++++++++-----------
3 files changed, 55 insertions(+), 18 deletions(-)
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index f72bbfa266d6c..b20929b7d71f5 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -230,6 +230,8 @@ struct console_flush_type {
bool legacy_offload;
};
+extern bool console_irqwork_blocked;
+
/*
* Identify which console flushing methods should be used in the context of
* the caller.
@@ -241,7 +243,7 @@ static inline void printk_get_console_flush_type(struct console_flush_type *ft)
switch (nbcon_get_default_prio()) {
case NBCON_PRIO_NORMAL:
if (have_nbcon_console && !have_boot_console) {
- if (printk_kthreads_running)
+ if (printk_kthreads_running && !console_irqwork_blocked)
ft->nbcon_offload = true;
else
ft->nbcon_atomic = true;
@@ -251,7 +253,7 @@ static inline void printk_get_console_flush_type(struct console_flush_type *ft)
if (have_legacy_console || have_boot_console) {
if (!is_printk_legacy_deferred())
ft->legacy_direct = true;
- else
+ else if (!console_irqwork_blocked)
ft->legacy_offload = true;
}
break;
@@ -264,7 +266,7 @@ static inline void printk_get_console_flush_type(struct console_flush_type *ft)
if (have_legacy_console || have_boot_console) {
if (!is_printk_legacy_deferred())
ft->legacy_direct = true;
- else
+ else if (!console_irqwork_blocked)
ft->legacy_offload = true;
}
break;
diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
index 73f315fd97a3e..730d14f6cbc58 100644
--- a/kernel/printk/nbcon.c
+++ b/kernel/printk/nbcon.c
@@ -1276,6 +1276,13 @@ void nbcon_kthreads_wake(void)
if (!printk_kthreads_running)
return;
+ /*
+ * It is not allowed to call this function when console irq_work
+ * is blocked.
+ */
+ if (WARN_ON_ONCE(console_irqwork_blocked))
+ return;
+
cookie = console_srcu_read_lock();
for_each_console_srcu(con) {
if (!(console_srcu_read_flags(con) & CON_NBCON))
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index dc89239cf1b58..b1c0d35cf3caa 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -462,6 +462,9 @@ bool have_boot_console;
/* See printk_legacy_allow_panic_sync() for details. */
bool legacy_allow_panic_sync;
+/* Avoid using irq_work when suspending. */
+bool console_irqwork_blocked;
+
#ifdef CONFIG_PRINTK
DECLARE_WAIT_QUEUE_HEAD(log_wait);
static DECLARE_WAIT_QUEUE_HEAD(legacy_wait);
@@ -2426,7 +2429,7 @@ asmlinkage int vprintk_emit(int facility, int level,
if (ft.legacy_offload)
defer_console_output();
- else
+ else if (!console_irqwork_blocked)
wake_up_klogd();
return printed_len;
@@ -2730,10 +2733,20 @@ void console_suspend_all(void)
{
struct console *con;
+ if (console_suspend_enabled)
+ pr_info("Suspending console(s) (use no_console_suspend to debug)\n");
+
+ /*
+ * Flush any console backlog and then avoid queueing irq_work until
+ * console_resume_all(). Until then deferred printing is no longer
+ * triggered, NBCON consoles transition to atomic flushing, and
+ * any klogd waiters are not triggered.
+ */
+ pr_flush(1000, true);
+ console_irqwork_blocked = true;
+
if (!console_suspend_enabled)
return;
- pr_info("Suspending console(s) (use no_console_suspend to debug)\n");
- pr_flush(1000, true);
console_list_lock();
for_each_console(con)
@@ -2754,26 +2767,34 @@ void console_resume_all(void)
struct console_flush_type ft;
struct console *con;
- if (!console_suspend_enabled)
- return;
-
- console_list_lock();
- for_each_console(con)
- console_srcu_write_flags(con, con->flags & ~CON_SUSPENDED);
- console_list_unlock();
-
/*
- * Ensure that all SRCU list walks have completed. All printing
- * contexts must be able to see they are no longer suspended so
- * that they are guaranteed to wake up and resume printing.
+ * Allow queueing irq_work. After restoring console state, deferred
+ * printing and any klogd waiters need to be triggered in case there
+ * is now a console backlog.
*/
- synchronize_srcu(&console_srcu);
+ console_irqwork_blocked = false;
+
+ if (console_suspend_enabled) {
+ console_list_lock();
+ for_each_console(con)
+ console_srcu_write_flags(con, con->flags & ~CON_SUSPENDED);
+ console_list_unlock();
+
+ /*
+ * Ensure that all SRCU list walks have completed. All printing
+ * contexts must be able to see they are no longer suspended so
+ * that they are guaranteed to wake up and resume printing.
+ */
+ synchronize_srcu(&console_srcu);
+ }
printk_get_console_flush_type(&ft);
if (ft.nbcon_offload)
nbcon_kthreads_wake();
if (ft.legacy_offload)
defer_console_output();
+ else
+ wake_up_klogd();
pr_flush(1000, true);
}
@@ -4511,6 +4532,13 @@ static void __wake_up_klogd(int val)
if (!printk_percpu_data_ready())
return;
+ /*
+ * It is not allowed to call this function when console irq_work
+ * is blocked.
+ */
+ if (WARN_ON_ONCE(console_irqwork_blocked))
+ return;
+
preempt_disable();
/*
* Guarantee any new records can be seen by tasks preparing to wait
--
2.47.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 1/2] printk: Allow printk_trigger_flush() to flush all types
2025-11-13 16:03 ` [PATCH printk v2 1/2] printk: Allow printk_trigger_flush() to flush all types John Ogness
@ 2025-11-13 16:20 ` kernel test robot
2025-11-14 13:42 ` Petr Mladek
1 sibling, 0 replies; 17+ messages in thread
From: kernel test robot @ 2025-11-13 16:20 UTC (permalink / raw)
To: John Ogness; +Cc: stable, oe-kbuild-all
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#option-1
Rule: add the tag "Cc: stable@vger.kernel.org" in the sign-off area to have the patch automatically included in the stable tree.
Subject: [PATCH printk v2 1/2] printk: Allow printk_trigger_flush() to flush all types
Link: https://lore.kernel.org/stable/20251113160351.113031-2-john.ogness%40linutronix.de
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend
2025-11-13 16:03 ` [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend John Ogness
@ 2025-11-13 16:38 ` Derek Barbosa
2025-11-13 17:06 ` John Ogness
2025-11-14 14:55 ` Petr Mladek
1 sibling, 1 reply; 17+ messages in thread
From: Derek Barbosa @ 2025-11-13 16:38 UTC (permalink / raw)
To: John Ogness
Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Sherry Sun,
Jacky Bai, Jon Hunter, Thierry Reding, linux-kernel, stable
On Thu, Nov 13, 2025 at 05:09:48PM +0106, John Ogness wrote:
> ---
> @sherry.sun: This patch is essentially the same as v1, but since
> two WARN_ON_ONCE() were added, I decided not to use your
> Tested-by. It would be great if you could test again with this
> series.
>
> kernel/printk/internal.h | 8 +++---
> kernel/printk/nbcon.c | 7 +++++
> kernel/printk/printk.c | 58 +++++++++++++++++++++++++++++-----------
> 3 files changed, 55 insertions(+), 18 deletions(-)
>
> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
> index 73f315fd97a3e..730d14f6cbc58 100644
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1276,6 +1276,13 @@ void nbcon_kthreads_wake(void)
> if (!printk_kthreads_running)
> return;
>
> + /*
> + * It is not allowed to call this function when console irq_work
> + * is blocked.
> + */
> + if (WARN_ON_ONCE(console_irqwork_blocked))
> + return;
> +
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index dc89239cf1b58..b1c0d35cf3caa 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -462,6 +462,9 @@ bool have_boot_console;
> /* See printk_legacy_allow_panic_sync() for details. */
> bool legacy_allow_panic_sync;
>
> +/* Avoid using irq_work when suspending. */
> +bool console_irqwork_blocked;
> +
> #ifdef CONFIG_PRINTK
> DECLARE_WAIT_QUEUE_HEAD(log_wait);
> static DECLARE_WAIT_QUEUE_HEAD(legacy_wait);
> @@ -2426,7 +2429,7 @@ asmlinkage int vprintk_emit(int facility, int level,
>
> if (ft.legacy_offload)
> defer_console_output();
> - else
> + else if (!console_irqwork_blocked)
> wake_up_klogd();
>
> return printed_len;
> @@ -2730,10 +2733,20 @@ void console_suspend_all(void)
> {
> struct console *con;
>
> + if (console_suspend_enabled)
> + pr_info("Suspending console(s) (use no_console_suspend to debug)\n");
> +
> + /*
> + * Flush any console backlog and then avoid queueing irq_work until
> + * console_resume_all(). Until then deferred printing is no longer
> + * triggered, NBCON consoles transition to atomic flushing, and
> + * any klogd waiters are not triggered.
> + */
> + pr_flush(1000, true);
> + console_irqwork_blocked = true;
> +
Thanks for this. I have recently have been seeing the same issue with a large-CPU
workstation system in which the serial console been locking up entry/exit of S4
Hibernation sleep state at different intervals.
I am still running tests on the V1 of the series to determine reproducibility,
but I will try to get this version tested in a timely manner as well.
I did, however, test the proto-patch at [0]. The original issue was reproducible
with this patch applied. Avoiding klogd waking in vprintk_emit() and the
addition of the check in nbcon.c (new in this series) opposed to aborting
callers outright seems more airtight.
[0] https://github.com/Linutronix/linux/commit/ae173249d9028ef159fba040bdab260d80dda43f
--
Derek <debarbos@redhat.com>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend
2025-11-13 16:38 ` Derek Barbosa
@ 2025-11-13 17:06 ` John Ogness
2025-11-13 19:15 ` Derek Barbosa
0 siblings, 1 reply; 17+ messages in thread
From: John Ogness @ 2025-11-13 17:06 UTC (permalink / raw)
To: debarbos
Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Sherry Sun,
Jacky Bai, Jon Hunter, Thierry Reding, linux-kernel, stable
Hi Derek,
On 2025-11-13, Derek Barbosa <debarbos@redhat.com> wrote:
> Thanks for this. I have recently have been seeing the same issue with a large-CPU
> workstation system in which the serial console been locking up entry/exit of S4
> Hibernation sleep state at different intervals.
>
> I am still running tests on the V1 of the series to determine reproducibility,
> but I will try to get this version tested in a timely manner as well.
>
> I did, however, test the proto-patch at [0]. The original issue was reproducible
> with this patch applied. Avoiding klogd waking in vprintk_emit() and the
> addition of the check in nbcon.c (new in this series) opposed to aborting
> callers outright seems more airtight.
I assume the problem you are seeing is with the PREEMPT_RT patches
applied (i.e. with the 8250-NBCON included). If that is the case, note
that recent versions of the 8250 driver introduce its own irq_work that
is also problematic. I am currently reworking the 8250-NBCON series so
that it does not introduce irq_work.
Since you probably are not doing anything related to modem control,
maybe you could test with the following hack (assuming you are using a
v6.14 or later PREEMPT_RT patched kernel).
diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index 96d32db9f8872..2ad0f91ad467a 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -3459,7 +3459,7 @@ void serial8250_console_write(struct uart_8250_port *up,
* may be a context that does not permit waking up tasks.
*/
if (is_atomic)
- irq_work_queue(&up->modem_status_work);
+ ;//irq_work_queue(&up->modem_status_work);
else
serial8250_modem_status(up);
}
> [0] https://github.com/Linutronix/linux/commit/ae173249d9028ef159fba040bdab260d80dda43f
John
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend
2025-11-13 17:06 ` John Ogness
@ 2025-11-13 19:15 ` Derek Barbosa
2025-11-25 19:24 ` Derek Barbosa
0 siblings, 1 reply; 17+ messages in thread
From: Derek Barbosa @ 2025-11-13 19:15 UTC (permalink / raw)
To: John Ogness
Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Sherry Sun,
Jacky Bai, Jon Hunter, Thierry Reding, linux-kernel, stable
Hi John,
On Thu, Nov 13, 2025 at 06:12:57PM +0106, John Ogness wrote:
>
> I assume the problem you are seeing is with the PREEMPT_RT patches
> applied (i.e. with the 8250-NBCON included). If that is the case, note
> that recent versions of the 8250 driver introduce its own irq_work that
> is also problematic. I am currently reworking the 8250-NBCON series so
> that it does not introduce irq_work.
>
IIRC the aforementioned scenario was just recently tested with an rc5 kernel
from Torvalds' tree. Sorry for any confusion
> Since you probably are not doing anything related to modem control,
> maybe you could test with the following hack (assuming you are using a
> v6.14 or later PREEMPT_RT patched kernel).
I'll give this a shot as a follow up, thanks for the suggestion
>
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index 96d32db9f8872..2ad0f91ad467a 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -3459,7 +3459,7 @@ void serial8250_console_write(struct uart_8250_port *up,
> * may be a context that does not permit waking up tasks.
> */
> if (is_atomic)
> - irq_work_queue(&up->modem_status_work);
> + ;//irq_work_queue(&up->modem_status_work);
> else
> serial8250_modem_status(up);
> }
>
> > [0] https://github.com/Linutronix/linux/commit/ae173249d9028ef159fba040bdab260d80dda43f
>
> John
>
--
Derek <debarbos@redhat.com>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 1/2] printk: Allow printk_trigger_flush() to flush all types
2025-11-13 16:03 ` [PATCH printk v2 1/2] printk: Allow printk_trigger_flush() to flush all types John Ogness
2025-11-13 16:20 ` kernel test robot
@ 2025-11-14 13:42 ` Petr Mladek
1 sibling, 0 replies; 17+ messages in thread
From: Petr Mladek @ 2025-11-14 13:42 UTC (permalink / raw)
To: John Ogness
Cc: Sergey Senozhatsky, Steven Rostedt, Sherry Sun, Jacky Bai,
Jon Hunter, Thierry Reding, Derek Barbosa, linux-kernel, stable
On Thu 2025-11-13 17:09:47, John Ogness wrote:
> Currently printk_trigger_flush() only triggers legacy offloaded
> flushing, even if that may not be the appropriate method to flush
> for currently registered consoles. (The function predates the
> NBCON consoles.)
>
> Since commit 6690d6b52726 ("printk: Add helper for flush type
> logic") there is printk_get_console_flush_type(), which also
> considers NBCON consoles and reports all the methods of flushing
> appropriate based on the system state and consoles available.
>
> Update printk_trigger_flush() to use
> printk_get_console_flush_type() to appropriately flush registered
> consoles.
>
> Suggested-by: Petr Mladek <pmladek@suse.com>
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
Looks good to me:
Reviewed-by: Petr Mladek <pmladek@suse.com>
Best Regards,
Petr
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend
2025-11-13 16:03 ` [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend John Ogness
2025-11-13 16:38 ` Derek Barbosa
@ 2025-11-14 14:55 ` Petr Mladek
1 sibling, 0 replies; 17+ messages in thread
From: Petr Mladek @ 2025-11-14 14:55 UTC (permalink / raw)
To: John Ogness
Cc: Sergey Senozhatsky, Steven Rostedt, Sherry Sun, Jacky Bai,
Jon Hunter, Thierry Reding, Derek Barbosa, linux-kernel, stable
On Thu 2025-11-13 17:09:48, John Ogness wrote:
> Allowing irq_work to be scheduled while trying to suspend has shown
> to cause problems as some architectures interpret the pending
> interrupts as a reason to not suspend. This became a problem for
> printk() with the introduction of NBCON consoles. With every
> printk() call, NBCON console printing kthreads are woken by queueing
> irq_work. This means that irq_work continues to be queued due to
> printk() calls late in the suspend procedure.
>
> Avoid this problem by preventing printk() from queueing irq_work
> once console suspending has begun. This applies to triggering NBCON
> and legacy deferred printing as well as klogd waiters.
>
> Since triggering of NBCON threaded printing relies on irq_work, the
> pr_flush() within console_suspend_all() is used to perform the final
> flushing before suspending consoles and blocking irq_work queueing.
> NBCON consoles that are not suspended (due to the usage of the
> "no_console_suspend" boot argument) transition to atomic flushing.
>
> Introduce a new global variable @console_irqwork_blocked to flag
> when irq_work queueing is to be avoided. The flag is used by
> printk_get_console_flush_type() to avoid allowing deferred printing
> and switch NBCON consoles to atomic flushing. It is also used by
> vprintk_emit() to avoid klogd waking.
>
> Add WARN_ON_ONCE(console_irqwork_blocked) to the irq_work queuing
> functions to catch any code that attempts to queue printk irq_work
> during the suspending/resuming procedure.
>
> Cc: <stable@vger.kernel.org> # 6.13.x because no drivers in 6.12.x
> Fixes: 6b93bb41f6ea ("printk: Add non-BKL (nbcon) console basic infrastructure")
> Closes: https://lore.kernel.org/lkml/DB9PR04MB8429E7DDF2D93C2695DE401D92C4A@DB9PR04MB8429.eurprd04.prod.outlook.com
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
The changes look goot to me:
Reviewed-by: Petr Mladek <pmladek@suse.com>
Best Regards,
Petr
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 0/2] Fix reported suspend failures
2025-11-13 16:03 [PATCH printk v2 0/2] Fix reported suspend failures John Ogness
2025-11-13 16:03 ` [PATCH printk v2 1/2] printk: Allow printk_trigger_flush() to flush all types John Ogness
2025-11-13 16:03 ` [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend John Ogness
@ 2025-11-14 14:57 ` Petr Mladek
2025-11-16 12:14 ` Sherry Sun
` (2 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Petr Mladek @ 2025-11-14 14:57 UTC (permalink / raw)
To: John Ogness
Cc: Sergey Senozhatsky, Steven Rostedt, Sherry Sun, Jacky Bai,
Jon Hunter, Thierry Reding, Derek Barbosa, linux-kernel, stable
On Thu 2025-11-13 17:09:46, John Ogness wrote:
> This is v2 of a series to address multiple reports [0][1]
> (+ 2 offlist) of suspend failing when NBCON console drivers are
> in use. With the help of NXP and NVIDIA we were able to isolate
> the problem and verify the fix.
>
> v1 is here [2].
>
> The first NBCON drivers appeared in 6.13, so currently there is
> no LTS kernel that requires this series. But it should go into
> 6.17.x and 6.18.
>
> The changes since v1:
>
> - For printk_trigger_flush() add support for all flush types
> that are available. This will prevent printk_trigger_flush()
> from trying to inappropriately queue irq_work after this
> series is applied.
>
> - Add WARN_ON_ONCE() to the printk irq_work queueing functions
> in case they are called when irq_work is blocked. There
> should never be (and currently are no) such callers, but
> these functions are externally available.
>
> John Ogness
>
> [0] https://lore.kernel.org/lkml/80b020fc-c18a-4da4-b222-16da1cab2f4c@nvidia.com
> [1] https://lore.kernel.org/lkml/DB9PR04MB8429E7DDF2D93C2695DE401D92C4A@DB9PR04MB8429.eurprd04.prod.outlook.com
> [2] https://lore.kernel.org/lkml/20251111144328.887159-1-john.ogness@linutronix.de
>
> John Ogness (2):
> printk: Allow printk_trigger_flush() to flush all types
> printk: Avoid scheduling irq_work on suspend
>
> kernel/printk/internal.h | 8 ++--
> kernel/printk/nbcon.c | 9 ++++-
> kernel/printk/printk.c | 81 ++++++++++++++++++++++++++++++++--------
> 3 files changed, 78 insertions(+), 20 deletions(-)
The patchset seems to be ready for linux-next from my POV. I am going
to wait few more days for potential feedback. I'll push it later the
following week unless anyone complains.
Best Regards,
Petr
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [PATCH printk v2 0/2] Fix reported suspend failures
2025-11-13 16:03 [PATCH printk v2 0/2] Fix reported suspend failures John Ogness
` (2 preceding siblings ...)
2025-11-14 14:57 ` [PATCH printk v2 0/2] Fix reported suspend failures Petr Mladek
@ 2025-11-16 12:14 ` Sherry Sun
2025-11-19 15:30 ` Petr Mladek
2025-11-20 13:33 ` Thierry Reding
5 siblings, 0 replies; 17+ messages in thread
From: Sherry Sun @ 2025-11-16 12:14 UTC (permalink / raw)
To: John Ogness, Petr Mladek
Cc: Sergey Senozhatsky, Steven Rostedt, Jacky Bai, Jon Hunter,
Thierry Reding, Derek Barbosa, linux-kernel@vger.kernel.org,
stable@vger.kernel.org
> -----Original Message-----
> From: John Ogness <john.ogness@linutronix.de>
> Sent: Friday, November 14, 2025 12:04 AM
> To: Petr Mladek <pmladek@suse.com>
> Cc: Sergey Senozhatsky <senozhatsky@chromium.org>; Steven Rostedt
> <rostedt@goodmis.org>; Sherry Sun <sherry.sun@nxp.com>; Jacky Bai
> <ping.bai@nxp.com>; Jon Hunter <jonathanh@nvidia.com>; Thierry Reding
> <thierry.reding@gmail.com>; Derek Barbosa <debarbos@redhat.com>; linux-
> kernel@vger.kernel.org; stable@vger.kernel.org
> Subject: [PATCH printk v2 0/2] Fix reported suspend failures
>
> This is v2 of a series to address multiple reports [0][1] (+ 2 offlist) of suspend
> failing when NBCON console drivers are in use. With the help of NXP and
> NVIDIA we were able to isolate the problem and verify the fix.
>
> v1 is here [2].
>
> The first NBCON drivers appeared in 6.13, so currently there is no LTS kernel
> that requires this series. But it should go into 6.17.x and 6.18.
>
> The changes since v1:
>
> - For printk_trigger_flush() add support for all flush types
> that are available. This will prevent printk_trigger_flush()
> from trying to inappropriately queue irq_work after this
> series is applied.
>
> - Add WARN_ON_ONCE() to the printk irq_work queueing functions
> in case they are called when irq_work is blocked. There
> should never be (and currently are no) such callers, but
> these functions are externally available.
>
For this V2 patch set, I have verified it works on i.MX platforms, thanks for the fix.
Tested-by: Sherry Sun <sherry.sun@nxp.com>
Best Regards
Sherry
> John Ogness
>
> [0]
> https://lore.ke/
> rnel.org%2Flkml%2F80b020fc-c18a-4da4-b222-
> 16da1cab2f4c%40nvidia.com&data=05%7C02%7Csherry.sun%40nxp.com%7C
> 9cfc62ea070640d33aaa08de22ce3d98%7C686ea1d3bc2b4c6fa92cd99c5c3016
> 35%7C0%7C0%7C638986466358701215%7CUnknown%7CTWFpbGZsb3d8eyJ
> FbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiT
> WFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=SvtsyYrKyzA4syX%2Fju
> hhKS6vFf8kVPLgR%2FaeMZEfmDQ%3D&reserved=0
> [1]
> https://lore.ke/
> rnel.org%2Flkml%2FDB9PR04MB8429E7DDF2D93C2695DE401D92C4A%40DB
> 9PR04MB8429.eurprd04.prod.outlook.com&data=05%7C02%7Csherry.sun%4
> 0nxp.com%7C9cfc62ea070640d33aaa08de22ce3d98%7C686ea1d3bc2b4c6fa9
> 2cd99c5c301635%7C0%7C0%7C638986466358740800%7CUnknown%7CTWFp
> bGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4z
> MiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=aw9PKics
> 81DBClwWFsavyPS4XcHqGvxC53rUtd%2Fu7yE%3D&reserved=0
> [2]
> https://lore.ke/
> rnel.org%2Flkml%2F20251111144328.887159-1-
> john.ogness%40linutronix.de&data=05%7C02%7Csherry.sun%40nxp.com%7C
> 9cfc62ea070640d33aaa08de22ce3d98%7C686ea1d3bc2b4c6fa92cd99c5c3016
> 35%7C0%7C0%7C638986466358781709%7CUnknown%7CTWFpbGZsb3d8eyJ
> FbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiT
> WFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=JXLraqSBF2HnRs8FmN
> wbBPu9nlyvnDSzx%2BU0AOML0Do%3D&reserved=0
>
> John Ogness (2):
> printk: Allow printk_trigger_flush() to flush all types
> printk: Avoid scheduling irq_work on suspend
>
> kernel/printk/internal.h | 8 ++--
> kernel/printk/nbcon.c | 9 ++++-
> kernel/printk/printk.c | 81 ++++++++++++++++++++++++++++++++--------
> 3 files changed, 78 insertions(+), 20 deletions(-)
>
>
> base-commit: e9a6fb0bcdd7609be6969112f3fbfcce3b1d4a7c
> --
> 2.47.3
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 0/2] Fix reported suspend failures
2025-11-13 16:03 [PATCH printk v2 0/2] Fix reported suspend failures John Ogness
` (3 preceding siblings ...)
2025-11-16 12:14 ` Sherry Sun
@ 2025-11-19 15:30 ` Petr Mladek
2025-11-20 11:03 ` John Ogness
2025-11-20 13:33 ` Thierry Reding
5 siblings, 1 reply; 17+ messages in thread
From: Petr Mladek @ 2025-11-19 15:30 UTC (permalink / raw)
To: John Ogness
Cc: Sergey Senozhatsky, Steven Rostedt, Sherry Sun, Jacky Bai,
Jon Hunter, Thierry Reding, Derek Barbosa, linux-kernel, stable
On Thu 2025-11-13 17:09:46, John Ogness wrote:
> This is v2 of a series to address multiple reports [0][1]
> (+ 2 offlist) of suspend failing when NBCON console drivers are
> in use. With the help of NXP and NVIDIA we were able to isolate
> the problem and verify the fix.
>
> The first NBCON drivers appeared in 6.13, so currently there is
> no LTS kernel that requires this series. But it should go into
> 6.17.x and 6.18.
>
> John Ogness (2):
> printk: Allow printk_trigger_flush() to flush all types
> printk: Avoid scheduling irq_work on suspend
>
> kernel/printk/internal.h | 8 ++--
> kernel/printk/nbcon.c | 9 ++++-
> kernel/printk/printk.c | 81 ++++++++++++++++++++++++++++++++--------
> 3 files changed, 78 insertions(+), 20 deletions(-)
JFYI, the patchset has been committed into printk/linux.git,
branch rework/suspend-fixes.
Best Regards,
Petr
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 0/2] Fix reported suspend failures
2025-11-19 15:30 ` Petr Mladek
@ 2025-11-20 11:03 ` John Ogness
2025-11-21 9:55 ` Petr Mladek
0 siblings, 1 reply; 17+ messages in thread
From: John Ogness @ 2025-11-20 11:03 UTC (permalink / raw)
To: Petr Mladek
Cc: Sergey Senozhatsky, Steven Rostedt, Sherry Sun, Jacky Bai,
Jon Hunter, Thierry Reding, Derek Barbosa, linux-kernel, stable
Hi Petr,
On 2025-11-19, Petr Mladek <pmladek@suse.com> wrote:
> JFYI, the patchset has been committed into printk/linux.git,
> branch rework/suspend-fixes.
While doing more testing I hit the new WARN_ON_ONCE() in
__wake_up_klogd():
[ 125.306075][ T92] Timekeeping suspended for 9.749 seconds
[ 125.306093][ T92] ------------[ cut here ]------------
[ 125.306108][ T92] WARNING: CPU: 0 PID: 92 at kernel/printk/printk.c:4539 vprintk_emit+0x134/0x2e8
[ 125.306151][ T92] Modules linked in: pm33xx ti_emif_sram wkup_m3_ipc wkup_m3_rproc omap_mailbox rtc_omap
[ 125.306249][ T92] CPU: 0 UID: 0 PID: 92 Comm: rtcwake Not tainted 6.18.0-rc5-00005-g3d7d27fc1b14 #162 PREEMPT
[ 125.306276][ T92] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 125.306290][ T92] Call trace:
[ 125.306308][ T92] unwind_backtrace from show_stack+0x18/0x1c
[ 125.306356][ T92] show_stack from dump_stack_lvl+0x50/0x64
[ 125.306398][ T92] dump_stack_lvl from __warn+0x7c/0x160
[ 125.306433][ T92] __warn from warn_slowpath_fmt+0x158/0x1f0
[ 125.306459][ T92] warn_slowpath_fmt from vprintk_emit+0x134/0x2e8
[ 125.306487][ T92] vprintk_emit from _printk_deferred+0x44/0x84
[ 125.306520][ T92] _printk_deferred from tk_debug_account_sleep_time+0x78/0x88
[ 125.306574][ T92] tk_debug_account_sleep_time from timekeeping_inject_sleeptime64+0x3c/0x6c
[ 125.306624][ T92] timekeeping_inject_sleeptime64 from rtc_resume.part.0+0x158/0x178
[ 125.306666][ T92] rtc_resume.part.0 from rtc_resume+0x54/0x64
[ 125.306705][ T92] rtc_resume from dpm_run_callback+0x68/0x1d4
[ 125.306747][ T92] dpm_run_callback from device_resume+0xc8/0x200
[ 125.306779][ T92] device_resume from dpm_resume+0x208/0x304
[ 125.306813][ T92] dpm_resume from dpm_resume_end+0x14/0x24
[ 125.306846][ T92] dpm_resume_end from suspend_devices_and_enter+0x1e8/0x8a4
[ 125.306892][ T92] suspend_devices_and_enter from pm_suspend+0x328/0x3c0
[ 125.306924][ T92] pm_suspend from state_store+0x70/0xd0
[ 125.306955][ T92] state_store from kernfs_fop_write_iter+0x124/0x1e4
[ 125.307001][ T92] kernfs_fop_write_iter from vfs_write+0x1f0/0x2bc
[ 125.307049][ T92] vfs_write from ksys_write+0x68/0xe8
[ 125.307085][ T92] ksys_write from ret_fast_syscall+0x0/0x58
[ 125.307113][ T92] Exception stack(0xd025dfa8 to 0xd025dff0)
[ 125.307137][ T92] dfa0: 00000004 bed09f71 00000004 bed09f71 00000003 00000001
[ 125.307157][ T92] dfc0: 00000004 bed09f71 00000003 00000004 00510bd4 00000000 00000000 0050e634
[ 125.307172][ T92] dfe0: 00000004 bed09bd8 b6ebc20b b6e35616
[ 125.307185][ T92] ---[ end trace 0000000000000000 ]---
It is due to a use of printk_deferred(). This goes through the special
case of "level == LOGLEVEL_SCHED" in vprintk_emit(). Originally I had
patched this code as well, but then later removed it thinking that it
was not needed. But it is needed. :-/ Something like:
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index b1c0d35cf3ca..c27fc7fc64eb 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2393,7 +2393,7 @@ asmlinkage int vprintk_emit(int facility, int level,
/* If called from the scheduler, we can not call up(). */
if (level == LOGLEVEL_SCHED) {
level = LOGLEVEL_DEFAULT;
- ft.legacy_offload |= ft.legacy_direct;
+ ft.legacy_offload |= ft.legacy_direct && !console_irqwork_blocked;
ft.legacy_direct = false;
}
Is this solution ok for you? Do you prefer a follow-up patch or a v3?
John
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 0/2] Fix reported suspend failures
2025-11-13 16:03 [PATCH printk v2 0/2] Fix reported suspend failures John Ogness
` (4 preceding siblings ...)
2025-11-19 15:30 ` Petr Mladek
@ 2025-11-20 13:33 ` Thierry Reding
5 siblings, 0 replies; 17+ messages in thread
From: Thierry Reding @ 2025-11-20 13:33 UTC (permalink / raw)
To: John Ogness
Cc: Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Sherry Sun,
Jacky Bai, Jon Hunter, Derek Barbosa, linux-kernel, stable
[-- Attachment #1: Type: text/plain, Size: 1784 bytes --]
On Thu, Nov 13, 2025 at 05:09:46PM +0106, John Ogness wrote:
> This is v2 of a series to address multiple reports [0][1]
> (+ 2 offlist) of suspend failing when NBCON console drivers are
> in use. With the help of NXP and NVIDIA we were able to isolate
> the problem and verify the fix.
>
> v1 is here [2].
>
> The first NBCON drivers appeared in 6.13, so currently there is
> no LTS kernel that requires this series. But it should go into
> 6.17.x and 6.18.
>
> The changes since v1:
>
> - For printk_trigger_flush() add support for all flush types
> that are available. This will prevent printk_trigger_flush()
> from trying to inappropriately queue irq_work after this
> series is applied.
>
> - Add WARN_ON_ONCE() to the printk irq_work queueing functions
> in case they are called when irq_work is blocked. There
> should never be (and currently are no) such callers, but
> these functions are externally available.
>
> John Ogness
>
> [0] https://lore.kernel.org/lkml/80b020fc-c18a-4da4-b222-16da1cab2f4c@nvidia.com
> [1] https://lore.kernel.org/lkml/DB9PR04MB8429E7DDF2D93C2695DE401D92C4A@DB9PR04MB8429.eurprd04.prod.outlook.com
> [2] https://lore.kernel.org/lkml/20251111144328.887159-1-john.ogness@linutronix.de
>
> John Ogness (2):
> printk: Allow printk_trigger_flush() to flush all types
> printk: Avoid scheduling irq_work on suspend
>
> kernel/printk/internal.h | 8 ++--
> kernel/printk/nbcon.c | 9 ++++-
> kernel/printk/printk.c | 81 ++++++++++++++++++++++++++++++++--------
> 3 files changed, 78 insertions(+), 20 deletions(-)
Sorry, I'm a bit late, but this seems to solve all the issues we were
seeing on Tegra boards, so for the record:
Tested-by: Thierry Reding <treding@nvidia.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 0/2] Fix reported suspend failures
2025-11-20 11:03 ` John Ogness
@ 2025-11-21 9:55 ` Petr Mladek
0 siblings, 0 replies; 17+ messages in thread
From: Petr Mladek @ 2025-11-21 9:55 UTC (permalink / raw)
To: John Ogness
Cc: Sergey Senozhatsky, Steven Rostedt, Sherry Sun, Jacky Bai,
Jon Hunter, Thierry Reding, Derek Barbosa, linux-kernel, stable
On Thu 2025-11-20 12:09:43, John Ogness wrote:
> Hi Petr,
>
> On 2025-11-19, Petr Mladek <pmladek@suse.com> wrote:
> > JFYI, the patchset has been committed into printk/linux.git,
> > branch rework/suspend-fixes.
>
> While doing more testing I hit the new WARN_ON_ONCE() in
> __wake_up_klogd():
>
> [ 125.306075][ T92] Timekeeping suspended for 9.749 seconds
> [ 125.306093][ T92] ------------[ cut here ]------------
> [ 125.306108][ T92] WARNING: CPU: 0 PID: 92 at kernel/printk/printk.c:4539 vprintk_emit+0x134/0x2e8
> [ 125.306151][ T92] Modules linked in: pm33xx ti_emif_sram wkup_m3_ipc wkup_m3_rproc omap_mailbox rtc_omap
> [ 125.306249][ T92] CPU: 0 UID: 0 PID: 92 Comm: rtcwake Not tainted 6.18.0-rc5-00005-g3d7d27fc1b14 #162 PREEMPT
> [ 125.306276][ T92] Hardware name: Generic AM33XX (Flattened Device Tree)
> [ 125.306290][ T92] Call trace:
> [ 125.306308][ T92] unwind_backtrace from show_stack+0x18/0x1c
> [ 125.306356][ T92] show_stack from dump_stack_lvl+0x50/0x64
> [ 125.306398][ T92] dump_stack_lvl from __warn+0x7c/0x160
> [ 125.306433][ T92] __warn from warn_slowpath_fmt+0x158/0x1f0
> [ 125.306459][ T92] warn_slowpath_fmt from vprintk_emit+0x134/0x2e8
> [ 125.306487][ T92] vprintk_emit from _printk_deferred+0x44/0x84
> [ 125.306520][ T92] _printk_deferred from tk_debug_account_sleep_time+0x78/0x88
> [ 125.306574][ T92] tk_debug_account_sleep_time from timekeeping_inject_sleeptime64+0x3c/0x6c
> [ 125.306624][ T92] timekeeping_inject_sleeptime64 from rtc_resume.part.0+0x158/0x178
> [ 125.306666][ T92] rtc_resume.part.0 from rtc_resume+0x54/0x64
> [ 125.306705][ T92] rtc_resume from dpm_run_callback+0x68/0x1d4
> [ 125.306747][ T92] dpm_run_callback from device_resume+0xc8/0x200
> [ 125.306779][ T92] device_resume from dpm_resume+0x208/0x304
> [ 125.306813][ T92] dpm_resume from dpm_resume_end+0x14/0x24
> [ 125.306846][ T92] dpm_resume_end from suspend_devices_and_enter+0x1e8/0x8a4
> [ 125.306892][ T92] suspend_devices_and_enter from pm_suspend+0x328/0x3c0
> [ 125.306924][ T92] pm_suspend from state_store+0x70/0xd0
> [ 125.306955][ T92] state_store from kernfs_fop_write_iter+0x124/0x1e4
> [ 125.307001][ T92] kernfs_fop_write_iter from vfs_write+0x1f0/0x2bc
> [ 125.307049][ T92] vfs_write from ksys_write+0x68/0xe8
> [ 125.307085][ T92] ksys_write from ret_fast_syscall+0x0/0x58
> [ 125.307113][ T92] Exception stack(0xd025dfa8 to 0xd025dff0)
> [ 125.307137][ T92] dfa0: 00000004 bed09f71 00000004 bed09f71 00000003 00000001
> [ 125.307157][ T92] dfc0: 00000004 bed09f71 00000003 00000004 00510bd4 00000000 00000000 0050e634
> [ 125.307172][ T92] dfe0: 00000004 bed09bd8 b6ebc20b b6e35616
> [ 125.307185][ T92] ---[ end trace 0000000000000000 ]---
>
> It is due to a use of printk_deferred(). This goes through the special
> case of "level == LOGLEVEL_SCHED" in vprintk_emit(). Originally I had
> patched this code as well, but then later removed it thinking that it
> was not needed. But it is needed. :-/ Something like:
Great catch!
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index b1c0d35cf3ca..c27fc7fc64eb 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2393,7 +2393,7 @@ asmlinkage int vprintk_emit(int facility, int level,
> /* If called from the scheduler, we can not call up(). */
> if (level == LOGLEVEL_SCHED) {
> level = LOGLEVEL_DEFAULT;
> - ft.legacy_offload |= ft.legacy_direct;
> + ft.legacy_offload |= ft.legacy_direct && !console_irqwork_blocked;
> ft.legacy_direct = false;
> }
>
> Is this solution ok for you? Do you prefer a follow-up patch or a v3?
Nothing better comes to my mind ;-) A follow-up patch would be
lovely. Please, go ahead.
Best Regards,
Petr
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend
2025-11-13 19:15 ` Derek Barbosa
@ 2025-11-25 19:24 ` Derek Barbosa
2025-11-26 9:22 ` Petr Mladek
0 siblings, 1 reply; 17+ messages in thread
From: Derek Barbosa @ 2025-11-25 19:24 UTC (permalink / raw)
To: John Ogness, Petr Mladek, Sergey Senozhatsky, Steven Rostedt,
Sherry Sun, Jacky Bai, Jon Hunter, Thierry Reding, linux-kernel,
stable
On Thu, Nov 13, 2025 at 02:15:09PM -0500, Derek Barbosa wrote:
> Hi John,
>
> On Thu, Nov 13, 2025 at 06:12:57PM +0106, John Ogness wrote:
> >
> > I assume the problem you are seeing is with the PREEMPT_RT patches
> > applied (i.e. with the 8250-NBCON included). If that is the case, note
> > that recent versions of the 8250 driver introduce its own irq_work that
> > is also problematic. I am currently reworking the 8250-NBCON series so
> > that it does not introduce irq_work.
> >
Hi John,
Apologies for the late reply here. Just now got some results in.
Testing this patch series atop of Linus' tree resolves the suspend issue seen on
these large CPU workstation systems.
I see this has already landed in the maintainers tree at printk/linux.git.
Cheers,
--
Derek <debarbos@redhat.com>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend
2025-11-25 19:24 ` Derek Barbosa
@ 2025-11-26 9:22 ` Petr Mladek
0 siblings, 0 replies; 17+ messages in thread
From: Petr Mladek @ 2025-11-26 9:22 UTC (permalink / raw)
To: Derek Barbosa
Cc: John Ogness, Sergey Senozhatsky, Steven Rostedt, Sherry Sun,
Jacky Bai, Jon Hunter, Thierry Reding, linux-kernel, stable
On Tue 2025-11-25 14:24:55, Derek Barbosa wrote:
> On Thu, Nov 13, 2025 at 02:15:09PM -0500, Derek Barbosa wrote:
> > Hi John,
> >
> > On Thu, Nov 13, 2025 at 06:12:57PM +0106, John Ogness wrote:
> > >
> > > I assume the problem you are seeing is with the PREEMPT_RT patches
> > > applied (i.e. with the 8250-NBCON included). If that is the case, note
> > > that recent versions of the 8250 driver introduce its own irq_work that
> > > is also problematic. I am currently reworking the 8250-NBCON series so
> > > that it does not introduce irq_work.
> > >
>
>
> Hi John,
>
> Apologies for the late reply here. Just now got some results in.
No problem at all.
> Testing this patch series atop of Linus' tree resolves the suspend issue seen on
> these large CPU workstation systems.
Thanks a lot for checking the patches. It is great to know that it
resolved the problem.
> I see this has already landed in the maintainers tree at printk/linux.git.
Yes, I wanted to have it in linux-next in time before the merge window opens
for 6.19 (likely next week).
Best Regards,
Petr
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-11-26 9:22 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-13 16:03 [PATCH printk v2 0/2] Fix reported suspend failures John Ogness
2025-11-13 16:03 ` [PATCH printk v2 1/2] printk: Allow printk_trigger_flush() to flush all types John Ogness
2025-11-13 16:20 ` kernel test robot
2025-11-14 13:42 ` Petr Mladek
2025-11-13 16:03 ` [PATCH printk v2 2/2] printk: Avoid scheduling irq_work on suspend John Ogness
2025-11-13 16:38 ` Derek Barbosa
2025-11-13 17:06 ` John Ogness
2025-11-13 19:15 ` Derek Barbosa
2025-11-25 19:24 ` Derek Barbosa
2025-11-26 9:22 ` Petr Mladek
2025-11-14 14:55 ` Petr Mladek
2025-11-14 14:57 ` [PATCH printk v2 0/2] Fix reported suspend failures Petr Mladek
2025-11-16 12:14 ` Sherry Sun
2025-11-19 15:30 ` Petr Mladek
2025-11-20 11:03 ` John Ogness
2025-11-21 9:55 ` Petr Mladek
2025-11-20 13:33 ` Thierry Reding
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).