From: Petr Mladek <pmladek@suse.com>
To: ysard <ysard_git@gmx.fr>
Cc: John Ogness <john.ogness@linutronix.de>,
linux-kernel@vger.kernel.org, senozhatsky@chromium.org
Subject: Re: Regression: system freeze on resume from suspend introduced by printk per-console suspended state
Date: Wed, 28 Jan 2026 15:00:40 +0100 [thread overview]
Message-ID: <aXoWiJhcOaGGlcmk@pathway.suse.cz> (raw)
In-Reply-To: <trinity-1a192a62-60e6-4c78-a0fb-2705ba7f6832-1769217761219@3c-app-mailcom-bs16>
On Sat 2026-01-24 02:22:41, ysard wrote:
> On Fri 2026-01-23 13:19:34 +0100, Petr Mladek wrote:
> > Also I would expect that the userspace waits until the services
> > finish the job before suspending the kernel.
>
> It does:
>
> janv. 24 00:33:41 systemd[1]: Reached target sleep.target - Sleep.
> janv. 24 00:33:41 systemd[1]: Starting nvidia-suspend.service - NVIDIA system suspend actions...
> janv. 24 00:33:41 suspend[51525]: nvidia-suspend.service
> janv. 24 00:33:41 logger[51525]: <13>Jan 24 00:33:41 suspend: nvidia-suspend.service
> janv. 24 00:33:42 kernel: audit: type=1400 audit(1769211222.373:2351): apparmor="ALLOWED" operation="open" class="file" profile="Xorg" name="/dev/nvidiactl" pid=1441 comm="Xorg" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0
> janv. 24 00:33:42 kernel: audit: type=1400 audit(1769211222.969:2352): apparmor="ALLOWED" operation="open" class="file" profile="Xorg" name="/dev/nvidiactl" pid=1441 comm="Xorg" requested_mask="wr" denied_mask="wr" fsuid=0 ouid=0
> janv. 24 00:33:45 systemd[1]: nvidia-suspend.service: Deactivated successfully.
> janv. 24 00:33:45 systemd[1]: Finished nvidia-suspend.service - NVIDIA system suspend actions.
> janv. 24 00:33:45 systemd[1]: Starting systemd-suspend.service - System Suspend...
> janv. 24 00:33:45 systemd[1]: session-1.scope: Unit now frozen-by-parent.
> janv. 24 00:33:45 systemd[1]: user@1000.service: Unit now frozen-by-parent.
> janv. 24 00:33:45 systemd[1]: user-1000.slice: Unit now frozen-by-parent.
> janv. 24 00:33:45 systemd[1]: user.slice: Unit now frozen.
> janv. 24 00:33:45 systemd-sleep[51562]: Successfully froze unit 'user.slice'.
> janv. 24 00:33:45 systemd-sleep[51562]: Performing sleep operation 'suspend'...
> janv. 24 00:33:45 kernel: PM: suspend entry (deep)
OK.
> Yes I have a reproducible pattern here. With the service disabled.
> The service `nvidia-resume.service` (which basically calls the script
> with the 'resume' argument) is expected to start if the resume is
> completed, but the system does not reach this stage during the freeze.
>
> No freeze:
> $ sudo sh -c "
> mkdir -p /var/run/nvidia-sleep \
> && echo 2 > /var/run/nvidia-sleep/Xorg.vt_number \
> && chvt 63 \
> && systemctl suspend"
>
> Freeze:
> $ sudo sh -c "
> mkdir -p /var/run/nvidia-sleep \
> && echo 2 > /var/run/nvidia-sleep/Xorg.vt_number \
> && chvt 63 \
> && echo suspend >/proc/driver/nvidia/suspend \
> && systemctl suspend"
>
> So the problem is related to this command:
> $ echo suspend >/proc/driver/nvidia/suspend
>
> Note that without the systemctl order this command suspends and wakes up the gpu correctly:
> $ sudo sh -c "
> chvt 63 \
> && echo suspend >/proc/driver/nvidia/suspend; \
> sleep 4; \
> echo resume >/proc/driver/nvidia/suspend; \
> chvt 2"
Interesting. It looks like the nvidia suspend does something which
breaks the system suspend. But the driver is able to revert it...
To be honest, I do not have any theory which could explain this.
But I have found a bug in John's debug patch from
https://lore.kernel.org/all/877bts1ltv.fsf@jogness.linutronix.de/
The patch tried to restore the original behavior on current mainline.
But console_suspend()/cosnole_resume() function have been renamed recently
to console_suspend_all()/console_resume_all(). The original
names were used for console-specific suspend/resume variants,
see
https://lore.kernel.org/all/20250226-printk-renaming-v1-0-0b878577f2e6@suse.com/
Also the debug patch did not revert synchronize_srcu(). I guess that
this was intentional. But I would rather revert it as well because
it is a potentially blocking operation.
Could you please test it with this fixed version of the debug patch?
If the patch helps, by chance, then please try to uncomment
the synchronize_srcu() calls and check if it still works.
I wonder if they make in difference.
From a36b57cbcb239e7e5af4fb8278690cd4965d6fc0 Mon Sep 17 00:00:00 2001
From: John Ogness <john.ogness@linutronix.de>
Date: Thu, 8 Jan 2026 10:49:24 +0106
Subject: [DEBUG v2] printk: Debug new vs. old suspend/resume behavior
This is just for debugging. It should restore the old console_lock
behavior for suspend/resume and also adds some debugging information.
Please compile with CONFIG_PRINTK_CALLER=y so that we can see which
tasks are locking/unlocking the console during suspend/resume.
Changes against v1:
- Set/Clear the global "console_suspended" variable in
console_suspend_all()/console_restore_all() instead of
console_suspend()/console_resume().
The functions have been renamed recently, see
https://lore.kernel.org/all/20250226-printk-renaming-v1-0-0b878577f2e6@suse.com/
- Do not call synchronize_srcu() in the suspend/resume functions.
They are another potentially blocking operation added by
the problematic commit 9e70a5e109a4a2336 ("printk: Add per-console
suspended state").
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
kernel/printk/printk.c | 64 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 62 insertions(+), 2 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1d765ad242b8..23fddc4006d3 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -356,6 +356,22 @@ static void __up_console_sem(unsigned long ip)
*/
static int console_locked;
+static int console_suspended;
+
+int vprintk_store(int facility, int level,
+ const struct dev_printk_info *dev_info,
+ const char *fmt, va_list args);
+
+/* Helper function to store-only. */
+static void printk_store(const char *fmt, ...)
+{
+ va_list args;
+
+ va_start(args, fmt);
+ vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args);
+ va_end(args);
+}
+
/*
* Array of consoles built from command line options (console=)
*/
@@ -2748,6 +2764,12 @@ void console_suspend_all(void)
if (!console_suspend_enabled)
return;
+ console_lock();
+ console_suspended = 1;
+ printk_store(KERN_INFO "printk: %s\n", __func__);
+ /* Unlock directly (i.e. without clearing @console_locked). */
+ up_console_sem();
+
console_list_lock();
for_each_console(con)
console_srcu_write_flags(con, con->flags | CON_SUSPENDED);
@@ -2759,7 +2781,7 @@ void console_suspend_all(void)
* is guaranteed that all printing has stopped when this function
* completes.
*/
- synchronize_srcu(&console_srcu);
+// synchronize_srcu(&console_srcu);
}
void console_resume_all(void)
@@ -2785,7 +2807,17 @@ void console_resume_all(void)
* contexts must be able to see they are no longer suspended so
* that they are guaranteed to wake up and resume printing.
*/
- synchronize_srcu(&console_srcu);
+// synchronize_srcu(&console_srcu);
+
+ down_console_sem();
+ printk_store(KERN_INFO "printk: %s\n", __func__);
+ console_suspended = 0;
+ /*
+ * Perform a regular unlock.
+ * Here console_locked=1 and console_may_schedule=1.
+ * @console_unlocked will be cleared.
+ */
+ console_unlock();
}
printk_get_console_flush_type(&ft);
@@ -2841,6 +2873,15 @@ void console_lock(void)
msleep(1000);
down_console_sem();
+ if (console_suspended) {
+ printk_store(KERN_INFO "printk: %s\n", __func__);
+ /*
+ * Keep console locked, but do not touch
+ * @console_locked or @console_may_schedule.
+ * (Although they will both be 1 here anyway.)
+ */
+ return;
+ }
console_locked = 1;
console_may_schedule = 1;
}
@@ -2861,6 +2902,15 @@ int console_trylock(void)
return 0;
if (down_trylock_console_sem())
return 0;
+ if (console_suspended) {
+ printk_store(KERN_INFO "printk: %s\n", __func__);
+ /*
+ * The lock was acquired, but unlock directly and report
+ * failure. Here console_locked=1 and console_may_schedule=1.
+ */
+ up_console_sem();
+ return 0;
+ }
console_locked = 1;
console_may_schedule = 0;
return 1;
@@ -3354,6 +3404,16 @@ void console_unlock(void)
{
struct console_flush_type ft;
+ if (console_suspended) {
+ printk_store(KERN_INFO "printk: %s\n", __func__);
+ /*
+ * Simply unlock directly.
+ * Here console_locked=1 and console_may_schedule=1.
+ */
+ up_console_sem();
+ return;
+ }
+
printk_get_console_flush_type(&ft);
if (ft.legacy_direct)
__console_flush_and_unlock();
--
2.52.0
next prev parent reply other threads:[~2026-01-28 14:00 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-21 22:42 Regression: system freeze on resume from suspend introduced by printk per-console suspended state ysard_git
2025-12-23 6:20 ` John Ogness
[not found] ` <trinity-43147d5d-a8ea-47c1-9f83-b578c346b387-1766479103562@3c-app-mailcom-bs12>
2026-01-08 0:05 ` pv
2026-01-08 9:43 ` John Ogness
2026-01-23 7:44 ` ysard
2026-01-23 12:19 ` Petr Mladek
2026-01-24 1:22 ` ysard
2026-01-28 14:00 ` Petr Mladek [this message]
2026-01-28 15:25 ` John Ogness
2026-01-29 9:34 ` ysard
2026-01-30 15:56 ` Petr Mladek
2026-01-30 16:28 ` Petr Mladek
2026-01-31 22:22 ` ysard
2026-02-02 11:02 ` Petr Mladek
2026-02-03 1:32 ` ysard
2026-02-03 14:11 ` Petr Mladek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aXoWiJhcOaGGlcmk@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=john.ogness@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=senozhatsky@chromium.org \
--cc=ysard_git@gmx.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox