public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Michael Nazzareno Trimarchi <michael@amarulasolutions.com>
Cc: John Stultz <jstultz@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Stephen Boyd <sboyd@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Michael <michael@mipisi.de>,
	kernel-team@android.com, Peter Zijlstra <peterz@infradead.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>
Subject: Re: [RFC][PATCH 2/2] time: alarmtimer: Use TASK_FREEZABLE to cleanup freezer handling
Date: Mon, 20 Feb 2023 22:18:57 +0100	[thread overview]
Message-ID: <87sff0ox1a.ffs@tglx> (raw)
In-Reply-To: <CAOf5uw=a2RYYFj+4_WOX+KaF_FCXSsUgfM+T2m2XjVuqKMdooA@mail.gmail.com>

Michael!

On Mon, Feb 20 2023 at 19:11, Michael Nazzareno Trimarchi wrote:
> On Mon, Feb 20, 2023 at 6:48 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> [   27.349352] alarmtimer_enqueue()
>>
>> U: Before SUSPEND
>>
>> [   31.353228] PM: suspend entry (s2idle)
>> [   31.388857] Filesystems sync: 0.033 seconds
>> [   31.418427] Freezing user space processes
>> [   31.422406] Freezing user space processes completed (elapsed 0.002 seconds)
>> [   31.425435] OOM killer disabled.
>> [   31.426833] Freezing remaining freezable tasks
>> [   31.429838] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
>> [   31.432922] printk: Suspending console(s) (use no_console_suspend to debug)
>> [   31.435912] alarmtimer alarmtimer.0.auto: PM: dpm_run_callback(): platform_pm_suspend+0x0/0x50 returns -16
>> [   31.435954] alarmtimer alarmtimer.0.auto: PM: failed to suspend: error -16
>>
>> That means the RTC interrupt was raised before the system was able to suspend.
>
> if (ktime_to_ns(min) < 2 * NSEC_PER_SEC) {
>     pm_wakeup_event(dev, 2 * MSEC_PER_SEC);
>     return -EBUSY;
> }
>
> I think that above happens to you. So it means that you are too close
> to this limit, can be?

Maybe.

> Yes but the alarm for me was set to be fired just before freezing. Is
> this a valid scenario?

Sure why not? 

> Let's say that I set an alarm to be fired just before the userspace
> freeze happens. If I'm close to it then then process will not process
> the signal to enquene again the alarm in the list and then during
> alarm suspend the list will be empty for the above.

Halfways, but slowly your explanations start to make sense. Here is the
dmesg output you provided:

> [   89.674127] PM: suspend entry (deep)
> [   89.714916] Filesystems sync: 0.037 seconds
> [   89.733594] Freezing user space processes
> [   89.740680] Freezing user space processes completed (elapsed 0.002 seconds)

User space tasks are frozen now.

> [   89.748593] OOM killer disabled.
> [   89.752257] Freezing remaining freezable tasks
> [   89.756807] alarmtimer_fired: called
> [   89.756831] alarmtimer_dequeue: called <---- HERE

Here fires the underlying hrtimer before devices are suspended, so the
sig_sendqueue() cannot wake up the task because task->state ==
TASK_FROZEN, which means the signal wont be handled and the timer wont
be rearmed until the task is thawed.

And as you correctly observed the alarmtimer_suspend() path won't see a
pending timer anymore because it is dequeued.

So precisely the time between freeze(alarmtask) and alarmtimer_suspend()
is a gaping hole which guarantees lost wakeups.

That's completely unrelated to Johns patches. That hole has been there
forever.

I assume that this horrible freezer hackery was supposed to plug that
hole, but that gem is not solving anything as far as I understand what
it is doing. I'm still failing to understand what it is supposed to
solve, but that's a different story.

Aside of that I can clearly reproduce the issue, now that I understand
what you are trying to tell, on plain 6.2 without and with John's
cleanup.

Something like the below plugs the hole reliably.

Thanks,

        tglx
---
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -26,6 +26,7 @@
 #include <linux/freezer.h>
 #include <linux/compat.h>
 #include <linux/module.h>
+#include <linux/suspend.h>
 #include <linux/time_namespace.h>
 
 #include "posix-timers.h"
@@ -176,6 +177,7 @@ static void alarmtimer_dequeue(struct al
 	alarm->state &= ~ALARMTIMER_STATE_ENQUEUED;
 }
 
+static atomic_t alarmtimer_wakeup;
 
 /**
  * alarmtimer_fired - Handles alarm hrtimer being fired.
@@ -194,6 +196,8 @@ static enum hrtimer_restart alarmtimer_f
 	int ret = HRTIMER_NORESTART;
 	int restart = ALARMTIMER_NORESTART;
 
+	atomic_inc(&alarmtimer_wakeup);
+
 	spin_lock_irqsave(&base->lock, flags);
 	alarmtimer_dequeue(base, alarm);
 	spin_unlock_irqrestore(&base->lock, flags);
@@ -244,6 +248,16 @@ static int alarmtimer_suspend(struct dev
 	if (!rtc)
 		return 0;
 
+	/*
+	 * Handle wakeups which happened between the start of suspend and
+	 * now as those wakeups might have tried to wake up a frozen task
+	 * which means they are not longer in the alarm timer list.
+	 */
+	if (atomic_read(&alarmtimer_wakeup)) {
+		pm_wakeup_event(dev, 0);
+		return -EBUSY;
+	}
+
 	/* Find the soonest timer to expire*/
 	for (i = 0; i < ALARM_NUMTYPE; i++) {
 		struct alarm_base *base = &alarm_bases[i];
@@ -296,6 +310,31 @@ static int alarmtimer_resume(struct devi
 	return 0;
 }
 
+static int alarmtimer_pm_notifier_fn(struct notifier_block *bl, unsigned long state,
+				     void *unused)
+{
+	switch (state) {
+	case PM_HIBERNATION_PREPARE:
+	case PM_POST_HIBERNATION:
+		atomic_set(&alarmtimer_wakeup, 0);
+		break;
+	}
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block alarmtimer_pm_notifier = {
+	.notifier_call = alarmtimer_pm_notifier_fn,
+};
+
+static inline int alarmtimer_register_pm_notifier(void)
+{
+	return register_pm_notifier(&alarmtimer_pm_notifier);
+}
+
+static inline void alarmtimer_unregister_pm_notifier(void)
+{
+	unregister_pm_notifier(&alarmtimer_pm_notifier);
+}
 #else
 static int alarmtimer_suspend(struct device *dev)
 {
@@ -306,6 +345,15 @@ static int alarmtimer_resume(struct devi
 {
 	return 0;
 }
+
+static inline int alarmtimer_register_pm_notifier(void)
+{
+	return 0;
+}
+
+static inline void alarmtimer_unregister_pm_notifier(void)
+{
+}
 #endif
 
 static void
@@ -904,11 +952,17 @@ static int __init alarmtimer_init(void)
 	if (error)
 		return error;
 
-	error = platform_driver_register(&alarmtimer_driver);
+	error = alarmtimer_register_pm_notifier();
 	if (error)
 		goto out_if;
 
+	error = platform_driver_register(&alarmtimer_driver);
+	if (error)
+		goto out_pm;
+
 	return 0;
+out_pm:
+	alarmtimer_unregister_pm_notifier();
 out_if:
 	alarmtimer_rtc_interface_remove();
 	return error;

        







  reply	other threads:[~2023-02-20 21:19 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-11  6:45 [RFC][PATCH 1/2] time: alarmtimer: Fix erroneous case of using 0 as an "invalid" initialization value John Stultz
2023-02-11  6:45 ` [RFC][PATCH 2/2] time: alarmtimer: Use TASK_FREEZABLE to cleanup freezer handling John Stultz
2023-02-15  8:22   ` Michael Nazzareno Trimarchi
2023-02-15 13:52     ` Thomas Gleixner
2023-02-18 14:56   ` Michael Nazzareno Trimarchi
2023-02-20  7:23     ` Thomas Gleixner
2023-02-20  8:23       ` Michael Nazzareno Trimarchi
2023-02-20 11:47         ` Michael Nazzareno Trimarchi
2023-02-20 17:48           ` Thomas Gleixner
2023-02-20 18:11             ` Michael Nazzareno Trimarchi
2023-02-20 21:18               ` Thomas Gleixner [this message]
2023-02-20 21:32                 ` Michael Nazzareno Trimarchi
2023-02-21  0:12                   ` Thomas Gleixner
2023-02-21  7:10                     ` Michael Nazzareno Trimarchi
2023-02-24 10:02                       ` Michael Nazzareno Trimarchi
2023-02-24 10:32                         ` Michael Nazzareno Trimarchi
2023-02-24 11:57                           ` Michael Nazzareno Trimarchi
2023-02-28  0:03                 ` John Stultz
2023-02-28  4:06                   ` John Stultz
2023-03-01 22:11                     ` Thomas Gleixner
2023-03-02  0:47                       ` John Stultz
2023-03-02  9:34                         ` Michael Nazzareno Trimarchi
2023-03-02 15:00                         ` Rafael J. Wysocki
2023-03-15 20:12                           ` Michael Nazzareno Trimarchi
2023-03-02 14:54                       ` Rafael J. Wysocki
2023-03-02 14:56                         ` Rafael J. Wysocki
2023-03-02 14:32                 ` Rafael J. Wysocki
2023-03-02 22:21                   ` Thomas Gleixner
2023-03-02 22:58                     ` Thomas Gleixner
2024-06-27  7:46                       ` Michael Nazzareno Trimarchi
2024-07-13 10:47                         ` Thomas Gleixner
2024-07-15  8:20                           ` Michael Nazzareno Trimarchi
2023-02-21 11:50   ` Michael Nazzareno Trimarchi
2024-01-11  8:28   ` Michael Nazzareno Trimarchi
2023-02-18 14:51 ` [RFC][PATCH 1/2] time: alarmtimer: Fix erroneous case of using 0 as an "invalid" initialization value Michael Nazzareno Trimarchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sff0ox1a.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=arnd@arndb.de \
    --cc=jstultz@google.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael@amarulasolutions.com \
    --cc=michael@mipisi.de \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=sboyd@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox