* [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work()
@ 2026-05-25 13:08 Jinjie Ruan
2026-06-11 1:40 ` Jinjie Ruan
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Jinjie Ruan @ 2026-05-25 13:08 UTC (permalink / raw)
To: alexandre.belloni, linux-rtc, linux-kernel; +Cc: ruanjinjie
In virtualization environments like QEMU [1], or during hardware
clocksource anomalies, an extreme time-warp event can occur. When
the system time abruptly jumps forward, the rtc_timer_do_work() handler
falls into a prolonged processing loop to clear accumulated historical
timers via timerqueue_getnext(). Running this loop indefinitely under
the rtc->ops_lock mutex triggers a kernel softlockup, stalling
the system.
Introduce an adaptive telemetry and loop guard mechanism to enhance debug
visibility and prevent softlockups:
1. Record `start_jiffies` upon entry and leverage `time_after()` to
check if the loop has monopolized the CPU for more than 1s (HZ). If so,
the handler prints a telemetry warning, triggers a WARN stack dump, and
breaks the loop to safely yield the CPU.
2. Track the execution via a `loop_count` metric. Printing this counter
in the warning log provides vital diagnostics to distinguish
an aggressive time-warp storm (high count) from a bogged-down callback
bug (low count).
3. Utilize the kernel format specifier `%ptR` to convert the raw ktime
into a human-readable timestamp (YYYY-MM-DD HH:MM:SS), allowing
developers to instantly pinpoint the exact boundary of the time
jump in dmesg.
This non-destructive telemetry guard provides precise hardware/emulator
diagnostic visibility while ensuring core kernel availability.
[1]: https://lore.kernel.org/all/20260114013257.3500578-1-ruanjinjie@huawei.com/
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
drivers/rtc/interface.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index 1906f4884a83..f6c5fd16cc4e 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -927,10 +927,12 @@ static void rtc_timer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
*/
void rtc_timer_do_work(struct work_struct *work)
{
- struct rtc_timer *timer;
+ unsigned long start_jiffies = jiffies;
struct timerqueue_node *next;
- ktime_t now;
+ struct rtc_timer *timer;
struct rtc_time tm;
+ int loop_count = 0;
+ ktime_t now;
int err;
struct rtc_device *rtc =
@@ -945,6 +947,15 @@ void rtc_timer_do_work(struct work_struct *work)
}
now = rtc_tm_to_ktime(tm);
while ((next = timerqueue_getnext(&rtc->timerqueue))) {
+ loop_count++;
+
+ if (unlikely(time_after(jiffies, start_jiffies + HZ))) {
+ dev_warn(&rtc->dev, "RTC time jump (loop: %d) to %ptR.\n",
+ loop_count, &tm);
+ WARN_ON_ONCE(1);
+ break;
+ }
+
if (next->expires > now)
break;
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work()
2026-05-25 13:08 [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work() Jinjie Ruan
@ 2026-06-11 1:40 ` Jinjie Ruan
2026-06-11 2:01 ` Jinjie Ruan
2026-06-15 15:22 ` Alexandre Belloni
2 siblings, 0 replies; 4+ messages in thread
From: Jinjie Ruan @ 2026-06-11 1:40 UTC (permalink / raw)
To: alexandre.belloni, linux-rtc, linux-kernel
Gentle ping.
On 5/25/2026 9:08 PM, Jinjie Ruan wrote:
> In virtualization environments like QEMU [1], or during hardware
> clocksource anomalies, an extreme time-warp event can occur. When
> the system time abruptly jumps forward, the rtc_timer_do_work() handler
> falls into a prolonged processing loop to clear accumulated historical
> timers via timerqueue_getnext(). Running this loop indefinitely under
> the rtc->ops_lock mutex triggers a kernel softlockup, stalling
> the system.
>
> Introduce an adaptive telemetry and loop guard mechanism to enhance debug
> visibility and prevent softlockups:
>
> 1. Record `start_jiffies` upon entry and leverage `time_after()` to
> check if the loop has monopolized the CPU for more than 1s (HZ). If so,
> the handler prints a telemetry warning, triggers a WARN stack dump, and
> breaks the loop to safely yield the CPU.
>
> 2. Track the execution via a `loop_count` metric. Printing this counter
> in the warning log provides vital diagnostics to distinguish
> an aggressive time-warp storm (high count) from a bogged-down callback
> bug (low count).
>
> 3. Utilize the kernel format specifier `%ptR` to convert the raw ktime
> into a human-readable timestamp (YYYY-MM-DD HH:MM:SS), allowing
> developers to instantly pinpoint the exact boundary of the time
> jump in dmesg.
>
> This non-destructive telemetry guard provides precise hardware/emulator
> diagnostic visibility while ensuring core kernel availability.
>
> [1]: https://lore.kernel.org/all/20260114013257.3500578-1-ruanjinjie@huawei.com/
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> drivers/rtc/interface.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
> index 1906f4884a83..f6c5fd16cc4e 100644
> --- a/drivers/rtc/interface.c
> +++ b/drivers/rtc/interface.c
> @@ -927,10 +927,12 @@ static void rtc_timer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
> */
> void rtc_timer_do_work(struct work_struct *work)
> {
> - struct rtc_timer *timer;
> + unsigned long start_jiffies = jiffies;
> struct timerqueue_node *next;
> - ktime_t now;
> + struct rtc_timer *timer;
> struct rtc_time tm;
> + int loop_count = 0;
> + ktime_t now;
> int err;
>
> struct rtc_device *rtc =
> @@ -945,6 +947,15 @@ void rtc_timer_do_work(struct work_struct *work)
> }
> now = rtc_tm_to_ktime(tm);
> while ((next = timerqueue_getnext(&rtc->timerqueue))) {
> + loop_count++;
> +
> + if (unlikely(time_after(jiffies, start_jiffies + HZ))) {
> + dev_warn(&rtc->dev, "RTC time jump (loop: %d) to %ptR.\n",
> + loop_count, &tm);
> + WARN_ON_ONCE(1);
> + break;
> + }
> +
> if (next->expires > now)
> break;
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work()
2026-05-25 13:08 [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work() Jinjie Ruan
2026-06-11 1:40 ` Jinjie Ruan
@ 2026-06-11 2:01 ` Jinjie Ruan
2026-06-15 15:22 ` Alexandre Belloni
2 siblings, 0 replies; 4+ messages in thread
From: Jinjie Ruan @ 2026-06-11 2:01 UTC (permalink / raw)
To: alexandre.belloni, linux-rtc, linux-kernel, tglx
+cc Thomas Gleixner
On 5/25/2026 9:08 PM, Jinjie Ruan wrote:
> In virtualization environments like QEMU [1], or during hardware
> clocksource anomalies, an extreme time-warp event can occur. When
> the system time abruptly jumps forward, the rtc_timer_do_work() handler
> falls into a prolonged processing loop to clear accumulated historical
> timers via timerqueue_getnext(). Running this loop indefinitely under
> the rtc->ops_lock mutex triggers a kernel softlockup, stalling
> the system.
>
> Introduce an adaptive telemetry and loop guard mechanism to enhance debug
> visibility and prevent softlockups:
>
> 1. Record `start_jiffies` upon entry and leverage `time_after()` to
> check if the loop has monopolized the CPU for more than 1s (HZ). If so,
> the handler prints a telemetry warning, triggers a WARN stack dump, and
> breaks the loop to safely yield the CPU.
>
> 2. Track the execution via a `loop_count` metric. Printing this counter
> in the warning log provides vital diagnostics to distinguish
> an aggressive time-warp storm (high count) from a bogged-down callback
> bug (low count).
>
> 3. Utilize the kernel format specifier `%ptR` to convert the raw ktime
> into a human-readable timestamp (YYYY-MM-DD HH:MM:SS), allowing
> developers to instantly pinpoint the exact boundary of the time
> jump in dmesg.
>
> This non-destructive telemetry guard provides precise hardware/emulator
> diagnostic visibility while ensuring core kernel availability.
>
> [1]: https://lore.kernel.org/all/20260114013257.3500578-1-ruanjinjie@huawei.com/
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> drivers/rtc/interface.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
> index 1906f4884a83..f6c5fd16cc4e 100644
> --- a/drivers/rtc/interface.c
> +++ b/drivers/rtc/interface.c
> @@ -927,10 +927,12 @@ static void rtc_timer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
> */
> void rtc_timer_do_work(struct work_struct *work)
> {
> - struct rtc_timer *timer;
> + unsigned long start_jiffies = jiffies;
> struct timerqueue_node *next;
> - ktime_t now;
> + struct rtc_timer *timer;
> struct rtc_time tm;
> + int loop_count = 0;
> + ktime_t now;
> int err;
>
> struct rtc_device *rtc =
> @@ -945,6 +947,15 @@ void rtc_timer_do_work(struct work_struct *work)
> }
> now = rtc_tm_to_ktime(tm);
> while ((next = timerqueue_getnext(&rtc->timerqueue))) {
> + loop_count++;
> +
> + if (unlikely(time_after(jiffies, start_jiffies + HZ))) {
> + dev_warn(&rtc->dev, "RTC time jump (loop: %d) to %ptR.\n",
> + loop_count, &tm);
> + WARN_ON_ONCE(1);
> + break;
> + }
> +
> if (next->expires > now)
> break;
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work()
2026-05-25 13:08 [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work() Jinjie Ruan
2026-06-11 1:40 ` Jinjie Ruan
2026-06-11 2:01 ` Jinjie Ruan
@ 2026-06-15 15:22 ` Alexandre Belloni
2 siblings, 0 replies; 4+ messages in thread
From: Alexandre Belloni @ 2026-06-15 15:22 UTC (permalink / raw)
To: Jinjie Ruan; +Cc: linux-rtc, linux-kernel
Hello,
On 25/05/2026 21:08:25+0800, Jinjie Ruan wrote:
> In virtualization environments like QEMU [1], or during hardware
> clocksource anomalies, an extreme time-warp event can occur. When
> the system time abruptly jumps forward, the rtc_timer_do_work() handler
> falls into a prolonged processing loop to clear accumulated historical
> timers via timerqueue_getnext(). Running this loop indefinitely under
> the rtc->ops_lock mutex triggers a kernel softlockup, stalling
> the system.
>
> Introduce an adaptive telemetry and loop guard mechanism to enhance debug
> visibility and prevent softlockups:
>
> 1. Record `start_jiffies` upon entry and leverage `time_after()` to
> check if the loop has monopolized the CPU for more than 1s (HZ). If so,
> the handler prints a telemetry warning, triggers a WARN stack dump, and
> breaks the loop to safely yield the CPU.
>
> 2. Track the execution via a `loop_count` metric. Printing this counter
> in the warning log provides vital diagnostics to distinguish
> an aggressive time-warp storm (high count) from a bogged-down callback
> bug (low count).
>
> 3. Utilize the kernel format specifier `%ptR` to convert the raw ktime
> into a human-readable timestamp (YYYY-MM-DD HH:MM:SS), allowing
> developers to instantly pinpoint the exact boundary of the time
> jump in dmesg.
>
> This non-destructive telemetry guard provides precise hardware/emulator
> diagnostic visibility while ensuring core kernel availability.
>
> [1]: https://lore.kernel.org/all/20260114013257.3500578-1-ruanjinjie@huawei.com/
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> drivers/rtc/interface.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
> index 1906f4884a83..f6c5fd16cc4e 100644
> --- a/drivers/rtc/interface.c
> +++ b/drivers/rtc/interface.c
> @@ -927,10 +927,12 @@ static void rtc_timer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
> */
> void rtc_timer_do_work(struct work_struct *work)
> {
> - struct rtc_timer *timer;
> + unsigned long start_jiffies = jiffies;
> struct timerqueue_node *next;
> - ktime_t now;
> + struct rtc_timer *timer;
> struct rtc_time tm;
> + int loop_count = 0;
> + ktime_t now;
> int err;
>
> struct rtc_device *rtc =
> @@ -945,6 +947,15 @@ void rtc_timer_do_work(struct work_struct *work)
> }
> now = rtc_tm_to_ktime(tm);
> while ((next = timerqueue_getnext(&rtc->timerqueue))) {
> + loop_count++;
> +
> + if (unlikely(time_after(jiffies, start_jiffies + HZ))) {
> + dev_warn(&rtc->dev, "RTC time jump (loop: %d) to %ptR.\n",
> + loop_count, &tm);
> + WARN_ON_ONCE(1);
So, your issue is that it is too slow so you make it even slower? There
are already plenty of tracepoints that allow proper debugging in this
loop, I'm pretty sure we don't want to bloat the kernel with more
messages.
--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-15 15:22 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-25 13:08 [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work() Jinjie Ruan
2026-06-11 1:40 ` Jinjie Ruan
2026-06-11 2:01 ` Jinjie Ruan
2026-06-15 15:22 ` Alexandre Belloni
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.