Linux RTC
 help / color / mirror / Atom feed
* [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work()
@ 2026-05-25 13:08 Jinjie Ruan
  2026-06-11  1:40 ` Jinjie Ruan
  2026-06-11  2:01 ` Jinjie Ruan
  0 siblings, 2 replies; 3+ messages in thread
From: Jinjie Ruan @ 2026-05-25 13:08 UTC (permalink / raw)
  To: alexandre.belloni, linux-rtc, linux-kernel; +Cc: ruanjinjie

In virtualization environments like QEMU [1], or during hardware
clocksource anomalies, an extreme time-warp event can occur. When
the system time abruptly jumps forward, the rtc_timer_do_work() handler
falls into a prolonged processing loop to clear accumulated historical
timers via timerqueue_getnext(). Running this loop indefinitely under
the rtc->ops_lock mutex triggers a kernel softlockup, stalling
the system.

Introduce an adaptive telemetry and loop guard mechanism to enhance debug
visibility and prevent softlockups:

1. Record `start_jiffies` upon entry and leverage `time_after()` to
   check if the loop has monopolized the CPU for more than 1s (HZ). If so,
   the handler prints a telemetry warning, triggers a WARN stack dump, and
   breaks the loop to safely yield the CPU.

2. Track the execution via a `loop_count` metric. Printing this counter
   in the warning log provides vital diagnostics to distinguish
   an aggressive time-warp storm (high count) from a bogged-down callback
   bug (low count).

3. Utilize the kernel format specifier `%ptR` to convert the raw ktime
   into a human-readable timestamp (YYYY-MM-DD HH:MM:SS), allowing
   developers to instantly pinpoint the exact boundary of the time
   jump in dmesg.

This non-destructive telemetry guard provides precise hardware/emulator
diagnostic visibility while ensuring core kernel availability.

[1]: https://lore.kernel.org/all/20260114013257.3500578-1-ruanjinjie@huawei.com/
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
 drivers/rtc/interface.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index 1906f4884a83..f6c5fd16cc4e 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -927,10 +927,12 @@ static void rtc_timer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
  */
 void rtc_timer_do_work(struct work_struct *work)
 {
-	struct rtc_timer *timer;
+	unsigned long start_jiffies = jiffies;
 	struct timerqueue_node *next;
-	ktime_t now;
+	struct rtc_timer *timer;
 	struct rtc_time tm;
+	int loop_count = 0;
+	ktime_t now;
 	int err;
 
 	struct rtc_device *rtc =
@@ -945,6 +947,15 @@ void rtc_timer_do_work(struct work_struct *work)
 	}
 	now = rtc_tm_to_ktime(tm);
 	while ((next = timerqueue_getnext(&rtc->timerqueue))) {
+		loop_count++;
+
+		if (unlikely(time_after(jiffies, start_jiffies + HZ))) {
+			dev_warn(&rtc->dev, "RTC time jump (loop: %d) to %ptR.\n",
+				 loop_count, &tm);
+			WARN_ON_ONCE(1);
+			break;
+		}
+
 		if (next->expires > now)
 			break;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work()
  2026-05-25 13:08 [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work() Jinjie Ruan
@ 2026-06-11  1:40 ` Jinjie Ruan
  2026-06-11  2:01 ` Jinjie Ruan
  1 sibling, 0 replies; 3+ messages in thread
From: Jinjie Ruan @ 2026-06-11  1:40 UTC (permalink / raw)
  To: alexandre.belloni, linux-rtc, linux-kernel

Gentle ping.

On 5/25/2026 9:08 PM, Jinjie Ruan wrote:
> In virtualization environments like QEMU [1], or during hardware
> clocksource anomalies, an extreme time-warp event can occur. When
> the system time abruptly jumps forward, the rtc_timer_do_work() handler
> falls into a prolonged processing loop to clear accumulated historical
> timers via timerqueue_getnext(). Running this loop indefinitely under
> the rtc->ops_lock mutex triggers a kernel softlockup, stalling
> the system.
> 
> Introduce an adaptive telemetry and loop guard mechanism to enhance debug
> visibility and prevent softlockups:
> 
> 1. Record `start_jiffies` upon entry and leverage `time_after()` to
>    check if the loop has monopolized the CPU for more than 1s (HZ). If so,
>    the handler prints a telemetry warning, triggers a WARN stack dump, and
>    breaks the loop to safely yield the CPU.
> 
> 2. Track the execution via a `loop_count` metric. Printing this counter
>    in the warning log provides vital diagnostics to distinguish
>    an aggressive time-warp storm (high count) from a bogged-down callback
>    bug (low count).
> 
> 3. Utilize the kernel format specifier `%ptR` to convert the raw ktime
>    into a human-readable timestamp (YYYY-MM-DD HH:MM:SS), allowing
>    developers to instantly pinpoint the exact boundary of the time
>    jump in dmesg.
> 
> This non-destructive telemetry guard provides precise hardware/emulator
> diagnostic visibility while ensuring core kernel availability.
> 
> [1]: https://lore.kernel.org/all/20260114013257.3500578-1-ruanjinjie@huawei.com/
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
>  drivers/rtc/interface.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
> index 1906f4884a83..f6c5fd16cc4e 100644
> --- a/drivers/rtc/interface.c
> +++ b/drivers/rtc/interface.c
> @@ -927,10 +927,12 @@ static void rtc_timer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
>   */
>  void rtc_timer_do_work(struct work_struct *work)
>  {
> -	struct rtc_timer *timer;
> +	unsigned long start_jiffies = jiffies;
>  	struct timerqueue_node *next;
> -	ktime_t now;
> +	struct rtc_timer *timer;
>  	struct rtc_time tm;
> +	int loop_count = 0;
> +	ktime_t now;
>  	int err;
>  
>  	struct rtc_device *rtc =
> @@ -945,6 +947,15 @@ void rtc_timer_do_work(struct work_struct *work)
>  	}
>  	now = rtc_tm_to_ktime(tm);
>  	while ((next = timerqueue_getnext(&rtc->timerqueue))) {
> +		loop_count++;
> +
> +		if (unlikely(time_after(jiffies, start_jiffies + HZ))) {
> +			dev_warn(&rtc->dev, "RTC time jump (loop: %d) to %ptR.\n",
> +				 loop_count, &tm);
> +			WARN_ON_ONCE(1);
> +			break;
> +		}
> +
>  		if (next->expires > now)
>  			break;
>  


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work()
  2026-05-25 13:08 [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work() Jinjie Ruan
  2026-06-11  1:40 ` Jinjie Ruan
@ 2026-06-11  2:01 ` Jinjie Ruan
  1 sibling, 0 replies; 3+ messages in thread
From: Jinjie Ruan @ 2026-06-11  2:01 UTC (permalink / raw)
  To: alexandre.belloni, linux-rtc, linux-kernel, tglx

+cc Thomas Gleixner

On 5/25/2026 9:08 PM, Jinjie Ruan wrote:
> In virtualization environments like QEMU [1], or during hardware
> clocksource anomalies, an extreme time-warp event can occur. When
> the system time abruptly jumps forward, the rtc_timer_do_work() handler
> falls into a prolonged processing loop to clear accumulated historical
> timers via timerqueue_getnext(). Running this loop indefinitely under
> the rtc->ops_lock mutex triggers a kernel softlockup, stalling
> the system.
> 
> Introduce an adaptive telemetry and loop guard mechanism to enhance debug
> visibility and prevent softlockups:
> 
> 1. Record `start_jiffies` upon entry and leverage `time_after()` to
>    check if the loop has monopolized the CPU for more than 1s (HZ). If so,
>    the handler prints a telemetry warning, triggers a WARN stack dump, and
>    breaks the loop to safely yield the CPU.
> 
> 2. Track the execution via a `loop_count` metric. Printing this counter
>    in the warning log provides vital diagnostics to distinguish
>    an aggressive time-warp storm (high count) from a bogged-down callback
>    bug (low count).
> 
> 3. Utilize the kernel format specifier `%ptR` to convert the raw ktime
>    into a human-readable timestamp (YYYY-MM-DD HH:MM:SS), allowing
>    developers to instantly pinpoint the exact boundary of the time
>    jump in dmesg.
> 
> This non-destructive telemetry guard provides precise hardware/emulator
> diagnostic visibility while ensuring core kernel availability.
> 
> [1]: https://lore.kernel.org/all/20260114013257.3500578-1-ruanjinjie@huawei.com/
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
>  drivers/rtc/interface.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
> index 1906f4884a83..f6c5fd16cc4e 100644
> --- a/drivers/rtc/interface.c
> +++ b/drivers/rtc/interface.c
> @@ -927,10 +927,12 @@ static void rtc_timer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
>   */
>  void rtc_timer_do_work(struct work_struct *work)
>  {
> -	struct rtc_timer *timer;
> +	unsigned long start_jiffies = jiffies;
>  	struct timerqueue_node *next;
> -	ktime_t now;
> +	struct rtc_timer *timer;
>  	struct rtc_time tm;
> +	int loop_count = 0;
> +	ktime_t now;
>  	int err;
>  
>  	struct rtc_device *rtc =
> @@ -945,6 +947,15 @@ void rtc_timer_do_work(struct work_struct *work)
>  	}
>  	now = rtc_tm_to_ktime(tm);
>  	while ((next = timerqueue_getnext(&rtc->timerqueue))) {
> +		loop_count++;
> +
> +		if (unlikely(time_after(jiffies, start_jiffies + HZ))) {
> +			dev_warn(&rtc->dev, "RTC time jump (loop: %d) to %ptR.\n",
> +				 loop_count, &tm);
> +			WARN_ON_ONCE(1);
> +			break;
> +		}
> +
>  		if (next->expires > now)
>  			break;
>  


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-11  2:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-25 13:08 [PATCH] rtc: interface: Add rtc time jump debug in rtc_timer_do_work() Jinjie Ruan
2026-06-11  1:40 ` Jinjie Ruan
2026-06-11  2:01 ` Jinjie Ruan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox