Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] rtc: armada38x: do not advertise update interrupt (UIE) support
@ 2026-07-04 21:07 Ioannis Fountzoulas
  2026-07-04 22:13 ` Alexandre Belloni
  0 siblings, 1 reply; 2+ messages in thread
From: Ioannis Fountzoulas @ 2026-07-04 21:07 UTC (permalink / raw)
  To: Alexandre Belloni, Andrew Lunn, Gregory Clement,
	Sebastian Hesselbarth
  Cc: linux-arm-kernel, linux-rtc, linux-kernel, Ioannis Fountzoulas

Problem:
chrony enables RTC update interrupts via the RTC_UIE_ON ioctl to track
RTC drift. On the armada38x driver this request is served by the RTC
core's native path, which arms a 1 second periodic timer that is
re-programmed on the alarm and serviced by rtc_timer_do_work().
If the RTC time is then stepped forward by a large amount while this
timer is active, its scheduled expiry ends up far in the past compared
to the freshly read time.

Why the CPU hangs:
When rtc_timer_do_work() runs, it expires every timer whose expiry is
not in the future, advancing periodic timers by one period each pass:
	while ((next = timerqueue_getnext(&rtc->timerqueue))) {
		if (next->expires > now)
			break;
		...
		timer->node.expires += timer->period;   /* += 1s */
		timerqueue_add(&rtc->timerqueue, &timer->node);
	}
With a large forward step (seen when the RTC starts far in the past and
chrony corrects it after the first NTP sync), the periodic UIE timer is
overdue by the size of the jump, so this loop must run one iteration per
elapsed second before it can exit.
It never yields in that time, so the workqueue worker pins the CPU and
the watchdog reports a soft lockup / RCU stall, after which the
board reboots:
	watchdog: BUG: soft lockup - CPU#1 stuck for 48s! [kworker/1:3:432]
	Kernel panic - not syncing: softlockup: hung tasks
	Workqueue: events rtc_timer_do_work
	 rtc_handle_legacy_irq from rtc_timer_do_work
	 rtc_timer_do_work from process_one_work
	 process_one_work from worker_thread

Fix:
Clear RTC_FEATURE_UPDATE_INTERRUPT at probe time so the driver stops
advertising native UIE. RTC_UIE_ON is then served by the poll-based UIE
emulation in rtc-dev (CONFIG_RTC_INTF_DEV_UIE_EMUL), which delivers the
1 Hz update notifications chrony needs without ever queuing the runaway
periodic timer.

Testing:
Tested on a Marvell Armada 38x board (ARM Cortex-A9). Before the change,
stepping the clock forward while UIE was enabled reproduced the soft
lockup (about 1 in 4 boots). After the change, rtc_timer_do_work() runs
zero loop iterations and returns immediately, chrony still receives
update interrupts via emulation, and no lockups occur over repeated
reboot cycles.

Signed-off-by: Ioannis Fountzoulas <ioannis.fountzoulas@nokia.com>
---
 drivers/rtc/rtc-armada38x.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/rtc/rtc-armada38x.c b/drivers/rtc/rtc-armada38x.c
index 245290ae1a8d..da036d819649 100644
--- a/drivers/rtc/rtc-armada38x.c
+++ b/drivers/rtc/rtc-armada38x.c
@@ -526,6 +526,14 @@ static __init int armada38x_rtc_probe(struct platform_device *pdev)
 	else
 		clear_bit(RTC_FEATURE_ALARM, rtc->rtc_dev->features);
 
+	/*
+	 * A large forward step of the RTC time makes
+	 * rtc_timer_do_work() replay one period per elapsed second and can
+	 * loop long enough to trigger a soft lockup. Do not advertise
+	 * native UIE; RTC_UIE_ON then uses the poll-based emulation.
+	 */
+	clear_bit(RTC_FEATURE_UPDATE_INTERRUPT, rtc->rtc_dev->features);
+
 	/* Update RTC-MBUS bridge timing parameters */
 	rtc->data->update_mbus_timing(rtc);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] rtc: armada38x: do not advertise update interrupt (UIE) support
  2026-07-04 21:07 [PATCH] rtc: armada38x: do not advertise update interrupt (UIE) support Ioannis Fountzoulas
@ 2026-07-04 22:13 ` Alexandre Belloni
  0 siblings, 0 replies; 2+ messages in thread
From: Alexandre Belloni @ 2026-07-04 22:13 UTC (permalink / raw)
  To: Ioannis Fountzoulas
  Cc: Andrew Lunn, Gregory Clement, Sebastian Hesselbarth,
	linux-arm-kernel, linux-rtc, linux-kernel

On 04/07/2026 17:07:10-0400, Ioannis Fountzoulas wrote:
> Problem:
> chrony enables RTC update interrupts via the RTC_UIE_ON ioctl to track
> RTC drift. On the armada38x driver this request is served by the RTC
> core's native path, which arms a 1 second periodic timer that is
> re-programmed on the alarm and serviced by rtc_timer_do_work().
> If the RTC time is then stepped forward by a large amount while this
> timer is active, its scheduled expiry ends up far in the past compared
> to the freshly read time.
> 
> Why the CPU hangs:
> When rtc_timer_do_work() runs, it expires every timer whose expiry is
> not in the future, advancing periodic timers by one period each pass:
> 	while ((next = timerqueue_getnext(&rtc->timerqueue))) {
> 		if (next->expires > now)
> 			break;
> 		...
> 		timer->node.expires += timer->period;   /* += 1s */
> 		timerqueue_add(&rtc->timerqueue, &timer->node);
> 	}
> With a large forward step (seen when the RTC starts far in the past and
> chrony corrects it after the first NTP sync), the periodic UIE timer is
> overdue by the size of the jump, so this loop must run one iteration per
> elapsed second before it can exit.

This is not correct because the core ensures that UIE is disable before
changing the RTC time and enables it afterwards so the jump doesn't have
any impact on UIE. So what you need to explain is why UIE is not synced
with the RTC time after it has been set. My guess is RES-3124064 is the
culprit.

> It never yields in that time, so the workqueue worker pins the CPU and
> the watchdog reports a soft lockup / RCU stall, after which the
> board reboots:
> 	watchdog: BUG: soft lockup - CPU#1 stuck for 48s! [kworker/1:3:432]
> 	Kernel panic - not syncing: softlockup: hung tasks
> 	Workqueue: events rtc_timer_do_work
> 	 rtc_handle_legacy_irq from rtc_timer_do_work
> 	 rtc_timer_do_work from process_one_work
> 	 process_one_work from worker_thread
> 
> Fix:
> Clear RTC_FEATURE_UPDATE_INTERRUPT at probe time so the driver stops
> advertising native UIE. RTC_UIE_ON is then served by the poll-based UIE
> emulation in rtc-dev (CONFIG_RTC_INTF_DEV_UIE_EMUL), which delivers the
> 1 Hz update notifications chrony needs without ever queuing the runaway
> periodic timer.
> 
> Testing:
> Tested on a Marvell Armada 38x board (ARM Cortex-A9). Before the change,
> stepping the clock forward while UIE was enabled reproduced the soft
> lockup (about 1 in 4 boots). After the change, rtc_timer_do_work() runs
> zero loop iterations and returns immediately, chrony still receives
> update interrupts via emulation, and no lockups occur over repeated
> reboot cycles.
> 
> Signed-off-by: Ioannis Fountzoulas <ioannis.fountzoulas@nokia.com>
> ---
>  drivers/rtc/rtc-armada38x.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/rtc/rtc-armada38x.c b/drivers/rtc/rtc-armada38x.c
> index 245290ae1a8d..da036d819649 100644
> --- a/drivers/rtc/rtc-armada38x.c
> +++ b/drivers/rtc/rtc-armada38x.c
> @@ -526,6 +526,14 @@ static __init int armada38x_rtc_probe(struct platform_device *pdev)
>  	else
>  		clear_bit(RTC_FEATURE_ALARM, rtc->rtc_dev->features);
>  
> +	/*
> +	 * A large forward step of the RTC time makes
> +	 * rtc_timer_do_work() replay one period per elapsed second and can
> +	 * loop long enough to trigger a soft lockup. Do not advertise
> +	 * native UIE; RTC_UIE_ON then uses the poll-based emulation.
> +	 */
> +	clear_bit(RTC_FEATURE_UPDATE_INTERRUPT, rtc->rtc_dev->features);
> +
>  	/* Update RTC-MBUS bridge timing parameters */
>  	rtc->data->update_mbus_timing(rtc);
>  
> -- 
> 2.34.1
> 

-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-07-04 22:19 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-04 21:07 [PATCH] rtc: armada38x: do not advertise update interrupt (UIE) support Ioannis Fountzoulas
2026-07-04 22:13 ` Alexandre Belloni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox