* [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements
@ 2025-08-04 8:03 Markus Stockhausen
2025-08-04 8:03 ` [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers Markus Stockhausen
` (4 more replies)
0 siblings, 5 replies; 12+ messages in thread
From: Markus Stockhausen @ 2025-08-04 8:03 UTC (permalink / raw)
To: markus.stockhausen, daniel.lezcano, tglx, linux-kernel, howels,
bjorn
This series fixes some shortcomings of the Realtek Otto timer driver.
These became evident after switching to longterm kernel 6.12. Devices
were randomly rebooted by the watchdog.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers
2025-08-04 8:03 [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements Markus Stockhausen
@ 2025-08-04 8:03 ` Markus Stockhausen
2025-09-10 9:02 ` Daniel Lezcano
2025-08-04 8:03 ` [PATCH 2/4] clocksource/drivers/timer-rtl-otto: drop set_counter function Markus Stockhausen
` (3 subsequent siblings)
4 siblings, 1 reply; 12+ messages in thread
From: Markus Stockhausen @ 2025-08-04 8:03 UTC (permalink / raw)
To: markus.stockhausen, daniel.lezcano, tglx, linux-kernel, howels,
bjorn
The OpenWrt distribution has switched from kernel longterm 6.6 to
6.12. Reports show that devices with the Realtek Otto switch platform
die during operation and are rebooted by the watchdog. Sorting out
other possible reasons the Otto timer is to blame. The platform
currently consists of 4 targets with different hardware revisions.
It is not 100% clear which devices and revisions are affected.
Analysis shows:
A more aggressive sched/deadline handling leads to more timer starts
with small intervals. This increases the bug chances. See
https://marc.info/?l=linux-kernel&m=175276556023276&w=2
Focusing on the real issue a hardware limitation on some devices was
found. There is a minimal chance that a timer ends without firing an
interrupt if it is reprogrammed within the 5us before its expiration
time. Work around this issue by introducing a bounce() function. It
restarts the timer directly before the normal restart functions as
follows:
- Stop timer
- Restart timer with a slow frequency.
- Target time will be >5us
- The subsequent normal restart is outside the critical window
Downstream has already tested and confirmed a patch. See
https://github.com/openwrt/openwrt/pull/19468
https://forum.openwrt.org/t/support-for-rtl838x-based-managed-switches/57875/3788
Tested-by: Stephen Howell <howels@allthatwemight.be>
Tested-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
---
drivers/clocksource/timer-rtl-otto.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/drivers/clocksource/timer-rtl-otto.c b/drivers/clocksource/timer-rtl-otto.c
index 8a3068b36e75..8be45a11fb8b 100644
--- a/drivers/clocksource/timer-rtl-otto.c
+++ b/drivers/clocksource/timer-rtl-otto.c
@@ -38,6 +38,7 @@
#define RTTM_BIT_COUNT 28
#define RTTM_MIN_DELTA 8
#define RTTM_MAX_DELTA CLOCKSOURCE_MASK(28)
+#define RTTM_MAX_DIVISOR GENMASK(15, 0)
/*
* Timers are derived from the LXB clock frequency. Usually this is a fixed
@@ -112,6 +113,22 @@ static irqreturn_t rttm_timer_interrupt(int irq, void *dev_id)
return IRQ_HANDLED;
}
+static void rttm_bounce_timer(void __iomem *base, u32 mode)
+{
+ /*
+ * When a running timer has less than ~5us left, a stop/start sequence
+ * might fail. While the details are unknown the most evident effect is
+ * that the subsequent interrupt will not be fired.
+ *
+ * As a workaround issue an intermediate restart with a very slow
+ * frequency of ~3kHz keeping the target counter (>=8). So the follow
+ * up restart will always be issued outside the critical window.
+ */
+
+ rttm_disable_timer(base);
+ rttm_enable_timer(base, mode, RTTM_MAX_DIVISOR);
+}
+
static void rttm_stop_timer(void __iomem *base)
{
rttm_disable_timer(base);
@@ -129,6 +146,7 @@ static int rttm_next_event(unsigned long delta, struct clock_event_device *clkev
struct timer_of *to = to_timer_of(clkevt);
RTTM_DEBUG(to->of_base.base);
+ rttm_bounce_timer(to->of_base.base, RTTM_CTRL_COUNTER);
rttm_stop_timer(to->of_base.base);
rttm_set_period(to->of_base.base, delta);
rttm_start_timer(to, RTTM_CTRL_COUNTER);
@@ -141,6 +159,7 @@ static int rttm_state_oneshot(struct clock_event_device *clkevt)
struct timer_of *to = to_timer_of(clkevt);
RTTM_DEBUG(to->of_base.base);
+ rttm_bounce_timer(to->of_base.base, RTTM_CTRL_COUNTER);
rttm_stop_timer(to->of_base.base);
rttm_set_period(to->of_base.base, RTTM_TICKS_PER_SEC / HZ);
rttm_start_timer(to, RTTM_CTRL_COUNTER);
@@ -153,6 +172,7 @@ static int rttm_state_periodic(struct clock_event_device *clkevt)
struct timer_of *to = to_timer_of(clkevt);
RTTM_DEBUG(to->of_base.base);
+ rttm_bounce_timer(to->of_base.base, RTTM_CTRL_TIMER);
rttm_stop_timer(to->of_base.base);
rttm_set_period(to->of_base.base, RTTM_TICKS_PER_SEC / HZ);
rttm_start_timer(to, RTTM_CTRL_TIMER);
--
2.47.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/4] clocksource/drivers/timer-rtl-otto: drop set_counter function
2025-08-04 8:03 [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements Markus Stockhausen
2025-08-04 8:03 ` [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers Markus Stockhausen
@ 2025-08-04 8:03 ` Markus Stockhausen
2025-08-04 8:03 ` [PATCH 3/4] clocksource/drivers/timer-rtl-otto: do not interfere with interrupts Markus Stockhausen
` (2 subsequent siblings)
4 siblings, 0 replies; 12+ messages in thread
From: Markus Stockhausen @ 2025-08-04 8:03 UTC (permalink / raw)
To: markus.stockhausen, daniel.lezcano, tglx, linux-kernel, howels,
bjorn
The current counter value is a read only register. It will be
reset when writing a new target timer value with rttm_set_period().
rttm_set_counter() is essentially a noop. Drop it.
While this makes rttm_start_timer() and rttm_enable_timer() the
same functions keep both to make the established abstraction layers
for register and control functions active.
Downstream has already tested and confirmed a patch. See
https://github.com/openwrt/openwrt/pull/19468
https://forum.openwrt.org/t/support-for-rtl838x-based-managed-switches/57875/3788
Tested-by: Stephen Howell <howels@allthatwemight.be>
Tested-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
---
drivers/clocksource/timer-rtl-otto.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/drivers/clocksource/timer-rtl-otto.c b/drivers/clocksource/timer-rtl-otto.c
index 8be45a11fb8b..48ba1164f3fb 100644
--- a/drivers/clocksource/timer-rtl-otto.c
+++ b/drivers/clocksource/timer-rtl-otto.c
@@ -56,11 +56,6 @@ struct rttm_cs {
};
/* Simple internal register functions */
-static inline void rttm_set_counter(void __iomem *base, unsigned int counter)
-{
- iowrite32(counter, base + RTTM_CNT);
-}
-
static inline unsigned int rttm_get_counter(void __iomem *base)
{
return ioread32(base + RTTM_CNT);
@@ -137,7 +132,6 @@ static void rttm_stop_timer(void __iomem *base)
static void rttm_start_timer(struct timer_of *to, u32 mode)
{
- rttm_set_counter(to->of_base.base, 0);
rttm_enable_timer(to->of_base.base, mode, to->of_clk.rate / RTTM_TICKS_PER_SEC);
}
--
2.47.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/4] clocksource/drivers/timer-rtl-otto: do not interfere with interrupts
2025-08-04 8:03 [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements Markus Stockhausen
2025-08-04 8:03 ` [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers Markus Stockhausen
2025-08-04 8:03 ` [PATCH 2/4] clocksource/drivers/timer-rtl-otto: drop set_counter function Markus Stockhausen
@ 2025-08-04 8:03 ` Markus Stockhausen
2025-08-04 8:03 ` [PATCH 4/4] clocksource/drivers/timer-rtl-otto: simplify documentation Markus Stockhausen
2025-09-10 20:54 ` [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements Daniel Lezcano
4 siblings, 0 replies; 12+ messages in thread
From: Markus Stockhausen @ 2025-08-04 8:03 UTC (permalink / raw)
To: markus.stockhausen, daniel.lezcano, tglx, linux-kernel, howels,
bjorn
During normal operation the timers are reprogrammed including an
interrupt acknowledgement. This has no effect as the whole timer
is setup from scratch afterwards. Especially in an interrupt this
has already been done by rttm_timer_interrupt().
Change the behaviour as follows:
- Use rttm_disable_timer() during reprogramming
- Keep rttm_stop_timer() for all other use cases.
Downstream has already tested and confirmed a patch. See
https://github.com/openwrt/openwrt/pull/19468
https://forum.openwrt.org/t/support-for-rtl838x-based-managed-switches/57875/3788
Tested-by: Stephen Howell <howels@allthatwemight.be>
Tested-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
---
drivers/clocksource/timer-rtl-otto.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/clocksource/timer-rtl-otto.c b/drivers/clocksource/timer-rtl-otto.c
index 48ba1164f3fb..42f702aca689 100644
--- a/drivers/clocksource/timer-rtl-otto.c
+++ b/drivers/clocksource/timer-rtl-otto.c
@@ -141,7 +141,7 @@ static int rttm_next_event(unsigned long delta, struct clock_event_device *clkev
RTTM_DEBUG(to->of_base.base);
rttm_bounce_timer(to->of_base.base, RTTM_CTRL_COUNTER);
- rttm_stop_timer(to->of_base.base);
+ rttm_disable_timer(to->of_base.base);
rttm_set_period(to->of_base.base, delta);
rttm_start_timer(to, RTTM_CTRL_COUNTER);
@@ -154,7 +154,7 @@ static int rttm_state_oneshot(struct clock_event_device *clkevt)
RTTM_DEBUG(to->of_base.base);
rttm_bounce_timer(to->of_base.base, RTTM_CTRL_COUNTER);
- rttm_stop_timer(to->of_base.base);
+ rttm_disable_timer(to->of_base.base);
rttm_set_period(to->of_base.base, RTTM_TICKS_PER_SEC / HZ);
rttm_start_timer(to, RTTM_CTRL_COUNTER);
@@ -167,7 +167,7 @@ static int rttm_state_periodic(struct clock_event_device *clkevt)
RTTM_DEBUG(to->of_base.base);
rttm_bounce_timer(to->of_base.base, RTTM_CTRL_TIMER);
- rttm_stop_timer(to->of_base.base);
+ rttm_disable_timer(to->of_base.base);
rttm_set_period(to->of_base.base, RTTM_TICKS_PER_SEC / HZ);
rttm_start_timer(to, RTTM_CTRL_TIMER);
--
2.47.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/4] clocksource/drivers/timer-rtl-otto: simplify documentation
2025-08-04 8:03 [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements Markus Stockhausen
` (2 preceding siblings ...)
2025-08-04 8:03 ` [PATCH 3/4] clocksource/drivers/timer-rtl-otto: do not interfere with interrupts Markus Stockhausen
@ 2025-08-04 8:03 ` Markus Stockhausen
2025-09-10 20:54 ` [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements Daniel Lezcano
4 siblings, 0 replies; 12+ messages in thread
From: Markus Stockhausen @ 2025-08-04 8:03 UTC (permalink / raw)
To: markus.stockhausen, daniel.lezcano, tglx, linux-kernel, howels,
bjorn
While the main SoC PLL is responsible for the lexra bus frequency
it has no implications on the the timer divisior. Update the
comments accordingly.
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
---
drivers/clocksource/timer-rtl-otto.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/clocksource/timer-rtl-otto.c b/drivers/clocksource/timer-rtl-otto.c
index 42f702aca689..6113d2fdd4de 100644
--- a/drivers/clocksource/timer-rtl-otto.c
+++ b/drivers/clocksource/timer-rtl-otto.c
@@ -41,12 +41,10 @@
#define RTTM_MAX_DIVISOR GENMASK(15, 0)
/*
- * Timers are derived from the LXB clock frequency. Usually this is a fixed
- * multiple of the 25 MHz oscillator. The 930X SOC is an exception from that.
- * Its LXB clock has only dividers and uses the switch PLL of 2.45 GHz as its
- * base. The only meaningful frequencies we can achieve from that are 175.000
- * MHz and 153.125 MHz. The greatest common divisor of all explained possible
- * speeds is 3125000. Pin the timers to this 3.125 MHz reference frequency.
+ * Timers are derived from the lexra bus (LXB) clock frequency. This is 175 MHz
+ * on RTL930x and 200 MHz on the other platforms. With 3.125 MHz choose a common
+ * divisor to have enough range and detail. This provides comparability between
+ * the different platforms.
*/
#define RTTM_TICKS_PER_SEC 3125000
--
2.47.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers
2025-08-04 8:03 ` [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers Markus Stockhausen
@ 2025-09-10 9:02 ` Daniel Lezcano
2025-09-10 10:16 ` AW: " markus.stockhausen
0 siblings, 1 reply; 12+ messages in thread
From: Daniel Lezcano @ 2025-09-10 9:02 UTC (permalink / raw)
To: Markus Stockhausen, tglx, linux-kernel, howels, bjorn
On 04/08/2025 10:03, Markus Stockhausen wrote:
> The OpenWrt distribution has switched from kernel longterm 6.6 to
> 6.12. Reports show that devices with the Realtek Otto switch platform
> die during operation and are rebooted by the watchdog. Sorting out
> other possible reasons the Otto timer is to blame. The platform
> currently consists of 4 targets with different hardware revisions.
> It is not 100% clear which devices and revisions are affected.
>
> Analysis shows:
>
> A more aggressive sched/deadline handling leads to more timer starts
> with small intervals. This increases the bug chances. See
> https://marc.info/?l=linux-kernel&m=175276556023276&w=2
>
> Focusing on the real issue a hardware limitation on some devices was
> found. There is a minimal chance that a timer ends without firing an
> interrupt if it is reprogrammed within the 5us before its expiration
> time.
Is it possible the timer IRQ flag is reset when setting the new counter
value ?
While in the code path with the interrupt disabled, the timer expires in
these 5us, the IRQ flag is raised, then the driver sets a new value and
this flag is reset automatically, thus losing the current timer expiration ?
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 12+ messages in thread
* AW: [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers
2025-09-10 9:02 ` Daniel Lezcano
@ 2025-09-10 10:16 ` markus.stockhausen
2025-09-10 16:39 ` Daniel Lezcano
0 siblings, 1 reply; 12+ messages in thread
From: markus.stockhausen @ 2025-09-10 10:16 UTC (permalink / raw)
To: 'Daniel Lezcano', tglx, linux-kernel, howels, bjorn
> Von: Daniel Lezcano <daniel.lezcano@linaro.org>
> Gesendet: Mittwoch, 10. September 2025 11:03
>
> On 04/08/2025 10:03, Markus Stockhausen wrote:
> > The OpenWrt distribution has switched from kernel longterm 6.6 to
> > 6.12. Reports show that devices with the Realtek Otto switch platform
> > die during operation and are rebooted by the watchdog. Sorting out
> > other possible reasons the Otto timer is to blame. The platform
> > currently consists of 4 targets with different hardware revisions.
> > It is not 100% clear which devices and revisions are affected.
> >
> > Analysis shows:
> >
> > A more aggressive sched/deadline handling leads to more timer starts
> > with small intervals. This increases the bug chances. See
> > https://marc.info/?l=linux-kernel&m=175276556023276&w=2
> >
> > Focusing on the real issue a hardware limitation on some devices was
> > found. There is a minimal chance that a timer ends without firing an
> > interrupt if it is reprogrammed within the 5us before its expiration
> > time.
>
> Is it possible the timer IRQ flag is reset when setting the new counter
> value ?
>
> While in the code path with the interrupt disabled, the timer expires in
> these 5us, the IRQ flag is raised, then the driver sets a new value and
> this flag is reset automatically, thus losing the current timer expiration ?
Something like this ...
During my analysis I tried a lot of things to identify the situation that
leads to this error. Especially just before the reprogramming command
static inline void rttm_enable_timer(void __iomem *base, u32 mode, u32 divisor)
{
iowrite32(RTTM_CTRL_ENABLE | mode | divisor, base + RTTM_CTRL);
}
What I tried:
1. Read out the current (remaining) timer value: In the error cases
this can give any value between 1 (=320ns) and 15 (=4800ns).
2. Check if IRQ flag is already set and IRQ might trigger next. This was
never the case.
3. Reorder reprogramming sequence (as far as possible). Only the
double reprogramming helped here.
So nothing we can do to actively identify and work around the buggy
situation. There is some hardware limitation between expiring timers
and reprgramming. Due to missing erratum the current bugfix is the
only (and best) solution I have.
Markus
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: AW: [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers
2025-09-10 10:16 ` AW: " markus.stockhausen
@ 2025-09-10 16:39 ` Daniel Lezcano
2025-09-10 18:16 ` AW: " markus.stockhausen
0 siblings, 1 reply; 12+ messages in thread
From: Daniel Lezcano @ 2025-09-10 16:39 UTC (permalink / raw)
To: markus.stockhausen, tglx, linux-kernel, howels, bjorn
On 10/09/2025 12:16, markus.stockhausen@gmx.de wrote:
>> Von: Daniel Lezcano <daniel.lezcano@linaro.org>
>> Gesendet: Mittwoch, 10. September 2025 11:03
>>
>> On 04/08/2025 10:03, Markus Stockhausen wrote:
>>> The OpenWrt distribution has switched from kernel longterm 6.6 to
>>> 6.12. Reports show that devices with the Realtek Otto switch platform
>>> die during operation and are rebooted by the watchdog. Sorting out
>>> other possible reasons the Otto timer is to blame. The platform
>>> currently consists of 4 targets with different hardware revisions.
>>> It is not 100% clear which devices and revisions are affected.
>>>
>>> Analysis shows:
>>>
>>> A more aggressive sched/deadline handling leads to more timer starts
>>> with small intervals. This increases the bug chances. See
>>> https://marc.info/?l=linux-kernel&m=175276556023276&w=2
>>>
>>> Focusing on the real issue a hardware limitation on some devices was
>>> found. There is a minimal chance that a timer ends without firing an
>>> interrupt if it is reprogrammed within the 5us before its expiration
>>> time.
>>
>> Is it possible the timer IRQ flag is reset when setting the new counter
>> value ?
>>
>> While in the code path with the interrupt disabled, the timer expires in
>> these 5us, the IRQ flag is raised, then the driver sets a new value and
>> this flag is reset automatically, thus losing the current timer expiration ?
>
> Something like this ...
>
> During my analysis I tried a lot of things to identify the situation that
> leads to this error. Especially just before the reprogramming command
>
> static inline void rttm_enable_timer(void __iomem *base, u32 mode, u32 divisor)
> {
> iowrite32(RTTM_CTRL_ENABLE | mode | divisor, base + RTTM_CTRL);
> }
>
> What I tried:
>
> 1. Read out the current (remaining) timer value: In the error cases
> this can give any value between 1 (=320ns) and 15 (=4800ns).
>
> 2. Check if IRQ flag is already set and IRQ might trigger next. This was
> never the case.
It would have been interesting to check if we are in the time bug range
to wait with a delay (5us), check the IRQ flag as the current timer
should have expired, then set the counter and recheck the IRQ flag.
> 3. Reorder reprogramming sequence (as far as possible). Only the
> double reprogramming helped here.
>
> So nothing we can do to actively identify and work around the buggy
> situation. There is some hardware limitation between expiring timers
> and reprgramming. Due to missing erratum the current bugfix is the
> only (and best) solution I have.
>
> Markus
>
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 12+ messages in thread
* AW: AW: [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers
2025-09-10 16:39 ` Daniel Lezcano
@ 2025-09-10 18:16 ` markus.stockhausen
2025-09-10 20:53 ` Daniel Lezcano
0 siblings, 1 reply; 12+ messages in thread
From: markus.stockhausen @ 2025-09-10 18:16 UTC (permalink / raw)
To: 'Daniel Lezcano', tglx; +Cc: howels, bjorn, linux-kernel
> Von: Daniel Lezcano <daniel.lezcano@linaro.org>
> Gesendet: Mittwoch, 10. September 2025 18:39
>
> > What I tried:
> >
> > 1. Read out the current (remaining) timer value: In the error cases
> > this can give any value between 1 (=320ns) and 15 (=4800ns).
> >
> > 2. Check if IRQ flag is already set and IRQ might trigger next. This was
> > never the case.
>
> It would have been interesting to check if we are in the time bug range
> to wait with a delay (5us), check the IRQ flag as the current timer
> should have expired, then set the counter and recheck the IRQ flag.
It's been 2 months that I dived deep into this case. Finding a
reproducer, adding lightweight logging and try&error a solution
was really hard. In the end I was happy to have a fix that was
intensively tested.
For some notes see
https://github.com/openwrt/openwrt/pull/19468#issuecomment-3095570297
From what I remember:
- I started on a multithreading SoC and went over to a single
core SoC to reduce side effects during analysis.
- The timer never died when it was reprogrammed from
an interrupt of a just finished timer. The reason was always
a reprogramming from outside the interrupt->reprogram
call sequence.
- Reprogramming always worked fine. A timer with <5us left, was
restarted with a timer >5us. The new timer started to count.
No interrupt flag seemed to be magically toggled during this
process. There was no active IRQ notification directly after the
reprogramming. That was how I expected it.
- But in rare cases the new timer did not trigger the subsequent
interrupt. I was totally confused that the future interrupt of
a newly started timer did not work.
Graphically:
- timer run ---+-------------------->|
| issue stop & start
| timer run ------------------>|
| no IRQ here
Conclusion was for me: If we "kill" a running timer and restart
it and it will not fire an interrupt after the newly set time,
then something must be somehow broken. The ending timer and
the stop/start sequence (that consists of two register writes)
have some interference. Whatever it might be.
Markus
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: AW: AW: [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers
2025-09-10 18:16 ` AW: " markus.stockhausen
@ 2025-09-10 20:53 ` Daniel Lezcano
0 siblings, 0 replies; 12+ messages in thread
From: Daniel Lezcano @ 2025-09-10 20:53 UTC (permalink / raw)
To: markus.stockhausen, tglx; +Cc: howels, bjorn, linux-kernel
On 10/09/2025 20:16, markus.stockhausen@gmx.de wrote:
>> Von: Daniel Lezcano <daniel.lezcano@linaro.org>
>> Gesendet: Mittwoch, 10. September 2025 18:39
>>
>>> What I tried:
>>>
>>> 1. Read out the current (remaining) timer value: In the error cases
>>> this can give any value between 1 (=320ns) and 15 (=4800ns).
>>>
>>> 2. Check if IRQ flag is already set and IRQ might trigger next. This was
>>> never the case.
>>
>> It would have been interesting to check if we are in the time bug range
>> to wait with a delay (5us), check the IRQ flag as the current timer
>> should have expired, then set the counter and recheck the IRQ flag.
>
> It's been 2 months that I dived deep into this case. Finding a
> reproducer, adding lightweight logging and try&error a solution
> was really hard. In the end I was happy to have a fix that was
> intensively tested.
I understand. No worries I applied the series, it is in the compilation
batch.
> For some notes see
> https://github.com/openwrt/openwrt/pull/19468#issuecomment-3095570297
>
> From what I remember:
>
> - I started on a multithreading SoC and went over to a single
> core SoC to reduce side effects during analysis.
>
> - The timer never died when it was reprogrammed from
> an interrupt of a just finished timer. The reason was always
> a reprogramming from outside the interrupt->reprogram
> call sequence.
>
> - Reprogramming always worked fine. A timer with <5us left, was
> restarted with a timer >5us. The new timer started to count.
> No interrupt flag seemed to be magically toggled during this
> process. There was no active IRQ notification directly after the
> reprogramming. That was how I expected it.
>
> - But in rare cases the new timer did not trigger the subsequent
> interrupt. I was totally confused that the future interrupt of
> a newly started timer did not work.
>
> Graphically:
>
> - timer run ---+-------------------->|
> | issue stop & start
> | timer run ------------------>|
> | no IRQ here
>
> Conclusion was for me: If we "kill" a running timer and restart
> it and it will not fire an interrupt after the newly set time,
> then something must be somehow broken. The ending timer and
> the stop/start sequence (that consists of two register writes)
> have some interference. Whatever it might be.
Mmh, I think I misunderstood initially the problem. Thanks for clarifying.
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements
2025-08-04 8:03 [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements Markus Stockhausen
` (3 preceding siblings ...)
2025-08-04 8:03 ` [PATCH 4/4] clocksource/drivers/timer-rtl-otto: simplify documentation Markus Stockhausen
@ 2025-09-10 20:54 ` Daniel Lezcano
2025-09-10 21:15 ` AW: " markus.stockhausen
4 siblings, 1 reply; 12+ messages in thread
From: Daniel Lezcano @ 2025-09-10 20:54 UTC (permalink / raw)
To: Markus Stockhausen, tglx, linux-kernel, howels, bjorn
On 04/08/2025 10:03, Markus Stockhausen wrote:
> This series fixes some shortcomings of the Realtek Otto timer driver.
> These became evident after switching to longterm kernel 6.12. Devices
> were randomly rebooted by the watchdog.
Applied, thanks
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
^ permalink raw reply [flat|nested] 12+ messages in thread
* AW: [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements
2025-09-10 20:54 ` [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements Daniel Lezcano
@ 2025-09-10 21:15 ` markus.stockhausen
0 siblings, 0 replies; 12+ messages in thread
From: markus.stockhausen @ 2025-09-10 21:15 UTC (permalink / raw)
To: 'Daniel Lezcano', tglx, linux-kernel, howels, bjorn
> Von: Daniel Lezcano <daniel.lezcano@linaro.org>
> Gesendet: Mittwoch, 10. September 2025 22:54
>
> > This series fixes some shortcomings of the Realtek Otto timer driver.
> > These became evident after switching to longterm kernel 6.12. Devices
> > were randomly rebooted by the watchdog.
>
> Applied, thanks
Thanks a lot.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-09-10 21:15 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-04 8:03 [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements Markus Stockhausen
2025-08-04 8:03 ` [PATCH 1/4] clocksource/drivers/timer-rtl-otto: work around dying timers Markus Stockhausen
2025-09-10 9:02 ` Daniel Lezcano
2025-09-10 10:16 ` AW: " markus.stockhausen
2025-09-10 16:39 ` Daniel Lezcano
2025-09-10 18:16 ` AW: " markus.stockhausen
2025-09-10 20:53 ` Daniel Lezcano
2025-08-04 8:03 ` [PATCH 2/4] clocksource/drivers/timer-rtl-otto: drop set_counter function Markus Stockhausen
2025-08-04 8:03 ` [PATCH 3/4] clocksource/drivers/timer-rtl-otto: do not interfere with interrupts Markus Stockhausen
2025-08-04 8:03 ` [PATCH 4/4] clocksource/drivers/timer-rtl-otto: simplify documentation Markus Stockhausen
2025-09-10 20:54 ` [PATCH 0/4] clocksource/drivers/timer-rtl-otto: enhancements Daniel Lezcano
2025-09-10 21:15 ` AW: " markus.stockhausen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox