From: Tony Rodriguez <unixpro1970@gmail.com>
To: Thomas Gleixner <tglx@kernel.org>
Cc: Linux kernel regressions list <regressions@lists.linux.dev>,
LKML <linux-kernel@vger.kernel.org>,
sparclinux@vger.kernel.org,
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
Thorsten Leemhuis <regressions@leemhuis.info>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: the stuttering regression in 7.0: should I have done something different
Date: Thu, 14 May 2026 00:24:46 -0700 [thread overview]
Message-ID: <e45438e7-8501-4c10-95ee-07f118de8a51@gmail.com> (raw)
In-Reply-To: <87tssb6olo.ffs@tglx>
Hi Thomas,
Cheers!
Initial validation of the test patches for v7.0.6 and 7.1-rc3 on the
S7-2 looks promising: I have not observed panics, timer delays, or other
timer-related issues so far. I will pause broader validation on the S7-2
and T7-1 until I receive your recommendation or any requested revisions
(see inline comments below).
Note: I did see an intermittent error on the S7-2 running 7.1-rc3,
usually when the system is under heavy load during a kernel build. I’m
not sure whether it is a separate problem?
"[676.464681] BUG: Bad rss-counter state mm:000000008d9f1cf2
type:MM_FILEPAGES val:-4096 Comm:cc1 Pid:78165".
On 5/13/26 1:28 PM, Thomas Gleixner wrote:
> Just to be clear: I never saw the VHDL code of that CPU, but that
> pattern is way too familiar.
>
> Those equal comparators, which were designed by AI (Absence of
> Intelligence) before AI got popular, generally work this way:
>
> The comparator is only evaluated on the clock edge which increments
> the counter, but not when the comparator value is written. So a write
> of the same value does not result in an interrupt.
>
> That's an "optimization" which spares quite a few gates and is obviously
> nowhere documented. So software has to deal with the consequences by
> using a crystal ball, which is trivial to get wrong and can go unnoticed
> for a long time until it roars it's ugly head at some point for whatever
> reasons.
>
> I'm willing to bet a round of beers at the next conference that this is
> the problem and that it will magically disappear when you change that
> condition to:
>
> return (read_cnt() - exp) >= 0 ? -ETIME : 0;
Attempted to locate "return (read_cnt() - exp) >= 0 ? -ETIME : 0;" but
could not find an exact match. After additional inspection I updated the
following functions "tick_add_compare()" and "stick_add_compare()" in
arch/sparc/kernel/time_64.c to from "> 0L" to ">= 0L". This appears to
have resolved the lost-timer behavior.
--- time_64.c.orig
+++ time_64.c
@@ -146,7 +146,7 @@
: "=r" (new_tick));
new_tick &= ~TICKCMP_IRQ_BIT;
- return ((long)(new_tick - (orig_tick+adj))) > 0L;
+ return ((long)(new_tick - (orig_tick+adj))) >= 0L;
}
static unsigned long tick_add_tick(unsigned long adj)
@@ -277,7 +277,7 @@
: "=r" (new_tick));
new_tick &= ~TICKCMP_IRQ_BIT;
- return ((long)(new_tick - (orig_tick+adj))) > 0L;
+ return ((long)(new_tick - (orig_tick+adj))) >= 0L;
}
static unsigned long stick_get_frequency(void)
>
> unless they managed to add some extra propagation delay to that
> comparator write like the HPET folks did at some point without telling
> anyone. I doubt the SPARC janitor who implemented it did so because
> that would have made the failure way more likely.
>
> I have truly no idea why the original code did not expose this problem,
> though it might have been just papered over by sheer luck and timing.
>
> Thanks,
>
> tglx
> ---
> --- a/kernel/time/clockevents.c
> +++ b/kernel/time/clockevents.c
> @@ -381,6 +381,8 @@ int clockevents_program_event(struct clo
> if (dev->set_next_event(dev->min_delta_ticks, dev)) {
> if (!force || clockevents_program_min_delta(dev))
> return -ETIME;
> + } else if (delta <= 0) {
> + dev->next_event = ktime_add_ns(ktime_get(), dev->min_delta_ns);
> }
> dev->next_event_forced = 1;
> return 0;
>
You mentioned this kernel/time/clockevents.c patch is optional, but I
propose revising clockevents_program_event(). If the requested event
time is already at or before now, record a sane next_event (now +
min_delta) so core code sees a future expected time and can behave
correctly. Does this seem reasonable?
--- clockevents.c.orig
+++ clockevents.c
@@ -347,6 +347,11 @@
if (dev->set_next_event(dev->min_delta_ticks, dev)) {
if (!force || clockevents_program_min_delta(dev))
return -ETIME;
+ } else {
+ ktime_t now = ktime_get();
+ s64 delta_ns = ktime_to_ns(ktime_sub(expires, now));
+ if (delta_ns <= 0)
+ dev->next_event = ktime_add_ns(now,
dev->min_delta_ns);
}
dev->next_event_forced = 1;
return 0;
next prev parent reply other threads:[~2026-05-14 7:24 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4dd98a32-d1d6-43de-910c-7e487503177e@leemhuis.info>
2026-05-08 5:51 ` the stuttering regression in 7.0: should I have done something different? John Paul Adrian Glaubitz
2026-05-08 6:33 ` Thorsten Leemhuis
[not found] ` <D5D19776-C809-4284-9417-F9A860877B98@gmail.com>
2026-05-08 7:50 ` Thorsten Leemhuis
2026-05-08 20:15 ` Tony Rodriguez
2026-05-08 20:21 ` Tony Rodriguez
2026-05-10 21:29 ` Thomas Gleixner
2026-05-11 3:13 ` Tony Rodriguez
2026-05-12 5:03 ` the stuttering regression in 7.0: should I have done something different Tony Rodriguez
2026-05-12 8:17 ` Thomas Gleixner
2026-05-12 21:43 ` Tony Rodriguez
2026-05-13 20:28 ` Thomas Gleixner
2026-05-14 7:24 ` Tony Rodriguez [this message]
2026-05-14 10:24 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e45438e7-8501-4c10-95ee-07f118de8a51@gmail.com \
--to=unixpro1970@gmail.com \
--cc=glaubitz@physik.fu-berlin.de \
--cc=linux-kernel@vger.kernel.org \
--cc=regressions@leemhuis.info \
--cc=regressions@lists.linux.dev \
--cc=sparclinux@vger.kernel.org \
--cc=tglx@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox