From: Frederic Weisbecker <frederic@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
"Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Anna-Maria Behnsen <anna-maria@linutronix.de>,
Ben Segall <bsegall@google.com>,
Boqun Feng <boqun.feng@gmail.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Heiko Carstens <hca@linux.ibm.com>,
Ingo Molnar <mingo@kernel.org>, Ingo Molnar <mingo@redhat.com>,
Jan Kiszka <jan.kiszka@siemens.com>,
Joel Fernandes <joelagnelf@nvidia.com>,
Juri Lelli <juri.lelli@redhat.com>,
Kieran Bingham <kbingham@kernel.org>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Mel Gorman <mgorman@suse.de>,
Michael Ellerman <mpe@ellerman.id.au>,
Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
Nicholas Piggin <npiggin@gmail.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Sashiko, Shrikanth Hegde <sshegde@linux.ibm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Sven Schnelle <svens@linux.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Uladzislau Rezki <urezki@gmail.com>,
Valentin Schneider <vschneid@redhat.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Viresh Kumar <viresh.kumar@linaro.org>,
Xin Zhao <jackzxcui1989@163.com>,
linux-pm@vger.kernel.org, linux-s390@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org
Subject: [PATCH 01/15] tick/sched: Fix TOCTOU in nohz idle time fetch
Date: Fri, 8 May 2026 15:16:33 +0200 [thread overview]
Message-ID: <20260508131647.43868-2-frederic@kernel.org> (raw)
In-Reply-To: <20260508131647.43868-1-frederic@kernel.org>
When the nohz idle time is fetched, the current clock timestamp is taken
outside the seqcount, which can result in such a race as reported by
Sashiko:
get_cpu_sleep_time_us() tick_nohz_start_idle()
----------------------- ---------------------
now = ktime_get()
write_seqcount_begin(idle_sleeptime_seq);
idle_entrytime = ktime_get()
tick_sched_flag_set(ts, TS_FLAG_IDLE_ACTIVE);
write_seqcount_end(&ts->idle_sleeptime_seq);
read_seqcount_begin(idle_sleeptime_seq)
delta = now - idle_entrytime);
//!! But now < idle_entrytime
idle = *sleeptime + delta;
read_seqcount_retry(&ts->idle_sleeptime_seq, seq)
Here the read side fetches the timestamp before the write side and its
update. As a result the time delta computed on the read side is negative
(ktime_t is signed) and breaks the cputime monotonicity guarantee.
This could possibly be fixed with reading the current clock timestamp
inside the seqcount but the reader overhead might then increase. Also
simply checking that the current timestamp is above the idle entry time
is enough to prevent any issue of the like.
Reported-by: Sashiko
Fixes: 620a30fa0bd1 ("timers/nohz: Protect idle/iowait sleep time under seqcount")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
kernel/time/tick-sched.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index cbbb87a0c6e7..171393367b5c 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -797,15 +797,16 @@ static u64 get_cpu_sleep_time_us(struct tick_sched *ts, ktime_t *sleeptime,
*last_update_time = ktime_to_us(now);
do {
+ ktime_t delta = 0;
+
seq = read_seqcount_begin(&ts->idle_sleeptime_seq);
if (tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE) && compute_delta) {
- ktime_t delta = ktime_sub(now, ts->idle_entrytime);
-
- idle = ktime_add(*sleeptime, delta);
- } else {
- idle = *sleeptime;
+ if (now > ts->idle_entrytime)
+ delta = ktime_sub(now, ts->idle_entrytime);
}
+
+ idle = ktime_add(*sleeptime, delta);
} while (read_seqcount_retry(&ts->idle_sleeptime_seq, seq));
return ktime_to_us(idle);
--
2.53.0
next prev parent reply other threads:[~2026-05-08 13:17 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-08 13:16 [PATCH 00/15 v4] tick/sched: Refactor idle cputime accounting Frederic Weisbecker
2026-05-08 13:16 ` Frederic Weisbecker [this message]
2026-05-08 13:16 ` [PATCH 02/15] sched/idle: Handle offlining first in idle loop Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 03/15] sched/cputime: Remove superfluous and error prone kcpustat_field() parameter Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 04/15] sched/cputime: Correctly support generic vtime idle time Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 05/15] powerpc/time: Prepare to stop elapsing in dynticks-idle Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 06/15] s390/time: " Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 07/15] tick/sched: Unify idle cputime accounting Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 08/15] tick/sched: Remove nohz disabled special case in cputime fetch Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 09/15] tick/sched: Move dyntick-idle cputime accounting to cputime code Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 10/15] tick/sched: Remove unused fields Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 11/15] tick/sched: Account tickless idle cputime only when tick is stopped Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 12/15] tick/sched: Consolidate idle time fetching APIs Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 13/15] sched/cputime: Provide get_cpu_[idle|iowait]_time_us() off-case Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 14/15] sched/cputime: Handle idle irqtime gracefully Frederic Weisbecker
2026-05-08 13:16 ` [PATCH 15/15] sched/cputime: Handle dyntick-idle steal time correctly Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260508131647.43868-2-frederic@kernel.org \
--to=frederic@kernel.org \
--cc=agordeev@linux.ibm.com \
--cc=anna-maria@linutronix.de \
--cc=boqun.feng@gmail.com \
--cc=borntraeger@linux.ibm.com \
--cc=bsegall@google.com \
--cc=chleroy@kernel.org \
--cc=dietmar.eggemann@arm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=jackzxcui1989@163.com \
--cc=jan.kiszka@siemens.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kbingham@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=neeraj.upadhyay@kernel.org \
--cc=npiggin@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rostedt@goodmis.org \
--cc=sshegde@linux.ibm.com \
--cc=svens@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=urezki@gmail.com \
--cc=vincent.guittot@linaro.org \
--cc=viresh.kumar@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox