From: Heiko Carstens <hca@linux.ibm.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
"Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Anna-Maria Behnsen <anna-maria@linutronix.de>,
Ben Segall <bsegall@google.com>,
Boqun Feng <boqun.feng@gmail.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Ingo Molnar <mingo@redhat.com>,
Jan Kiszka <jan.kiszka@siemens.com>,
Joel Fernandes <joelagnelf@nvidia.com>,
Juri Lelli <juri.lelli@redhat.com>,
Kieran Bingham <kbingham@kernel.org>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Mel Gorman <mgorman@suse.de>,
Michael Ellerman <mpe@ellerman.id.au>,
Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
Nicholas Piggin <npiggin@gmail.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Steven Rostedt <rostedt@goodmis.org>,
Sven Schnelle <svens@linux.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Uladzislau Rezki <urezki@gmail.com>,
Valentin Schneider <vschneid@redhat.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Viresh Kumar <viresh.kumar@linaro.org>,
Xin Zhao <jackzxcui1989@163.com>,
linux-pm@vger.kernel.org, linux-s390@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 05/15] s390/time: Prepare to stop elapsing in dynticks-idle
Date: Thu, 22 Jan 2026 15:40:45 +0100 [thread overview]
Message-ID: <20260122144045.38254A3e-hca@linux.ibm.com> (raw)
In-Reply-To: <aXEVM-04lj0lntMr@localhost.localdomain>
On Wed, Jan 21, 2026 at 07:04:35PM +0100, Frederic Weisbecker wrote:
> BTW here is a question for you, does the timer (as in get_cpu_timer()) still
> decrements while in idle? I would assume not, given how lc->system_timer
> is updated in account_idle_time_irq().
It is not decremented while in idle (or when the hypervisor schedules
the virtual cpu away). We use the fact that the cpu timer is not
decremented when the virtual cpu is not running vs the real
time-of-day clock to calculate steal time.
> And another question in this same function is this :
>
> lc->steal_timer += idle->clock_idle_enter - lc->last_update_clock;
>
> clock_idle_enter is updated right before halting the CPU. But when was
> last_update_clock updated last? Could be either task switch to idle, or
> a previous idle tick interrupt or a previous idle IRQ entry. In any case
> I'm not sure the difference is meaningful as steal time.
>
> I must be missing something.
"It has been like that forever" :) However I do agree that this doesn't seem
to make any sense. At least with the current implementation I cannot see how
that makes sense, since the difference of two time stamps, which do not
include any steal time are added.
Maybe it broke by some of all the changes over the years, or it was always
wrong, or I am missing something too.
Will investigate and address it if required. Thank you for bringing this up!
> > Not sure what to do with this. I thought about removing those sysfs files
> > already in the past, since they are of very limited use; and most likely
> > nothing in user space would miss them.
>
> Perhaps but this file is a good comparison point against /proc/stat because
> s390 vtime is much closer to measuring the actual CPU halted time than what
> the generic nohz accounting does (which includes more idle code execution).
Yes, while comparing those files I also see an unexpected difference of
several seconds after two days of uptime; that is before your changes.
In theory the sum of idle and iowait in /proc/stat should be the same like the
per-cpu idle_time_us sysfs file. But there is a difference, which shouldn't be
there as far as I can tell. Yet another thing to look into.
> > Guess I need to spend some more time on accounting and see what it would take
> > to convert to VIRT_CPU_ACCOUNTING_GEN, while keeping the current precision and
> > functionality.
>
> I would expect more overhead with VIRT_CPU_ACCOUNTING_GEN, though that has yet
> to be measured. In any case you'll lose some idle cputime precision (but
> you need to read that through s390 sysfs files) if what we want to measure
> here is the actual halted time.
>
> Perhaps we could enhance VIRT_CPU_ACCOUNTING_GEN and nohz idle cputime
> accounting to match s390 precision. Though I expect some cost
> accessing the clock inevitably more often on some machines.
Let me experiment with that, but first I want to understand the oddities
pointed out above.
next prev parent reply other threads:[~2026-01-22 14:41 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-16 14:51 [PATCH 00/15] tick/sched: Refactor idle cputime accounting Frederic Weisbecker
2026-01-16 14:51 ` [PATCH 01/15] sched/idle: Handle offlining first in idle loop Frederic Weisbecker
2026-01-19 12:53 ` Peter Zijlstra
2026-01-19 21:04 ` Frederic Weisbecker
2026-01-20 4:26 ` K Prateek Nayak
2026-01-20 14:52 ` Frederic Weisbecker
2026-01-16 14:51 ` [PATCH 02/15] sched/cputime: Remove superfluous and error prone kcpustat_field() parameter Frederic Weisbecker
2026-01-16 14:51 ` [PATCH 03/15] sched/cputime: Correctly support generic vtime idle time Frederic Weisbecker
2026-01-19 13:02 ` Peter Zijlstra
2026-01-19 21:35 ` Frederic Weisbecker
2026-01-16 14:51 ` [PATCH 04/15] powerpc/time: Prepare to stop elapsing in dynticks-idle Frederic Weisbecker
2026-02-25 17:53 ` Christophe Leroy (CS GROUP)
2026-01-16 14:51 ` [PATCH 05/15] s390/time: " Frederic Weisbecker
2026-01-21 12:17 ` Heiko Carstens
2026-01-21 18:04 ` Frederic Weisbecker
2026-01-22 14:40 ` Heiko Carstens [this message]
2026-01-27 14:45 ` Frederic Weisbecker
2026-01-16 14:51 ` [PATCH 06/15] tick/sched: Unify idle cputime accounting Frederic Weisbecker
2026-01-19 14:26 ` Peter Zijlstra
2026-01-19 22:00 ` Frederic Weisbecker
2026-01-16 14:52 ` [PATCH 07/15] cpufreq: ondemand: Simplify idle cputime granularity test Frederic Weisbecker
2026-01-19 5:37 ` Viresh Kumar
2026-01-19 12:30 ` Rafael J. Wysocki
2026-01-19 22:06 ` Frederic Weisbecker
2026-01-20 12:32 ` Rafael J. Wysocki
2026-01-20 14:28 ` Frederic Weisbecker
2026-01-16 14:52 ` [PATCH 08/15] tick/sched: Remove nohz disabled special case in cputime fetch Frederic Weisbecker
2026-01-16 14:52 ` [PATCH 09/15] tick/sched: Move dyntick-idle cputime accounting to cputime code Frederic Weisbecker
2026-01-19 14:35 ` Peter Zijlstra
2026-01-19 22:08 ` Frederic Weisbecker
2026-01-16 14:52 ` [PATCH 10/15] tick/sched: Remove unused fields Frederic Weisbecker
2026-01-16 14:52 ` [PATCH 11/15] tick/sched: Account tickless idle cputime only when tick is stopped Frederic Weisbecker
2026-01-16 14:52 ` [PATCH 12/15] tick/sched: Consolidate idle time fetching APIs Frederic Weisbecker
2026-01-16 14:52 ` [PATCH 13/15] sched/cputime: Consolidate get_cpu_[idle|iowait]_time_us() Frederic Weisbecker
2026-01-16 14:52 ` [PATCH 14/15] sched/cputime: Handle idle irqtime gracefully Frederic Weisbecker
2026-01-16 14:52 ` [PATCH 15/15] sched/cputime: Handle dyntick-idle steal time correctly Frederic Weisbecker
2026-01-16 14:57 ` [PATCH 00/15] tick/sched: Refactor idle cputime accounting Frederic Weisbecker
2026-01-20 12:42 ` Shrikanth Hegde
2026-01-21 16:55 ` Frederic Weisbecker
2026-01-19 14:53 ` Peter Zijlstra
2026-01-19 22:12 ` Frederic Weisbecker
-- strict thread matches above, loose matches on Subject: below --
2026-02-06 14:22 [PATCH 00/15 v2] " Frederic Weisbecker
2026-02-06 14:22 ` [PATCH 05/15] s390/time: Prepare to stop elapsing in dynticks-idle Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260122144045.38254A3e-hca@linux.ibm.com \
--to=hca@linux.ibm.com \
--cc=agordeev@linux.ibm.com \
--cc=anna-maria@linutronix.de \
--cc=boqun.feng@gmail.com \
--cc=borntraeger@linux.ibm.com \
--cc=bsegall@google.com \
--cc=chleroy@kernel.org \
--cc=dietmar.eggemann@arm.com \
--cc=frederic@kernel.org \
--cc=gor@linux.ibm.com \
--cc=jackzxcui1989@163.com \
--cc=jan.kiszka@siemens.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kbingham@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=neeraj.upadhyay@kernel.org \
--cc=npiggin@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rostedt@goodmis.org \
--cc=svens@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=urezki@gmail.com \
--cc=vincent.guittot@linaro.org \
--cc=viresh.kumar@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox