* [RFC PATCH 0/3] s390: Idle time accounting improvements
@ 2026-02-25 14:51 Heiko Carstens
2026-02-25 14:51 ` [RFC PATCH 1/3] fixup! s390/time: Prepare to stop elapsing in dynticks-idle Heiko Carstens
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Heiko Carstens @ 2026-02-25 14:51 UTC (permalink / raw)
To: Frederic Weisbecker, Alexander Gordeev, Sven Schnelle,
Vasily Gorbik, Christian Borntraeger
Cc: linux-kernel, linux-s390
This series is on top of Frederic Weisbecker's idle cputime accounting
refactor series.
The first patch is a fix and should be merged into the corresponding patch of
the series.
The second patch is supposed to improve s390 idle time accounting, and bring
it back to the state it is was before arch_cpu_idle_time() was removed [2].
In result all cpu time accounting is done by the s390 architecture backend
again, instead of having a mix of architecure specific and common code
accounting (common code: idle, s390 architecture: everything else).
The code doesn't look too nice, and, as usual, might contain bugs. Therefore
this is an RFC. Maybe the outcome is also to drop this, and stay with
Frederic's code as s390 backend.
Thanks,
Heiko
[1] https://lore.kernel.org/all/20260206142245.58987-1-frederic@kernel.org/
[2] commit be76ea614460 ("s390/idle: remove arch_cpu_idle_time() and corresponding code")
Heiko Carstens (3):
fixup! s390/time: Prepare to stop elapsing in dynticks-idle
s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait()
s390/idle: Remove idle time and count sysfs files
arch/s390/include/asm/idle.h | 11 ++--
arch/s390/include/asm/lowcore.h | 9 +--
arch/s390/include/asm/timex.h | 20 +-----
arch/s390/include/asm/tod_types.h | 30 +++++++++
arch/s390/kernel/asm-offsets.c | 5 ++
arch/s390/kernel/entry.S | 7 +-
arch/s390/kernel/idle.c | 105 +++++++++++++++++++++---------
arch/s390/kernel/irq.c | 2 +-
arch/s390/kernel/setup.c | 1 +
arch/s390/kernel/smp.c | 33 +---------
arch/s390/kernel/vtime.c | 37 -----------
drivers/s390/cio/qdio_main.c | 2 +-
drivers/s390/cio/qdio_thinint.c | 2 +-
include/linux/kernel_stat.h | 27 ++++++++
include/linux/vtime.h | 6 ++
kernel/sched/cputime.c | 4 +-
16 files changed, 166 insertions(+), 135 deletions(-)
create mode 100644 arch/s390/include/asm/tod_types.h
--
2.51.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC PATCH 1/3] fixup! s390/time: Prepare to stop elapsing in dynticks-idle
2026-02-25 14:51 [RFC PATCH 0/3] s390: Idle time accounting improvements Heiko Carstens
@ 2026-02-25 14:51 ` Heiko Carstens
2026-03-11 15:14 ` Frederic Weisbecker
2026-02-25 14:51 ` [RFC PATCH 2/3] s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait() Heiko Carstens
2026-02-25 14:51 ` [RFC PATCH 3/3] s390/idle: Remove idle time and count sysfs files Heiko Carstens
2 siblings, 1 reply; 8+ messages in thread
From: Heiko Carstens @ 2026-02-25 14:51 UTC (permalink / raw)
To: Frederic Weisbecker, Alexander Gordeev, Sven Schnelle,
Vasily Gorbik, Christian Borntraeger
Cc: linux-kernel, linux-s390
This should be merged with "s390/time: Prepare to stop elapsing in
dynticks-idle".
It makes sure that idle->clock_idle_enter is always set when loading the
idle psw. Otherwise the idle_time calculation in account_idle_time_irq()
would be incorrect.
Also "revert" some not needed code movements and whitespace changes to keep
the diff minimal.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/include/asm/idle.h | 12 ++++++------
arch/s390/kernel/idle.c | 9 +++------
2 files changed, 9 insertions(+), 12 deletions(-)
diff --git a/arch/s390/include/asm/idle.h b/arch/s390/include/asm/idle.h
index 285b3da318d6..7f2a1240b6ff 100644
--- a/arch/s390/include/asm/idle.h
+++ b/arch/s390/include/asm/idle.h
@@ -13,12 +13,12 @@
#include <linux/device.h>
struct s390_idle_data {
- bool idle_dyntick;
- unsigned long idle_count;
- unsigned long idle_time;
- unsigned long clock_idle_enter;
- unsigned long timer_idle_enter;
- unsigned long mt_cycles_enter[8];
+ bool idle_dyntick;
+ unsigned long idle_count;
+ unsigned long idle_time;
+ unsigned long clock_idle_enter;
+ unsigned long timer_idle_enter;
+ unsigned long mt_cycles_enter[8];
};
DECLARE_PER_CPU(struct s390_idle_data, s390_idle);
diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c
index 614db5ea6ea3..fb4f431342f5 100644
--- a/arch/s390/kernel/idle.c
+++ b/arch/s390/kernel/idle.c
@@ -35,11 +35,10 @@ void account_idle_time_irq(void)
this_cpu_add(mt_cycles[i], cycles_new[i] - idle->mt_cycles_enter[i]);
}
- WRITE_ONCE(idle->idle_count, READ_ONCE(idle->idle_count) + 1);
-
/* Account time spent with enabled wait psw loaded as idle time. */
idle_time = lc->int_clock - idle->clock_idle_enter;
WRITE_ONCE(idle->idle_time, READ_ONCE(idle->idle_time) + idle_time);
+ WRITE_ONCE(idle->idle_count, READ_ONCE(idle->idle_count) + 1);
/* Dyntick idle time accounted by nohz/scheduler */
if (idle->idle_dyntick)
@@ -66,10 +65,8 @@ void noinstr arch_cpu_idle(void)
set_cpu_flag(CIF_ENABLED_WAIT);
if (smp_cpu_mtid)
stcctm(MT_DIAG, smp_cpu_mtid, (u64 *)&idle->mt_cycles_enter);
- if (!idle->idle_dyntick) {
- idle->clock_idle_enter = get_tod_clock_fast();
- idle->timer_idle_enter = get_cpu_timer();
- }
+ idle->clock_idle_enter = get_tod_clock_fast();
+ idle->timer_idle_enter = get_cpu_timer();
bpon();
__load_psw_mask(psw_mask);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC PATCH 2/3] s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait()
2026-02-25 14:51 [RFC PATCH 0/3] s390: Idle time accounting improvements Heiko Carstens
2026-02-25 14:51 ` [RFC PATCH 1/3] fixup! s390/time: Prepare to stop elapsing in dynticks-idle Heiko Carstens
@ 2026-02-25 14:51 ` Heiko Carstens
2026-03-11 16:13 ` Frederic Weisbecker
2026-02-25 14:51 ` [RFC PATCH 3/3] s390/idle: Remove idle time and count sysfs files Heiko Carstens
2 siblings, 1 reply; 8+ messages in thread
From: Heiko Carstens @ 2026-02-25 14:51 UTC (permalink / raw)
To: Frederic Weisbecker, Alexander Gordeev, Sven Schnelle,
Vasily Gorbik, Christian Borntraeger
Cc: linux-kernel, linux-s390
With commit be76ea614460 ("s390/idle: remove arch_cpu_idle_time() and
corresponding code") the s390 specific arch_cpu_idle_time() was
removed. Reason for that was that the implementation was racy and reported
idle time could go backwards (see the referenced commit for details).
With Frederic Weisbecker's idle cputime accounting refactoring
kernel_cpustat got a sequence counter. Use this to implement s390 specific
variants of kcpustat_field_idle() and kcpustat_field_iowait(). This is
basically a revert of the referenced commit and at the same time addresses
all the outlined races.
For comparing cross cpu time stamps it is necessary to use the stcke
instead of the stckf instruction in irq entry path. Furthermore this
open-codes a sequence lock in assembler and C code, which is required to
copy the irq entry time stamp to the per-cpu idle_data structure in a
race-free manner.
With all of this the s390 idle time accounting should be very precise
again, while also avoiding that reported idle time goes backwards.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/include/asm/idle.h | 15 +++---
arch/s390/include/asm/lowcore.h | 9 ++--
arch/s390/include/asm/timex.h | 20 +-------
arch/s390/include/asm/tod_types.h | 30 ++++++++++++
arch/s390/kernel/asm-offsets.c | 5 ++
arch/s390/kernel/entry.S | 7 ++-
arch/s390/kernel/idle.c | 78 +++++++++++++++++++++++++++----
arch/s390/kernel/irq.c | 2 +-
arch/s390/kernel/setup.c | 1 +
arch/s390/kernel/smp.c | 1 +
arch/s390/kernel/vtime.c | 37 ---------------
drivers/s390/cio/qdio_main.c | 2 +-
drivers/s390/cio/qdio_thinint.c | 2 +-
include/linux/kernel_stat.h | 27 +++++++++++
include/linux/vtime.h | 6 +++
kernel/sched/cputime.c | 4 +-
16 files changed, 166 insertions(+), 80 deletions(-)
create mode 100644 arch/s390/include/asm/tod_types.h
diff --git a/arch/s390/include/asm/idle.h b/arch/s390/include/asm/idle.h
index 7f2a1240b6ff..dc04d63b6187 100644
--- a/arch/s390/include/asm/idle.h
+++ b/arch/s390/include/asm/idle.h
@@ -11,14 +11,17 @@
#include <linux/percpu-defs.h>
#include <linux/types.h>
#include <linux/device.h>
+#include <asm/tod_types.h>
struct s390_idle_data {
- bool idle_dyntick;
- unsigned long idle_count;
- unsigned long idle_time;
- unsigned long clock_idle_enter;
- unsigned long timer_idle_enter;
- unsigned long mt_cycles_enter[8];
+ bool in_idle;
+ unsigned int sequence;
+ unsigned long idle_count;
+ unsigned long idle_time;
+ union tod_clock clock_idle_enter;
+ union tod_clock clock_idle_exit;
+ unsigned long timer_idle_enter;
+ unsigned long mt_cycles_enter[8];
};
DECLARE_PER_CPU(struct s390_idle_data, s390_idle);
diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h
index 50ffe75adeb4..64b74ee3560c 100644
--- a/arch/s390/include/asm/lowcore.h
+++ b/arch/s390/include/asm/lowcore.h
@@ -10,6 +10,7 @@
#define _ASM_S390_LOWCORE_H
#include <linux/types.h>
+#include <asm/tod_types.h>
#include <asm/machine.h>
#include <asm/ptrace.h>
#include <asm/ctlreg.h>
@@ -125,10 +126,10 @@ struct lowcore {
__u64 avg_steal_timer; /* 0x0300 */
__u64 last_update_timer; /* 0x0308 */
__u64 last_update_clock; /* 0x0310 */
- __u64 int_clock; /* 0x0318 */
- __u8 pad_0x0320[0x0328-0x0320]; /* 0x0320 */
- __u64 clock_comparator; /* 0x0328 */
- __u8 pad_0x0330[0x0340-0x0330]; /* 0x0330 */
+ __u64 idle_data; /* 0x0318 */
+ union tod_clock int_clock; /* 0x0320 */
+ __u64 clock_comparator; /* 0x0330 */
+ __u8 pad_0x0330[0x0340-0x0338]; /* 0x0338 */
/* Current process. */
__u64 current_task; /* 0x0340 */
diff --git a/arch/s390/include/asm/timex.h b/arch/s390/include/asm/timex.h
index 49447b40f038..ac3ab6c29912 100644
--- a/arch/s390/include/asm/timex.h
+++ b/arch/s390/include/asm/timex.h
@@ -12,6 +12,7 @@
#include <linux/preempt.h>
#include <linux/time64.h>
+#include <asm/tod_types.h>
#include <asm/lowcore.h>
#include <asm/machine.h>
#include <asm/asm.h>
@@ -21,25 +22,6 @@
extern u64 clock_comparator_max;
-union tod_clock {
- __uint128_t val;
- struct {
- __uint128_t ei : 8; /* epoch index */
- __uint128_t tod : 64; /* bits 0-63 of tod clock */
- __uint128_t : 40;
- __uint128_t pf : 16; /* programmable field */
- };
- struct {
- __uint128_t eitod : 72; /* epoch index + bits 0-63 tod clock */
- __uint128_t : 56;
- };
- struct {
- __uint128_t us : 60; /* micro-seconds */
- __uint128_t sus : 12; /* sub-microseconds */
- __uint128_t : 56;
- };
-} __packed;
-
/* Inline functions for clock register access. */
static inline int set_tod_clock(__u64 time)
{
diff --git a/arch/s390/include/asm/tod_types.h b/arch/s390/include/asm/tod_types.h
new file mode 100644
index 000000000000..976fa0a1e895
--- /dev/null
+++ b/arch/s390/include/asm/tod_types.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_S390_TOD_TYPES_H
+#define _ASM_S390_TOD_TYPES_H
+
+#include <linux/types.h>
+
+#ifndef __ASSEMBLER__
+
+union tod_clock {
+ __uint128_t val;
+ struct {
+ __uint128_t ei : 8; /* epoch index */
+ __uint128_t tod : 64; /* bits 0-63 of tod clock */
+ __uint128_t : 40;
+ __uint128_t pf : 16; /* programmable field */
+ };
+ struct {
+ __uint128_t eitod : 72; /* epoch index + bits 0-63 tod clock */
+ __uint128_t : 56;
+ };
+ struct {
+ __uint128_t us : 60; /* micro-seconds */
+ __uint128_t sus : 12; /* sub-microseconds */
+ __uint128_t : 56;
+ };
+} __packed;
+
+#endif
+#endif
diff --git a/arch/s390/kernel/asm-offsets.c b/arch/s390/kernel/asm-offsets.c
index e1a5b5b54e4f..c580a5ec265c 100644
--- a/arch/s390/kernel/asm-offsets.c
+++ b/arch/s390/kernel/asm-offsets.c
@@ -14,6 +14,7 @@
#include <asm/kvm_host_types.h>
#include <asm/stacktrace.h>
#include <asm/ptrace.h>
+#include <asm/idle.h>
int main(void)
{
@@ -126,6 +127,7 @@ int main(void)
OFFSET(__LC_EXIT_TIMER, lowcore, exit_timer);
OFFSET(__LC_LAST_UPDATE_TIMER, lowcore, last_update_timer);
OFFSET(__LC_LAST_UPDATE_CLOCK, lowcore, last_update_clock);
+ OFFSET(__LC_IDLE_DATA, lowcore, idle_data);
OFFSET(__LC_INT_CLOCK, lowcore, int_clock);
OFFSET(__LC_CURRENT, lowcore, current_task);
OFFSET(__LC_KERNEL_STACK, lowcore, kernel_stack);
@@ -180,6 +182,9 @@ int main(void)
DEFINE(OLDMEM_SIZE, PARMAREA + offsetof(struct parmarea, oldmem_size));
DEFINE(COMMAND_LINE, PARMAREA + offsetof(struct parmarea, command_line));
DEFINE(MAX_COMMAND_LINE_SIZE, PARMAREA + offsetof(struct parmarea, max_command_line_size));
+ /* idle data offsets */
+ OFFSET(__IDLE_CLOCK_EXIT, s390_idle_data, clock_idle_exit);
+ OFFSET(__IDLE_SEQUENCE, s390_idle_data, sequence);
OFFSET(__FTRACE_REGS_PT_REGS, __arch_ftrace_regs, regs);
DEFINE(__FTRACE_REGS_SIZE, sizeof(struct __arch_ftrace_regs));
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 4873fe9d891b..19fd1541a0f3 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -378,8 +378,13 @@ SYM_CODE_END(pgm_check_handler)
SYM_CODE_START(\name)
STMG_LC %r8,%r15,__LC_SAVE_AREA
GET_LC %r13
- stckf __LC_INT_CLOCK(%r13)
+ lg %r12,__LC_IDLE_DATA(%r13)
+ asi __IDLE_SEQUENCE(%r12),1
+ stcke __LC_INT_CLOCK(%r13)
stpt __LC_SYS_ENTER_TIMER(%r13)
+ mvc __IDLE_CLOCK_EXIT(16,%r12),__LC_INT_CLOCK(%r13)
+ ALTERNATIVE "bcr 15,0", "bcr 14,0", ALT_FACILITY(45)
+ asi __IDLE_SEQUENCE(%r12),1
STBEAR __LC_LAST_BREAK(%r13)
BPOFF
lmg %r8,%r9,\lc_old_psw(%r13)
diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c
index fb4f431342f5..ceb95c0d22eb 100644
--- a/arch/s390/kernel/idle.c
+++ b/arch/s390/kernel/idle.c
@@ -9,12 +9,14 @@
#include <linux/kernel.h>
#include <linux/kernel_stat.h>
+#include <linux/sched/stat.h>
#include <linux/notifier.h>
#include <linux/init.h>
#include <linux/cpu.h>
#include <trace/events/power.h>
#include <asm/cpu_mf.h>
#include <asm/cputime.h>
+#include <asm/lowcore.h>
#include <asm/nmi.h>
#include <asm/smp.h>
#include "entry.h"
@@ -24,6 +26,7 @@ DEFINE_PER_CPU(struct s390_idle_data, s390_idle);
void account_idle_time_irq(void)
{
struct s390_idle_data *idle = this_cpu_ptr(&s390_idle);
+ struct kernel_cpustat *kc = kcpustat_this_cpu;
struct lowcore *lc = get_lowcore();
unsigned long idle_time;
u64 cycles_new[8];
@@ -35,27 +38,82 @@ void account_idle_time_irq(void)
this_cpu_add(mt_cycles[i], cycles_new[i] - idle->mt_cycles_enter[i]);
}
+ write_seqcount_begin(&kc->idle_sleeptime_seq);
/* Account time spent with enabled wait psw loaded as idle time. */
- idle_time = lc->int_clock - idle->clock_idle_enter;
+ idle_time = lc->int_clock.tod - idle->clock_idle_enter.tod;
WRITE_ONCE(idle->idle_time, READ_ONCE(idle->idle_time) + idle_time);
WRITE_ONCE(idle->idle_count, READ_ONCE(idle->idle_count) + 1);
- /* Dyntick idle time accounted by nohz/scheduler */
- if (idle->idle_dyntick)
- return;
-
- lc->steal_timer += idle->clock_idle_enter - lc->last_update_clock;
- lc->last_update_clock = lc->int_clock;
+ lc->steal_timer += idle->clock_idle_enter.tod - lc->last_update_clock;
+ lc->last_update_clock = lc->int_clock.tod;
lc->system_timer += lc->last_update_timer - idle->timer_idle_enter;
lc->last_update_timer = lc->sys_enter_timer;
+ idle->in_idle = false;
account_idle_time(cputime_to_nsecs(idle_time));
+ write_seqcount_end(&kc->idle_sleeptime_seq);
+}
+
+static u64 arch_cpu_in_idle_time(int cpu)
+{
+ struct s390_idle_data *idle = &per_cpu(s390_idle, cpu);
+ union tod_clock now;
+ unsigned int seq;
+ u64 idle_time;
+
+ if (!idle->in_idle)
+ return 0;
+ /*
+ * Open-coded read seqlock which pairs with assembler write seqlock
+ * in entry.S. Its purpose is to prevent that reported idle time
+ * goes backwards.
+ */
+ do {
+ do {
+ seq = __atomic_read(&idle->sequence);
+ smp_mb();
+ } while (seq & 1);
+ store_tod_clock_ext(&now);
+ if (tod_after(idle->clock_idle_exit.tod, idle->clock_idle_enter.tod))
+ idle_time = idle->clock_idle_exit.tod - idle->clock_idle_enter.tod;
+ else
+ idle_time = now.tod - idle->clock_idle_enter.tod;
+ smp_mb();
+ } while (__atomic_read(&idle->sequence) != seq);
+ return tod_to_ns(idle_time);
+}
+
+static u64 arch_cpu_idle_time(int cpu, enum cpu_usage_stat idx, bool compute_delta)
+{
+ struct kernel_cpustat *kc = &kcpustat_cpu(cpu);
+ u64 *cpustat = kc->cpustat;
+ unsigned int seq;
+ u64 idle_time;
+
+ do {
+ seq = read_seqcount_begin(&kc->idle_sleeptime_seq);
+ idle_time = cpustat[idx];
+ if (compute_delta)
+ idle_time += arch_cpu_in_idle_time(cpu);
+ } while (read_seqcount_retry(&kc->idle_sleeptime_seq, seq));
+ return idle_time;
+}
+
+u64 arch_kcpustat_field_idle(int cpu)
+{
+ return arch_cpu_idle_time(cpu, CPUTIME_IDLE, !nr_iowait_cpu(cpu));
+}
+
+u64 arch_kcpustat_field_iowait(int cpu)
+{
+ return arch_cpu_idle_time(cpu, CPUTIME_IOWAIT, nr_iowait_cpu(cpu));
}
void noinstr arch_cpu_idle(void)
{
struct s390_idle_data *idle = this_cpu_ptr(&s390_idle);
+ struct kernel_cpustat *kc = kcpustat_this_cpu;
unsigned long psw_mask;
/* Wait for external, I/O or machine check interrupt. */
@@ -65,8 +123,12 @@ void noinstr arch_cpu_idle(void)
set_cpu_flag(CIF_ENABLED_WAIT);
if (smp_cpu_mtid)
stcctm(MT_DIAG, smp_cpu_mtid, (u64 *)&idle->mt_cycles_enter);
- idle->clock_idle_enter = get_tod_clock_fast();
+ raw_write_seqcount_begin(&kc->idle_sleeptime_seq);
+ idle->in_idle = true;
+ store_tod_clock_ext(&idle->clock_idle_enter);
idle->timer_idle_enter = get_cpu_timer();
+ idle->clock_idle_exit = idle->clock_idle_enter;
+ raw_write_seqcount_end(&kc->idle_sleeptime_seq);
bpon();
__load_psw_mask(psw_mask);
}
diff --git a/arch/s390/kernel/irq.c b/arch/s390/kernel/irq.c
index bdf9c7cb5685..8e3667bff225 100644
--- a/arch/s390/kernel/irq.c
+++ b/arch/s390/kernel/irq.c
@@ -103,7 +103,7 @@ static const struct irq_class irqclass_sub_desc[] = {
static void do_IRQ(struct pt_regs *regs, int irq)
{
- if (tod_after_eq(get_lowcore()->int_clock,
+ if (tod_after_eq(get_lowcore()->int_clock.tod,
get_lowcore()->clock_comparator))
/* Serve timer interrupts first. */
clock_comparator_work();
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index c1fe0b53c5ac..2b1af286ef94 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -421,6 +421,7 @@ static void __init setup_lowcore(void)
lc->steal_timer = get_lowcore()->steal_timer;
lc->last_update_timer = get_lowcore()->last_update_timer;
lc->last_update_clock = get_lowcore()->last_update_clock;
+ lc->idle_data = (unsigned long)per_cpu_ptr(&s390_idle, 0);
/*
* Allocate the global restart stack which is the same for
* all CPUs in case *one* of them does a PSW restart.
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index b7429f30afc1..439eab2fb67a 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -256,6 +256,7 @@ static void pcpu_prepare_secondary(struct pcpu *pcpu, int cpu)
lc->user_asce = s390_invalid_asce;
lc->user_timer = lc->system_timer =
lc->steal_timer = lc->avg_steal_timer = 0;
+ lc->idle_data = (unsigned long)per_cpu_ptr(&s390_idle, cpu);
abs_lc = get_abs_lowcore();
memcpy(lc->cregs_save_area, abs_lc->cregs_save_area, sizeof(lc->cregs_save_area));
put_abs_lowcore(abs_lc);
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index c19528eb4ee3..93255f442359 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -269,43 +269,6 @@ void vtime_account_hardirq(struct task_struct *tsk)
virt_timer_forward(delta);
}
-#ifdef CONFIG_NO_HZ_COMMON
-/**
- * vtime_reset - Fast forward vtime entry clocks
- *
- * Called from dynticks idle IRQ entry to fast-forward the clocks to current time
- * so that the IRQ time is still accounted by vtime while nohz cputime is paused.
- */
-void vtime_reset(void)
-{
- vtime_reset_last_update(get_lowcore());
-}
-
-/**
- * vtime_dyntick_start - Inform vtime about entry to idle-dynticks
- *
- * Called when idle enters in dyntick mode. The idle cputime that elapsed so far
- * is flushed and the tick subsystem takes over the idle cputime accounting.
- */
-void vtime_dyntick_start(void)
-{
- __this_cpu_write(s390_idle.idle_dyntick, true);
- vtime_flush(current);
-}
-
-/**
- * vtime_dyntick_stop - Inform vtime about exit from idle-dynticks
- *
- * Called when idle exits from dyntick mode. The vtime entry clocks are
- * fast-forward to current time and idle accounting resumes.
- */
-void vtime_dyntick_stop(void)
-{
- vtime_reset_last_update(get_lowcore());
- __this_cpu_write(s390_idle.idle_dyntick, false);
-}
-#endif /* CONFIG_NO_HZ_COMMON */
-
/*
* Sorted add to a list. List is linear searched until first bigger
* element is found.
diff --git a/drivers/s390/cio/qdio_main.c b/drivers/s390/cio/qdio_main.c
index 7dd967165025..ffcb416f3f58 100644
--- a/drivers/s390/cio/qdio_main.c
+++ b/drivers/s390/cio/qdio_main.c
@@ -695,7 +695,7 @@ static void qdio_int_handler_pci(struct qdio_irq *irq_ptr)
return;
qdio_deliver_irq(irq_ptr);
- irq_ptr->last_data_irq_time = get_lowcore()->int_clock;
+ irq_ptr->last_data_irq_time = get_lowcore()->int_clock.tod;
}
static void qdio_handle_activate_check(struct qdio_irq *irq_ptr,
diff --git a/drivers/s390/cio/qdio_thinint.c b/drivers/s390/cio/qdio_thinint.c
index f931954910c5..ca0d2e4b673f 100644
--- a/drivers/s390/cio/qdio_thinint.c
+++ b/drivers/s390/cio/qdio_thinint.c
@@ -99,7 +99,7 @@ static inline u32 clear_shared_ind(void)
static void tiqdio_thinint_handler(struct airq_struct *airq,
struct tpi_info *tpi_info)
{
- u64 irq_time = get_lowcore()->int_clock;
+ u64 irq_time = get_lowcore()->int_clock.tod;
u32 si_used = clear_shared_ind();
struct qdio_irq *irq;
diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index 24a54a6151ba..4ea9681a5f5c 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -107,6 +107,30 @@ static inline unsigned long kstat_cpu_irqs_sum(unsigned int cpu)
}
#ifdef CONFIG_NO_HZ_COMMON
+
+#ifdef CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE
+
+static inline void kcpustat_dyntick_start(u64 now) { }
+static inline void kcpustat_dyntick_stop(u64 now) { }
+static inline void kcpustat_irq_enter(u64 now) { }
+static inline void kcpustat_irq_exit(u64 now) { }
+static inline bool kcpustat_idle_dyntick(void) { return false; }
+
+extern u64 arch_kcpustat_field_idle(int cpu);
+extern u64 arch_kcpustat_field_iowait(int cpu);
+
+static inline u64 kcpustat_field_idle(int cpu)
+{
+ return arch_kcpustat_field_idle(cpu);
+}
+
+static inline u64 kcpustat_field_iowait(int cpu)
+{
+ return arch_kcpustat_field_iowait(cpu);
+}
+
+#else /* !CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE */
+
extern void kcpustat_dyntick_start(u64 now);
extern void kcpustat_dyntick_stop(u64 now);
extern void kcpustat_irq_enter(u64 now);
@@ -118,6 +142,9 @@ static inline bool kcpustat_idle_dyntick(void)
{
return __this_cpu_read(kernel_cpustat.idle_dyntick);
}
+
+#endif /* !CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE */
+
#else
static inline u64 kcpustat_field_idle(int cpu)
{
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index a4506336002d..07c862fd0951 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -42,9 +42,15 @@ extern void vtime_account_irq(struct task_struct *tsk, unsigned int offset);
extern void vtime_account_softirq(struct task_struct *tsk);
extern void vtime_account_hardirq(struct task_struct *tsk);
extern void vtime_flush(struct task_struct *tsk);
+#ifdef CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE
+static inline void vtime_reset(void) { }
+static inline void vtime_dyntick_start(void) { }
+static inline void vtime_dyntick_stop(void) { }
+#else
extern void vtime_reset(void);
extern void vtime_dyntick_start(void);
extern void vtime_dyntick_stop(void);
+#endif
#else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
static inline void vtime_account_irq(struct task_struct *tsk, unsigned int offset) { }
static inline void vtime_account_softirq(struct task_struct *tsk) { }
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index e0492e18cc5f..85a0f728e28e 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -420,7 +420,7 @@ static inline void irqtime_account_process_tick(struct task_struct *p, int user_
int nr_ticks) { }
#endif /* !CONFIG_IRQ_TIME_ACCOUNTING */
-#ifdef CONFIG_NO_HZ_COMMON
+#if defined(CONFIG_NO_HZ_COMMON) && !defined(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE)
static void kcpustat_idle_stop(struct kernel_cpustat *kc, u64 now)
{
u64 *cpustat = kc->cpustat;
@@ -542,7 +542,7 @@ static u64 kcpustat_field_dyntick(int cpu, enum cpu_usage_stat idx,
{
return kcpustat_cpu(cpu).cpustat[idx];
}
-#endif /* CONFIG_NO_HZ_COMMON */
+#endif /* CONFIG_NO_HZ_COMMON && !CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE */
static u64 get_cpu_sleep_time_us(int cpu, enum cpu_usage_stat idx,
bool compute_delta, u64 *last_update_time)
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC PATCH 3/3] s390/idle: Remove idle time and count sysfs files
2026-02-25 14:51 [RFC PATCH 0/3] s390: Idle time accounting improvements Heiko Carstens
2026-02-25 14:51 ` [RFC PATCH 1/3] fixup! s390/time: Prepare to stop elapsing in dynticks-idle Heiko Carstens
2026-02-25 14:51 ` [RFC PATCH 2/3] s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait() Heiko Carstens
@ 2026-02-25 14:51 ` Heiko Carstens
2026-03-11 16:14 ` Frederic Weisbecker
2 siblings, 1 reply; 8+ messages in thread
From: Heiko Carstens @ 2026-02-25 14:51 UTC (permalink / raw)
To: Frederic Weisbecker, Alexander Gordeev, Sven Schnelle,
Vasily Gorbik, Christian Borntraeger
Cc: linux-kernel, linux-s390
Remove the s390 specific idle_time_us and idle_count per-cpu sysfs
files. They do not provide an additional value. The risk that there are
existing applications which rely on these architecture specific files
should be very low.
However if it turns out such applications exist, this can be easily
reverted.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/include/asm/idle.h | 4 ----
arch/s390/kernel/idle.c | 20 +-------------------
arch/s390/kernel/smp.c | 32 +-------------------------------
3 files changed, 2 insertions(+), 54 deletions(-)
diff --git a/arch/s390/include/asm/idle.h b/arch/s390/include/asm/idle.h
index dc04d63b6187..f0f3d38ef648 100644
--- a/arch/s390/include/asm/idle.h
+++ b/arch/s390/include/asm/idle.h
@@ -10,7 +10,6 @@
#include <linux/percpu-defs.h>
#include <linux/types.h>
-#include <linux/device.h>
#include <asm/tod_types.h>
struct s390_idle_data {
@@ -26,9 +25,6 @@ struct s390_idle_data {
DECLARE_PER_CPU(struct s390_idle_data, s390_idle);
-extern struct device_attribute dev_attr_idle_count;
-extern struct device_attribute dev_attr_idle_time_us;
-
void psw_idle(struct s390_idle_data *data, unsigned long psw_mask);
#endif /* _S390_IDLE_H */
diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c
index ceb95c0d22eb..db120ef810ac 100644
--- a/arch/s390/kernel/idle.c
+++ b/arch/s390/kernel/idle.c
@@ -81,7 +81,7 @@ static u64 arch_cpu_in_idle_time(int cpu)
idle_time = now.tod - idle->clock_idle_enter.tod;
smp_mb();
} while (__atomic_read(&idle->sequence) != seq);
- return tod_to_ns(idle_time);
+ return cputime_to_nsecs(idle_time);
}
static u64 arch_cpu_idle_time(int cpu, enum cpu_usage_stat idx, bool compute_delta)
@@ -133,24 +133,6 @@ void noinstr arch_cpu_idle(void)
__load_psw_mask(psw_mask);
}
-static ssize_t show_idle_count(struct device *dev,
- struct device_attribute *attr, char *buf)
-{
- struct s390_idle_data *idle = &per_cpu(s390_idle, dev->id);
-
- return sysfs_emit(buf, "%lu\n", READ_ONCE(idle->idle_count));
-}
-DEVICE_ATTR(idle_count, 0444, show_idle_count, NULL);
-
-static ssize_t show_idle_time(struct device *dev,
- struct device_attribute *attr, char *buf)
-{
- struct s390_idle_data *idle = &per_cpu(s390_idle, dev->id);
-
- return sysfs_emit(buf, "%lu\n", READ_ONCE(idle->idle_time) >> 12);
-}
-DEVICE_ATTR(idle_time_us, 0444, show_idle_time, NULL);
-
void arch_cpu_idle_enter(void)
{
}
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 439eab2fb67a..64f0a5617e86 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -1086,31 +1086,6 @@ static struct attribute_group cpu_common_attr_group = {
.attrs = cpu_common_attrs,
};
-static struct attribute *cpu_online_attrs[] = {
- &dev_attr_idle_count.attr,
- &dev_attr_idle_time_us.attr,
- NULL,
-};
-
-static struct attribute_group cpu_online_attr_group = {
- .attrs = cpu_online_attrs,
-};
-
-static int smp_cpu_online(unsigned int cpu)
-{
- struct cpu *c = per_cpu_ptr(&cpu_devices, cpu);
-
- return sysfs_create_group(&c->dev.kobj, &cpu_online_attr_group);
-}
-
-static int smp_cpu_pre_down(unsigned int cpu)
-{
- struct cpu *c = per_cpu_ptr(&cpu_devices, cpu);
-
- sysfs_remove_group(&c->dev.kobj, &cpu_online_attr_group);
- return 0;
-}
-
bool arch_cpu_is_hotpluggable(int cpu)
{
return !!cpu;
@@ -1176,18 +1151,13 @@ static DEVICE_ATTR_WO(rescan);
static int __init s390_smp_init(void)
{
struct device *dev_root;
- int rc;
+ int rc = 0;
dev_root = bus_get_dev_root(&cpu_subsys);
if (dev_root) {
rc = device_create_file(dev_root, &dev_attr_rescan);
put_device(dev_root);
- if (rc)
- return rc;
}
- rc = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "s390/smp:online",
- smp_cpu_online, smp_cpu_pre_down);
- rc = rc <= 0 ? rc : 0;
return rc;
}
subsys_initcall(s390_smp_init);
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [RFC PATCH 1/3] fixup! s390/time: Prepare to stop elapsing in dynticks-idle
2026-02-25 14:51 ` [RFC PATCH 1/3] fixup! s390/time: Prepare to stop elapsing in dynticks-idle Heiko Carstens
@ 2026-03-11 15:14 ` Frederic Weisbecker
0 siblings, 0 replies; 8+ messages in thread
From: Frederic Weisbecker @ 2026-03-11 15:14 UTC (permalink / raw)
To: Heiko Carstens
Cc: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, linux-kernel, linux-s390
Le Wed, Feb 25, 2026 at 03:51:44PM +0100, Heiko Carstens a écrit :
> This should be merged with "s390/time: Prepare to stop elapsing in
> dynticks-idle".
>
> It makes sure that idle->clock_idle_enter is always set when loading the
> idle psw. Otherwise the idle_time calculation in account_idle_time_irq()
> would be incorrect.
>
> Also "revert" some not needed code movements and whitespace changes to keep
> the diff minimal.
>
> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Thank you, I'm folding this to the patch.
--
Frederic Weisbecker
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH 2/3] s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait()
2026-02-25 14:51 ` [RFC PATCH 2/3] s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait() Heiko Carstens
@ 2026-03-11 16:13 ` Frederic Weisbecker
2026-03-20 12:30 ` Heiko Carstens
0 siblings, 1 reply; 8+ messages in thread
From: Frederic Weisbecker @ 2026-03-11 16:13 UTC (permalink / raw)
To: Heiko Carstens
Cc: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, linux-kernel, linux-s390
Le Wed, Feb 25, 2026 at 03:51:45PM +0100, Heiko Carstens a écrit :
> diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
> index 4873fe9d891b..19fd1541a0f3 100644
> --- a/arch/s390/kernel/entry.S
> +++ b/arch/s390/kernel/entry.S
> @@ -378,8 +378,13 @@ SYM_CODE_END(pgm_check_handler)
> SYM_CODE_START(\name)
> STMG_LC %r8,%r15,__LC_SAVE_AREA
> GET_LC %r13
> - stckf __LC_INT_CLOCK(%r13)
> + lg %r12,__LC_IDLE_DATA(%r13)
> + asi __IDLE_SEQUENCE(%r12),1
> + stcke __LC_INT_CLOCK(%r13)
> stpt __LC_SYS_ENTER_TIMER(%r13)
> + mvc __IDLE_CLOCK_EXIT(16,%r12),__LC_INT_CLOCK(%r13)
> + ALTERNATIVE "bcr 15,0", "bcr 14,0", ALT_FACILITY(45)
> + asi __IDLE_SEQUENCE(%r12),1
Would it be possible to instead do that with &kc->idle_sleeptime_seq ?
This should sum up to a simple increment as well. This way you don't need
those nested seqcounts.
Thanks.
--
Frederic Weisbecker
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH 3/3] s390/idle: Remove idle time and count sysfs files
2026-02-25 14:51 ` [RFC PATCH 3/3] s390/idle: Remove idle time and count sysfs files Heiko Carstens
@ 2026-03-11 16:14 ` Frederic Weisbecker
0 siblings, 0 replies; 8+ messages in thread
From: Frederic Weisbecker @ 2026-03-11 16:14 UTC (permalink / raw)
To: Heiko Carstens
Cc: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, linux-kernel, linux-s390
Le Wed, Feb 25, 2026 at 03:51:46PM +0100, Heiko Carstens a écrit :
> Remove the s390 specific idle_time_us and idle_count per-cpu sysfs
> files. They do not provide an additional value. The risk that there are
> existing applications which rely on these architecture specific files
> should be very low.
>
> However if it turns out such applications exist, this can be easily
> reverted.
>
> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
--
Frederic Weisbecker
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH 2/3] s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait()
2026-03-11 16:13 ` Frederic Weisbecker
@ 2026-03-20 12:30 ` Heiko Carstens
0 siblings, 0 replies; 8+ messages in thread
From: Heiko Carstens @ 2026-03-20 12:30 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Alexander Gordeev, Sven Schnelle, Vasily Gorbik,
Christian Borntraeger, linux-kernel, linux-s390
On Wed, Mar 11, 2026 at 05:13:59PM +0100, Frederic Weisbecker wrote:
> Le Wed, Feb 25, 2026 at 03:51:45PM +0100, Heiko Carstens a écrit :
> > diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
> > index 4873fe9d891b..19fd1541a0f3 100644
> > --- a/arch/s390/kernel/entry.S
> > +++ b/arch/s390/kernel/entry.S
> > @@ -378,8 +378,13 @@ SYM_CODE_END(pgm_check_handler)
> > SYM_CODE_START(\name)
> > STMG_LC %r8,%r15,__LC_SAVE_AREA
> > GET_LC %r13
> > - stckf __LC_INT_CLOCK(%r13)
> > + lg %r12,__LC_IDLE_DATA(%r13)
> > + asi __IDLE_SEQUENCE(%r12),1
> > + stcke __LC_INT_CLOCK(%r13)
> > stpt __LC_SYS_ENTER_TIMER(%r13)
> > + mvc __IDLE_CLOCK_EXIT(16,%r12),__LC_INT_CLOCK(%r13)
> > + ALTERNATIVE "bcr 15,0", "bcr 14,0", ALT_FACILITY(45)
> > + asi __IDLE_SEQUENCE(%r12),1
>
> Would it be possible to instead do that with &kc->idle_sleeptime_seq ?
> This should sum up to a simple increment as well. This way you don't need
> those nested seqcounts.
Yes, that was my first thought when implementing this. But decided
against it since I thought it wouldn't be such a good idea to directly
access members of common code locking structures from asm
code. However, since you mention this too, I'll reconsider :)
Please move forward with your current patch set and ignore this patch
for now. I'll provide an updated version later, however will be on
vacation for now.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-03-20 12:30 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25 14:51 [RFC PATCH 0/3] s390: Idle time accounting improvements Heiko Carstens
2026-02-25 14:51 ` [RFC PATCH 1/3] fixup! s390/time: Prepare to stop elapsing in dynticks-idle Heiko Carstens
2026-03-11 15:14 ` Frederic Weisbecker
2026-02-25 14:51 ` [RFC PATCH 2/3] s390/idle: Provide arch specific kcpustat_field_idle()/kcpustat_field_iowait() Heiko Carstens
2026-03-11 16:13 ` Frederic Weisbecker
2026-03-20 12:30 ` Heiko Carstens
2026-02-25 14:51 ` [RFC PATCH 3/3] s390/idle: Remove idle time and count sysfs files Heiko Carstens
2026-03-11 16:14 ` Frederic Weisbecker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox