From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 97E8D36605D for ; Tue, 24 Feb 2026 08:32:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771921974; cv=none; b=PBQ+BavDkPZRUH7Z9GoxeCiaMN64S5pFDyoKoUF73yMSVeyFdPdR/IYVALxI1yLAdyY1X+QVHOOFDaCawYpGXVNgN7gOPwge6iBvAiON0MpK6H1GOK3V/pV/EV/Fv8v0FqAkGgLmVCGNapYxvFgJd4QGjrDcL/g6dK48NkbA2Rw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771921974; c=relaxed/simple; bh=nCZP3O1jlpF+ZnxgoXds4SGgmSUGSU70yB6njTsd0Yo=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=UxapzEcrJQD2F6eDjUxZlojhnaQwAhdmcAPHrOasW2fBo7AlijvF4LcZjFMmYp+JDZ263SqvrIJltxU9HKoZjGrrFXiOS58BqxCP1v/jufYM77gQo+Wh1xYxLcFMhKP+aaYDiMCD43/MGZ1jpXa45lMoJa6EUK6tgAA6rHD/ZMA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=eXh44PyN; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="eXh44PyN" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5344AC116D0; Tue, 24 Feb 2026 08:32:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1771921974; bh=nCZP3O1jlpF+ZnxgoXds4SGgmSUGSU70yB6njTsd0Yo=; h=From:To:Cc:Subject:Date:From; b=eXh44PyNk/IaeWu3JL1l0mJO2lCQIHCHTPJ1SlU24qGD6sKkYzWCkwfCOBRzAlSYh QUyGCa8YJ3h78AXYshcuqeqUdsCp1cwCNf+CBVamYXKEfDhqtLvtTF047VJDiV/0lO rW/f725NmtyQ/1srLfVnyUPUfQhecwaxw5OSeuKFalBxQHQUCWITdaqUiWiZ3yU27w /ld2t8b72S/h/eX1sZPnbsVDw4wIt+/FQHHQg31wW+2cAsa4figV0JLbB7aDeZfqxq ISvdVSETciTnhCgBZm2ulYa0BvPoc5PNQAbipa3GQIIz+Tm1vKTy5Q51V2S8c3yO+R c+44g8OkRUSvA== From: Thomas Gleixner To: LKML Cc: Peter Zijlstra , "Rafael J. Wysocki" , Frederic Weisbecker Subject: [PATCH RFC] tick/sched: Prevent pointless NOHZ transitions Date: Tue, 24 Feb 2026 09:32:50 +0100 Message-ID: <875x7mv8wd.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain During a hackbench run with a fully loaded machine CPUs go briefly idle when they run out of tasks, which is expected. What's not expected are pointless NOHZ transitions like this: hackbench-1915 [001] d..2. 84.086755: sched_switch: prev_comm=hackbench prev_pid=1915 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 1) -0 [001] dn.2. 84.086757: hrtimer_start: hrtimer=00000000db1ede74 function=tick_nohz_handler expires=305340000000 softexpires=305340000000 mode=ABS|PINNED|HARD was_armed=1 -0 [001] dn.2. 84.086757: hrtimer_rearm: next_event=83885523974 deferred=0 2) -0 [001] dN.2. 84.086761: hrtimer_start: hrtimer=00000000db1ede74 function=tick_nohz_handler expires=82950000000 softexpires=82950000000 mode=ABS|PINNED|HARD was_armed=1 -0 [001] dN.2. 84.086761: hrtimer_rearm: next_event=82950000000 deferred=0 -0 [001] d..2. 84.086767: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=hackbench next_pid=2138 next_prio=120 hackbench-2138 [001] d..2. 84.086779: sched_switch: prev_comm=hackbench prev_pid=2138 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 #1 switches to NOHZ mode targeting the next expiring timer and #2 switches back to tick mode a whopping 4us later. This happens with both TEO and MENU governors in a VM guest. That's not only pointless it's also a performance issue as each rearm of the timer implies a VM exit. Keep track of the idle time with a moving average and check it for being larger than TICK_NSEC in can_stop_idle_tick(). That cures this behaviour while still allowing the system to go into long idle sleeps once the work load stopped. Signed-off-by: Thomas Gleixner --- kernel/time/tick-sched.c | 20 +++++++++++++++++--- kernel/time/tick-sched.h | 9 +++++++++ 2 files changed, 26 insertions(+), 3 deletions(-) --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -751,6 +751,16 @@ static void tick_nohz_update_jiffies(kti touch_softlockup_watchdog_sched(); } +static void tick_nohz_update_idle_duration(struct tick_sched *ts, ktime_t now) +{ + ktime_t delta = now - ts->idle_dur_entry; + unsigned int idx = ts->idle_dur_idx; + + ts->idle_dur_sum += delta - ts->idle_dur[idx]; + ts->idle_dur[idx] = delta; + ts->idle_dur_idx = (idx + 1) & IDLE_DUR_MASK; +} + static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now) { ktime_t delta; @@ -760,6 +770,8 @@ static void tick_nohz_stop_idle(struct t delta = ktime_sub(now, ts->idle_entrytime); + tick_nohz_update_idle_duration(ts, now); + write_seqcount_begin(&ts->idle_sleeptime_seq); if (nr_iowait_cpu(smp_processor_id()) > 0) ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta); @@ -1224,7 +1236,7 @@ static bool can_stop_idle_tick(int cpu, return false; } - return true; + return ts->idle_dur_sum > TICK_NSEC * IDLE_DUR_ENTRIES; } /** @@ -1292,6 +1304,7 @@ void tick_nohz_idle_enter(void) tick_sched_flag_set(ts, TS_FLAG_INIDLE); tick_nohz_start_idle(ts); + ts->idle_dur_entry = ts->idle_entrytime; local_irq_enable(); } @@ -1490,11 +1503,12 @@ void tick_nohz_idle_exit(void) idle_active = tick_sched_flag_test(ts, TS_FLAG_IDLE_ACTIVE); tick_stopped = tick_sched_flag_test(ts, TS_FLAG_STOPPED); - if (idle_active || tick_stopped) - now = ktime_get(); + now = ktime_get(); if (idle_active) tick_nohz_stop_idle(ts, now); + else + tick_nohz_update_idle_duration(ts, now); if (tick_stopped) tick_nohz_idle_update_tick(ts, now); --- a/kernel/time/tick-sched.h +++ b/kernel/time/tick-sched.h @@ -30,6 +30,9 @@ struct tick_device { /* High resolution tick mode */ #define TS_FLAG_HIGHRES BIT(5) +#define IDLE_DUR_ENTRIES 8 +#define IDLE_DUR_MASK (IDLE_DUR_ENTRIES - 1) + /** * struct tick_sched - sched tick emulation and no idle tick control/stats * @@ -95,6 +98,12 @@ struct tick_sched { ktime_t idle_sleeptime; ktime_t iowait_sleeptime; + /* Idle duration */ + ktime_t idle_dur[IDLE_DUR_ENTRIES]; + ktime_t idle_dur_entry; + ktime_t idle_dur_sum; + unsigned int idle_dur_idx; + /* Full dynticks handling */ atomic_t tick_dep_mask;