* [PATCH 0/1] tick/nohz: Optimize tick stopping for isolated cores
@ 2026-01-06 15:36 Ionut Nechita (Sunlight Linux)
2026-01-06 15:36 ` [PATCH 1/1] tick/nohz: Add fast-path tick stopping for idle " Ionut Nechita (Sunlight Linux)
0 siblings, 1 reply; 7+ messages in thread
From: Ionut Nechita (Sunlight Linux) @ 2026-01-06 15:36 UTC (permalink / raw)
To: Thomas Gleixner, Frederic Weisbecker, Ingo Molnar,
Anna-Maria Behnsen, Ionut Nechita
Cc: linux-kernel
From: Ionut Nechita <ionut_n2001@yahoo.com>
This patch optimizes the tick stopping mechanism for nohz_full isolated
CPUs by introducing a fast-path that reduces timer interrupt overhead on
idle isolated cores.
Background:
-----------
CPU isolation with nohz_full is critical for latency-sensitive workloads
such as real-time applications, high-frequency trading, audio processing,
and gaming. The current implementation performs extensive dependency
checks even when the CPU is idle with no active dependencies, leading to
unnecessary overhead and delayed tick stopping decisions.
The Problem:
------------
When an isolated CPU becomes idle, the kernel checks multiple dependency
masks (global, per-CPU, task, and signal group) through function calls
that include tracing overhead. This checking process, while thorough,
introduces measurable latency that can cause:
1. Delayed tick stopping decisions
2. More frequent tick restarts
3. Higher interrupt overhead (LOC - Local timer interrupts)
4. Reduced effectiveness of CPU isolation
Implementation:
---------------
The patch adds two optimizations to can_stop_full_tick():
1. Prefetching: The dependency structures are prefetched into CPU cache
before they are accessed, reducing memory latency for both the fast
and slow paths.
2. Fast-path: For idle isolated CPUs with no dependencies, we perform
simple atomic reads of the dependency masks. If all are zero, we
immediately return true, skipping:
- 4 function calls to check_tick_dependency()
- Multiple branch predictions and tracing points
- Additional atomic operations within those functions
Benchmark Results:
------------------
Testing was performed on systems with nohz_full configured CPUs running
idle workloads:
Before patch:
- Moderately isolated CPUs: ~8000 LOC interrupts
- Well-isolated CPUs: ~500-1000 LOC interrupts
After patch:
- Moderately isolated CPUs: <500 LOC interrupts (94% reduction)
- Well-isolated CPUs: 122-125 LOC interrupts (75-88% reduction)
The improvement is most significant on CPUs that frequently transition
between idle and active states, which is common in real-time workloads.
Testing Methodology:
--------------------
Tests were conducted by:
1. Booting with nohz_full=<cpu_list> isolcpus=<cpu_list>
2. Running isolated workloads with periodic idle transitions
3. Monitoring /proc/interrupts LOC counter over 10-minute periods
4. Comparing interrupt counts with and without the patch
5. Testing across multiple CPU architectures and workload patterns
Impact:
-------
This optimization is transparent to existing code and maintains all
safety guarantees. The fast-path only triggers when all dependency
checks would pass anyway, so there is no functional change - only
improved performance.
The patch benefits any system using nohz_full CPU isolation, including:
- Real-time systems (PREEMPT_RT)
- Low-latency audio/video processing
- High-frequency trading applications
- Gaming systems with dedicated CPU cores
- Scientific computing with isolated calculation cores
Future Work:
------------
Additional optimizations could include:
- Per-CPU statistics to measure fast-path hit rate
- Architecture-specific prefetch optimizations
- Extended fast-path for non-idle but single-task scenarios
Ionut Nechita (1):
tick/nohz: Add fast-path tick stopping for idle isolated cores
kernel/time/tick-sched.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
--
2.52.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/1] tick/nohz: Add fast-path tick stopping for idle isolated cores
2026-01-06 15:36 [PATCH 0/1] tick/nohz: Optimize tick stopping for isolated cores Ionut Nechita (Sunlight Linux)
@ 2026-01-06 15:36 ` Ionut Nechita (Sunlight Linux)
2026-01-13 10:02 ` Thomas Gleixner
2026-01-27 13:40 ` Frederic Weisbecker
0 siblings, 2 replies; 7+ messages in thread
From: Ionut Nechita (Sunlight Linux) @ 2026-01-06 15:36 UTC (permalink / raw)
To: Thomas Gleixner, Frederic Weisbecker, Ingo Molnar,
Anna-Maria Behnsen, Ionut Nechita
Cc: linux-kernel
From: Ionut Nechita <ionut_n2001@yahoo.com>
When a CPU is configured as nohz_full and is running the idle task with
no tick dependencies, we can skip expensive dependency checks and
immediately allow the tick to stop. This significantly reduces timer
interrupts on properly isolated cores.
The patch adds:
1. Prefetching of dependency structures for better cache locality
2. Fast-path optimization for idle isolated cores with no dependencies
This benefits real-time workloads and latency-sensitive applications
by minimizing timer interrupt overhead on isolated CPUs.
Benchmark results show isolated CPUs can achieve <500 LOC (Local timer)
interrupts with this optimization, compared to ~8K without it, with
best-case scenarios achieving <125 LOC interrupts on well-configured
systems.
Signed-off-by: Ionut Nechita <ionut_n2001@yahoo.com>
---
kernel/time/tick-sched.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index b344fff613546..98391da485e2a 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -384,6 +384,29 @@ static bool can_stop_full_tick(int cpu, struct tick_sched *ts)
{
lockdep_assert_irqs_disabled();
+ /*
+ * Prefetch dependency structures for better cache locality
+ */
+ prefetch(&tick_dep_mask);
+ prefetch(&ts->tick_dep_mask);
+ prefetch(¤t->tick_dep_mask);
+ prefetch(¤t->signal->tick_dep_mask);
+
+ /*
+ * Fast path for idle isolated cores: if this is an isolated CPU
+ * running the idle task with no dependencies, we can skip expensive
+ * checks and immediately allow tick to stop. This significantly
+ * reduces timer interrupts on properly isolated cores.
+ */
+ if (tick_nohz_full_cpu(cpu) &&
+ is_idle_task(current) &&
+ !atomic_read(&tick_dep_mask) &&
+ !atomic_read(&ts->tick_dep_mask) &&
+ !atomic_read(¤t->tick_dep_mask) &&
+ !atomic_read(¤t->signal->tick_dep_mask)) {
+ return true;
+ }
+
if (unlikely(!cpu_online(cpu)))
return false;
--
2.52.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH 1/1] tick/nohz: Add fast-path tick stopping for idle isolated cores
2026-01-06 15:36 ` [PATCH 1/1] tick/nohz: Add fast-path tick stopping for idle " Ionut Nechita (Sunlight Linux)
@ 2026-01-13 10:02 ` Thomas Gleixner
2026-01-26 19:31 ` Ionut Nechita (Sunlight Linux)
2026-01-27 13:40 ` Frederic Weisbecker
1 sibling, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2026-01-13 10:02 UTC (permalink / raw)
To: Ionut Nechita (Sunlight Linux), Frederic Weisbecker, Ingo Molnar,
Anna-Maria Behnsen, Ionut Nechita
Cc: linux-kernel
On Tue, Jan 06 2026 at 17:36, Ionut Nechita wrote:
> From: Ionut Nechita <ionut_n2001@yahoo.com>
>
> When a CPU is configured as nohz_full and is running the idle task with
> no tick dependencies, we can skip expensive dependency checks and
s/we can/it is possible to/
> immediately allow the tick to stop. This significantly reduces timer
> interrupts on properly isolated cores.
>
> The patch adds:
"The patch adds" is a pointless filler phrase. See
Documentation/process/
> + /*
> + * Prefetch dependency structures for better cache locality
> + */
> + prefetch(&tick_dep_mask);
> + prefetch(&ts->tick_dep_mask);
> + prefetch(¤t->tick_dep_mask);
> + prefetch(¤t->signal->tick_dep_mask);
These are really not required.
> + /*
> + * Fast path for idle isolated cores: if this is an isolated CPU
> + * running the idle task with no dependencies, we can skip expensive
> + * checks and immediately allow tick to stop. This significantly
> + * reduces timer interrupts on properly isolated cores.
> + */
> + if (tick_nohz_full_cpu(cpu) &&
> + is_idle_task(current) &&
> + !atomic_read(&tick_dep_mask) &&
> + !atomic_read(&ts->tick_dep_mask) &&
> + !atomic_read(¤t->tick_dep_mask) &&
> + !atomic_read(¤t->signal->tick_dep_mask)) {
> + return true;
How is that different from the existing checks for the various
dependency masks, except for the added nohz_full_cpu() and
is_idle_task() conditions?
I can see that not going through the per bit checks is faster, but I
really do not see how this reduces the timer interrupts by an order of
magnitude. At least not without a proper explanation why this matters
and how this optimization is causing this improvement.
Also why is this restricted to tick_nohz_full CPUs and to the idle task?
You can avoid the per bit evaluation way simpler, which improves the
evaluation independent of context. See uncompiled patch below.
Thanks,
tglx
---
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -344,6 +344,9 @@ static bool check_tick_dependency(atomic
{
int val = atomic_read(dep);
+ if (likely(!tracepoint_enabled(tick_stop)))
+ return !val;
+
if (val & TICK_DEP_MASK_POSIX_TIMER) {
trace_tick_stop(0, TICK_DEP_MASK_POSIX_TIMER);
return true;
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH 1/1] tick/nohz: Add fast-path tick stopping for idle isolated cores
2026-01-13 10:02 ` Thomas Gleixner
@ 2026-01-26 19:31 ` Ionut Nechita (Sunlight Linux)
2026-01-26 21:32 ` Thomas Gleixner
0 siblings, 1 reply; 7+ messages in thread
From: Ionut Nechita (Sunlight Linux) @ 2026-01-26 19:31 UTC (permalink / raw)
To: tglx; +Cc: anna-maria, frederic, ionut_n2001, linux-kernel, mingo,
sunlightlinux
From: Ionut Nechita <sunlightlinux@gmail.com>
On Tue, Jan 13 2026 at 11:02, Thomas Gleixner wrote:
> You can avoid the per bit evaluation way simpler, which improves the
> evaluation independent of context. See uncompiled patch below.
Thank you for the detailed feedback and for taking the time to review
this patch.
Your approach is indeed much simpler and more elegant. I see no issues
with it, and it has the additional benefit of improving the evaluation
regardless of context, rather than being restricted to nohz_full CPUs
and the idle task. This is clearly a better solution.
I appreciate the learning opportunity - working on kernel development
continues to teach me new and interesting things every day.
Should I prepare a v2 based on your suggestion, or would you prefer to
submit this as a separate patch?
Thanks again,
Ionut
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] tick/nohz: Add fast-path tick stopping for idle isolated cores
2026-01-06 15:36 ` [PATCH 1/1] tick/nohz: Add fast-path tick stopping for idle " Ionut Nechita (Sunlight Linux)
2026-01-13 10:02 ` Thomas Gleixner
@ 2026-01-27 13:40 ` Frederic Weisbecker
2026-01-28 7:26 ` Ionut Nechita (Sunlight Linux)
1 sibling, 1 reply; 7+ messages in thread
From: Frederic Weisbecker @ 2026-01-27 13:40 UTC (permalink / raw)
To: Ionut Nechita (Sunlight Linux)
Cc: Thomas Gleixner, Ingo Molnar, Anna-Maria Behnsen, Ionut Nechita,
linux-kernel
Le Tue, Jan 06, 2026 at 05:36:48PM +0200, Ionut Nechita (Sunlight Linux) a écrit :
> From: Ionut Nechita <ionut_n2001@yahoo.com>
>
> When a CPU is configured as nohz_full and is running the idle task with
> no tick dependencies, we can skip expensive dependency checks and
> immediately allow the tick to stop. This significantly reduces timer
> interrupts on properly isolated cores.
Most of the idle code is under TS_FLAG_INIDLE, and the can_stop_full_tick()
path is then not taken.
>
> The patch adds:
> 1. Prefetching of dependency structures for better cache locality
> 2. Fast-path optimization for idle isolated cores with no dependencies
>
> This benefits real-time workloads and latency-sensitive applications
> by minimizing timer interrupt overhead on isolated CPUs.
>
> Benchmark results show isolated CPUs can achieve <500 LOC (Local timer)
> interrupts with this optimization, compared to ~8K without it, with
> best-case scenarios achieving <125 LOC interrupts on well-configured
> systems.
I guess we could indeed optimize further outside the idle path. But I'm not
sure this is a good thing. After all, the point of nohz_full is to run things
with the tick stopped. The only part that should run with the tick is setup
and preparatory work, which doesn't really needs optimization.
I'm even tempted to say that the tick code is a slowpath on nohz_full.
Thanks.
--
Frederic Weisbecker
SUSE Labs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] tick/nohz: Add fast-path tick stopping for idle isolated cores
2026-01-27 13:40 ` Frederic Weisbecker
@ 2026-01-28 7:26 ` Ionut Nechita (Sunlight Linux)
0 siblings, 0 replies; 7+ messages in thread
From: Ionut Nechita (Sunlight Linux) @ 2026-01-28 7:26 UTC (permalink / raw)
To: frederic
Cc: anna-maria, ionut_n2001, linux-kernel, mingo, sunlightlinux, tglx
On Tue, Jan 27, 2026 at 02:40:08PM +0100, Frederic Weisbecker wrote:
> Le Tue, Jan 06, 2026 at 05:36:48PM +0200, Ionut Nechita (Sunlight Linux) a écrit :
> > When a CPU is configured as nohz_full and is running the idle task with
> > no tick dependencies, we can skip expensive dependency checks and
> > immediately allow the tick to stop.
>
> Most of the idle code is under TS_FLAG_INIDLE, and the can_stop_full_tick()
> path is then not taken.
You're absolutely right about the TS_FLAG_INIDLE observation. Looking at
tick_nohz_irq_exit(), when TS_FLAG_INIDLE is set, the code path goes to
tick_nohz_start_idle() and can_stop_full_tick() is not called at all.
I need to clarify: the benchmark results showing the reduction from 8K to
<500 LOC interrupts were measured with *workloads running* on the isolated
CPUs, not with idle CPUs. The optimization was helping in the non-idle path
where can_stop_full_tick() is actually called via tick_nohz_full_update_tick().
The commit message was misleading by focusing on "idle isolated cores" when
the actual benefit was for nohz_full CPUs running workloads.
> I guess we could indeed optimize further outside the idle path. But I'm not
> sure this is a good thing. After all, the point of nohz_full is to run things
> with the tick stopped. The only part that should run with the tick is setup
> and preparatory work, which doesn't really needs optimization.
Thomas suggested a cleaner approach that optimizes check_tick_dependency()
directly by returning early when tracepoints are disabled:
if (likely(!tracepoint_enabled(tick_stop)))
return !val;
This is more general and benefits all contexts, not just nohz_full. It avoids
the per-bit iteration when tracing is disabled, which is the common case.
Thanks for pointing out the idle path issue - it helped clarify where the
actual optimization was occurring.
Thanks,
Ionut
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-01-28 7:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-06 15:36 [PATCH 0/1] tick/nohz: Optimize tick stopping for isolated cores Ionut Nechita (Sunlight Linux)
2026-01-06 15:36 ` [PATCH 1/1] tick/nohz: Add fast-path tick stopping for idle " Ionut Nechita (Sunlight Linux)
2026-01-13 10:02 ` Thomas Gleixner
2026-01-26 19:31 ` Ionut Nechita (Sunlight Linux)
2026-01-26 21:32 ` Thomas Gleixner
2026-01-27 13:40 ` Frederic Weisbecker
2026-01-28 7:26 ` Ionut Nechita (Sunlight Linux)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox