From: Marc Gonzalez <marc.w.gonzalez@free.fr>
To: linux-rt-users@vger.kernel.org
Cc: Daniel Wagner <dwagner@suse.de>,
Leon Woestenberg <leon@sidebranch.com>,
John Ogness <john.ogness@linutronix.de>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Clark Williams <williams@redhat.com>,
Pavel Machek <pavel@denx.de>,
Luis Goncalves <lgoncalv@redhat.com>,
John McCalpin <john@mccalpin.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
Ingo Molnar <mingo@kernel.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
"Ahmed S. Darwish" <darwi@linutronix.de>,
agner@agner.org, Dirk Beyer <dirk.beyer@lmu.de>,
Philipp Wendler <philipp.wendler@lmu.de>
Subject: Re: Unexplained variance in run-time of simple program (part 2)
Date: Thu, 26 Mar 2026 20:09:12 +0100 [thread overview]
Message-ID: <5397d0cd-9266-44ae-97f2-75164d89bf48@free.fr> (raw)
In-Reply-To: <199905cb-04b3-4d3e-aeb3-da2b2d6428eb@free.fr>
[ Add Daniel Wagner + use different address for John McCalpin ]
On 26/03/2026 16:24, Marc Gonzalez wrote:
> Hello (again) everyone,
>
> Past discussion:
> Large(ish) variance induced by SCHED_FIFO / Unexplained variance in run-time of trivial program
> https://lore.kernel.org/linux-rt-users/0d87e3c3-8de1-4d98-802e-a292f63f1bf1@free.fr/
>
> SYNOPSIS:
> I have a simple(*) program.
> I want to know how long the program runs.
>
> (*) By simple, I mean:
> - no system calls, no library calls, just simple bit twiddling
> - tiny code, small(ish) dataset
> (the main function uses ~900 bytes of stack & recurses 40-60 times)
>
> GOAL: Run the program 25,000 times. Get the SAME(ish) cycle count 25,000 times.
>
> Running kernel v6.8 on Haswell i5-4590 3.3 GHz
>
> I have removed "all" sources of noise / jitter / variance in the system:
>
> A) kernel boots with:
> threadirqs irqaffinity=0-2 nohz=on nohz_full=3 isolcpus=3 rcu_nocbs=3 nosmt mitigations=off single
> i.e.
> - Expose ISRs as regular processes
> - No ISRs on CPU3
> - No timer interrupt on CPU3
> - No RCU callbacks on CPU3
> - 1 thread per core
> - No side-channel mitigations
> - Single user mode, no GUI, only 1 terminal
>
> B) before program runs:
> echo -1 > /proc/sys/kernel/sched_rt_runtime_us
> for I in 0 1 2 3; do echo userspace > /sys/devices/system/cpu/cpu$I/cpufreq/scaling_governor; done
> for I in 0 1 2 3; do echo 2000000 > /sys/devices/system/cpu/cpu$I/cpufreq/scaling_setspeed; done
> sleep 0.5
> i.e
> - Let SCHED_FIFO program monopolize a CPU
> - Pin CPU frequency to 2 GHz to avoid thermal throttling & disable turbo-boost
> - Give these settings time to settle
>
> C) start the benchmark:
> for I in $(seq 1 25000); do chrt -f 99 taskset -c 3 ./bench; done
> i.e.
> - Run as SCHED_FIFO 99 = nothing can interrupt the benchmark
> - Run the program on isolated CPU 3 where nothing else is running
> $ ps -eo psr,cls,pri,cmd --sort psr,pri
> 3 FF 139 [migration/3]
> 3 FF 90 [idle_inject/3]
> 3 TS 19 [cpuhp/3]
> 3 TS 19 [ksoftirqd/3]
> 3 TS 19 [kworker/3:0-events]
> 3 TS 19 [kworker/3:1]
>
> D) prepare to run the timed code:
> u64 v[1+4];
> int main_fd = open_event(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES, -1);
> open_event(PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS, main_fd);
> open_event(PERF_TYPE_RAW, UOPS_EXECUTED, main_fd);
> open_event(PERF_TYPE_RAW, EXEC_STALLS, main_fd);
>
> void *ctx = init_ctx();
> solve_grid(ctx); // warm up all types of caches
>
> ioctl(main_fd, PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP);
> solve_grid(ctx);
> if (read(main_fd, v, sizeof v) < sizeof v) return 2;
>
> printf("%lu %lu %lu %lu\n", v[1], v[2], v[3], v[4]);
>
> - PERF_EVENT_IOC_RESET resets all counters to 0, so we're only measuring the actual program, not any setup/teardown system code.
>
> The results are unexpected, disappointing, frustrating...
>
>
> AA BB CC DD
> $ head -5 sorted.RES.5
> 108018 186124 256147 23195
> 108412 186124 257228 23275
> 108637 186124 258963 23245
> 109103 186124 258598 23507
> 109167 186124 259715 23425
>
> $ tail -5 sorted.RES.5
> 123824 186124 266546 30949
> 124755 186122 266494 31749
> 124773 186124 264435 30966
> 126273 186122 267967 32376
> 130967 186124 284301 33597
>
> AA = PERF_COUNT_HW_CPU_CYCLES
> BB = PERF_COUNT_HW_INSTRUCTIONS
> CC = UOPS_EXECUTED
> DD = EXEC_STALLS
>
> It seems the program runs in ~108k cycles, but unexplained perturbations can delay
> the program by up to 23k cycles = 21% (108k + 23k = 131k in the worst observed case)
>
> BEST CASE vs WORST CASE
> 108018 186124 256147 23195
> 130967 186124 284301 33597
>
> Run-time: +21%
> I_count: identical
> uop_count: +11%
> exec_stalls: +45%
>
> I don't see these wild deviations when I test toy programs that don't touch memory
> or only touch 1 word on the stack. So this seems to be memory-related?
> But everything fits in L1...
> Could there be some activity on other CPUs that force cache-coherence shenanigans?
> I'm stumped :(
>
> Would appreciate any insight.
> Will re-read the previous thread for anything I might have missed.
>
> Regards
next prev parent reply other threads:[~2026-03-26 19:09 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 15:24 Unexplained variance in run-time of simple program (part 2) Marc Gonzalez
2026-03-26 19:09 ` Marc Gonzalez [this message]
2026-04-07 0:38 ` Marc Gonzalez
[not found] ` <17537284-FA52-40E5-A70F-1120FCEB8BC6@mccalpin.com>
2026-04-07 13:52 ` Marc Gonzalez
2026-04-08 9:29 ` John D. McCalpin
2026-04-10 17:16 ` Marc Gonzalez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5397d0cd-9266-44ae-97f2-75164d89bf48@free.fr \
--to=marc.w.gonzalez@free.fr \
--cc=agner@agner.org \
--cc=bigeasy@linutronix.de \
--cc=darwi@linutronix.de \
--cc=dirk.beyer@lmu.de \
--cc=dwagner@suse.de \
--cc=fweisbec@gmail.com \
--cc=john.ogness@linutronix.de \
--cc=john@mccalpin.com \
--cc=leon@sidebranch.com \
--cc=lgoncalv@redhat.com \
--cc=linux-rt-users@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=pavel@denx.de \
--cc=philipp.wendler@lmu.de \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox