public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed
From: Marc Gonzalez <marc.w.gonzalez@free.fr>
To: linux-rt-users@vger.kernel.org
Cc: Leon Woestenberg <leon@sidebranch.com>,
	John Ogness <john.ogness@linutronix.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Clark Williams <williams@redhat.com>,
	Pavel Machek <pavel@denx.de>,
	Luis Goncalves <lgoncalv@redhat.com>,
	John McCalpin <mccalpin@tacc.utexas.edu>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	"Ahmed S. Darwish" <darwi@linutronix.de>,
	agner@agner.org, Dirk Beyer <dirk.beyer@lmu.de>,
	Philipp Wendler <philipp.wendler@lmu.de>
Subject: Unexplained variance in run-time of simple program (part 2)
Date: Thu, 26 Mar 2026 16:24:16 +0100	[thread overview]
Message-ID: <199905cb-04b3-4d3e-aeb3-da2b2d6428eb@free.fr> (raw)

Hello (again) everyone,

Past discussion:
Large(ish) variance induced by SCHED_FIFO / Unexplained variance in run-time of trivial program
https://lore.kernel.org/linux-rt-users/0d87e3c3-8de1-4d98-802e-a292f63f1bf1@free.fr/

SYNOPSIS:
I have a simple(*) program.
I want to know how long the program runs.

(*) By simple, I mean:
- no system calls, no library calls, just simple bit twiddling
- tiny code, small(ish) dataset
(the main function uses ~900 bytes of stack & recurses 40-60 times)

GOAL: Run the program 25,000 times. Get the SAME(ish) cycle count 25,000 times.

Running kernel v6.8 on Haswell i5-4590 3.3 GHz

I have removed "all" sources of noise / jitter / variance in the system:

A) kernel boots with:
threadirqs irqaffinity=0-2 nohz=on nohz_full=3 isolcpus=3 rcu_nocbs=3 nosmt mitigations=off single
i.e.
- Expose ISRs as regular processes
- No ISRs on CPU3
- No timer interrupt on CPU3
- No RCU callbacks on CPU3
- 1 thread per core
- No side-channel mitigations
- Single user mode, no GUI, only 1 terminal

B) before program runs:
echo -1 > /proc/sys/kernel/sched_rt_runtime_us
for I in 0 1 2 3; do echo userspace > /sys/devices/system/cpu/cpu$I/cpufreq/scaling_governor; done
for I in 0 1 2 3; do echo   2000000 > /sys/devices/system/cpu/cpu$I/cpufreq/scaling_setspeed; done
sleep 0.5
i.e
- Let SCHED_FIFO program monopolize a CPU
- Pin CPU frequency to 2 GHz to avoid thermal throttling & disable turbo-boost
- Give these settings time to settle

C) start the benchmark:
for I in $(seq 1 25000); do chrt -f 99 taskset -c 3 ./bench; done
i.e.
- Run as SCHED_FIFO 99 = nothing can interrupt the benchmark
- Run the program on isolated CPU 3 where nothing else is running
$ ps -eo psr,cls,pri,cmd --sort psr,pri
  3  FF 139 [migration/3]
  3  FF  90 [idle_inject/3]
  3  TS  19 [cpuhp/3]
  3  TS  19 [ksoftirqd/3]
  3  TS  19 [kworker/3:0-events]
  3  TS  19 [kworker/3:1]

D) prepare to run the timed code:
	u64 v[1+4];
	int main_fd = open_event(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES, -1);
	open_event(PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS, main_fd);
	open_event(PERF_TYPE_RAW, UOPS_EXECUTED, main_fd);
	open_event(PERF_TYPE_RAW, EXEC_STALLS, main_fd);

	void *ctx = init_ctx();
	solve_grid(ctx); // warm up all types of caches

	ioctl(main_fd, PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP);
	solve_grid(ctx);
	if (read(main_fd, v, sizeof v) < sizeof v) return 2;

	printf("%lu %lu %lu %lu\n", v[1], v[2], v[3], v[4]);

- PERF_EVENT_IOC_RESET resets all counters to 0, so we're only measuring the actual program, not any setup/teardown system code.

The results are unexpected, disappointing, frustrating...


  AA     BB     CC    DD
$ head -5 sorted.RES.5
108018 186124 256147 23195
108412 186124 257228 23275
108637 186124 258963 23245
109103 186124 258598 23507
109167 186124 259715 23425

$ tail -5 sorted.RES.5
123824 186124 266546 30949
124755 186122 266494 31749
124773 186124 264435 30966
126273 186122 267967 32376
130967 186124 284301 33597

AA = PERF_COUNT_HW_CPU_CYCLES
BB = PERF_COUNT_HW_INSTRUCTIONS
CC = UOPS_EXECUTED
DD = EXEC_STALLS

It seems the program runs in ~108k cycles, but unexplained perturbations can delay
the program by up to 23k cycles = 21% (108k + 23k = 131k in the worst observed case)

BEST CASE vs WORST CASE
108018 186124 256147 23195
130967 186124 284301 33597

Run-time: +21%
I_count: identical
uop_count: +11%
exec_stalls: +45%

I don't see these wild deviations when I test toy programs that don't touch memory
or only touch 1 word on the stack. So this seems to be memory-related?
But everything fits in L1...
Could there be some activity on other CPUs that force cache-coherence shenanigans?
I'm stumped :(

Would appreciate any insight.
Will re-read the previous thread for anything I might have missed.

Regards

             reply	other threads:[~2026-03-26 15:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26 15:24 Marc Gonzalez [this message]
2026-03-26 19:09 ` Unexplained variance in run-time of simple program (part 2) Marc Gonzalez
2026-04-07  0:38   ` Marc Gonzalez
     [not found]     ` <17537284-FA52-40E5-A70F-1120FCEB8BC6@mccalpin.com>
2026-04-07 13:52       ` Marc Gonzalez
2026-04-08  9:29         ` John D. McCalpin
2026-04-10 17:16           ` Marc Gonzalez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=199905cb-04b3-4d3e-aeb3-da2b2d6428eb@free.fr \
    --to=marc.w.gonzalez@free.fr \
    --cc=agner@agner.org \
    --cc=bigeasy@linutronix.de \
    --cc=darwi@linutronix.de \
    --cc=dirk.beyer@lmu.de \
    --cc=fweisbec@gmail.com \
    --cc=john.ogness@linutronix.de \
    --cc=leon@sidebranch.com \
    --cc=lgoncalv@redhat.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mccalpin@tacc.utexas.edu \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=pavel@denx.de \
    --cc=philipp.wendler@lmu.de \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox