public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed
From: Marc Gonzalez <marc.w.gonzalez@free.fr>
To: linux-rt-users@vger.kernel.org
Cc: Daniel Wagner <dwagner@suse.de>,
	Leon Woestenberg <leon@sidebranch.com>,
	John Ogness <john.ogness@linutronix.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Clark Williams <williams@redhat.com>,
	Pavel Machek <pavel@denx.de>,
	Luis Goncalves <lgoncalv@redhat.com>,
	John McCalpin <john@mccalpin.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	"Ahmed S. Darwish" <darwi@linutronix.de>,
	agner@agner.org, Dirk Beyer <dirk.beyer@lmu.de>,
	Philipp Wendler <philipp.wendler@lmu.de>
Subject: Re: Unexplained variance in run-time of simple program (part 2)
Date: Thu, 26 Mar 2026 20:09:12 +0100	[thread overview]
Message-ID: <5397d0cd-9266-44ae-97f2-75164d89bf48@free.fr> (raw)
In-Reply-To: <199905cb-04b3-4d3e-aeb3-da2b2d6428eb@free.fr>

[ Add Daniel Wagner + use different address for John McCalpin ]

On 26/03/2026 16:24, Marc Gonzalez wrote:

> Hello (again) everyone,
> 
> Past discussion:
> Large(ish) variance induced by SCHED_FIFO / Unexplained variance in run-time of trivial program
> https://lore.kernel.org/linux-rt-users/0d87e3c3-8de1-4d98-802e-a292f63f1bf1@free.fr/
> 
> SYNOPSIS:
> I have a simple(*) program.
> I want to know how long the program runs.
> 
> (*) By simple, I mean:
> - no system calls, no library calls, just simple bit twiddling
> - tiny code, small(ish) dataset
> (the main function uses ~900 bytes of stack & recurses 40-60 times)
> 
> GOAL: Run the program 25,000 times. Get the SAME(ish) cycle count 25,000 times.
> 
> Running kernel v6.8 on Haswell i5-4590 3.3 GHz
> 
> I have removed "all" sources of noise / jitter / variance in the system:
> 
> A) kernel boots with:
> threadirqs irqaffinity=0-2 nohz=on nohz_full=3 isolcpus=3 rcu_nocbs=3 nosmt mitigations=off single
> i.e.
> - Expose ISRs as regular processes
> - No ISRs on CPU3
> - No timer interrupt on CPU3
> - No RCU callbacks on CPU3
> - 1 thread per core
> - No side-channel mitigations
> - Single user mode, no GUI, only 1 terminal
> 
> B) before program runs:
> echo -1 > /proc/sys/kernel/sched_rt_runtime_us
> for I in 0 1 2 3; do echo userspace > /sys/devices/system/cpu/cpu$I/cpufreq/scaling_governor; done
> for I in 0 1 2 3; do echo   2000000 > /sys/devices/system/cpu/cpu$I/cpufreq/scaling_setspeed; done
> sleep 0.5
> i.e
> - Let SCHED_FIFO program monopolize a CPU
> - Pin CPU frequency to 2 GHz to avoid thermal throttling & disable turbo-boost
> - Give these settings time to settle
> 
> C) start the benchmark:
> for I in $(seq 1 25000); do chrt -f 99 taskset -c 3 ./bench; done
> i.e.
> - Run as SCHED_FIFO 99 = nothing can interrupt the benchmark
> - Run the program on isolated CPU 3 where nothing else is running
> $ ps -eo psr,cls,pri,cmd --sort psr,pri
>   3  FF 139 [migration/3]
>   3  FF  90 [idle_inject/3]
>   3  TS  19 [cpuhp/3]
>   3  TS  19 [ksoftirqd/3]
>   3  TS  19 [kworker/3:0-events]
>   3  TS  19 [kworker/3:1]
> 
> D) prepare to run the timed code:
> 	u64 v[1+4];
> 	int main_fd = open_event(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES, -1);
> 	open_event(PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS, main_fd);
> 	open_event(PERF_TYPE_RAW, UOPS_EXECUTED, main_fd);
> 	open_event(PERF_TYPE_RAW, EXEC_STALLS, main_fd);
> 
> 	void *ctx = init_ctx();
> 	solve_grid(ctx); // warm up all types of caches
> 
> 	ioctl(main_fd, PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP);
> 	solve_grid(ctx);
> 	if (read(main_fd, v, sizeof v) < sizeof v) return 2;
> 
> 	printf("%lu %lu %lu %lu\n", v[1], v[2], v[3], v[4]);
> 
> - PERF_EVENT_IOC_RESET resets all counters to 0, so we're only measuring the actual program, not any setup/teardown system code.
> 
> The results are unexpected, disappointing, frustrating...
> 
> 
>   AA     BB     CC    DD
> $ head -5 sorted.RES.5
> 108018 186124 256147 23195
> 108412 186124 257228 23275
> 108637 186124 258963 23245
> 109103 186124 258598 23507
> 109167 186124 259715 23425
> 
> $ tail -5 sorted.RES.5
> 123824 186124 266546 30949
> 124755 186122 266494 31749
> 124773 186124 264435 30966
> 126273 186122 267967 32376
> 130967 186124 284301 33597
> 
> AA = PERF_COUNT_HW_CPU_CYCLES
> BB = PERF_COUNT_HW_INSTRUCTIONS
> CC = UOPS_EXECUTED
> DD = EXEC_STALLS
> 
> It seems the program runs in ~108k cycles, but unexplained perturbations can delay
> the program by up to 23k cycles = 21% (108k + 23k = 131k in the worst observed case)
> 
> BEST CASE vs WORST CASE
> 108018 186124 256147 23195
> 130967 186124 284301 33597
> 
> Run-time: +21%
> I_count: identical
> uop_count: +11%
> exec_stalls: +45%
> 
> I don't see these wild deviations when I test toy programs that don't touch memory
> or only touch 1 word on the stack. So this seems to be memory-related?
> But everything fits in L1...
> Could there be some activity on other CPUs that force cache-coherence shenanigans?
> I'm stumped :(
> 
> Would appreciate any insight.
> Will re-read the previous thread for anything I might have missed.
> 
> Regards


  reply	other threads:[~2026-03-26 19:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26 15:24 Unexplained variance in run-time of simple program (part 2) Marc Gonzalez
2026-03-26 19:09 ` Marc Gonzalez [this message]
2026-04-07  0:38   ` Marc Gonzalez
     [not found]     ` <17537284-FA52-40E5-A70F-1120FCEB8BC6@mccalpin.com>
2026-04-07 13:52       ` Marc Gonzalez
2026-04-08  9:29         ` John D. McCalpin
2026-04-10 17:16           ` Marc Gonzalez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5397d0cd-9266-44ae-97f2-75164d89bf48@free.fr \
    --to=marc.w.gonzalez@free.fr \
    --cc=agner@agner.org \
    --cc=bigeasy@linutronix.de \
    --cc=darwi@linutronix.de \
    --cc=dirk.beyer@lmu.de \
    --cc=dwagner@suse.de \
    --cc=fweisbec@gmail.com \
    --cc=john.ogness@linutronix.de \
    --cc=john@mccalpin.com \
    --cc=leon@sidebranch.com \
    --cc=lgoncalv@redhat.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=pavel@denx.de \
    --cc=philipp.wendler@lmu.de \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox