From: Ian Rogers <irogers@google.com>
To: irogers@google.com, acme@kernel.org, james.clark@linaro.org,
namhyung@kernel.org
Cc: 9erthalion6@gmail.com, adrian.hunter@intel.com, alex@ghiti.fr,
alexandre.chartre@oracle.com, andrii@kernel.org,
ankur.a.arora@oracle.com, aou@eecs.berkeley.edu,
bpf@vger.kernel.org, collin.funk1@gmail.com,
costa.shul@redhat.com, daniel@iogearbox.net,
dapeng1.mi@linux.intel.com, dsterba@suse.com, eddyz87@gmail.com,
howardchu95@gmail.com, jolsa@kernel.org, leo.yan@arm.com,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
martin.lau@linux.dev, memxor@gmail.com, mingo@redhat.com,
mmayer@broadcom.com, nathan@kernel.org, palmer@dabbelt.com,
peterz@infradead.org, pjw@kernel.org, qmo@kernel.org,
ricky.ringler@proton.me, song@kernel.org,
swapnil.sapkal@amd.com, terrelln@fb.com, tglozar@redhat.com,
thomas.falcon@intel.com, yonghong.song@linux.dev
Subject: [PATCH v3 00/17] perf build: Reduce build time by nearly half
Date: Thu, 14 May 2026 09:33:52 -0700 [thread overview]
Message-ID: <20260514163409.927816-1-irogers@google.com> (raw)
In-Reply-To: <20260512174638.120445-1-irogers@google.com>
This patch series refactors Kbuild internals, BPF skeleton generation,
Python AST pre-computation, and foundational tooling dependencies across
the perf tool build system. By eliminating umbrella target synchronization
barriers, decoupling static library prerequisites, parallelizing single-core
script generators, and eradicating redundant feature checks, this series
unlocks absolute theoretical peak multi-core concurrency during Kbuild startup.
On a 28-core build workstation (make -j28 all from scratch), clean build
latency improves by over 49%:
Before:
real 0m29.006s
user 2m46.019s
sys 0m30.610s
After:
real 0m14.782s
user 2m39.527s
sys 0m22.938s
Saving 14.2 full seconds time per clean build. Furthermore, nothing to
build incremental builds are improved by nearly 7x:
Before:
real 0m11.528s
user 0m9.633s
sys 0m6.965s
After:
real 0m1.729s
user 0m1.600s
sys 0m0.884s
Summary of Patches:
1-3: Foundational Tooling & Fast-Path Feature Detection
- Exempts bpftool bootstrap from non-essential feature tests (LLVM, libbfd,
libcap), saving 1.1s of sub-make fork overhead during Kbuild startup.
- Integrates libdebuginfod directly into test-all.c, allowing Make to skip
individual feature check sub-make forks during AST parsing on fully
configured workstations. Escapes $(shell ...) macro expansion to prevent
unconditional sub-make forks.
- Fixes test-clang-bpf-co-re.bin feature check to correctly generate its
target file on disk via atomic move (> $@.tmp && mv $@.tmp $@), allowing
Kbuild to perfectly cache the detection result and avoid continuous sub-make
re-evaluations.
4-6: Flattening Umbrella Prepare Barriers
- builtin-trace embedded inclusions and pmu-events generation are completely
decoupled from the sequential "prepare" umbrella target, eliminating Make
AST double-parsing overhead and unchoking parallel compilation barriers.
7-10: Decoupling & Pre-generating BPF Skeletons
- BPF skeleton rules are extracted out of Makefile.perf into bpf_skel.mak.
- Decouples bpftool bootstrap from top-level static libbpf dependencies,
attaching bpf-skel-prepare directly to the umbrella prepare target. This
allows Make to pre-compile bpftool and dump vmlinux.h in the background at
build startup, removing the 7-second serialization bottleneck before BPF
object compilation.
- Ensures benchmark skeleton intermediate .bpf.o files are cleanly removed
during make clean, and adds bpf-skel-prepare to .PHONY.
11-12: Foundational Linkage Optimization
- Eliminates redundant libbpf sub-make feature checks during static builds.
- Moves static libsymbol and libbpf library prerequisites out of the
prepare step, ensuring libbpf headers are installed before
compiling BPF-dependent tests.
13-14: jevents.py Concurrency & Deduplication
- Splits the massive 2.8 MB big_c_string literal out of pmu-events.c
into a dedicated pmu-events-string.c compilation unit. This slices
C compilation latency in half by compiling string and struct
tables simultaneously across separate CPU cores while preserving
zero dynamic ELF relocations. Adds pmu-events-string.c to
.gitignore and uses Make 4.0 compatible dependency chaining.
- Pre-populates jevents.py JSON ASTs and metric formulas in parallel across
all available CPU cores using ProcessPoolExecutor (accelerating Python
execution by 11x, from 3.3s down to ~290ms). Moves _init_worker to top-level
scope to ensure clean pickling under spawn multiprocessing start methods.
15: Out-of-Tree Incremental Rebuild Fix
- Prefixes SCRIPTS (perf-archive, perf-iostat) with $(OUTPUT) to prevent
Make from continuously re-executing script installation rules on already
built out-of-tree builds.
16-17: AST Parsing Optimization & Shell Fork Eradication
- Converts ZENS, ARMS, and INTELS in pmu-events/Build from recursive
assignment (=) to simply expanded assignment (:=) and replaces
model_name/vendor_name with pure GNU Make string functions. This
guarantees Make executes directory probing shell forks exactly
once during AST parsing and evaluates path macros purely in
memory, completely eradicating over 7,800 redundant sub-processes
during out-of-tree build evaluation.
- Converts llvm-config shell queries in Makefile.config from
recursive assignment (=) to simply expanded assignment (:=). This
eliminates ~185 redundant sub-processes that were previously
executed across object compilation dependency checks.
Changes since v2:
- Dropped Patch 4 (tools scripts: Short-circuit CC_NO_CLANG compiler
probe in Makefile.include) to prevent potential cross-compilation
regressions when CC and HOSTCC use different compilers.
- tools build (Patch 2): Escaped $(shell ...) macro expansion as
$$(shell ...) inside define feature_check_code to safely defer
sub-make execution until after eval parses the ifeq guard.
- tools build (Patch 3): Refactored test-clang-bpf-co-re.bin feature
check recipe to redirect grep output to a temporary file and
atomically move it upon success (> $@.tmp && mv $@.tmp $@),
preventing Kbuild from permanently caching failed detections due to
0-byte files.
- perf trace beauty (Patch 4): Updated commit description to accurately
reflect the unconditional top-level recursive kbuild hook
(perf-util-y += trace/beauty/).
- perf build (Patch 7): Added $(OUTPUT)bench/bpf_skel/.tmp to
bpf-skel-clean in Makefile.perf to ensure intermediate benchmark
skeleton .bpf.o artifacts are cleanly removed during make clean.
Removed unused bpf_skel_deps variable from bpf_skel.mak.
- perf build (Patch 9): Added $(LIBBPF) as an explicit prerequisite to
$(LIBPERF_TEST_IN) in Makefile.perf to guarantee libbpf headers are
fully installed before compiling sigtrap.c or other BPF-dependent
tests during parallel builds.
- perf build (Patch 10): Added bpf-skel-prepare to the .PHONY target
list in Makefile.perf to ensure Make never incorrectly skips the
target if a file or directory named bpf-skel-prepare accidentally
exists in the build tree.
- perf pmu-events (Patch 13): Added pmu-events/pmu-events-string.c to
tools/perf/.gitignore. Replaced grouped targets (&:) with Make 4.0
compatible dependency chaining to guarantee backward compatibility
with older Make versions (like 4.2.1) and prevent parallel builds
from spawning multiple concurrent jevents.py processes.
- perf pmu-events (Patch 14): Moved _init_worker from local main()
scope to the top-level module scope in jevents.py to ensure it can be
cleanly pickled when ProcessPoolExecutor uses the spawn
multiprocessing start method (avoiding AttributeError crashes).
Ian Rogers (17):
bpftool build: Restrict feature tests during bootstrap compilation
tools build: Integrate libdebuginfod into test-all fast path
tools build: Fix test-clang-bpf-co-re.bin to generate target file
perf trace beauty: Make beauty generated C code standalone .o files
perf build: Decouple pmu-events from prepare umbrella target
perf build: Remove empty archheaders target
perf build: Move BPF skeleton generation out of Makefile.perf
perf build: Encapsulate vmlinux.h and bpftool in bpf_skel.mak
perf build: Move static libbpf dependency out of prepare step
perf build: Pre-generate BPF skeleton tooling during umbrella prepare
phase
perf build: Move libsymbol dependency out of prepare step
perf build: Remove redundant libbpf feature check for static builds
perf pmu-events: Split big_c_string storage into standalone
compilation unit
perf pmu-events: Parallelize JSON and metric pre-computation in
jevents.py
perf build: Prefix SCRIPTS with output directory to fix continuous
rebuilds
perf pmu-events: Convert recursive shell assignments and macros to
Make built-ins
perf build: Convert llvm-config shell queries to simply expanded
variables
tools/bpf/bpftool/Makefile | 5 +
tools/build/Makefile.feature | 6 +-
tools/build/feature/Makefile | 4 +-
tools/build/feature/test-all.c | 5 +
tools/perf/.gitignore | 1 +
tools/perf/Build | 2 +
tools/perf/Makefile.config | 19 +-
tools/perf/Makefile.perf | 431 ++----------------
tools/perf/bench/Build | 6 +
.../bpf_skel/bench_uprobe.bpf.c | 0
tools/perf/bench/uprobe.c | 2 +-
tools/perf/bpf_skel.mak | 109 +++++
tools/perf/builtin-trace.c | 30 +-
tools/perf/pmu-events/Build | 26 +-
tools/perf/pmu-events/jevents.py | 56 ++-
tools/perf/trace/beauty/Build | 280 ++++++++++++
tools/perf/trace/beauty/arch_errno_names.c | 2 +
tools/perf/trace/beauty/arch_errno_names.sh | 2 +-
tools/perf/trace/beauty/beauty.h | 60 +++
tools/perf/trace/beauty/eventfd.c | 6 +-
tools/perf/trace/beauty/fsconfig.c | 5 +
tools/perf/trace/beauty/futex_op.c | 6 +-
tools/perf/trace/beauty/futex_val3.c | 6 +-
tools/perf/trace/beauty/mmap.c | 24 +-
tools/perf/trace/beauty/mode_t.c | 6 +-
tools/perf/trace/beauty/msg_flags.c | 8 +-
tools/perf/trace/beauty/open_flags.c | 1 +
tools/perf/trace/beauty/perf_event_open.c | 22 +-
tools/perf/trace/beauty/pid.c | 5 +-
tools/perf/trace/beauty/sched_policy.c | 8 +-
tools/perf/trace/beauty/seccomp.c | 12 +-
tools/perf/trace/beauty/signum.c | 6 +-
tools/perf/trace/beauty/socket_type.c | 6 +-
.../perf/{util => trace/beauty}/syscalltbl.c | 0
.../perf/{util => trace/beauty}/syscalltbl.h | 0
tools/perf/trace/beauty/tracepoints/Build | 22 +
tools/perf/trace/beauty/waitid_options.c | 8 +-
tools/perf/util/Build | 17 +-
tools/perf/util/bpf-trace-summary.c | 2 +-
tools/perf/util/env.c | 4 +-
tools/perf/util/env.h | 1 +
41 files changed, 717 insertions(+), 504 deletions(-)
rename tools/perf/{util => bench}/bpf_skel/bench_uprobe.bpf.c (100%)
create mode 100644 tools/perf/bpf_skel.mak
create mode 100644 tools/perf/trace/beauty/fsconfig.c
rename tools/perf/{util => trace/beauty}/syscalltbl.c (100%)
rename tools/perf/{util => trace/beauty}/syscalltbl.h (100%)
--
2.54.0.563.g4f69b47b94-goog
next prev parent reply other threads:[~2026-05-14 16:34 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-12 5:35 [PATCH v1 00/14] perf build: Reduce build time by one third Ian Rogers
2026-05-12 5:35 ` [PATCH v1 01/14] bpftool build: Restrict feature tests during bootstrap compilation Ian Rogers
2026-05-12 5:35 ` [PATCH v1 02/14] perf trace beauty: Make beauty generated C code standalone .o files Ian Rogers
2026-05-13 5:21 ` sashiko-bot
2026-05-12 5:35 ` [PATCH v1 03/14] perf build: Decouple pmu-events from prepare umbrella target Ian Rogers
2026-05-12 5:35 ` [PATCH v1 04/14] perf build: Remove empty archheaders target Ian Rogers
2026-05-12 5:35 ` [PATCH v1 05/14] perf build: Move BPF skeleton generation out of Makefile.perf Ian Rogers
2026-05-13 19:52 ` sashiko-bot
2026-05-12 5:35 ` [PATCH v1 06/14] perf build: Encapsulate vmlinux.h and bpftool in bpf_skel.mak Ian Rogers
2026-05-13 20:09 ` sashiko-bot
2026-05-12 5:35 ` [PATCH v1 07/14] perf build: Move static libbpf dependency out of prepare step Ian Rogers
2026-05-13 20:36 ` sashiko-bot
2026-05-12 5:35 ` [PATCH v1 08/14] perf build: Pre-generate BPF skeletons during umbrella prepare phase Ian Rogers
2026-05-12 5:35 ` [PATCH v1 09/14] perf build: Move libsymbol dependency out of prepare step Ian Rogers
2026-05-13 21:11 ` sashiko-bot
2026-05-12 5:35 ` [PATCH v1 10/14] perf build: Remove redundant libbpf feature check for static builds Ian Rogers
2026-05-12 5:35 ` [PATCH v1 11/14] tools build: Integrate libdebuginfod into test-all fast path Ian Rogers
2026-05-13 21:40 ` sashiko-bot
2026-05-12 5:35 ` [PATCH v1 12/14] perf pmu-events: Split big_c_string storage into standalone compilation unit Ian Rogers
2026-05-13 21:56 ` sashiko-bot
2026-05-12 5:35 ` [PATCH v1 13/14] perf pmu-events: Parallelize JSON and metric pre-computation in jevents.py Ian Rogers
2026-05-13 22:18 ` sashiko-bot
2026-05-12 5:35 ` [PATCH v1 14/14] perf build: Prefix SCRIPTS with output directory to fix continuous rebuilds Ian Rogers
2026-05-12 17:46 ` [PATCH v2 00/18] perf build: Reduce build time by nearly half Ian Rogers
2026-05-12 17:46 ` [PATCH v2 01/18] bpftool build: Restrict feature tests during bootstrap compilation Ian Rogers
2026-05-12 17:46 ` [PATCH v2 02/18] tools build: Integrate libdebuginfod into test-all fast path Ian Rogers
2026-05-13 23:59 ` sashiko-bot
2026-05-12 17:46 ` [PATCH v2 03/18] tools build: Fix test-clang-bpf-co-re.bin to generate target file Ian Rogers
2026-05-14 0:15 ` sashiko-bot
2026-05-12 17:46 ` [PATCH v2 04/18] tools scripts: Short-circuit CC_NO_CLANG compiler probe in Makefile.include Ian Rogers
2026-05-14 0:28 ` sashiko-bot
2026-05-12 17:46 ` [PATCH v2 05/18] perf trace beauty: Make beauty generated C code standalone .o files Ian Rogers
2026-05-14 0:50 ` sashiko-bot
2026-05-12 17:46 ` [PATCH v2 06/18] perf build: Decouple pmu-events from prepare umbrella target Ian Rogers
2026-05-12 17:46 ` [PATCH v2 07/18] perf build: Remove empty archheaders target Ian Rogers
2026-05-12 17:46 ` [PATCH v2 08/18] perf build: Move BPF skeleton generation out of Makefile.perf Ian Rogers
2026-05-14 1:55 ` sashiko-bot
2026-05-12 17:46 ` [PATCH v2 09/18] perf build: Encapsulate vmlinux.h and bpftool in bpf_skel.mak Ian Rogers
2026-05-12 17:46 ` [PATCH v2 10/18] perf build: Move static libbpf dependency out of prepare step Ian Rogers
2026-05-14 3:02 ` sashiko-bot
2026-05-12 17:46 ` [PATCH v2 11/18] perf build: Pre-generate BPF skeleton tooling during umbrella prepare phase Ian Rogers
2026-05-14 3:39 ` sashiko-bot
2026-05-12 17:46 ` [PATCH v2 12/18] perf build: Move libsymbol dependency out of prepare step Ian Rogers
2026-05-12 17:46 ` [PATCH v2 13/18] perf build: Remove redundant libbpf feature check for static builds Ian Rogers
2026-05-12 17:46 ` [PATCH v2 14/18] perf pmu-events: Split big_c_string storage into standalone compilation unit Ian Rogers
2026-05-14 4:35 ` sashiko-bot
2026-05-12 17:46 ` [PATCH v2 15/18] perf pmu-events: Parallelize JSON and metric pre-computation in jevents.py Ian Rogers
2026-05-14 5:06 ` sashiko-bot
2026-05-12 17:46 ` [PATCH v2 16/18] perf build: Prefix SCRIPTS with output directory to fix continuous rebuilds Ian Rogers
2026-05-12 17:46 ` [PATCH v2 17/18] perf pmu-events: Convert recursive shell assignments and macros to Make built-ins Ian Rogers
2026-05-12 17:46 ` [PATCH v2 18/18] perf build: Convert llvm-config shell queries to simply expanded variables Ian Rogers
2026-05-14 16:33 ` Ian Rogers [this message]
2026-05-14 16:33 ` [PATCH v3 01/17] bpftool build: Restrict feature tests during bootstrap compilation Ian Rogers
2026-05-14 16:33 ` [PATCH v3 02/17] tools build: Integrate libdebuginfod into test-all fast path Ian Rogers
2026-05-14 16:33 ` [PATCH v3 03/17] tools build: Fix test-clang-bpf-co-re.bin to generate target file Ian Rogers
2026-05-14 16:33 ` [PATCH v3 04/17] perf trace beauty: Make beauty generated C code standalone .o files Ian Rogers
2026-05-14 16:33 ` [PATCH v3 05/17] perf build: Decouple pmu-events from prepare umbrella target Ian Rogers
2026-05-14 16:33 ` [PATCH v3 06/17] perf build: Remove empty archheaders target Ian Rogers
2026-05-14 16:33 ` [PATCH v3 07/17] perf build: Move BPF skeleton generation out of Makefile.perf Ian Rogers
2026-05-14 16:34 ` [PATCH v3 08/17] perf build: Encapsulate vmlinux.h and bpftool in bpf_skel.mak Ian Rogers
2026-05-14 16:34 ` [PATCH v3 09/17] perf build: Move static libbpf dependency out of prepare step Ian Rogers
2026-05-14 16:34 ` [PATCH v3 10/17] perf build: Pre-generate BPF skeleton tooling during umbrella prepare phase Ian Rogers
2026-05-14 16:34 ` [PATCH v3 11/17] perf build: Move libsymbol dependency out of prepare step Ian Rogers
2026-05-14 16:34 ` [PATCH v3 12/17] perf build: Remove redundant libbpf feature check for static builds Ian Rogers
2026-05-14 16:34 ` [PATCH v3 13/17] perf pmu-events: Split big_c_string storage into standalone compilation unit Ian Rogers
2026-05-14 16:34 ` [PATCH v3 14/17] perf pmu-events: Parallelize JSON and metric pre-computation in jevents.py Ian Rogers
2026-05-14 16:34 ` [PATCH v3 15/17] perf build: Prefix SCRIPTS with output directory to fix continuous rebuilds Ian Rogers
2026-05-14 16:34 ` [PATCH v3 16/17] perf pmu-events: Convert recursive shell assignments and macros to Make built-ins Ian Rogers
2026-05-14 16:34 ` [PATCH v3 17/17] perf build: Convert llvm-config shell queries to simply expanded variables Ian Rogers
2026-05-12 9:36 ` [PATCH v1 00/14] perf build: Reduce build time by one third James Clark
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260514163409.927816-1-irogers@google.com \
--to=irogers@google.com \
--cc=9erthalion6@gmail.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alex@ghiti.fr \
--cc=alexandre.chartre@oracle.com \
--cc=andrii@kernel.org \
--cc=ankur.a.arora@oracle.com \
--cc=aou@eecs.berkeley.edu \
--cc=bpf@vger.kernel.org \
--cc=collin.funk1@gmail.com \
--cc=costa.shul@redhat.com \
--cc=daniel@iogearbox.net \
--cc=dapeng1.mi@linux.intel.com \
--cc=dsterba@suse.com \
--cc=eddyz87@gmail.com \
--cc=howardchu95@gmail.com \
--cc=james.clark@linaro.org \
--cc=jolsa@kernel.org \
--cc=leo.yan@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=memxor@gmail.com \
--cc=mingo@redhat.com \
--cc=mmayer@broadcom.com \
--cc=namhyung@kernel.org \
--cc=nathan@kernel.org \
--cc=palmer@dabbelt.com \
--cc=peterz@infradead.org \
--cc=pjw@kernel.org \
--cc=qmo@kernel.org \
--cc=ricky.ringler@proton.me \
--cc=song@kernel.org \
--cc=swapnil.sapkal@amd.com \
--cc=terrelln@fb.com \
--cc=tglozar@redhat.com \
--cc=thomas.falcon@intel.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox