Linux Perf Users
 help / color / mirror / Atom feed
From: Ian Rogers <irogers@google.com>
To: irogers@google.com, acme@kernel.org, james.clark@linaro.org,
	 namhyung@kernel.org
Cc: 9erthalion6@gmail.com, adrian.hunter@intel.com, alex@ghiti.fr,
	 alexandre.chartre@oracle.com, andrii@kernel.org,
	ankur.a.arora@oracle.com,  aou@eecs.berkeley.edu,
	bpf@vger.kernel.org, collin.funk1@gmail.com,
	 costa.shul@redhat.com, daniel@iogearbox.net,
	dapeng1.mi@linux.intel.com,  dsterba@suse.com, eddyz87@gmail.com,
	howardchu95@gmail.com, jolsa@kernel.org,  leo.yan@arm.com,
	linux-kernel@vger.kernel.org,  linux-perf-users@vger.kernel.org,
	martin.lau@linux.dev, memxor@gmail.com,  mingo@redhat.com,
	mmayer@broadcom.com, nathan@kernel.org, palmer@dabbelt.com,
	 peterz@infradead.org, pjw@kernel.org, qmo@kernel.org,
	ricky.ringler@proton.me,  song@kernel.org,
	swapnil.sapkal@amd.com, terrelln@fb.com, tglozar@redhat.com,
	 thomas.falcon@intel.com, yonghong.song@linux.dev
Subject: [PATCH v2 00/18] perf build: Reduce build time by nearly half
Date: Tue, 12 May 2026 10:46:20 -0700	[thread overview]
Message-ID: <20260512174638.120445-1-irogers@google.com> (raw)
In-Reply-To: <20260512053539.3410189-15-irogers@google.com>

This patch series refactors Kbuild internals, BPF skeleton generation,
Python AST pre-computation, and foundational tooling dependencies across
the perf tool build system. By eliminating umbrella target synchronization
barriers, decoupling static library prerequisites, parallelizing single-core
script generators, and eradicating redundant feature checks, this series
unlocks absolute theoretical peak multi-core concurrency during Kbuild startup.

On a 28-core build workstation (make -j28 all from scratch), clean build
latency improves by over 46%:

  Before:
    real    0m29.006s
    user    2m46.019s
    sys     0m30.610s

  After:
    real    0m15.655s
    user    2m43.051s
    sys     0m26.437s

Saving 13.3 full seconds time per clean build. Furthermore, nothing to
build incremental builds are improved by nearly 7x:

  Before:
    real    0m11.528s
    user    0m9.633s
    sys     0m6.965s

  After:
    real    0m1.665s
    user    0m1.501s
    sys     0m0.841s

Summary of Patches:

1-4: Foundational Tooling & Fast-Path Feature Detection
  - Exempts bpftool bootstrap from non-essential feature tests (LLVM, libbfd,
    libcap), saving 1.1s of sub-make fork overhead during Kbuild startup.
  - Integrates libdebuginfod directly into test-all.c, allowing Make to skip
    individual feature check sub-make forks during AST parsing on fully
    configured workstations.
  - Fixes test-clang-bpf-co-re.bin feature check to correctly generate its
    target file on disk, allowing Kbuild to perfectly cache the detection result
    and avoid continuous sub-make re-evaluations.
  - Short-circuits CC_NO_CLANG compiler inspection probe in Makefile.include by
    exporting the cached result, eliminating 40+ redundant compiler forks across
    the sub-make hierarchy.

5-7: Flattening Umbrella Prepare Barriers
  - builtin-trace embedded inclusions and pmu-events generation are completely
    decoupled from the sequential "prepare" umbrella target, eliminating Make
    AST double-parsing overhead and unchoking parallel compilation barriers.

8-11: Decoupling & Pre-generating BPF Skeletons
  - BPF skeleton rules are extracted out of Makefile.perf into bpf_skel.mak.
  - Decouples bpftool bootstrap from top-level static libbpf dependencies,
    attaching bpf-skel-prepare directly to the umbrella prepare target. This
    allows Make to pre-compile bpftool and dump vmlinux.h in the background at
    build startup, removing the 7-second serialization bottleneck before BPF
    object compilation.

12-13: Foundational Linkage Optimization
  - Eliminates redundant libbpf sub-make feature checks during static builds.
  - Moves static libsymbol and libbpf library prerequisites out of the prepare step.

14-15: jevents.py Concurrency & Deduplication
  - Splits the massive 2.8 MB big_c_string literal out of pmu-events.c into a
    dedicated pmu-events-string.c compilation unit. This slices C compilation
    latency in half by compiling string and struct tables simultaneously across
    separate CPU cores while preserving zero dynamic ELF relocations.
  - Pre-populates jevents.py JSON ASTs and metric formulas in parallel across
    all available CPU cores using ProcessPoolExecutor (accelerating Python
    execution by 11x, from 3.3s down to ~290ms).

16: Out-of-Tree Incremental Rebuild Fix
  - Prefixes SCRIPTS (perf-archive, perf-iostat) with $(OUTPUT) to prevent
    Make from continuously re-executing script installation rules on already
    built out-of-tree builds.

17-18: AST Parsing Optimization & Shell Fork Eradication
  - Converts ZENS, ARMS, and INTELS in pmu-events/Build from recursive assignment
    (=) to simply expanded assignment (:=) and replaces model_name/vendor_name
    with pure GNU Make string functions. This guarantees Make executes directory
    probing shell forks exactly once during AST parsing and evaluates path macros
    purely in memory, completely eradicating over 7,800 redundant sub-processes
    during out-of-tree build evaluation.
  - Converts llvm-config shell queries in Makefile.config from recursive assignment
    (=) to simply expanded assignment (:=). This eliminates ~185 redundant sub-processes
    that were previously executed across object compilation dependency checks.

Changes since v1:
  - Reorganized commit order so foundational build system and script infrastructure
    patches precede perf tool refactoring.
  - Added Tested-by tag from James Clark on v1 patches.
  - Eliminated redundant llvm-config shell forks and simply expanded PMU directory
    probing variables, wiping out over 7,800 redundant sub-processes during AST parsing.
  - Fixed test-clang-bpf-co-re.bin feature check caching and short-circuited CC_NO_CLANG
    compiler probes across sub-makes.

Ian Rogers (18):
  bpftool build: Restrict feature tests during bootstrap compilation
  tools build: Integrate libdebuginfod into test-all fast path
  tools build: Fix test-clang-bpf-co-re.bin to generate target file
  tools scripts: Short-circuit CC_NO_CLANG compiler probe in
    Makefile.include
  perf trace beauty: Make beauty generated C code standalone .o files
  perf build: Decouple pmu-events from prepare umbrella target
  perf build: Remove empty archheaders target
  perf build: Move BPF skeleton generation out of Makefile.perf
  perf build: Encapsulate vmlinux.h and bpftool in bpf_skel.mak
  perf build: Move static libbpf dependency out of prepare step
  perf build: Pre-generate BPF skeleton tooling during umbrella prepare
    phase
  perf build: Move libsymbol dependency out of prepare step
  perf build: Remove redundant libbpf feature check for static builds
  perf pmu-events: Split big_c_string storage into standalone
    compilation unit
  perf pmu-events: Parallelize JSON and metric pre-computation in
    jevents.py
  perf build: Prefix SCRIPTS with output directory to fix continuous
    rebuilds
  perf pmu-events: Convert recursive shell assignments and macros to
    Make built-ins
  perf build: Convert llvm-config shell queries to simply expanded
    variables

 tools/bpf/bpftool/Makefile                    |   5 +
 tools/build/Makefile.feature                  |   6 +-
 tools/build/feature/Makefile                  |   4 +-
 tools/build/feature/test-all.c                |   5 +
 tools/perf/Build                              |   2 +
 tools/perf/Makefile.config                    |  19 +-
 tools/perf/Makefile.perf                      | 427 +-----------------
 tools/perf/bench/Build                        |   6 +
 .../bpf_skel/bench_uprobe.bpf.c               |   0
 tools/perf/bench/uprobe.c                     |   2 +-
 tools/perf/bpf_skel.mak                       | 110 +++++
 tools/perf/builtin-trace.c                    |  30 +-
 tools/perf/pmu-events/Build                   |  25 +-
 tools/perf/pmu-events/jevents.py              |  56 ++-
 tools/perf/trace/beauty/Build                 | 280 ++++++++++++
 tools/perf/trace/beauty/arch_errno_names.c    |   2 +
 tools/perf/trace/beauty/arch_errno_names.sh   |   2 +-
 tools/perf/trace/beauty/beauty.h              |  60 +++
 tools/perf/trace/beauty/eventfd.c             |   6 +-
 tools/perf/trace/beauty/fsconfig.c            |   5 +
 tools/perf/trace/beauty/futex_op.c            |   6 +-
 tools/perf/trace/beauty/futex_val3.c          |   6 +-
 tools/perf/trace/beauty/mmap.c                |  24 +-
 tools/perf/trace/beauty/mode_t.c              |   6 +-
 tools/perf/trace/beauty/msg_flags.c           |   8 +-
 tools/perf/trace/beauty/open_flags.c          |   1 +
 tools/perf/trace/beauty/perf_event_open.c     |  22 +-
 tools/perf/trace/beauty/pid.c                 |   5 +-
 tools/perf/trace/beauty/sched_policy.c        |   8 +-
 tools/perf/trace/beauty/seccomp.c             |  12 +-
 tools/perf/trace/beauty/signum.c              |   6 +-
 tools/perf/trace/beauty/socket_type.c         |   6 +-
 .../perf/{util => trace/beauty}/syscalltbl.c  |   0
 .../perf/{util => trace/beauty}/syscalltbl.h  |   0
 tools/perf/trace/beauty/tracepoints/Build     |  22 +
 tools/perf/trace/beauty/waitid_options.c      |   8 +-
 tools/perf/util/Build                         |  17 +-
 tools/perf/util/bpf-trace-summary.c           |   2 +-
 tools/perf/util/env.c                         |   4 +-
 tools/perf/util/env.h                         |   1 +
 tools/scripts/Makefile.include                |   3 +
 41 files changed, 716 insertions(+), 503 deletions(-)
 rename tools/perf/{util => bench}/bpf_skel/bench_uprobe.bpf.c (100%)
 create mode 100644 tools/perf/bpf_skel.mak
 create mode 100644 tools/perf/trace/beauty/fsconfig.c
 rename tools/perf/{util => trace/beauty}/syscalltbl.c (100%)
 rename tools/perf/{util => trace/beauty}/syscalltbl.h (100%)

-- 
2.54.0.563.g4f69b47b94-goog


  reply	other threads:[~2026-05-12 17:46 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12  5:35 [PATCH v1 00/14] perf build: Reduce build time by one third Ian Rogers
2026-05-12  5:35 ` [PATCH v1 01/14] bpftool build: Restrict feature tests during bootstrap compilation Ian Rogers
2026-05-12  5:35 ` [PATCH v1 02/14] perf trace beauty: Make beauty generated C code standalone .o files Ian Rogers
2026-05-12  5:35 ` [PATCH v1 03/14] perf build: Decouple pmu-events from prepare umbrella target Ian Rogers
2026-05-12  5:35 ` [PATCH v1 04/14] perf build: Remove empty archheaders target Ian Rogers
2026-05-12  5:35 ` [PATCH v1 05/14] perf build: Move BPF skeleton generation out of Makefile.perf Ian Rogers
2026-05-12  5:35 ` [PATCH v1 06/14] perf build: Encapsulate vmlinux.h and bpftool in bpf_skel.mak Ian Rogers
2026-05-12  5:35 ` [PATCH v1 07/14] perf build: Move static libbpf dependency out of prepare step Ian Rogers
2026-05-12  5:35 ` [PATCH v1 08/14] perf build: Pre-generate BPF skeletons during umbrella prepare phase Ian Rogers
2026-05-12  5:35 ` [PATCH v1 09/14] perf build: Move libsymbol dependency out of prepare step Ian Rogers
2026-05-12  5:35 ` [PATCH v1 10/14] perf build: Remove redundant libbpf feature check for static builds Ian Rogers
2026-05-12  5:35 ` [PATCH v1 11/14] tools build: Integrate libdebuginfod into test-all fast path Ian Rogers
2026-05-12  5:35 ` [PATCH v1 12/14] perf pmu-events: Split big_c_string storage into standalone compilation unit Ian Rogers
2026-05-12  5:35 ` [PATCH v1 13/14] perf pmu-events: Parallelize JSON and metric pre-computation in jevents.py Ian Rogers
2026-05-12  5:35 ` [PATCH v1 14/14] perf build: Prefix SCRIPTS with output directory to fix continuous rebuilds Ian Rogers
2026-05-12 17:46   ` Ian Rogers [this message]
2026-05-12 17:46     ` [PATCH v2 01/18] bpftool build: Restrict feature tests during bootstrap compilation Ian Rogers
2026-05-12 17:46     ` [PATCH v2 02/18] tools build: Integrate libdebuginfod into test-all fast path Ian Rogers
2026-05-12 17:46     ` [PATCH v2 03/18] tools build: Fix test-clang-bpf-co-re.bin to generate target file Ian Rogers
2026-05-12 17:46     ` [PATCH v2 04/18] tools scripts: Short-circuit CC_NO_CLANG compiler probe in Makefile.include Ian Rogers
2026-05-12 17:46     ` [PATCH v2 05/18] perf trace beauty: Make beauty generated C code standalone .o files Ian Rogers
2026-05-12 17:46     ` [PATCH v2 06/18] perf build: Decouple pmu-events from prepare umbrella target Ian Rogers
2026-05-12 17:46     ` [PATCH v2 07/18] perf build: Remove empty archheaders target Ian Rogers
2026-05-12 17:46     ` [PATCH v2 08/18] perf build: Move BPF skeleton generation out of Makefile.perf Ian Rogers
2026-05-12 17:46     ` [PATCH v2 09/18] perf build: Encapsulate vmlinux.h and bpftool in bpf_skel.mak Ian Rogers
2026-05-12 17:46     ` [PATCH v2 10/18] perf build: Move static libbpf dependency out of prepare step Ian Rogers
2026-05-12 17:46     ` [PATCH v2 11/18] perf build: Pre-generate BPF skeleton tooling during umbrella prepare phase Ian Rogers
2026-05-12 17:46     ` [PATCH v2 12/18] perf build: Move libsymbol dependency out of prepare step Ian Rogers
2026-05-12 17:46     ` [PATCH v2 13/18] perf build: Remove redundant libbpf feature check for static builds Ian Rogers
2026-05-12 17:46     ` [PATCH v2 14/18] perf pmu-events: Split big_c_string storage into standalone compilation unit Ian Rogers
2026-05-12 17:46     ` [PATCH v2 15/18] perf pmu-events: Parallelize JSON and metric pre-computation in jevents.py Ian Rogers
2026-05-12 17:46     ` [PATCH v2 16/18] perf build: Prefix SCRIPTS with output directory to fix continuous rebuilds Ian Rogers
2026-05-12 17:46     ` [PATCH v2 17/18] perf pmu-events: Convert recursive shell assignments and macros to Make built-ins Ian Rogers
2026-05-12 17:46     ` [PATCH v2 18/18] perf build: Convert llvm-config shell queries to simply expanded variables Ian Rogers
2026-05-12  9:36 ` [PATCH v1 00/14] perf build: Reduce build time by one third James Clark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260512174638.120445-1-irogers@google.com \
    --to=irogers@google.com \
    --cc=9erthalion6@gmail.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alex@ghiti.fr \
    --cc=alexandre.chartre@oracle.com \
    --cc=andrii@kernel.org \
    --cc=ankur.a.arora@oracle.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=bpf@vger.kernel.org \
    --cc=collin.funk1@gmail.com \
    --cc=costa.shul@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=dsterba@suse.com \
    --cc=eddyz87@gmail.com \
    --cc=howardchu95@gmail.com \
    --cc=james.clark@linaro.org \
    --cc=jolsa@kernel.org \
    --cc=leo.yan@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=mingo@redhat.com \
    --cc=mmayer@broadcom.com \
    --cc=namhyung@kernel.org \
    --cc=nathan@kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=peterz@infradead.org \
    --cc=pjw@kernel.org \
    --cc=qmo@kernel.org \
    --cc=ricky.ringler@proton.me \
    --cc=song@kernel.org \
    --cc=swapnil.sapkal@amd.com \
    --cc=terrelln@fb.com \
    --cc=tglozar@redhat.com \
    --cc=thomas.falcon@intel.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox