Linux Perf Users
 help / color / mirror / Atom feed
* [RFC PATCH v1 0/5] perf annotate: Add ARM64 data type profiling support
@ 2026-06-23 13:02 Shuai Xue
  2026-06-23 13:02 ` [RFC PATCH v1 1/5] perf annotate-data: Widen type_state_reg::imm_value to u64 Shuai Xue
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Shuai Xue @ 2026-06-23 13:02 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim
  Cc: Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
	Adrian Hunter, James Clark, Zecheng Li, linux-perf-users,
	linux-kernel

`perf test -v "perf data type profiling tests"` fails on ARM64:

    Basic Rust perf annotate test
    perf mem record -o /tmp/perf.data perf test -w code_with_type
    perf annotate --code-with-type -i /tmp/perf.data --stdio --percent-limit 1
    Basic annotate [Failed: missing target data type]

The root cause is that ARM64 lacks the instruction parsing infrastructure
required for data type profiling. Specifically:

  1. annotate_get_insn_location() cannot extract register numbers and
     memory offsets from ARM64 load/store instructions, because ARM64
     does not set objdump.register_char or objdump.memory_ref_char
     (unlike x86 which uses '%' and '(').

  2. arch_supports_insn_tracking() does not include ARM64, so
     find_data_type_block() cannot perform instruction-level type state
     tracking.

  3. init_type_state() has no ARM64 branch, leaving stack_reg as 0 (x0)
     after memset, which causes x0-based memory accesses to be
     misidentified as stack accesses.

As a result, perf annotate --code-with-type silently produces no type
annotations on ARM64, and the test grep for "# data-type: struct Buf"
fails.

This series adds ARM64 data type profiling support following the PowerPC
model: decode raw 32-bit instruction words rather than parsing objdump
text. ARM64's fixed-width encoding and trivial DWARF register mapping
(x0-x30 = DWARF 0-30) make this approach clean and robust.

Three classes of instructions are tracked for register state propagation:
  - ADRP: compute PC-relative page address for global variable resolution
  - ADD (immediate): combine with ADRP result to form full variable address
  - MOV (register): propagate type state between registers

This covers the common `adrp + add + ldr/str` pattern that ARM64
compilers emit for global variable access.

Known limitations:
  - The `adrp + ldr` pattern (with :lo12: folded into the load offset,
    without an intermediate ADD) is not yet handled. This requires
    extending check_matching_type() to resolve TSR_KIND_CONST with the
    load offset, which can be added incrementally.
  - Pointer chain tracking (load-from-memory propagating type to the
    destination register) is not implemented, matching PowerPC's current
    scope.

Testing:
  All four sub-tests in `perf test "perf data type profiling tests"`
  pass reliably on ARM64 (AArch64, SPE-capable hardware):
    - Basic/Pipe Rust: struct Buf (code_with_type workload)
    - Basic/Pipe C: struct buf (datasym workload, global variable)

Patch breakdown:
  1/5  Widen type_state_reg::imm_value from u32 to u64 (prerequisite
       for storing 64-bit addresses from ADRP)
  2/5  Add arch__is_arm64() detection, raw instruction parsing from
       objdump output, and enable show_asm_raw for ARM64
  3/5  Add get_arm64_regs() to extract registers and memory offsets
       from load/store instruction encodings (4 addressing modes)
  4/5  Wire up ARM64 in annotate_get_insn_location(),
       arch_supports_insn_tracking(), and init_type_state()
  5/5  Main patch: instruction classification, ADRP/ADD/MOV register
       state tracking, and architecture initialization

Shuai Xue (5):
  perf annotate-data: Widen type_state_reg::imm_value to u64
  perf disasm: Add ARM64 architecture detection and raw instruction
    parsing
  perf dwarf-regs: Add ARM64 register and offset extraction from raw
    instructions
  perf annotate: Wire up ARM64 data type profiling infrastructure
  perf annotate-arch: Add ARM64 data type profiling support

 .../perf/util/annotate-arch/annotate-arm64.c  | 333 ++++++++++++++++++
 tools/perf/util/annotate-arch/annotate-x86.c  |   2 +-
 tools/perf/util/annotate-data.c               |  18 +-
 tools/perf/util/annotate-data.h               |   2 +-
 tools/perf/util/annotate.c                    |  12 +-
 tools/perf/util/disasm.c                      |  64 ++++
 tools/perf/util/disasm.h                      |   2 +
 .../util/dwarf-regs-arch/dwarf-regs-arm64.c   | 125 +++++++
 tools/perf/util/include/dwarf-regs.h          |   7 +
 9 files changed, 558 insertions(+), 7 deletions(-)

-- 
2.51.2.612.gdc70283dfc

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-06-23 16:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-23 13:02 [RFC PATCH v1 0/5] perf annotate: Add ARM64 data type profiling support Shuai Xue
2026-06-23 13:02 ` [RFC PATCH v1 1/5] perf annotate-data: Widen type_state_reg::imm_value to u64 Shuai Xue
2026-06-23 13:02 ` [RFC PATCH v1 2/5] perf disasm: Add ARM64 architecture detection and raw instruction parsing Shuai Xue
2026-06-23 13:19   ` sashiko-bot
2026-06-23 13:02 ` [RFC PATCH v1 3/5] perf dwarf-regs: Add ARM64 register and offset extraction from raw instructions Shuai Xue
2026-06-23 13:02 ` [RFC PATCH v1 4/5] perf annotate: Wire up ARM64 data type profiling infrastructure Shuai Xue
2026-06-23 13:02 ` [RFC PATCH v1 5/5] perf annotate-arch: Add ARM64 data type profiling support Shuai Xue
2026-06-23 13:32   ` sashiko-bot
2026-06-23 16:56 ` [RFC PATCH v1 0/5] perf annotate: " Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox