From: Namhyung Kim <namhyung@kernel.org>
To: Tengda Wu <wutengda@huaweicloud.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
leo.yan@linux.dev, Li Huafei <lihuafei1@huawei.com>,
Ian Rogers <irogers@google.com>,
Kim Phillips <kim.phillips@arm.com>,
Mark Rutland <mark.rutland@arm.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Ingo Molnar <mingo@redhat.com>, Bill Wendling <morbo@google.com>,
Nick Desaulniers <nick.desaulniers+lkml@gmail.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Zecheng Li <zli94@ncsu.edu>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
llvm@lists.linux.dev
Subject: Re: [PATCH v2 00/16] perf arm64: Support data type profiling
Date: Mon, 6 Apr 2026 23:31:00 -0700 [thread overview]
Message-ID: <adSkpI9AoTNgISU5@z2> (raw)
In-Reply-To: <20260403094800.1418825-1-wutengda@huaweicloud.com>
Hello,
On Fri, Apr 03, 2026 at 09:47:44AM +0000, Tengda Wu wrote:
> This patch series implements data type profiling support for arm64,
> building upon the foundational work previously contributed by Huafei [1].
> While the initial version laid the groundwork for arm64 data type analysis,
> this series iterates on that work by refining instruction parsing and
> extending support for core architectural features.
Thanks for working on this! I'm happy to see that the changes are well
organized and each commit explained the issues clearly.
>
> The series is organized as follows:
>
> 1. Fix disassembly mismatches (Patches 01-02)
> Current perf annotate supports three disassembly backends: llvm,
> capstone, and objdump. On arm64, inconsistencies between the output
> of these backends (specifically llvm/capstone vs. objdump) often
> prevent the tracker from correctly identifying registers and offsets.
> These patches resolve these mismatches, ensuring consistent instruction
> parsing across all supported backends.
>
> 2. Infrastructure for arm64 operand parsing (Patches 03-07)
> These patches establish the necessary infrastructure for arm64-specific
> operand handling. This includes implementing new callbacks and data
> structures to manage arm64's unique addressing modes and register sets.
> This foundation is essential for the subsequent type-tracking logic.
I've only checked up to this part so far. Let me write replies soon.
I'll continue to review later in this week.
>
> 3. Core instruction tracking (Patches 08-16)
> These patches implement the core logic for type tracking on arm64,
> covering a wide range of instructions including:
>
> * Memory Access: ldr/str variants (including stack-based access).
> * Arithmetic & Data Processing: mov, add, and adrp.
> * Special Access: System register access (mrs) and per-cpu variable
> tracking.
>
> The implementation draws inspiration from the existing x86 logic while
> adapting it to the nuances of the AArch64 ISA [2][3]. With these changes,
> perf annotate can successfully resolve memory locations and register
> types, enabling comprehensive data type profiling on arm64 platforms.
>
> Example Result
> ==============
>
> # perf mem record -a -K -- sleep 1
> # perf annotate --data-type --type-stat --stdio
> Annotate data type stats:
> total 6204, ok 5091 (82.1%), bad 1113 (17.9%)
I'm impressed that the success rate is quite high. But I think you need
to confirm that the findings are correct by taking a close look at each
result. You can try `perf annotate --code-with-type`.
Thanks,
Namhyung
> -----------------------------------------------------------
> 29 : no_sym
> 196 : no_var
> 806 : no_typeinfo
> 82 : bad_offset
> 1370 : insn_track
>
> Annotate type: 'struct page' in [kernel.kallsyms] (59208 samples):
> ============================================================================
> Percent offset size field
> 100.00 0 0x40 struct page {
> 9.95 0 0x8 long unsigned int flags;
> 52.83 0x8 0x28 union {
> 52.83 0x8 0x28 struct {
> 37.21 0x8 0x10 union {
> 37.21 0x8 0x10 struct list_head lru {
> 37.21 0x8 0x8 struct list_head* next;
> 0.00 0x10 0x8 struct list_head* prev;
> };
> 37.21 0x8 0x10 struct {
> 37.21 0x8 0x8 void* __filler;
> 0.00 0x10 0x4 unsigned int mlock_count;
> ...
>
> Changes since v1: (reworked from Huafei's series):
>
> - Fix inconsistencies in arm64 instruction output across llvm, capstone,
> and objdump disassembly backends.
> - Support arm64-specific addressing modes and operand formats. (Leo Yan)
> - Extend instruction tracking to support mov and add instructions,
> along with per-cpu and stack variables.
> - Include real-world examples in commit messages to demonstrate
> practical effects. (Namhyung Kim)
> - Improve type-tracking success rate (type stat) from 64.2% to 82.1%.
> https://lore.kernel.org/all/20250314162137.528204-1-lihuafei1@huawei.com/
>
> Please let me know if you have any feedback.
>
> Thanks,
> Tengda
>
> [1] https://lore.kernel.org/all/20250314162137.528204-1-lihuafei1@huawei.com/
> [2] https://developer.arm.com/documentation/102374/0103
> [3] https://github.com/flynd/asmsheets/releases/tag/v8
>
> ---
>
> Tengda Wu (16):
> perf llvm: Fix arm64 adrp instruction disassembly mismatch with
> objdump
> perf capstone: Fix arm64 jump/adrp disassembly mismatch with objdump
> perf annotate-arm64: Generalize arm64_mov__parse to support standard
> operands
> perf annotate-arm64: Handle load and store instructions
> perf annotate: Introduce extract_op_location callback for
> arch-specific parsing
> perf dwarf-regs: Adapt get_dwarf_regnum() for arm64
> perf annotate-arm64: Implement extract_op_location() callback
> perf annotate-arm64: Enable instruction tracking support
> perf annotate-arm64: Support load instruction tracking
> perf annotate-arm64: Support store instruction tracking
> perf annotate-arm64: Support stack variable tracking
> perf annotate-arm64: Support 'mov' instruction tracking
> perf annotate-arm64: Support 'add' instruction tracking
> perf annotate-arm64: Support 'adrp' instruction to track global
> variables
> perf annotate-arm64: Support per-cpu variable access tracking
> perf annotate-arm64: Support 'mrs' instruction to track 'current'
> pointer
>
> .../perf/util/annotate-arch/annotate-arm64.c | 642 +++++++++++++++++-
> .../util/annotate-arch/annotate-powerpc.c | 10 +
> tools/perf/util/annotate-arch/annotate-x86.c | 88 ++-
> tools/perf/util/annotate-data.c | 72 +-
> tools/perf/util/annotate-data.h | 7 +-
> tools/perf/util/annotate.c | 108 +--
> tools/perf/util/annotate.h | 12 +
> tools/perf/util/capstone.c | 107 ++-
> tools/perf/util/disasm.c | 5 +
> tools/perf/util/disasm.h | 5 +
> .../util/dwarf-regs-arch/dwarf-regs-arm64.c | 20 +
> tools/perf/util/dwarf-regs.c | 2 +-
> tools/perf/util/include/dwarf-regs.h | 1 +
> tools/perf/util/llvm.c | 50 ++
> 14 files changed, 984 insertions(+), 145 deletions(-)
>
>
> base-commit: cf7c3c02fdd0dfccf4d6611714273dcb538af2cb
> --
> 2.34.1
>
prev parent reply other threads:[~2026-04-07 6:31 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-03 9:47 [PATCH v2 00/16] perf arm64: Support data type profiling Tengda Wu
2026-04-03 9:47 ` [PATCH v2 01/16] perf llvm: Fix arm64 adrp instruction disassembly mismatch with objdump Tengda Wu
2026-04-03 9:47 ` [PATCH v2 02/16] perf capstone: Fix arm64 jump/adrp " Tengda Wu
2026-04-07 6:43 ` Namhyung Kim
2026-04-03 9:47 ` [PATCH v2 03/16] perf annotate-arm64: Generalize arm64_mov__parse to support standard operands Tengda Wu
2026-04-07 6:58 ` Namhyung Kim
2026-04-03 9:47 ` [PATCH v2 04/16] perf annotate-arm64: Handle load and store instructions Tengda Wu
2026-04-07 7:09 ` Namhyung Kim
2026-04-03 9:47 ` [PATCH v2 05/16] perf annotate: Introduce extract_op_location callback for arch-specific parsing Tengda Wu
2026-04-03 9:47 ` [PATCH v2 06/16] perf dwarf-regs: Adapt get_dwarf_regnum() for arm64 Tengda Wu
2026-04-03 9:47 ` [PATCH v2 07/16] perf annotate-arm64: Implement extract_op_location() callback Tengda Wu
2026-04-07 7:26 ` Namhyung Kim
2026-04-03 9:47 ` [PATCH v2 08/16] perf annotate-arm64: Enable instruction tracking support Tengda Wu
2026-04-03 9:47 ` [PATCH v2 09/16] perf annotate-arm64: Support load instruction tracking Tengda Wu
2026-04-03 9:47 ` [PATCH v2 10/16] perf annotate-arm64: Support store " Tengda Wu
2026-04-03 9:47 ` [PATCH v2 11/16] perf annotate-arm64: Support stack variable tracking Tengda Wu
2026-04-03 9:47 ` [PATCH v2 12/16] perf annotate-arm64: Support 'mov' instruction tracking Tengda Wu
2026-04-03 9:47 ` [PATCH v2 13/16] perf annotate-arm64: Support 'add' " Tengda Wu
2026-04-03 9:47 ` [PATCH v2 14/16] perf annotate-arm64: Support 'adrp' instruction to track global variables Tengda Wu
2026-04-03 9:47 ` [PATCH v2 15/16] perf annotate-arm64: Support per-cpu variable access tracking Tengda Wu
2026-04-03 9:48 ` [PATCH v2 16/16] perf annotate-arm64: Support 'mrs' instruction to track 'current' pointer Tengda Wu
2026-04-07 6:31 ` Namhyung Kim [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adSkpI9AoTNgISU5@z2 \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=irogers@google.com \
--cc=kim.phillips@arm.com \
--cc=leo.yan@linux.dev \
--cc=lihuafei1@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=llvm@lists.linux.dev \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=morbo@google.com \
--cc=nick.desaulniers+lkml@gmail.com \
--cc=peterz@infradead.org \
--cc=wutengda@huaweicloud.com \
--cc=zli94@ncsu.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox