* [PATCH V7 00/18] Add data type profiling support for powerpc
@ 2024-07-13 16:55 Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 01/18] tools/perf: Move the data structures related to register type to header file Athira Rajeev
` (19 more replies)
0 siblings, 20 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
The patchset from Namhyung added support for data type profiling
in perf tool. This enabled support to associate PMU samples to data
types they refer using DWARF debug information. With the upstream
perf, currently it possible to run perf report or perf annotate to
view the data type information on x86.
Initial patchset posted here had changes need to enable data type
profiling support for powerpc.
https://lore.kernel.org/all/6e09dc28-4a2e-49d8-a2b5-ffb3396a9952@csgroup.eu/T/
Main change were:
1. powerpc instruction nmemonic table to associate load/store
instructions with move_ops which is use to identify if instruction
is a memory access one.
2. To get register number and access offset from the given
instruction, code uses fields from "struct arch" -> objump.
Added entry for powerpc here.
3. A get_arch_regnum to return register number from the
register name string.
But the apporach used in the initial patchset used parsing of
disassembled code which the current perf tool implementation does.
Example: lwz r10,0(r9)
This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset. Also to find whether there is a memory
reference in the operands, "memory_ref_char" field of objdump is used.
For x86, "(" is used as memory_ref_char to tackle instructions of the
form "mov (%rax), %rcx".
In case of powerpc, not all instructions using "(" are the only memory
instructions. Example, above instruction can also be of extended form (X
form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
and extract the source/target registers, second patchset added support to use
raw instruction. With raw instruction, macros are added to extract opcode
and register fields.
Link to second patchset:
https://lore.kernel.org/all/20240506121906.76639-1-atrajeev@linux.vnet.ibm.com/
Example representation using --show-raw-insn in objdump gives result:
38 01 81 e8 ld r4,312(r1)
Here "38 01 81 e8" is the raw instruction representation. In powerpc,
this translates to instruction form: "ld RT,DS(RA)" and binary code
as:
_____________________________________
| 58 | RT | RA | DS | |
-------------------------------------
0 6 11 16 30 31
Second patchset used "objdump" again to read the raw instruction.
But since there is no need to disassemble and binary code can be read
directly from the DSO, third patchset (ie this patchset) uses below
apporach. The apporach preferred in powerpc to parse sample for data
type profiling in V3 patchset is:
- Read directly from DSO using dso__data_read_offset
- If that fails for any case, fallback to using libcapstone
- If libcapstone is not supported, approach will use objdump
Patchset adds support to pick the opcode and reg fields from this
raw/binary instruction code. This approach came in from review comment
by Segher Boessenkool and Christophe for the initial patchset.
Apart from that, instruction tracking is enabled for powerpc and
support function is added to find variables defined as registers
Example, in powerpc, below two registers are
defined to represent variable:
1. r13: represents local_paca
register struct paca_struct *local_paca asm("r13");
2. r1: represents stack_pointer
register void *__stack_pointer asm("r1");
These are handled in this patchset.
- Patch 1 is to rearrange register state type structures to header file
so that it can referred from other arch specific files
- Patch 2 is to make instruction tracking as a callback to"struct arch"
so that it can be implemented by other archs easily and defined in arch
specific files
- Patch 3 is to handle state type regs array size for x86 and powerpc
- Patch 4 adds support to capture and parse raw instruction in powerpc
using dso__data_read_offset utility
- Patch 4 also adds logic to support using objdump when doing default "perf
report" or "perf annotate" since it that needs disassembled instruction.
- Patch 5 adds disasm_line__parse to parse raw instruction for powerpc
- Patch 6 update parameters for reg extract functions to use raw
instruction on powerpc
- Patch 7 updates ins__find to carry raw_insn and also adds parse
callback for memory instructions for powerpc
- Patch 8 add support to identify memory instructions of opcode 31 in
powerpc
- Patch 9 adds more instructions to support instruction tracking in powerpc
- Patch 10 and 11 handles instruction tracking for powerpc.
- Patch 12, 13 and 14 add support to use libcapstone in powerpc
- Patch 15 and patch 16 handles support to find global register variables
- PAtch 17 updates data type compare functions data_type_cmp and
sort__typeoff_sort to include var_name along with type_name in
comparison.
- Patch 18 handles insn-stat option for perf annotate
Note:
- There are remaining unknowns (25%) as seen in annotate Instruction stats
below.
- This patchset is not tested on powerpc32. In next step of enhancements
along with handling remaining unknowns, plan to cover powerpc32 changes
based on how testing goes.
With the current patchset:
./perf record -a -e mem-loads sleep 1
./perf report -s type,typeoff --hierarchy --group --stdio
./perf annotate --data-type --insn-stat
perf annotate logs:
==================
Annotate Instruction stats
total 609, ok 446 (73.2%), bad 163 (26.8%)
Name/opcode : Good Bad
-----------------------------------------------------------
58 : 323 80
32 : 49 43
34 : 33 11
OP_31_XOP_LDX : 8 20
40 : 23 0
OP_31_XOP_LWARX : 5 1
OP_31_XOP_LWZX : 2 3
OP_31_XOP_LDARX : 3 0
33 : 0 2
OP_31_XOP_LBZX : 0 1
OP_31_XOP_LWAX : 0 1
OP_31_XOP_LHZX : 0 1
perf report logs:
=================
Total Lost Samples: 0
Samples: 1K of event 'mem-loads'
Event count (approx.): 937238
Overhead Data Type Data Type Offset
........ ......... ................
48.60% (unknown) (unknown) +0 (no field)
11.42% long unsigned int long unsigned int +0 (current_stack_pointer)
4.68% struct paca_struct struct paca_struct +2312 (__current)
4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)
2.69% struct paca_struct struct paca_struct +2808 (canary)
2.68% struct paca_struct struct paca_struct +8 (paca_index)
2.24% struct paca_struct struct paca_struct +48 (data_offset)
1.43% long unsigned int long unsigned int +0 (no field)
1.41% struct vm_fault struct vm_fault +0 (vma)
1.29% struct task_struct struct task_struct +276 (flags)
1.03% struct pt_regs struct pt_regs +264 (user_regs.msr)
0.90% struct security_hook_list struct security_hook_list +0 (list.next)
0.76% struct irq_desc struct irq_desc +304 (irq_data.chip)
0.76% struct rq struct rq +2856 (cpu)
0.72% long long unsigned int long long unsigned int +0 (no field)
Thanks
Athira Rajeev
Changelog:
From v6 -> v7:
- Addressed review comments from Namhyung
Changed format string space to %-20s while printing
instruction stats in patch 18.
Use cmp_null in patch 17 while comparing var_name to
properly sort with correct order.
From v5 -> v6:
- Addressed review comments from Namhyung
Conditionally define TYPE_STATE_MAX_REGS based on arch.
Added macro for defining width of the raw codes and spaces
in disasm_line__parse_powerpc.
Call disasm_line__parse from disasm_line__parse_powerpc
for generic code.
Renamed symbol__disassemble_dso to symbol__disassemble_raw.
Fixed find_data_type_global_reg to correclty free var_types
and change indent level.
Fixed data_type_cmp and sort__typeoff_sort to include var_name
in comparing data type entries.
From v4 -> v5:
- Addressed review comments from Namhyung
Handle max number of type state regs as 16 for x86 and 32 for
powerpc.
Added generic support for objdump patch first and DSO read
optimisation next
combined patch 3 and patch 4 in patchseries V4 to one patch
Changed reference for "raw_insn" to use "u32"
Splitted "parse" callback patch changes and "ins__find" patch
changes into two
Instead of making weak function, added get_powerpc_regs to
extract register and offset fields for powerpc
- Addressed complation fail when "dwarf.h" is not present ie
elfutils devel is not present. Used includes for #ifdef HAVE_DWARF_SUPPORT
when including functions that use Dwarf references. Also
conditionally include some of the header files.
From v3->v4:
- Addressed review comments from Ian by using capston_init from
"util/print_insn.c" instead of "open_capston_handle".
- Addressed review comment from Namhyung by moving "opcode"
field from "struct ins" to "struct disasm_line"
From v2->v3:
- Addressed review comments from Christophe and Namhyung for V2
- Changed the apporach in powerpc to parse sample for data
type profiling as:
Read directly from DSO using dso__data_read_offset
If that fails for any case, fallback to using libcapstone
If libcapstone is not supported, approach will use objdump
- Include instructions with opcode as 31 and correctly categorize
them as memory or arithmetic instructions.
- Include more instructions for instruction tracking in powerpc
From v1->v2:
- Addressed suggestion from Christophe Leroy and Segher Boessenkool
to use the binary code (raw insn) to fetch opcode, register and
offset fields.
- Added support for instruction tracking in powerpc
- Find the register defined variables (r13 and r1 which points to
local_paca and current_stack_pointer in powerpc)
Athira Rajeev (18):
tools/perf: Move the data structures related to register type to
header file
tools/perf: Add "update_insn_state" callback function to handle arch
specific instruction tracking
tools/perf: Update TYPE_STATE_MAX_REGS to include max of regs in
powerpc
tools/perf: Add disasm_line__parse to parse raw instruction for
powerpc
tools/perf: Add support to capture and parse raw instruction in
powerpc using dso__data_read_offset utility
tools/perf: Update parameters for reg extract functions to use raw
instruction on powerpc
tools/perf: Add parse function for memory instructions in powerpc
tools/perf: Add support to identify memory instructions of opcode 31
in powerpc
tools/perf: Add some of the arithmetic instructions to support
instruction tracking in powerpc
tools/perf: Add more instructions for instruction tracking
tools/perf: Update instruction tracking for powerpc
tools/perf: Make capstone_init non-static so that it can be used
during symbol disassemble
tools/perf: Use capstone_init and remove open_capstone_handle from
disasm.c
tools/perf: Add support to use libcapstone in powerpc
tools/perf: Add support to find global register variables using
find_data_type_global_reg
tools/perf: Add support for global_die to capture name of variable in
case of register defined variable
tools/perf: Update data_type_cmp and sort__typeoff_sort function to
include var_name in comparison
tools/perf: Set instruction name to be used with insn-stat when using
raw instruction
tools/include/linux/string.h | 2 +
tools/lib/string.c | 13 +
tools/perf/arch/arm64/annotate/instructions.c | 3 +-
.../arch/loongarch/annotate/instructions.c | 6 +-
.../perf/arch/powerpc/annotate/instructions.c | 254 ++++++++
tools/perf/arch/powerpc/util/dwarf-regs.c | 53 ++
tools/perf/arch/s390/annotate/instructions.c | 5 +-
tools/perf/arch/x86/annotate/instructions.c | 377 ++++++++++++
tools/perf/builtin-annotate.c | 4 +-
tools/perf/util/annotate-data.c | 544 ++++--------------
tools/perf/util/annotate-data.h | 83 +++
tools/perf/util/annotate.c | 29 +-
tools/perf/util/annotate.h | 6 +-
tools/perf/util/disasm.c | 468 +++++++++++++--
tools/perf/util/disasm.h | 19 +-
tools/perf/util/dwarf-aux.c | 1 +
tools/perf/util/dwarf-aux.h | 1 +
tools/perf/util/include/dwarf-regs.h | 12 +
tools/perf/util/print_insn.c | 15 +-
tools/perf/util/print_insn.h | 5 +
tools/perf/util/sort.c | 25 +-
tools/perf/util/sort.h | 1 +
22 files changed, 1421 insertions(+), 505 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH V7 01/18] tools/perf: Move the data structures related to register type to header file
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 02/18] tools/perf: Add "update_insn_state" callback function to handle arch specific instruction tracking Athira Rajeev
` (18 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Data type profiling uses instruction tracking by checking each
instruction and updating the register type state in some data
structures. This is useful to find the data type in cases when the
register state gets transferred from one reg to another. Example, in
x86, "mov" instruction and in powerpc, "mr" instruction. Currently these
structures are defined in annotate-data.c and instruction tracking is
implemented only for x86. Move these data structures to
"annotate-data.h" header file so that other arch implementations can use
it in arch specific files as well.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/perf/util/annotate-data.c | 53 +------------------------------
tools/perf/util/annotate-data.h | 56 +++++++++++++++++++++++++++++++++
2 files changed, 57 insertions(+), 52 deletions(-)
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 965da6c0b542..a4c7f98a75e3 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -31,15 +31,6 @@
static void delete_var_types(struct die_var_type *var_types);
-enum type_state_kind {
- TSR_KIND_INVALID = 0,
- TSR_KIND_TYPE,
- TSR_KIND_PERCPU_BASE,
- TSR_KIND_CONST,
- TSR_KIND_POINTER,
- TSR_KIND_CANARY,
-};
-
#define pr_debug_dtp(fmt, ...) \
do { \
if (debug_type_profile) \
@@ -140,49 +131,7 @@ static void pr_debug_location(Dwarf_Die *die, u64 pc, int reg)
}
}
-/*
- * Type information in a register, valid when @ok is true.
- * The @caller_saved registers are invalidated after a function call.
- */
-struct type_state_reg {
- Dwarf_Die type;
- u32 imm_value;
- bool ok;
- bool caller_saved;
- u8 kind;
-};
-
-/* Type information in a stack location, dynamically allocated */
-struct type_state_stack {
- struct list_head list;
- Dwarf_Die type;
- int offset;
- int size;
- bool compound;
- u8 kind;
-};
-
-/* FIXME: This should be arch-dependent */
-#define TYPE_STATE_MAX_REGS 16
-
-/*
- * State table to maintain type info in each register and stack location.
- * It'll be updated when new variable is allocated or type info is moved
- * to a new location (register or stack). As it'd be used with the
- * shortest path of basic blocks, it only maintains a single table.
- */
-struct type_state {
- /* state of general purpose registers */
- struct type_state_reg regs[TYPE_STATE_MAX_REGS];
- /* state of stack location */
- struct list_head stack_vars;
- /* return value register */
- int ret_reg;
- /* stack pointer register */
- int stack_reg;
-};
-
-static bool has_reg_type(struct type_state *state, int reg)
+bool has_reg_type(struct type_state *state, int reg)
{
return (unsigned)reg < ARRAY_SIZE(state->regs);
}
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 0a57d9f5ee78..cdb5cd8960bb 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -6,6 +6,11 @@
#include <linux/compiler.h>
#include <linux/rbtree.h>
#include <linux/types.h>
+#include "annotate.h"
+
+#ifdef HAVE_DWARF_SUPPORT
+#include "debuginfo.h"
+#endif
struct annotated_op_loc;
struct debuginfo;
@@ -15,6 +20,15 @@ struct hist_entry;
struct map_symbol;
struct thread;
+enum type_state_kind {
+ TSR_KIND_INVALID = 0,
+ TSR_KIND_TYPE,
+ TSR_KIND_PERCPU_BASE,
+ TSR_KIND_CONST,
+ TSR_KIND_POINTER,
+ TSR_KIND_CANARY,
+};
+
/**
* struct annotated_member - Type of member field
* @node: List entry in the parent list
@@ -143,6 +157,47 @@ struct annotated_data_stat {
extern struct annotated_data_stat ann_data_stat;
#ifdef HAVE_DWARF_SUPPORT
+/*
+ * Type information in a register, valid when @ok is true.
+ * The @caller_saved registers are invalidated after a function call.
+ */
+struct type_state_reg {
+ Dwarf_Die type;
+ u32 imm_value;
+ bool ok;
+ bool caller_saved;
+ u8 kind;
+};
+
+/* Type information in a stack location, dynamically allocated */
+struct type_state_stack {
+ struct list_head list;
+ Dwarf_Die type;
+ int offset;
+ int size;
+ bool compound;
+ u8 kind;
+};
+
+/* FIXME: This should be arch-dependent */
+#define TYPE_STATE_MAX_REGS 16
+
+/*
+ * State table to maintain type info in each register and stack location.
+ * It'll be updated when new variable is allocated or type info is moved
+ * to a new location (register or stack). As it'd be used with the
+ * shortest path of basic blocks, it only maintains a single table.
+ */
+struct type_state {
+ /* state of general purpose registers */
+ struct type_state_reg regs[TYPE_STATE_MAX_REGS];
+ /* state of stack location */
+ struct list_head stack_vars;
+ /* return value register */
+ int ret_reg;
+ /* stack pointer register */
+ int stack_reg;
+};
/* Returns data type at the location (ip, reg, offset) */
struct annotated_data_type *find_data_type(struct data_loc_info *dloc);
@@ -160,6 +215,7 @@ void global_var_type__tree_delete(struct rb_root *root);
int hist_entry__annotate_data_tty(struct hist_entry *he, struct evsel *evsel);
+bool has_reg_type(struct type_state *state, int reg);
#else /* HAVE_DWARF_SUPPORT */
static inline struct annotated_data_type *
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 02/18] tools/perf: Add "update_insn_state" callback function to handle arch specific instruction tracking
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 01/18] tools/perf: Move the data structures related to register type to header file Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 03/18] tools/perf: Update TYPE_STATE_MAX_REGS to include max of regs in powerpc Athira Rajeev
` (17 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Add "update_insn_state" callback to "struct arch" to handle instruction
tracking. Currently updating instruction state is handled by static
function "update_insn_state_x86" which is defined in "annotate-data.c".
Make this as a callback for specific arch and move to archs specific
file "arch/x86/annotate/instructions.c" . This will help to add helper
function for other platforms in file:
"arch/<platform>/annotate/instructions.c and make changes/updates
easier.
Define callback "update_insn_state" as part of "struct arch", also make
some of the debug functions non-static so that it can be referenced from
other places.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/perf/arch/x86/annotate/instructions.c | 377 +++++++++++++++++++
tools/perf/util/annotate-data.c | 391 +-------------------
tools/perf/util/annotate-data.h | 23 ++
tools/perf/util/disasm.c | 4 +
tools/perf/util/disasm.h | 12 +
5 files changed, 424 insertions(+), 383 deletions(-)
diff --git a/tools/perf/arch/x86/annotate/instructions.c b/tools/perf/arch/x86/annotate/instructions.c
index 5cdf457f5cbe..7b7d462c6c6b 100644
--- a/tools/perf/arch/x86/annotate/instructions.c
+++ b/tools/perf/arch/x86/annotate/instructions.c
@@ -206,3 +206,380 @@ static int x86__annotate_init(struct arch *arch, char *cpuid)
arch->initialized = true;
return err;
}
+
+#ifdef HAVE_DWARF_SUPPORT
+static void update_insn_state_x86(struct type_state *state,
+ struct data_loc_info *dloc, Dwarf_Die *cu_die,
+ struct disasm_line *dl)
+{
+ struct annotated_insn_loc loc;
+ struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
+ struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
+ struct type_state_reg *tsr;
+ Dwarf_Die type_die;
+ u32 insn_offset = dl->al.offset;
+ int fbreg = dloc->fbreg;
+ int fboff = 0;
+
+ if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
+ return;
+
+ if (ins__is_call(&dl->ins)) {
+ struct symbol *func = dl->ops.target.sym;
+
+ if (func == NULL)
+ return;
+
+ /* __fentry__ will preserve all registers */
+ if (!strcmp(func->name, "__fentry__"))
+ return;
+
+ pr_debug_dtp("call [%x] %s\n", insn_offset, func->name);
+
+ /* Otherwise invalidate caller-saved registers after call */
+ for (unsigned i = 0; i < ARRAY_SIZE(state->regs); i++) {
+ if (state->regs[i].caller_saved)
+ state->regs[i].ok = false;
+ }
+
+ /* Update register with the return type (if any) */
+ if (die_find_func_rettype(cu_die, func->name, &type_die)) {
+ tsr = &state->regs[state->ret_reg];
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ pr_debug_dtp("call [%x] return -> reg%d",
+ insn_offset, state->ret_reg);
+ pr_debug_type_name(&type_die, tsr->kind);
+ }
+ return;
+ }
+
+ if (!strncmp(dl->ins.name, "add", 3)) {
+ u64 imm_value = -1ULL;
+ int offset;
+ const char *var_name = NULL;
+ struct map_symbol *ms = dloc->ms;
+ u64 ip = ms->sym->start + dl->al.offset;
+
+ if (!has_reg_type(state, dst->reg1))
+ return;
+
+ tsr = &state->regs[dst->reg1];
+
+ if (src->imm)
+ imm_value = src->offset;
+ else if (has_reg_type(state, src->reg1) &&
+ state->regs[src->reg1].kind == TSR_KIND_CONST)
+ imm_value = state->regs[src->reg1].imm_value;
+ else if (src->reg1 == DWARF_REG_PC) {
+ u64 var_addr = annotate_calc_pcrel(dloc->ms, ip,
+ src->offset, dl);
+
+ if (get_global_var_info(dloc, var_addr,
+ &var_name, &offset) &&
+ !strcmp(var_name, "this_cpu_off") &&
+ tsr->kind == TSR_KIND_CONST) {
+ tsr->kind = TSR_KIND_PERCPU_BASE;
+ imm_value = tsr->imm_value;
+ }
+ }
+ else
+ return;
+
+ if (tsr->kind != TSR_KIND_PERCPU_BASE)
+ return;
+
+ if (get_global_var_type(cu_die, dloc, ip, imm_value, &offset,
+ &type_die) && offset == 0) {
+ /*
+ * This is not a pointer type, but it should be treated
+ * as a pointer.
+ */
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_POINTER;
+ tsr->ok = true;
+
+ pr_debug_dtp("add [%x] percpu %#"PRIx64" -> reg%d",
+ insn_offset, imm_value, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ return;
+ }
+
+ if (strncmp(dl->ins.name, "mov", 3))
+ return;
+
+ if (dloc->fb_cfa) {
+ u64 ip = dloc->ms->sym->start + dl->al.offset;
+ u64 pc = map__rip_2objdump(dloc->ms->map, ip);
+
+ if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0)
+ fbreg = -1;
+ }
+
+ /* Case 1. register to register or segment:offset to register transfers */
+ if (!src->mem_ref && !dst->mem_ref) {
+ if (!has_reg_type(state, dst->reg1))
+ return;
+
+ tsr = &state->regs[dst->reg1];
+ if (dso__kernel(map__dso(dloc->ms->map)) &&
+ src->segment == INSN_SEG_X86_GS && src->imm) {
+ u64 ip = dloc->ms->sym->start + dl->al.offset;
+ u64 var_addr;
+ int offset;
+
+ /*
+ * In kernel, %gs points to a per-cpu region for the
+ * current CPU. Access with a constant offset should
+ * be treated as a global variable access.
+ */
+ var_addr = src->offset;
+
+ if (var_addr == 40) {
+ tsr->kind = TSR_KIND_CANARY;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] stack canary -> reg%d\n",
+ insn_offset, dst->reg1);
+ return;
+ }
+
+ if (!get_global_var_type(cu_die, dloc, ip, var_addr,
+ &offset, &type_die) ||
+ !die_get_member_type(&type_die, offset, &type_die)) {
+ tsr->ok = false;
+ return;
+ }
+
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] this-cpu addr=%#"PRIx64" -> reg%d",
+ insn_offset, var_addr, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ return;
+ }
+
+ if (src->imm) {
+ tsr->kind = TSR_KIND_CONST;
+ tsr->imm_value = src->offset;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] imm=%#x -> reg%d\n",
+ insn_offset, tsr->imm_value, dst->reg1);
+ return;
+ }
+
+ if (!has_reg_type(state, src->reg1) ||
+ !state->regs[src->reg1].ok) {
+ tsr->ok = false;
+ return;
+ }
+
+ tsr->type = state->regs[src->reg1].type;
+ tsr->kind = state->regs[src->reg1].kind;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] reg%d -> reg%d",
+ insn_offset, src->reg1, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ /* Case 2. memory to register transers */
+ if (src->mem_ref && !dst->mem_ref) {
+ int sreg = src->reg1;
+
+ if (!has_reg_type(state, dst->reg1))
+ return;
+
+ tsr = &state->regs[dst->reg1];
+
+retry:
+ /* Check stack variables with offset */
+ if (sreg == fbreg) {
+ struct type_state_stack *stack;
+ int offset = src->offset - fboff;
+
+ stack = find_stack_state(state, offset);
+ if (stack == NULL) {
+ tsr->ok = false;
+ return;
+ } else if (!stack->compound) {
+ tsr->type = stack->type;
+ tsr->kind = stack->kind;
+ tsr->ok = true;
+ } else if (die_get_member_type(&stack->type,
+ offset - stack->offset,
+ &type_die)) {
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+ } else {
+ tsr->ok = false;
+ return;
+ }
+
+ pr_debug_dtp("mov [%x] -%#x(stack) -> reg%d",
+ insn_offset, -offset, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ /* And then dereference the pointer if it has one */
+ else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
+ state->regs[sreg].kind == TSR_KIND_TYPE &&
+ die_deref_ptr_type(&state->regs[sreg].type,
+ src->offset, &type_die)) {
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] %#x(reg%d) -> reg%d",
+ insn_offset, src->offset, sreg, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ /* Or check if it's a global variable */
+ else if (sreg == DWARF_REG_PC) {
+ struct map_symbol *ms = dloc->ms;
+ u64 ip = ms->sym->start + dl->al.offset;
+ u64 addr;
+ int offset;
+
+ addr = annotate_calc_pcrel(ms, ip, src->offset, dl);
+
+ if (!get_global_var_type(cu_die, dloc, ip, addr, &offset,
+ &type_die) ||
+ !die_get_member_type(&type_die, offset, &type_die)) {
+ tsr->ok = false;
+ return;
+ }
+
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] global addr=%"PRIx64" -> reg%d",
+ insn_offset, addr, dst->reg1);
+ pr_debug_type_name(&type_die, tsr->kind);
+ }
+ /* And check percpu access with base register */
+ else if (has_reg_type(state, sreg) &&
+ state->regs[sreg].kind == TSR_KIND_PERCPU_BASE) {
+ u64 ip = dloc->ms->sym->start + dl->al.offset;
+ u64 var_addr = src->offset;
+ int offset;
+
+ if (src->multi_regs) {
+ int reg2 = (sreg == src->reg1) ? src->reg2 : src->reg1;
+
+ if (has_reg_type(state, reg2) && state->regs[reg2].ok &&
+ state->regs[reg2].kind == TSR_KIND_CONST)
+ var_addr += state->regs[reg2].imm_value;
+ }
+
+ /*
+ * In kernel, %gs points to a per-cpu region for the
+ * current CPU. Access with a constant offset should
+ * be treated as a global variable access.
+ */
+ if (get_global_var_type(cu_die, dloc, ip, var_addr,
+ &offset, &type_die) &&
+ die_get_member_type(&type_die, offset, &type_die)) {
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ if (src->multi_regs) {
+ pr_debug_dtp("mov [%x] percpu %#x(reg%d,reg%d) -> reg%d",
+ insn_offset, src->offset, src->reg1,
+ src->reg2, dst->reg1);
+ } else {
+ pr_debug_dtp("mov [%x] percpu %#x(reg%d) -> reg%d",
+ insn_offset, src->offset, sreg, dst->reg1);
+ }
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ } else {
+ tsr->ok = false;
+ }
+ }
+ /* And then dereference the calculated pointer if it has one */
+ else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
+ state->regs[sreg].kind == TSR_KIND_POINTER &&
+ die_get_member_type(&state->regs[sreg].type,
+ src->offset, &type_die)) {
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] pointer %#x(reg%d) -> reg%d",
+ insn_offset, src->offset, sreg, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ /* Or try another register if any */
+ else if (src->multi_regs && sreg == src->reg1 &&
+ src->reg1 != src->reg2) {
+ sreg = src->reg2;
+ goto retry;
+ }
+ else {
+ int offset;
+ const char *var_name = NULL;
+
+ /* it might be per-cpu variable (in kernel) access */
+ if (src->offset < 0) {
+ if (get_global_var_info(dloc, (s64)src->offset,
+ &var_name, &offset) &&
+ !strcmp(var_name, "__per_cpu_offset")) {
+ tsr->kind = TSR_KIND_PERCPU_BASE;
+
+ pr_debug_dtp("mov [%x] percpu base reg%d\n",
+ insn_offset, dst->reg1);
+ }
+ }
+
+ tsr->ok = false;
+ }
+ }
+ /* Case 3. register to memory transfers */
+ if (!src->mem_ref && dst->mem_ref) {
+ if (!has_reg_type(state, src->reg1) ||
+ !state->regs[src->reg1].ok)
+ return;
+
+ /* Check stack variables with offset */
+ if (dst->reg1 == fbreg) {
+ struct type_state_stack *stack;
+ int offset = dst->offset - fboff;
+
+ tsr = &state->regs[src->reg1];
+
+ stack = find_stack_state(state, offset);
+ if (stack) {
+ /*
+ * The source register is likely to hold a type
+ * of member if it's a compound type. Do not
+ * update the stack variable type since we can
+ * get the member type later by using the
+ * die_get_member_type().
+ */
+ if (!stack->compound)
+ set_stack_state(stack, offset, tsr->kind,
+ &tsr->type);
+ } else {
+ findnew_stack_state(state, offset, tsr->kind,
+ &tsr->type);
+ }
+
+ pr_debug_dtp("mov [%x] reg%d -> -%#x(stack)",
+ insn_offset, src->reg1, -offset);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ /*
+ * Ignore other transfers since it'd set a value in a struct
+ * and won't change the type.
+ */
+ }
+ /* Case 4. memory to memory transfers (not handled for now) */
+}
+#endif
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index a4c7f98a75e3..7a48c3d72b89 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -39,7 +39,7 @@ do { \
pr_debug3(fmt, ##__VA_ARGS__); \
} while (0)
-static void pr_debug_type_name(Dwarf_Die *die, enum type_state_kind kind)
+void pr_debug_type_name(Dwarf_Die *die, enum type_state_kind kind)
{
struct strbuf sb;
char *str;
@@ -390,7 +390,7 @@ static int check_variable(struct data_loc_info *dloc, Dwarf_Die *var_die,
return 0;
}
-static struct type_state_stack *find_stack_state(struct type_state *state,
+struct type_state_stack *find_stack_state(struct type_state *state,
int offset)
{
struct type_state_stack *stack;
@@ -406,7 +406,7 @@ static struct type_state_stack *find_stack_state(struct type_state *state,
return NULL;
}
-static void set_stack_state(struct type_state_stack *stack, int offset, u8 kind,
+void set_stack_state(struct type_state_stack *stack, int offset, u8 kind,
Dwarf_Die *type_die)
{
int tag;
@@ -433,7 +433,7 @@ static void set_stack_state(struct type_state_stack *stack, int offset, u8 kind,
}
}
-static struct type_state_stack *findnew_stack_state(struct type_state *state,
+struct type_state_stack *findnew_stack_state(struct type_state *state,
int offset, u8 kind,
Dwarf_Die *type_die)
{
@@ -537,7 +537,7 @@ void global_var_type__tree_delete(struct rb_root *root)
}
}
-static bool get_global_var_info(struct data_loc_info *dloc, u64 addr,
+bool get_global_var_info(struct data_loc_info *dloc, u64 addr,
const char **var_name, int *var_offset)
{
struct addr_location al;
@@ -611,7 +611,7 @@ static void global_var__collect(struct data_loc_info *dloc)
}
}
-static bool get_global_var_type(Dwarf_Die *cu_die, struct data_loc_info *dloc,
+bool get_global_var_type(Dwarf_Die *cu_die, struct data_loc_info *dloc,
u64 ip, u64 var_addr, int *var_offset,
Dwarf_Die *type_die)
{
@@ -722,381 +722,6 @@ static void update_var_state(struct type_state *state, struct data_loc_info *dlo
}
}
-static void update_insn_state_x86(struct type_state *state,
- struct data_loc_info *dloc, Dwarf_Die *cu_die,
- struct disasm_line *dl)
-{
- struct annotated_insn_loc loc;
- struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
- struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
- struct type_state_reg *tsr;
- Dwarf_Die type_die;
- u32 insn_offset = dl->al.offset;
- int fbreg = dloc->fbreg;
- int fboff = 0;
-
- if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
- return;
-
- if (ins__is_call(&dl->ins)) {
- struct symbol *func = dl->ops.target.sym;
-
- if (func == NULL)
- return;
-
- /* __fentry__ will preserve all registers */
- if (!strcmp(func->name, "__fentry__"))
- return;
-
- pr_debug_dtp("call [%x] %s\n", insn_offset, func->name);
-
- /* Otherwise invalidate caller-saved registers after call */
- for (unsigned i = 0; i < ARRAY_SIZE(state->regs); i++) {
- if (state->regs[i].caller_saved)
- state->regs[i].ok = false;
- }
-
- /* Update register with the return type (if any) */
- if (die_find_func_rettype(cu_die, func->name, &type_die)) {
- tsr = &state->regs[state->ret_reg];
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- pr_debug_dtp("call [%x] return -> reg%d",
- insn_offset, state->ret_reg);
- pr_debug_type_name(&type_die, tsr->kind);
- }
- return;
- }
-
- if (!strncmp(dl->ins.name, "add", 3)) {
- u64 imm_value = -1ULL;
- int offset;
- const char *var_name = NULL;
- struct map_symbol *ms = dloc->ms;
- u64 ip = ms->sym->start + dl->al.offset;
-
- if (!has_reg_type(state, dst->reg1))
- return;
-
- tsr = &state->regs[dst->reg1];
-
- if (src->imm)
- imm_value = src->offset;
- else if (has_reg_type(state, src->reg1) &&
- state->regs[src->reg1].kind == TSR_KIND_CONST)
- imm_value = state->regs[src->reg1].imm_value;
- else if (src->reg1 == DWARF_REG_PC) {
- u64 var_addr = annotate_calc_pcrel(dloc->ms, ip,
- src->offset, dl);
-
- if (get_global_var_info(dloc, var_addr,
- &var_name, &offset) &&
- !strcmp(var_name, "this_cpu_off") &&
- tsr->kind == TSR_KIND_CONST) {
- tsr->kind = TSR_KIND_PERCPU_BASE;
- imm_value = tsr->imm_value;
- }
- }
- else
- return;
-
- if (tsr->kind != TSR_KIND_PERCPU_BASE)
- return;
-
- if (get_global_var_type(cu_die, dloc, ip, imm_value, &offset,
- &type_die) && offset == 0) {
- /*
- * This is not a pointer type, but it should be treated
- * as a pointer.
- */
- tsr->type = type_die;
- tsr->kind = TSR_KIND_POINTER;
- tsr->ok = true;
-
- pr_debug_dtp("add [%x] percpu %#"PRIx64" -> reg%d",
- insn_offset, imm_value, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- return;
- }
-
- if (strncmp(dl->ins.name, "mov", 3))
- return;
-
- if (dloc->fb_cfa) {
- u64 ip = dloc->ms->sym->start + dl->al.offset;
- u64 pc = map__rip_2objdump(dloc->ms->map, ip);
-
- if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0)
- fbreg = -1;
- }
-
- /* Case 1. register to register or segment:offset to register transfers */
- if (!src->mem_ref && !dst->mem_ref) {
- if (!has_reg_type(state, dst->reg1))
- return;
-
- tsr = &state->regs[dst->reg1];
- if (dso__kernel(map__dso(dloc->ms->map)) &&
- src->segment == INSN_SEG_X86_GS && src->imm) {
- u64 ip = dloc->ms->sym->start + dl->al.offset;
- u64 var_addr;
- int offset;
-
- /*
- * In kernel, %gs points to a per-cpu region for the
- * current CPU. Access with a constant offset should
- * be treated as a global variable access.
- */
- var_addr = src->offset;
-
- if (var_addr == 40) {
- tsr->kind = TSR_KIND_CANARY;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] stack canary -> reg%d\n",
- insn_offset, dst->reg1);
- return;
- }
-
- if (!get_global_var_type(cu_die, dloc, ip, var_addr,
- &offset, &type_die) ||
- !die_get_member_type(&type_die, offset, &type_die)) {
- tsr->ok = false;
- return;
- }
-
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] this-cpu addr=%#"PRIx64" -> reg%d",
- insn_offset, var_addr, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- return;
- }
-
- if (src->imm) {
- tsr->kind = TSR_KIND_CONST;
- tsr->imm_value = src->offset;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] imm=%#x -> reg%d\n",
- insn_offset, tsr->imm_value, dst->reg1);
- return;
- }
-
- if (!has_reg_type(state, src->reg1) ||
- !state->regs[src->reg1].ok) {
- tsr->ok = false;
- return;
- }
-
- tsr->type = state->regs[src->reg1].type;
- tsr->kind = state->regs[src->reg1].kind;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] reg%d -> reg%d",
- insn_offset, src->reg1, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- /* Case 2. memory to register transers */
- if (src->mem_ref && !dst->mem_ref) {
- int sreg = src->reg1;
-
- if (!has_reg_type(state, dst->reg1))
- return;
-
- tsr = &state->regs[dst->reg1];
-
-retry:
- /* Check stack variables with offset */
- if (sreg == fbreg) {
- struct type_state_stack *stack;
- int offset = src->offset - fboff;
-
- stack = find_stack_state(state, offset);
- if (stack == NULL) {
- tsr->ok = false;
- return;
- } else if (!stack->compound) {
- tsr->type = stack->type;
- tsr->kind = stack->kind;
- tsr->ok = true;
- } else if (die_get_member_type(&stack->type,
- offset - stack->offset,
- &type_die)) {
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
- } else {
- tsr->ok = false;
- return;
- }
-
- pr_debug_dtp("mov [%x] -%#x(stack) -> reg%d",
- insn_offset, -offset, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- /* And then dereference the pointer if it has one */
- else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
- state->regs[sreg].kind == TSR_KIND_TYPE &&
- die_deref_ptr_type(&state->regs[sreg].type,
- src->offset, &type_die)) {
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] %#x(reg%d) -> reg%d",
- insn_offset, src->offset, sreg, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- /* Or check if it's a global variable */
- else if (sreg == DWARF_REG_PC) {
- struct map_symbol *ms = dloc->ms;
- u64 ip = ms->sym->start + dl->al.offset;
- u64 addr;
- int offset;
-
- addr = annotate_calc_pcrel(ms, ip, src->offset, dl);
-
- if (!get_global_var_type(cu_die, dloc, ip, addr, &offset,
- &type_die) ||
- !die_get_member_type(&type_die, offset, &type_die)) {
- tsr->ok = false;
- return;
- }
-
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] global addr=%"PRIx64" -> reg%d",
- insn_offset, addr, dst->reg1);
- pr_debug_type_name(&type_die, tsr->kind);
- }
- /* And check percpu access with base register */
- else if (has_reg_type(state, sreg) &&
- state->regs[sreg].kind == TSR_KIND_PERCPU_BASE) {
- u64 ip = dloc->ms->sym->start + dl->al.offset;
- u64 var_addr = src->offset;
- int offset;
-
- if (src->multi_regs) {
- int reg2 = (sreg == src->reg1) ? src->reg2 : src->reg1;
-
- if (has_reg_type(state, reg2) && state->regs[reg2].ok &&
- state->regs[reg2].kind == TSR_KIND_CONST)
- var_addr += state->regs[reg2].imm_value;
- }
-
- /*
- * In kernel, %gs points to a per-cpu region for the
- * current CPU. Access with a constant offset should
- * be treated as a global variable access.
- */
- if (get_global_var_type(cu_die, dloc, ip, var_addr,
- &offset, &type_die) &&
- die_get_member_type(&type_die, offset, &type_die)) {
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- if (src->multi_regs) {
- pr_debug_dtp("mov [%x] percpu %#x(reg%d,reg%d) -> reg%d",
- insn_offset, src->offset, src->reg1,
- src->reg2, dst->reg1);
- } else {
- pr_debug_dtp("mov [%x] percpu %#x(reg%d) -> reg%d",
- insn_offset, src->offset, sreg, dst->reg1);
- }
- pr_debug_type_name(&tsr->type, tsr->kind);
- } else {
- tsr->ok = false;
- }
- }
- /* And then dereference the calculated pointer if it has one */
- else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
- state->regs[sreg].kind == TSR_KIND_POINTER &&
- die_get_member_type(&state->regs[sreg].type,
- src->offset, &type_die)) {
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] pointer %#x(reg%d) -> reg%d",
- insn_offset, src->offset, sreg, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- /* Or try another register if any */
- else if (src->multi_regs && sreg == src->reg1 &&
- src->reg1 != src->reg2) {
- sreg = src->reg2;
- goto retry;
- }
- else {
- int offset;
- const char *var_name = NULL;
-
- /* it might be per-cpu variable (in kernel) access */
- if (src->offset < 0) {
- if (get_global_var_info(dloc, (s64)src->offset,
- &var_name, &offset) &&
- !strcmp(var_name, "__per_cpu_offset")) {
- tsr->kind = TSR_KIND_PERCPU_BASE;
-
- pr_debug_dtp("mov [%x] percpu base reg%d\n",
- insn_offset, dst->reg1);
- }
- }
-
- tsr->ok = false;
- }
- }
- /* Case 3. register to memory transfers */
- if (!src->mem_ref && dst->mem_ref) {
- if (!has_reg_type(state, src->reg1) ||
- !state->regs[src->reg1].ok)
- return;
-
- /* Check stack variables with offset */
- if (dst->reg1 == fbreg) {
- struct type_state_stack *stack;
- int offset = dst->offset - fboff;
-
- tsr = &state->regs[src->reg1];
-
- stack = find_stack_state(state, offset);
- if (stack) {
- /*
- * The source register is likely to hold a type
- * of member if it's a compound type. Do not
- * update the stack variable type since we can
- * get the member type later by using the
- * die_get_member_type().
- */
- if (!stack->compound)
- set_stack_state(stack, offset, tsr->kind,
- &tsr->type);
- } else {
- findnew_stack_state(state, offset, tsr->kind,
- &tsr->type);
- }
-
- pr_debug_dtp("mov [%x] reg%d -> -%#x(stack)",
- insn_offset, src->reg1, -offset);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- /*
- * Ignore other transfers since it'd set a value in a struct
- * and won't change the type.
- */
- }
- /* Case 4. memory to memory transfers (not handled for now) */
-}
-
/**
* update_insn_state - Update type state for an instruction
* @state: type state table
@@ -1115,8 +740,8 @@ static void update_insn_state_x86(struct type_state *state,
static void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
Dwarf_Die *cu_die, struct disasm_line *dl)
{
- if (arch__is(dloc->arch, "x86"))
- update_insn_state_x86(state, dloc, cu_die, dl);
+ if (dloc->arch->update_insn_state)
+ dloc->arch->update_insn_state(state, dloc, cu_die, dl);
}
/*
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index cdb5cd8960bb..6fe8ee8b8410 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -6,6 +6,7 @@
#include <linux/compiler.h>
#include <linux/rbtree.h>
#include <linux/types.h>
+#include "dwarf-regs.h"
#include "annotate.h"
#ifdef HAVE_DWARF_SUPPORT
@@ -20,6 +21,14 @@ struct hist_entry;
struct map_symbol;
struct thread;
+#define pr_debug_dtp(fmt, ...) \
+do { \
+ if (debug_type_profile) \
+ pr_info(fmt, ##__VA_ARGS__); \
+ else \
+ pr_debug3(fmt, ##__VA_ARGS__); \
+} while (0)
+
enum type_state_kind {
TSR_KIND_INVALID = 0,
TSR_KIND_TYPE,
@@ -216,6 +225,20 @@ void global_var_type__tree_delete(struct rb_root *root);
int hist_entry__annotate_data_tty(struct hist_entry *he, struct evsel *evsel);
bool has_reg_type(struct type_state *state, int reg);
+struct type_state_stack *findnew_stack_state(struct type_state *state,
+ int offset, u8 kind,
+ Dwarf_Die *type_die);
+void set_stack_state(struct type_state_stack *stack, int offset, u8 kind,
+ Dwarf_Die *type_die);
+struct type_state_stack *find_stack_state(struct type_state *state,
+ int offset);
+bool get_global_var_type(Dwarf_Die *cu_die, struct data_loc_info *dloc,
+ u64 ip, u64 var_addr, int *var_offset,
+ Dwarf_Die *type_die);
+bool get_global_var_info(struct data_loc_info *dloc, u64 addr,
+ const char **var_name, int *var_offset);
+void pr_debug_type_name(Dwarf_Die *die, enum type_state_kind kind);
+
#else /* HAVE_DWARF_SUPPORT */
static inline struct annotated_data_type *
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 72aec8f61b94..d2723ba024bf 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -12,6 +12,7 @@
#include <subcmd/run-command.h>
#include "annotate.h"
+#include "annotate-data.h"
#include "build-id.h"
#include "debug.h"
#include "disasm.h"
@@ -145,6 +146,9 @@ static struct arch architectures[] = {
.memory_ref_char = '(',
.imm_char = '$',
},
+#ifdef HAVE_DWARF_SUPPORT
+ .update_insn_state = update_insn_state_x86,
+#endif
},
{
.name = "powerpc",
diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
index 3d381a043520..c835759c8e2b 100644
--- a/tools/perf/util/disasm.h
+++ b/tools/perf/util/disasm.h
@@ -4,11 +4,18 @@
#include "map_symbol.h"
+#ifdef HAVE_DWARF_SUPPORT
+#include "dwarf-aux.h"
+#endif
+
struct annotation_options;
struct disasm_line;
struct ins;
struct evsel;
struct symbol;
+struct data_loc_info;
+struct type_state;
+struct disasm_line;
struct arch {
const char *name;
@@ -32,6 +39,11 @@ struct arch {
char memory_ref_char;
char imm_char;
} objdump;
+#ifdef HAVE_DWARF_SUPPORT
+ void (*update_insn_state)(struct type_state *state,
+ struct data_loc_info *dloc, Dwarf_Die *cu_die,
+ struct disasm_line *dl);
+#endif
};
struct ins {
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 03/18] tools/perf: Update TYPE_STATE_MAX_REGS to include max of regs in powerpc
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 01/18] tools/perf: Move the data structures related to register type to header file Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 02/18] tools/perf: Add "update_insn_state" callback function to handle arch specific instruction tracking Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 04/18] tools/perf: Add disasm_line__parse to parse raw instruction for powerpc Athira Rajeev
` (16 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
TYPE_STATE_MAX_REGS is arch-dependent. Currently this is defined
to be 16. While checking if reg is valid using has_reg_type,
max value is checked using TYPE_STATE_MAX_REGS value. Define
this conditionally for powerpc.
Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com>
---
tools/perf/util/annotate-data.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 6fe8ee8b8410..992b7ce4bd11 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -189,7 +189,11 @@ struct type_state_stack {
};
/* FIXME: This should be arch-dependent */
+#ifdef __powerpc__
+#define TYPE_STATE_MAX_REGS 32
+#else
#define TYPE_STATE_MAX_REGS 16
+#endif
/*
* State table to maintain type info in each register and stack location.
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 04/18] tools/perf: Add disasm_line__parse to parse raw instruction for powerpc
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (2 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 03/18] tools/perf: Update TYPE_STATE_MAX_REGS to include max of regs in powerpc Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 05/18] tools/perf: Add support to capture and parse raw instruction in powerpc using dso__data_read_offset utility Athira Rajeev
` (15 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Currently, the perf tool infrastructure disasm_line__parse function to
parse disassembled line.
Example snippet from objdump:
objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>
c0000000010224b4: lwz r10,0(r9)
This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset. In powerpc, the approach for data type
profiling uses raw instruction instead of result from objdump to identify
the instruction category and extract the source/target registers.
Example: 38 01 81 e8 ld r4,312(r1)
Here "38 01 81 e8" is the raw instruction representation. Add function
"disasm_line__parse_powerpc" to handle parsing of raw instruction.
Also update "struct disasm_line" to save the binary code/
With the change, function captures:
line -> "38 01 81 e8 ld r4,312(r1)"
raw instruction "38 01 81 e8"
Raw instruction is used later to extract the reg/offset fields. Macros
are added to extract opcode and register fields. "struct disasm_line"
is updated to carry union of "bytes" and "raw_insn" of 32 bit to carry raw
code (raw). Function "disasm_line__parse_powerpc fills the raw
instruction hex value and can use macros to get opcode. There is no
changes in existing code paths, which parses the disassembled code.
The size of raw instruction depends on architecture. In case of powerpc,
the parsing the disasm line needs to handle cases for reading binary code
directly from DSO as well as parsing the objdump result. Hence adding
the logic into separate function instead of updating "disasm_line__parse".
The architecture using the instruction name and present approach is
not altered. Since this approach targets powerpc, the macro
implementation is added for powerpc as of now.
Since the disasm_line__parse is used in other cases (perf annotate) and
not only data tye profiling, the powerpc callback includes changes to
work with binary code as well as mneumonic representation. Also in case
if the DSO read fails and libcapstone is not supported, the approach
fallback to use objdump as option. Hence as option, patch has changes to
ensure objdump option also works well.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/include/linux/string.h | 2 +
tools/lib/string.c | 13 +++++
.../perf/arch/powerpc/annotate/instructions.c | 1 +
tools/perf/arch/powerpc/util/dwarf-regs.c | 9 ++++
tools/perf/util/annotate.h | 5 +-
tools/perf/util/disasm.c | 48 ++++++++++++++++++-
6 files changed, 76 insertions(+), 2 deletions(-)
diff --git a/tools/include/linux/string.h b/tools/include/linux/string.h
index db5c99318c79..0acb1fc14e19 100644
--- a/tools/include/linux/string.h
+++ b/tools/include/linux/string.h
@@ -46,5 +46,7 @@ extern char * __must_check skip_spaces(const char *);
extern char *strim(char *);
+extern void remove_spaces(char *s);
+
extern void *memchr_inv(const void *start, int c, size_t bytes);
#endif /* _TOOLS_LINUX_STRING_H_ */
diff --git a/tools/lib/string.c b/tools/lib/string.c
index 8b6892f959ab..3126d2cff716 100644
--- a/tools/lib/string.c
+++ b/tools/lib/string.c
@@ -153,6 +153,19 @@ char *strim(char *s)
return skip_spaces(s);
}
+/*
+ * remove_spaces - Removes whitespaces from @s
+ */
+void remove_spaces(char *s)
+{
+ char *d = s;
+
+ do {
+ while (*d == ' ')
+ ++d;
+ } while ((*s++ = *d++));
+}
+
/**
* strreplace - Replace all occurrences of character in string.
* @s: The string to operate on.
diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index a3f423c27cae..d57fd023ef9c 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -55,6 +55,7 @@ static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
arch->initialized = true;
arch->associate_instruction_ops = powerpc__associate_instruction_ops;
arch->objdump.comment_char = '#';
+ annotate_opts.show_asm_raw = true;
}
return 0;
diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c
index 0c4f4caf53ac..430623ca5612 100644
--- a/tools/perf/arch/powerpc/util/dwarf-regs.c
+++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
@@ -98,3 +98,12 @@ int regs_query_register_offset(const char *name)
return roff->ptregs_offset;
return -EINVAL;
}
+
+#define PPC_OP(op) (((op) >> 26) & 0x3F)
+#define PPC_RA(a) (((a) >> 16) & 0x1f)
+#define PPC_RT(t) (((t) >> 21) & 0x1f)
+#define PPC_RB(b) (((b) >> 11) & 0x1f)
+#define PPC_D(D) ((D) & 0xfffe)
+#define PPC_DS(DS) ((DS) & 0xfffc)
+#define OP_LD 58
+#define OP_STD 62
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index d5c821c22f79..9ba772f46270 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -113,7 +113,10 @@ struct annotation_line {
struct disasm_line {
struct ins ins;
struct ins_operands ops;
-
+ union {
+ u8 bytes[4];
+ u32 raw_insn;
+ } raw;
/* This needs to be at the end. */
struct annotation_line al;
};
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index d2723ba024bf..a53591a6111e 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -44,6 +44,7 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t size,
static void ins__sort(struct arch *arch);
static int disasm_line__parse(char *line, const char **namep, char **rawp);
+static int disasm_line__parse_powerpc(struct disasm_line *dl);
static __attribute__((constructor)) void symbol__init_regexpr(void)
{
@@ -845,6 +846,48 @@ static int disasm_line__parse(char *line, const char **namep, char **rawp)
return -1;
}
+/*
+ * Parses the result captured from symbol__disassemble_*
+ * Example, line read from DSO file in powerpc:
+ * line: 38 01 81 e8
+ * opcode: fetched from arch specific get_opcode_insn
+ * rawp_insn: e8810138
+ *
+ * rawp_insn is used later to extract the reg/offset fields
+ */
+#define PPC_OP(op) (((op) >> 26) & 0x3F)
+#define RAW_BYTES 11
+
+static int disasm_line__parse_powerpc(struct disasm_line *dl)
+{
+ char *line = dl->al.line;
+ const char **namep = &dl->ins.name;
+ char **rawp = &dl->ops.raw;
+ char *tmp_raw_insn, *name_raw_insn = skip_spaces(line);
+ char *name = skip_spaces(name_raw_insn + RAW_BYTES);
+ int objdump = 0;
+
+ if (strlen(line) > RAW_BYTES)
+ objdump = 1;
+
+ if (name_raw_insn[0] == '\0')
+ return -1;
+
+ if (objdump) {
+ disasm_line__parse(name, namep, rawp);
+ } else
+ *namep = "";
+
+ tmp_raw_insn = strndup(name_raw_insn, 11);
+ remove_spaces(tmp_raw_insn);
+
+ sscanf(tmp_raw_insn, "%x", &dl->raw.raw_insn);
+ if (objdump)
+ dl->raw.raw_insn = be32_to_cpu(dl->raw.raw_insn);
+
+ return 0;
+}
+
static void annotation_line__init(struct annotation_line *al,
struct annotate_args *args,
int nr)
@@ -898,7 +941,10 @@ struct disasm_line *disasm_line__new(struct annotate_args *args)
goto out_delete;
if (args->offset != -1) {
- if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
+ if (arch__is(args->arch, "powerpc")) {
+ if (disasm_line__parse_powerpc(dl) < 0)
+ goto out_free_line;
+ } else if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
goto out_free_line;
disasm_line__init_ins(dl, args->arch, &args->ms);
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 05/18] tools/perf: Add support to capture and parse raw instruction in powerpc using dso__data_read_offset utility
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (3 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 04/18] tools/perf: Add disasm_line__parse to parse raw instruction for powerpc Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 06/18] tools/perf: Update parameters for reg extract functions to use raw instruction on powerpc Athira Rajeev
` (14 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Add support to capture and parse raw instruction in powerpc.
Currently, the perf tool infrastructure uses two ways to disassemble
and understand the instruction. One is objdump and other option is
via libcapstone.
Currently, the perf tool infrastructure uses "--no-show-raw-insn" option
with "objdump" while disassemble. Example from powerpc with this option
for an instruction address is:
Snippet from:
objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>
c0000000010224b4: lwz r10,0(r9)
This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset. Also to find whether there is a memory
reference in the operands, "memory_ref_char" field of objdump is used.
For x86, "(" is used as memory_ref_char to tackle instructions of the
form "mov (%rax), %rcx".
In case of powerpc, not all instructions using "(" are the only memory
instructions. Example, above instruction can also be of extended form (X
form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
and extract the source/target registers, patch adds support to use raw
instruction for powerpc. Approach used is to read the raw instruction
directly from the DSO file using "dso__data_read_offset" utility which
is already implemented in perf infrastructure in "util/dso.c".
Example:
38 01 81 e8 ld r4,312(r1)
Here "38 01 81 e8" is the raw instruction representation. In powerpc,
this translates to instruction form: "ld RT,DS(RA)" and binary code
as:
| 58 | RT | RA | DS | |
-------------------------------------
0 6 11 16 30 31
Function "symbol__disassemble_dso" is updated to read raw instruction
directly from DSO using dso__data_read_offset utility. In case of
above example, this captures:
line: 38 01 81 e8
The above works well when perf report is invoked with only sort keys for
data type ie type and typeoff. Because there is no instruction level
annotation needed if only data type information is requested for. For
annotating sample, along with type and typeoff sort key, "sym" sort key
is also needed. And by default invoking just "perf report" uses sort key
"sym" that displays the symbol information.
With approach changes in powerpc which first reads DSO for raw
instruction, "perf annotate" and "perf report" + a key breaks since
it doesn't do the instruction level disassembly.
Snippet of result from perf report:
Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238
do_work /usr/bin/pmlogger [Percent: local period]
Percent│ ea230010
│ 3a550010
│ 3a600000
│ 38f60001
│ 39490008
│ 42400438
51.44 │ 81290008
│ 7d485378
Here, raw instruction is displayed in the output instead of human
readable annotated form.
One way to get the appropriate data is to specify "--objdump path", by
which code annotation will be done. But the default behaviour will be
changed. To fix this breakage, check if "sym" sort key is set. If so
fallback and use the libcapstone/objdump way of disassmbling the sample.
With the changes and "perf report"
Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238
do_work /usr/bin/pmlogger [Percent: local period]
Percent│ ld r17,16(r3)
│ addi r18,r21,16
│ li r19,0
│ 8b0: rldicl r10,r10,63,33
│ addi r10,r10,1
│ mtctr r10
│ ↓ b 8e4
│ 8c0: addi r7,r22,1
│ addi r10,r9,8
│ ↓ bdz d00
51.44 │ lwz r9,8(r9)
│ mr r8,r10
│ cmpw r20,r9
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/perf/util/disasm.c | 101 +++++++++++++++++++++++++++++++++++++++
1 file changed, 101 insertions(+)
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index a53591a6111e..646290b043b2 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -25,6 +25,7 @@
#include "srcline.h"
#include "symbol.h"
#include "util.h"
+#include "sort.h"
static regex_t file_lineno;
@@ -1634,6 +1635,91 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
}
#endif
+static int symbol__disassemble_raw(char *filename, struct symbol *sym,
+ struct annotate_args *args)
+{
+ struct annotation *notes = symbol__annotation(sym);
+ struct map *map = args->ms.map;
+ struct dso *dso = map__dso(map);
+ u64 start = map__rip_2objdump(map, sym->start);
+ u64 end = map__rip_2objdump(map, sym->end);
+ u64 len = end - start;
+ u64 offset;
+ int i, count;
+ u8 *buf = NULL;
+ char disasm_buf[512];
+ struct disasm_line *dl;
+ u32 *line;
+
+ /* Return if objdump is specified explicitly */
+ if (args->options->objdump_path)
+ return -1;
+
+ pr_debug("Reading raw instruction from : %s using dso__data_read_offset\n", filename);
+
+ buf = malloc(len);
+ if (buf == NULL)
+ goto err;
+
+ count = dso__data_read_offset(dso, NULL, sym->start, buf, len);
+
+ line = (u32 *)buf;
+
+ if ((u64)count != len)
+ goto err;
+
+ /* add the function address and name */
+ scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
+ start, sym->name);
+
+ args->offset = -1;
+ args->line = disasm_buf;
+ args->line_nr = 0;
+ args->fileloc = NULL;
+ args->ms.sym = sym;
+
+ dl = disasm_line__new(args);
+ if (dl == NULL)
+ goto err;
+
+ annotation_line__add(&dl->al, ¬es->src->source);
+
+ /* Each raw instruction is 4 byte */
+ count = len/4;
+
+ for (i = 0, offset = 0; i < count; i++) {
+ args->offset = offset;
+ sprintf(args->line, "%x", line[i]);
+ dl = disasm_line__new(args);
+ if (dl == NULL)
+ goto err;
+
+ annotation_line__add(&dl->al, ¬es->src->source);
+ offset += 4;
+ }
+
+ /* It failed in the middle */
+ if (offset != len) {
+ struct list_head *list = ¬es->src->source;
+
+ /* Discard all lines and fallback to objdump */
+ while (!list_empty(list)) {
+ dl = list_first_entry(list, struct disasm_line, al.node);
+
+ list_del_init(&dl->al.node);
+ disasm_line__free(dl);
+ }
+ count = -1;
+ }
+
+out:
+ free(buf);
+ return count < 0 ? count : 0;
+
+err:
+ count = -1;
+ goto out;
+}
/*
* Possibly create a new version of line with tabs expanded. Returns the
* existing or new line, storage is updated if a new line is allocated. If
@@ -1758,6 +1844,21 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
strcpy(symfs_filename, tmp);
}
+ /*
+ * For powerpc data type profiling, use the dso__data_read_offset
+ * to read raw instruction directly and interpret the binary code
+ * to understand instructions and register fields. For sort keys as
+ * type and typeoff, disassemble to mnemonic notation is
+ * not required in case of powerpc.
+ */
+ if (arch__is(args->arch, "powerpc")) {
+ if (sort_order && !strstr(sort_order, "sym")) {
+ err = symbol__disassemble_raw(symfs_filename, sym, args);
+ if (err == 0)
+ goto out_remove_tmp;
+ }
+ }
+
#ifdef HAVE_LIBCAPSTONE_SUPPORT
err = symbol__disassemble_capstone(symfs_filename, sym, args);
if (err == 0)
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 06/18] tools/perf: Update parameters for reg extract functions to use raw instruction on powerpc
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (4 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 05/18] tools/perf: Add support to capture and parse raw instruction in powerpc using dso__data_read_offset utility Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 07/18] tools/perf: Add parse function for memory instructions in powerpc Athira Rajeev
` (13 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Use the raw instruction code and macros to identify memory instructions,
extract register fields and also offset. The implementation addresses
the D-form, X-form, DS-form instructions. Adds "mem_ref" field to check
whether source/target has memory reference. Add function
"get_powerpc_regs" which will set these fields: reg1, reg2, offset
depending of where it is source or target ops.
Update "parse" callback for "struct ins_ops" to also pass "struct
disasm_line" as argument. This is needed in parse functions where opcode
is used to determine whether to set multi_regs and other fields
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/perf/arch/arm64/annotate/instructions.c | 3 +-
.../arch/loongarch/annotate/instructions.c | 6 ++-
tools/perf/arch/powerpc/util/dwarf-regs.c | 44 +++++++++++++++++++
tools/perf/arch/s390/annotate/instructions.c | 5 ++-
tools/perf/util/annotate.c | 19 ++++++--
tools/perf/util/disasm.c | 19 +++++---
tools/perf/util/disasm.h | 5 ++-
tools/perf/util/include/dwarf-regs.h | 11 +++++
8 files changed, 96 insertions(+), 16 deletions(-)
diff --git a/tools/perf/arch/arm64/annotate/instructions.c b/tools/perf/arch/arm64/annotate/instructions.c
index 4af0c3a0f86e..f86d9f4798bd 100644
--- a/tools/perf/arch/arm64/annotate/instructions.c
+++ b/tools/perf/arch/arm64/annotate/instructions.c
@@ -11,7 +11,8 @@ struct arm64_annotate {
static int arm64_mov__parse(struct arch *arch __maybe_unused,
struct ins_operands *ops,
- struct map_symbol *ms __maybe_unused)
+ struct map_symbol *ms __maybe_unused,
+ struct disasm_line *dl __maybe_unused)
{
char *s = strchr(ops->raw, ','), *target, *endptr;
diff --git a/tools/perf/arch/loongarch/annotate/instructions.c b/tools/perf/arch/loongarch/annotate/instructions.c
index 21cc7e4149f7..ab43b1ab51e3 100644
--- a/tools/perf/arch/loongarch/annotate/instructions.c
+++ b/tools/perf/arch/loongarch/annotate/instructions.c
@@ -5,7 +5,8 @@
* Copyright (C) 2020-2023 Loongson Technology Corporation Limited
*/
-static int loongarch_call__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms)
+static int loongarch_call__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms,
+ struct disasm_line *dl __maybe_unused)
{
char *c, *endptr, *tok, *name;
struct map *map = ms->map;
@@ -51,7 +52,8 @@ static struct ins_ops loongarch_call_ops = {
.scnprintf = call__scnprintf,
};
-static int loongarch_jump__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms)
+static int loongarch_jump__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms,
+ struct disasm_line *dl __maybe_unused)
{
struct map *map = ms->map;
struct symbol *sym = ms->sym;
diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c
index 430623ca5612..104c7ae5c433 100644
--- a/tools/perf/arch/powerpc/util/dwarf-regs.c
+++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
@@ -107,3 +107,47 @@ int regs_query_register_offset(const char *name)
#define PPC_DS(DS) ((DS) & 0xfffc)
#define OP_LD 58
#define OP_STD 62
+
+static int get_source_reg(u32 raw_insn)
+{
+ return PPC_RA(raw_insn);
+}
+
+static int get_target_reg(u32 raw_insn)
+{
+ return PPC_RT(raw_insn);
+}
+
+static int get_offset_opcode(u32 raw_insn)
+{
+ int opcode = PPC_OP(raw_insn);
+
+ /* DS- form */
+ if ((opcode == OP_LD) || (opcode == OP_STD))
+ return PPC_DS(raw_insn);
+ else
+ return PPC_D(raw_insn);
+}
+
+/*
+ * Fills the required fields for op_loc depending on if it
+ * is a source or target.
+ * D form: ins RT,D(RA) -> src_reg1 = RA, offset = D, dst_reg1 = RT
+ * DS form: ins RT,DS(RA) -> src_reg1 = RA, offset = DS, dst_reg1 = RT
+ * X form: ins RT,RA,RB -> src_reg1 = RA, src_reg2 = RB, dst_reg1 = RT
+ */
+void get_powerpc_regs(u32 raw_insn, int is_source,
+ struct annotated_op_loc *op_loc)
+{
+ if (is_source)
+ op_loc->reg1 = get_source_reg(raw_insn);
+ else
+ op_loc->reg1 = get_target_reg(raw_insn);
+
+ if (op_loc->multi_regs)
+ op_loc->reg2 = PPC_RB(raw_insn);
+
+ /* TODO: Implement offset handling for X Form */
+ if ((op_loc->mem_ref) && (PPC_OP(raw_insn) != 31))
+ op_loc->offset = get_offset_opcode(raw_insn);
+}
diff --git a/tools/perf/arch/s390/annotate/instructions.c b/tools/perf/arch/s390/annotate/instructions.c
index da5aa3e1f04c..eeac25cca699 100644
--- a/tools/perf/arch/s390/annotate/instructions.c
+++ b/tools/perf/arch/s390/annotate/instructions.c
@@ -2,7 +2,7 @@
#include <linux/compiler.h>
static int s390_call__parse(struct arch *arch, struct ins_operands *ops,
- struct map_symbol *ms)
+ struct map_symbol *ms, struct disasm_line *dl __maybe_unused)
{
char *endptr, *tok, *name;
struct map *map = ms->map;
@@ -52,7 +52,8 @@ static struct ins_ops s390_call_ops = {
static int s390_mov__parse(struct arch *arch __maybe_unused,
struct ins_operands *ops,
- struct map_symbol *ms __maybe_unused)
+ struct map_symbol *ms __maybe_unused,
+ struct disasm_line *dl __maybe_unused)
{
char *s = strchr(ops->raw, ','), *target, *endptr;
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 1451caf25e77..ce99db291c5e 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2123,20 +2123,33 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
for_each_insn_op_loc(loc, i, op_loc) {
const char *insn_str = ops->source.raw;
bool multi_regs = ops->source.multi_regs;
+ bool mem_ref = ops->source.mem_ref;
if (i == INSN_OP_TARGET) {
insn_str = ops->target.raw;
multi_regs = ops->target.multi_regs;
+ mem_ref = ops->target.mem_ref;
}
/* Invalidate the register by default */
op_loc->reg1 = -1;
op_loc->reg2 = -1;
- if (insn_str == NULL)
- continue;
+ if (insn_str == NULL) {
+ if (!arch__is(arch, "powerpc"))
+ continue;
+ }
- if (strchr(insn_str, arch->objdump.memory_ref_char)) {
+ /*
+ * For powerpc, call get_powerpc_regs function which extracts the
+ * required fields for op_loc, ie reg1, reg2, offset from the
+ * raw instruction.
+ */
+ if (arch__is(arch, "powerpc")) {
+ op_loc->mem_ref = mem_ref;
+ op_loc->multi_regs = multi_regs;
+ get_powerpc_regs(dl->raw.raw_insn, !i, op_loc);
+ } else if (strchr(insn_str, arch->objdump.memory_ref_char)) {
op_loc->mem_ref = true;
op_loc->multi_regs = multi_regs;
extract_reg_offset(arch, insn_str, op_loc);
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 646290b043b2..8e45f0874e03 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -256,7 +256,8 @@ bool ins__is_fused(struct arch *arch, const char *ins1, const char *ins2)
return arch->ins_is_fused(arch, ins1, ins2);
}
-static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms)
+static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms,
+ struct disasm_line *dl __maybe_unused)
{
char *endptr, *tok, *name;
struct map *map = ms->map;
@@ -351,7 +352,8 @@ static inline const char *validate_comma(const char *c, struct ins_operands *ops
return c;
}
-static int jump__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms)
+static int jump__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms,
+ struct disasm_line *dl __maybe_unused)
{
struct map *map = ms->map;
struct symbol *sym = ms->sym;
@@ -510,7 +512,8 @@ static int comment__symbol(char *raw, char *comment, u64 *addrp, char **namep)
return 0;
}
-static int lock__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms)
+static int lock__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms,
+ struct disasm_line *dl __maybe_unused)
{
ops->locked.ops = zalloc(sizeof(*ops->locked.ops));
if (ops->locked.ops == NULL)
@@ -525,7 +528,7 @@ static int lock__parse(struct arch *arch, struct ins_operands *ops, struct map_s
goto out_free_ops;
if (ops->locked.ins.ops->parse &&
- ops->locked.ins.ops->parse(arch, ops->locked.ops, ms) < 0)
+ ops->locked.ins.ops->parse(arch, ops->locked.ops, ms, NULL) < 0)
goto out_free_ops;
return 0;
@@ -596,7 +599,8 @@ static bool check_multi_regs(struct arch *arch, const char *op)
return count > 1;
}
-static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms __maybe_unused)
+static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms __maybe_unused,
+ struct disasm_line *dl __maybe_unused)
{
char *s = strchr(ops->raw, ','), *target, *comment, prev;
@@ -674,7 +678,8 @@ static struct ins_ops mov_ops = {
.scnprintf = mov__scnprintf,
};
-static int dec__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map_symbol *ms __maybe_unused)
+static int dec__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map_symbol *ms __maybe_unused,
+ struct disasm_line *dl __maybe_unused)
{
char *target, *comment, *s, prev;
@@ -815,7 +820,7 @@ static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, str
if (!dl->ins.ops)
return;
- if (dl->ins.ops->parse && dl->ins.ops->parse(arch, &dl->ops, ms) < 0)
+ if (dl->ins.ops->parse && dl->ins.ops->parse(arch, &dl->ops, ms, dl) < 0)
dl->ins.ops = NULL;
}
diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
index c835759c8e2b..30be0a94ea04 100644
--- a/tools/perf/util/disasm.h
+++ b/tools/perf/util/disasm.h
@@ -62,6 +62,7 @@ struct ins_operands {
bool offset_avail;
bool outside;
bool multi_regs;
+ bool mem_ref;
} target;
union {
struct {
@@ -69,6 +70,7 @@ struct ins_operands {
char *name;
u64 addr;
bool multi_regs;
+ bool mem_ref;
} source;
struct {
struct ins ins;
@@ -83,7 +85,8 @@ struct ins_operands {
struct ins_ops {
void (*free)(struct ins_operands *ops);
- int (*parse)(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms);
+ int (*parse)(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms,
+ struct disasm_line *dl);
int (*scnprintf)(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name);
};
diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
index 01fb25a1150a..75b28dcc8317 100644
--- a/tools/perf/util/include/dwarf-regs.h
+++ b/tools/perf/util/include/dwarf-regs.h
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _PERF_DWARF_REGS_H_
#define _PERF_DWARF_REGS_H_
+#include "annotate.h"
#define DWARF_REG_PC 0xd3af9c /* random number */
#define DWARF_REG_FB 0xd3affb /* random number */
@@ -31,6 +32,16 @@ static inline int get_dwarf_regnum(const char *name __maybe_unused,
}
#endif
+#if !defined(__powerpc__) || !defined(HAVE_DWARF_SUPPORT)
+static inline void get_powerpc_regs(u32 raw_insn __maybe_unused, int is_source __maybe_unused,
+ struct annotated_op_loc *op_loc __maybe_unused)
+{
+ return;
+}
+#else
+void get_powerpc_regs(u32 raw_insn, int is_source, struct annotated_op_loc *op_loc);
+#endif
+
#ifdef HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
/*
* Arch should support fetching the offset of a register in pt_regs
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 07/18] tools/perf: Add parse function for memory instructions in powerpc
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (5 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 06/18] tools/perf: Update parameters for reg extract functions to use raw instruction on powerpc Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 08/18] tools/perf: Add support to identify memory instructions of opcode 31 " Athira Rajeev
` (12 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Use the raw instruction code and macros to identify memory instructions,
extract register fields and also offset. The implementation addresses
the D-form, X-form, DS-form instructions. Two main functions are added.
New parse function "load_store__parse" as instruction ops parser for
memory instructions. Unlink other parser (like mov__parse), this parser
fills in the "multi_regs" field for source/target and new added "mem_ref"
field. No other fields are set because, here there is no need to parse the
disassembled code and arch specific macros will take care of extracting
offset and regs which is easier and will be precise.
In powerpc, all instructions with a primary opcode from 32 to 63
are memory instructions. Update "ins__find" function to have "raw_insn"
also as a parameter.
Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com>
---
.../perf/arch/powerpc/annotate/instructions.c | 16 ++++++
tools/perf/util/disasm.c | 54 +++++++++++++++++--
tools/perf/util/disasm.h | 2 +-
3 files changed, 66 insertions(+), 6 deletions(-)
diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index d57fd023ef9c..b084423d8477 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -49,6 +49,22 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con
return ops;
}
+#define PPC_OP(op) (((op) >> 26) & 0x3F)
+
+static struct ins_ops *check_ppc_insn(u32 raw_insn)
+{
+ int opcode = PPC_OP(raw_insn);
+
+ /*
+ * Instructions with opcode 32 to 63 are memory
+ * instructions in powerpc
+ */
+ if ((opcode & 0x20))
+ return &load_store_ops;
+
+ return NULL;
+}
+
static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
{
if (!arch->initialized) {
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 8e45f0874e03..b30277a930c0 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -37,6 +37,7 @@ static struct ins_ops mov_ops;
static struct ins_ops nop_ops;
static struct ins_ops lock_ops;
static struct ins_ops ret_ops;
+static struct ins_ops load_store_ops;
static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name);
@@ -522,7 +523,7 @@ static int lock__parse(struct arch *arch, struct ins_operands *ops, struct map_s
if (disasm_line__parse(ops->raw, &ops->locked.ins.name, &ops->locked.ops->raw) < 0)
goto out_free_ops;
- ops->locked.ins.ops = ins__find(arch, ops->locked.ins.name);
+ ops->locked.ins.ops = ins__find(arch, ops->locked.ins.name, 0);
if (ops->locked.ins.ops == NULL)
goto out_free_ops;
@@ -678,6 +679,37 @@ static struct ins_ops mov_ops = {
.scnprintf = mov__scnprintf,
};
+static int load_store__scnprintf(struct ins *ins, char *bf, size_t size,
+ struct ins_operands *ops, int max_ins_name)
+{
+ return scnprintf(bf, size, "%-*s %s", max_ins_name, ins->name,
+ ops->raw);
+}
+
+/*
+ * Sets the fields: multi_regs and "mem_ref".
+ * "mem_ref" is set for ops->source which is later used to
+ * fill the objdump->memory_ref-char field. This ops is currently
+ * used by powerpc and since binary instruction code is used to
+ * extract opcode, regs and offset, no other parsing is needed here
+ */
+static int load_store__parse(struct arch *arch __maybe_unused, struct ins_operands *ops,
+ struct map_symbol *ms __maybe_unused, struct disasm_line *dl __maybe_unused)
+{
+ ops->source.mem_ref = true;
+ ops->source.multi_regs = false;
+
+ ops->target.mem_ref = false;
+ ops->target.multi_regs = false;
+
+ return 0;
+}
+
+static struct ins_ops load_store_ops = {
+ .parse = load_store__parse,
+ .scnprintf = load_store__scnprintf,
+};
+
static int dec__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map_symbol *ms __maybe_unused,
struct disasm_line *dl __maybe_unused)
{
@@ -769,11 +801,23 @@ static void ins__sort(struct arch *arch)
qsort(arch->instructions, nmemb, sizeof(struct ins), ins__cmp);
}
-static struct ins_ops *__ins__find(struct arch *arch, const char *name)
+static struct ins_ops *__ins__find(struct arch *arch, const char *name, u32 raw_insn)
{
struct ins *ins;
const int nmemb = arch->nr_instructions;
+ if (arch__is(arch, "powerpc")) {
+ /*
+ * For powerpc, identify the instruction ops
+ * from the opcode using raw_insn.
+ */
+ struct ins_ops *ops;
+
+ ops = check_ppc_insn(raw_insn);
+ if (ops)
+ return ops;
+ }
+
if (!arch->sorted_instructions) {
ins__sort(arch);
arch->sorted_instructions = true;
@@ -803,9 +847,9 @@ static struct ins_ops *__ins__find(struct arch *arch, const char *name)
return ins ? ins->ops : NULL;
}
-struct ins_ops *ins__find(struct arch *arch, const char *name)
+struct ins_ops *ins__find(struct arch *arch, const char *name, u32 raw_insn)
{
- struct ins_ops *ops = __ins__find(arch, name);
+ struct ins_ops *ops = __ins__find(arch, name, raw_insn);
if (!ops && arch->associate_instruction_ops)
ops = arch->associate_instruction_ops(arch, name);
@@ -815,7 +859,7 @@ struct ins_ops *ins__find(struct arch *arch, const char *name)
static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map_symbol *ms)
{
- dl->ins.ops = ins__find(arch, dl->ins.name);
+ dl->ins.ops = ins__find(arch, dl->ins.name, dl->raw.raw_insn);
if (!dl->ins.ops)
return;
diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
index 30be0a94ea04..c1bb1e484bfb 100644
--- a/tools/perf/util/disasm.h
+++ b/tools/perf/util/disasm.h
@@ -105,7 +105,7 @@ struct annotate_args {
struct arch *arch__find(const char *name);
bool arch__is(struct arch *arch, const char *name);
-struct ins_ops *ins__find(struct arch *arch, const char *name);
+struct ins_ops *ins__find(struct arch *arch, const char *name, u32 raw_insn);
int ins__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name);
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 08/18] tools/perf: Add support to identify memory instructions of opcode 31 in powerpc
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (6 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 07/18] tools/perf: Add parse function for memory instructions in powerpc Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 09/18] tools/perf: Add some of the arithmetic instructions to support instruction tracking " Athira Rajeev
` (11 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
There are memory instructions in powerpc with opcode as 31.
Example: "ldx RT,RA,RB" , Its X form is as below:
______________________________________
| 31 | RT | RA | RB | 21 |/|
--------------------------------------
0 6 11 16 21 30 31
The opcode for "ldx" is 31. There are other instructions also with
opcode 31 which are memory insn like ldux, stbx, lwzx, lhaux
But all instructions with opcode 31 are not memory. Example is add
instruction: "add RT,RA,RB"
The value in bit 21-30 [ 21 for ldx ] is different for these
instructions. Patch uses this value to assign instruction ops for these
cases. The naming convention and value to identify these are picked from
defines in "arch/powerpc/include/asm/ppc-opcode.h"
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
.../perf/arch/powerpc/annotate/instructions.c | 107 +++++++++++++++++-
tools/perf/util/disasm.c | 3 +
2 files changed, 108 insertions(+), 2 deletions(-)
diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index b084423d8477..1ffb64c6bd0d 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -49,18 +49,121 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con
return ops;
}
-#define PPC_OP(op) (((op) >> 26) & 0x3F)
+#define PPC_OP(op) (((op) >> 26) & 0x3F)
+#define PPC_21_30(R) (((R) >> 1) & 0x3ff)
+
+struct insn_offset {
+ const char *name;
+ int value;
+};
+
+/*
+ * There are memory instructions with opcode 31 which are
+ * of X Form, Example:
+ * ldx RT,RA,RB
+ * ______________________________________
+ * | 31 | RT | RA | RB | 21 |/|
+ * --------------------------------------
+ * 0 6 11 16 21 30 31
+ *
+ * But all instructions with opcode 31 are not memory.
+ * Example: add RT,RA,RB
+ *
+ * Use bits 21 to 30 to check memory insns with 31 as opcode.
+ * In ins_array below, for ldx instruction:
+ * name => OP_31_XOP_LDX
+ * value => 21
+ */
+
+static struct insn_offset ins_array[] = {
+ { .name = "OP_31_XOP_LXSIWZX", .value = 12, },
+ { .name = "OP_31_XOP_LWARX", .value = 20, },
+ { .name = "OP_31_XOP_LDX", .value = 21, },
+ { .name = "OP_31_XOP_LWZX", .value = 23, },
+ { .name = "OP_31_XOP_LDUX", .value = 53, },
+ { .name = "OP_31_XOP_LWZUX", .value = 55, },
+ { .name = "OP_31_XOP_LXSIWAX", .value = 76, },
+ { .name = "OP_31_XOP_LDARX", .value = 84, },
+ { .name = "OP_31_XOP_LBZX", .value = 87, },
+ { .name = "OP_31_XOP_LVX", .value = 103, },
+ { .name = "OP_31_XOP_LBZUX", .value = 119, },
+ { .name = "OP_31_XOP_STXSIWX", .value = 140, },
+ { .name = "OP_31_XOP_STDX", .value = 149, },
+ { .name = "OP_31_XOP_STWX", .value = 151, },
+ { .name = "OP_31_XOP_STDUX", .value = 181, },
+ { .name = "OP_31_XOP_STWUX", .value = 183, },
+ { .name = "OP_31_XOP_STBX", .value = 215, },
+ { .name = "OP_31_XOP_STVX", .value = 231, },
+ { .name = "OP_31_XOP_STBUX", .value = 247, },
+ { .name = "OP_31_XOP_LHZX", .value = 279, },
+ { .name = "OP_31_XOP_LHZUX", .value = 311, },
+ { .name = "OP_31_XOP_LXVDSX", .value = 332, },
+ { .name = "OP_31_XOP_LWAX", .value = 341, },
+ { .name = "OP_31_XOP_LHAX", .value = 343, },
+ { .name = "OP_31_XOP_LWAUX", .value = 373, },
+ { .name = "OP_31_XOP_LHAUX", .value = 375, },
+ { .name = "OP_31_XOP_STHX", .value = 407, },
+ { .name = "OP_31_XOP_STHUX", .value = 439, },
+ { .name = "OP_31_XOP_LXSSPX", .value = 524, },
+ { .name = "OP_31_XOP_LDBRX", .value = 532, },
+ { .name = "OP_31_XOP_LSWX", .value = 533, },
+ { .name = "OP_31_XOP_LWBRX", .value = 534, },
+ { .name = "OP_31_XOP_LFSUX", .value = 567, },
+ { .name = "OP_31_XOP_LXSDX", .value = 588, },
+ { .name = "OP_31_XOP_LSWI", .value = 597, },
+ { .name = "OP_31_XOP_LFDX", .value = 599, },
+ { .name = "OP_31_XOP_LFDUX", .value = 631, },
+ { .name = "OP_31_XOP_STXSSPX", .value = 652, },
+ { .name = "OP_31_XOP_STDBRX", .value = 660, },
+ { .name = "OP_31_XOP_STXWX", .value = 661, },
+ { .name = "OP_31_XOP_STWBRX", .value = 662, },
+ { .name = "OP_31_XOP_STFSX", .value = 663, },
+ { .name = "OP_31_XOP_STFSUX", .value = 695, },
+ { .name = "OP_31_XOP_STXSDX", .value = 716, },
+ { .name = "OP_31_XOP_STSWI", .value = 725, },
+ { .name = "OP_31_XOP_STFDX", .value = 727, },
+ { .name = "OP_31_XOP_STFDUX", .value = 759, },
+ { .name = "OP_31_XOP_LXVW4X", .value = 780, },
+ { .name = "OP_31_XOP_LHBRX", .value = 790, },
+ { .name = "OP_31_XOP_LXVD2X", .value = 844, },
+ { .name = "OP_31_XOP_LFIWAX", .value = 855, },
+ { .name = "OP_31_XOP_LFIWZX", .value = 887, },
+ { .name = "OP_31_XOP_STXVW4X", .value = 908, },
+ { .name = "OP_31_XOP_STHBRX", .value = 918, },
+ { .name = "OP_31_XOP_STXVD2X", .value = 972, },
+ { .name = "OP_31_XOP_STFIWX", .value = 983, },
+};
+
+static int cmp_offset(const void *a, const void *b)
+{
+ const struct insn_offset *val1 = a;
+ const struct insn_offset *val2 = b;
+
+ return (val1->value - val2->value);
+}
static struct ins_ops *check_ppc_insn(u32 raw_insn)
{
int opcode = PPC_OP(raw_insn);
+ int mem_insn_31 = PPC_21_30(raw_insn);
+ struct insn_offset *ret;
+ struct insn_offset mem_insns_31_opcode = {
+ "OP_31_INSN",
+ mem_insn_31
+ };
/*
* Instructions with opcode 32 to 63 are memory
* instructions in powerpc
*/
- if ((opcode & 0x20))
+ if ((opcode & 0x20)) {
return &load_store_ops;
+ } else if (opcode == 31) {
+ /* Check for memory instructions with opcode 31 */
+ ret = bsearch(&mem_insns_31_opcode, ins_array, ARRAY_SIZE(ins_array), sizeof(ins_array[0]), cmp_offset);
+ if (ret != NULL)
+ return &load_store_ops;
+ }
return NULL;
}
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index b30277a930c0..d39ff19ea081 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -698,6 +698,9 @@ static int load_store__parse(struct arch *arch __maybe_unused, struct ins_operan
{
ops->source.mem_ref = true;
ops->source.multi_regs = false;
+ /* opcode 31 is of X form */
+ if (PPC_OP(dl->raw.raw_insn) == 31)
+ ops->source.multi_regs = true;
ops->target.mem_ref = false;
ops->target.multi_regs = false;
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 09/18] tools/perf: Add some of the arithmetic instructions to support instruction tracking in powerpc
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (7 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 08/18] tools/perf: Add support to identify memory instructions of opcode 31 " Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 10/18] tools/perf: Add more instructions for instruction tracking Athira Rajeev
` (10 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Data type profiling has concept of instruction tracking.
Example sequence in powerpc:
ld r10,264(r3)
mr r31,r3
<<after some sequence>
ld r9,312(r31)
or differently
lwz r10,264(r3)
add r31, r3, RB
lwz r9, 0(r31)
If a sample is hit at "lwz r9, 0(r31)", data type of r31 depends
on previous instruction sequence here. So to track the previous
instructions, patch adds changes to identify some of the arithmetic
instructions which are having opcode as 31. Since memory instructions
also has cases with opcode 31, use the bits 22:30 to filter the
arithmetic instructions here. Also there are instructions with just
two operands like addme, addze. Patch adds new instructions ops
"arithmetic_ops" to handle this
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
.../perf/arch/powerpc/annotate/instructions.c | 49 ++++++++++++++++++
tools/perf/util/disasm.c | 51 +++++++++++++++++++
2 files changed, 100 insertions(+)
diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index 1ffb64c6bd0d..aa5ee09fa28f 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -51,6 +51,7 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con
#define PPC_OP(op) (((op) >> 26) & 0x3F)
#define PPC_21_30(R) (((R) >> 1) & 0x3ff)
+#define PPC_22_30(R) (((R) >> 1) & 0x1ff)
struct insn_offset {
const char *name;
@@ -134,6 +135,44 @@ static struct insn_offset ins_array[] = {
{ .name = "OP_31_XOP_STFIWX", .value = 983, },
};
+/*
+ * Arithmetic instructions which are having opcode as 31.
+ * These instructions are tracked to save the register state
+ * changes. Example:
+ *
+ * lwz r10,264(r3)
+ * add r31, r3, r3
+ * lwz r9, 0(r31)
+ *
+ * Here instruction tracking needs to identify the "add"
+ * instruction and save data type of r3 to r31. If a sample
+ * is hit at next "lwz r9, 0(r31)", by this instruction tracking,
+ * data type of r31 can be resolved.
+ */
+static struct insn_offset arithmetic_ins_op_31[] = {
+ { .name = "SUB_CARRY_XO_FORM", .value = 8, },
+ { .name = "MUL_HDW_XO_FORM1", .value = 9, },
+ { .name = "ADD_CARRY_XO_FORM", .value = 10, },
+ { .name = "MUL_HW_XO_FORM1", .value = 11, },
+ { .name = "SUB_XO_FORM", .value = 40, },
+ { .name = "MUL_HDW_XO_FORM", .value = 73, },
+ { .name = "MUL_HW_XO_FORM", .value = 75, },
+ { .name = "SUB_EXT_XO_FORM", .value = 136, },
+ { .name = "ADD_EXT_XO_FORM", .value = 138, },
+ { .name = "SUB_ZERO_EXT_XO_FORM", .value = 200, },
+ { .name = "ADD_ZERO_EXT_XO_FORM", .value = 202, },
+ { .name = "SUB_EXT_XO_FORM2", .value = 232, },
+ { .name = "MUL_DW_XO_FORM", .value = 233, },
+ { .name = "ADD_EXT_XO_FORM2", .value = 234, },
+ { .name = "MUL_W_XO_FORM", .value = 235, },
+ { .name = "ADD_XO_FORM", .value = 266, },
+ { .name = "DIV_DW_XO_FORM1", .value = 457, },
+ { .name = "DIV_W_XO_FORM1", .value = 459, },
+ { .name = "DIV_DW_XO_FORM", .value = 489, },
+ { .name = "DIV_W_XO_FORM", .value = 491, },
+};
+
+
static int cmp_offset(const void *a, const void *b)
{
const struct insn_offset *val1 = a;
@@ -163,6 +202,16 @@ static struct ins_ops *check_ppc_insn(u32 raw_insn)
ret = bsearch(&mem_insns_31_opcode, ins_array, ARRAY_SIZE(ins_array), sizeof(ins_array[0]), cmp_offset);
if (ret != NULL)
return &load_store_ops;
+ else {
+ mem_insns_31_opcode.value = PPC_22_30(raw_insn);
+ ret = bsearch(&mem_insns_31_opcode, arithmetic_ins_op_31, ARRAY_SIZE(arithmetic_ins_op_31),
+ sizeof(arithmetic_ins_op_31[0]), cmp_offset);
+ if (ret != NULL)
+ return &arithmetic_ops;
+ /* Bits 21 to 30 has value 444 for "mr" insn ie, OR X form */
+ if (PPC_21_30(raw_insn) == 444)
+ return &arithmetic_ops;
+ }
}
return NULL;
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index d39ff19ea081..801d57287a35 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -38,6 +38,7 @@ static struct ins_ops nop_ops;
static struct ins_ops lock_ops;
static struct ins_ops ret_ops;
static struct ins_ops load_store_ops;
+static struct ins_ops arithmetic_ops;
static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name);
@@ -679,6 +680,56 @@ static struct ins_ops mov_ops = {
.scnprintf = mov__scnprintf,
};
+#define PPC_22_30(R) (((R) >> 1) & 0x1ff)
+#define MINUS_EXT_XO_FORM 234
+#define SUB_EXT_XO_FORM 232
+#define ADD_ZERO_EXT_XO_FORM 202
+#define SUB_ZERO_EXT_XO_FORM 200
+
+static int arithmetic__scnprintf(struct ins *ins, char *bf, size_t size,
+ struct ins_operands *ops, int max_ins_name)
+{
+ return scnprintf(bf, size, "%-*s %s", max_ins_name, ins->name,
+ ops->raw);
+}
+
+/*
+ * Sets the fields: multi_regs and "mem_ref".
+ * "mem_ref" is set for ops->source which is later used to
+ * fill the objdump->memory_ref-char field. This ops is currently
+ * used by powerpc and since binary instruction code is used to
+ * extract opcode, regs and offset, no other parsing is needed here.
+ *
+ * Dont set multi regs for 4 cases since it has only one operand
+ * for source:
+ * - Add to Minus One Extended XO-form ( Ex: addme, addmeo )
+ * - Subtract From Minus One Extended XO-form ( Ex: subfme )
+ * - Add to Zero Extended XO-form ( Ex: addze, addzeo )
+ * - Subtract From Zero Extended XO-form ( Ex: subfze )
+ */
+static int arithmetic__parse(struct arch *arch __maybe_unused, struct ins_operands *ops,
+ struct map_symbol *ms __maybe_unused, struct disasm_line *dl)
+{
+ int opcode = PPC_OP(dl->raw.raw_insn);
+
+ ops->source.mem_ref = false;
+ if (opcode == 31) {
+ if ((opcode != MINUS_EXT_XO_FORM) && (opcode != SUB_EXT_XO_FORM) \
+ && (opcode != ADD_ZERO_EXT_XO_FORM) && (opcode != SUB_ZERO_EXT_XO_FORM))
+ ops->source.multi_regs = true;
+ }
+
+ ops->target.mem_ref = false;
+ ops->target.multi_regs = false;
+
+ return 0;
+}
+
+static struct ins_ops arithmetic_ops = {
+ .parse = arithmetic__parse,
+ .scnprintf = arithmetic__scnprintf,
+};
+
static int load_store__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name)
{
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 10/18] tools/perf: Add more instructions for instruction tracking
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (8 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 09/18] tools/perf: Add some of the arithmetic instructions to support instruction tracking " Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 11/18] tools/perf: Update instruction tracking for powerpc Athira Rajeev
` (9 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Add few more instructions and use opcode as search key
to find if it is supported by the architecture. Added ones
are: addi, addic, addic., addis, subfic and mulli
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/perf/arch/powerpc/annotate/instructions.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index aa5ee09fa28f..aa25a336d8d0 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -172,6 +172,14 @@ static struct insn_offset arithmetic_ins_op_31[] = {
{ .name = "DIV_W_XO_FORM", .value = 491, },
};
+static struct insn_offset arithmetic_two_ops[] = {
+ { .name = "mulli", .value = 7, },
+ { .name = "subfic", .value = 8, },
+ { .name = "addic", .value = 12, },
+ { .name = "addic.", .value = 13, },
+ { .name = "addi", .value = 14, },
+ { .name = "addis", .value = 15, },
+};
static int cmp_offset(const void *a, const void *b)
{
@@ -212,6 +220,12 @@ static struct ins_ops *check_ppc_insn(u32 raw_insn)
if (PPC_21_30(raw_insn) == 444)
return &arithmetic_ops;
}
+ } else {
+ mem_insns_31_opcode.value = opcode;
+ ret = bsearch(&mem_insns_31_opcode, arithmetic_two_ops, ARRAY_SIZE(arithmetic_two_ops),
+ sizeof(arithmetic_two_ops[0]), cmp_offset);
+ if (ret != NULL)
+ return &arithmetic_ops;
}
return NULL;
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 11/18] tools/perf: Update instruction tracking for powerpc
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (9 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 10/18] tools/perf: Add more instructions for instruction tracking Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 12/18] tools/perf: Make capstone_init non-static so that it can be used during symbol disassemble Athira Rajeev
` (8 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Add instruction tracking function "update_insn_state_powerpc" for
powerpc. Example sequence in powerpc:
ld r10,264(r3)
mr r31,r3
<<after some sequence>
ld r9,312(r31)
Consider ithe sample is pointing to: "ld r9,312(r31)".
Here the memory reference is hit at "312(r31)" where 312 is the offset
and r31 is the source register. Previous instruction sequence shows that
register state of r3 is moved to r31. So to identify the data type for r31
access, the previous instruction ("mr") needs to be tracked and the
state type entry has to be updated. Current instruction tracking support
in perf tools infrastructure is specific to x86. Patch adds this support
for powerpc as well.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
.../perf/arch/powerpc/annotate/instructions.c | 59 +++++++++++++++++++
tools/perf/util/annotate-data.c | 9 ++-
tools/perf/util/disasm.c | 3 +
3 files changed, 70 insertions(+), 1 deletion(-)
diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index aa25a336d8d0..af1032572bf3 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -231,6 +231,65 @@ static struct ins_ops *check_ppc_insn(u32 raw_insn)
return NULL;
}
+/*
+ * Instruction tracking function to track register state moves.
+ * Example sequence:
+ * ld r10,264(r3)
+ * mr r31,r3
+ * <<after some sequence>
+ * ld r9,312(r31)
+ *
+ * Previous instruction sequence shows that register state of r3
+ * is moved to r31. update_insn_state_powerpc tracks these state
+ * changes
+ */
+#ifdef HAVE_DWARF_SUPPORT
+static void update_insn_state_powerpc(struct type_state *state,
+ struct data_loc_info *dloc, Dwarf_Die * cu_die __maybe_unused,
+ struct disasm_line *dl)
+{
+ struct annotated_insn_loc loc;
+ struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
+ struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
+ struct type_state_reg *tsr;
+ u32 insn_offset = dl->al.offset;
+
+ if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
+ return;
+
+ /*
+ * Value 444 for bits 21:30 is for "mr"
+ * instruction. "mr" is extended OR. So set the
+ * source and destination reg correctly
+ */
+ if (PPC_21_30(dl->raw.raw_insn) == 444) {
+ int src_reg = src->reg1;
+
+ src->reg1 = dst->reg1;
+ dst->reg1 = src_reg;
+ }
+
+ if (!has_reg_type(state, dst->reg1))
+ return;
+
+ tsr = &state->regs[dst->reg1];
+
+ if (!has_reg_type(state, src->reg1) ||
+ !state->regs[src->reg1].ok) {
+ tsr->ok = false;
+ return;
+ }
+
+ tsr->type = state->regs[src->reg1].type;
+ tsr->kind = state->regs[src->reg1].kind;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] reg%d -> reg%d",
+ insn_offset, src->reg1, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+}
+#endif /* HAVE_DWARF_SUPPORT */
+
static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
{
if (!arch->initialized) {
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 7a48c3d72b89..734acdd8c4b7 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -1080,6 +1080,13 @@ static int find_data_type_insn(struct data_loc_info *dloc,
return ret;
}
+static int arch_supports_insn_tracking(struct data_loc_info *dloc)
+{
+ if ((arch__is(dloc->arch, "x86")) || (arch__is(dloc->arch, "powerpc")))
+ return 1;
+ return 0;
+}
+
/*
* Construct a list of basic blocks for each scope with variables and try to find
* the data type by updating a type state table through instructions.
@@ -1094,7 +1101,7 @@ static int find_data_type_block(struct data_loc_info *dloc,
int ret = -1;
/* TODO: other architecture support */
- if (!arch__is(dloc->arch, "x86"))
+ if (!arch_supports_insn_tracking(dloc))
return -1;
prev_dst_ip = dst_ip = dloc->ip;
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 801d57287a35..a839f037bdaf 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -157,6 +157,9 @@ static struct arch architectures[] = {
{
.name = "powerpc",
.init = powerpc__annotate_init,
+#ifdef HAVE_DWARF_SUPPORT
+ .update_insn_state = update_insn_state_powerpc,
+#endif
},
{
.name = "riscv64",
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 12/18] tools/perf: Make capstone_init non-static so that it can be used during symbol disassemble
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (10 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 11/18] tools/perf: Update instruction tracking for powerpc Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 13/18] tools/perf: Use capstone_init and remove open_capstone_handle from disasm.c Athira Rajeev
` (7 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
symbol__disassemble_capstone in util/disasm.c calls function
open_capstone_handle to open/init the capstone. We already have a
capstone_init function in "util/print_insn.c". But capstone_init
is defined as a static function in util/print_insn.c. Change this and
also add the function in print_insn.h
The open_capstone_handle checks the disassembler_style option from
annotation_options to decide whether to set CS_OPT_SYNTAX_ATT.
Add that logic in capstone_init also and by default set it to true.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/perf/util/print_insn.c | 12 +++++++++---
tools/perf/util/print_insn.h | 5 +++++
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/print_insn.c b/tools/perf/util/print_insn.c
index a950e9157d2d..a76aae81d7a0 100644
--- a/tools/perf/util/print_insn.c
+++ b/tools/perf/util/print_insn.c
@@ -32,7 +32,7 @@ size_t sample__fprintf_insn_raw(struct perf_sample *sample, FILE *fp)
#ifdef HAVE_LIBCAPSTONE_SUPPORT
#include <capstone/capstone.h>
-static int capstone_init(struct machine *machine, csh *cs_handle, bool is64)
+int capstone_init(struct machine *machine, csh *cs_handle, bool is64, bool disassembler_style)
{
cs_arch arch;
cs_mode mode;
@@ -62,7 +62,13 @@ static int capstone_init(struct machine *machine, csh *cs_handle, bool is64)
}
if (machine__normalized_is(machine, "x86")) {
- cs_option(*cs_handle, CS_OPT_SYNTAX, CS_OPT_SYNTAX_ATT);
+ /*
+ * In case of using capstone_init while symbol__disassemble
+ * setting CS_OPT_SYNTAX_ATT depends if disassembler_style opts
+ * is set via annotation args
+ */
+ if (disassembler_style)
+ cs_option(*cs_handle, CS_OPT_SYNTAX, CS_OPT_SYNTAX_ATT);
/*
* Resolving address operands to symbols is implemented
* on x86 by investigating instruction details.
@@ -122,7 +128,7 @@ ssize_t fprintf_insn_asm(struct machine *machine, struct thread *thread, u8 cpum
int ret;
/* TODO: Try to initiate capstone only once but need a proper place. */
- ret = capstone_init(machine, &cs_handle, is64bit);
+ ret = capstone_init(machine, &cs_handle, is64bit, true);
if (ret < 0)
return ret;
diff --git a/tools/perf/util/print_insn.h b/tools/perf/util/print_insn.h
index 07d11af3fc1c..2c8ee41c4a5d 100644
--- a/tools/perf/util/print_insn.h
+++ b/tools/perf/util/print_insn.h
@@ -19,4 +19,9 @@ ssize_t fprintf_insn_asm(struct machine *machine, struct thread *thread, u8 cpum
bool is64bit, const uint8_t *code, size_t code_size,
uint64_t ip, int *lenp, int print_opts, FILE *fp);
+#ifdef HAVE_LIBCAPSTONE_SUPPORT
+#include <capstone/capstone.h>
+int capstone_init(struct machine *machine, csh *cs_handle, bool is64, bool disassembler_style);
+#endif
+
#endif /* PERF_PRINT_INSN_H */
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 13/18] tools/perf: Use capstone_init and remove open_capstone_handle from disasm.c
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (11 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 12/18] tools/perf: Make capstone_init non-static so that it can be used during symbol disassemble Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 14/18] tools/perf: Add support to use libcapstone in powerpc Athira Rajeev
` (6 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
capstone_init is made availbale for all archs to use and updated to
enable support for CS_ARCH_PPC as well. Patch removes
open_capstone_handle and uses capstone_init in all the places.
Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com>
---
tools/perf/util/disasm.c | 42 +++++++++++-------------------------
tools/perf/util/print_insn.c | 3 +++
2 files changed, 15 insertions(+), 30 deletions(-)
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index a839f037bdaf..a848e6f5f05a 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -26,6 +26,7 @@
#include "symbol.h"
#include "util.h"
#include "sort.h"
+#include "print_insn.h"
static regex_t file_lineno;
@@ -1510,32 +1511,6 @@ symbol__disassemble_bpf_image(struct symbol *sym,
#ifdef HAVE_LIBCAPSTONE_SUPPORT
#include <capstone/capstone.h>
-static int open_capstone_handle(struct annotate_args *args, bool is_64bit,
- csh *handle)
-{
- struct annotation_options *opt = args->options;
- cs_mode mode = is_64bit ? CS_MODE_64 : CS_MODE_32;
-
- /* TODO: support more architectures */
- if (!arch__is(args->arch, "x86"))
- return -1;
-
- if (cs_open(CS_ARCH_X86, mode, handle) != CS_ERR_OK)
- return -1;
-
- if (!opt->disassembler_style ||
- !strcmp(opt->disassembler_style, "att"))
- cs_option(*handle, CS_OPT_SYNTAX, CS_OPT_SYNTAX_ATT);
-
- /*
- * Resolving address operands to symbols is implemented
- * on x86 by investigating instruction details.
- */
- cs_option(*handle, CS_OPT_DETAIL, CS_OPT_ON);
-
- return 0;
-}
-
struct find_file_offset_data {
u64 ip;
u64 offset;
@@ -1632,6 +1607,7 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
cs_insn *insn;
char disasm_buf[512];
struct disasm_line *dl;
+ bool disassembler_style = false;
if (args->options->objdump_path)
return -1;
@@ -1646,7 +1622,11 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
&is_64bit) == 0)
goto err;
- if (open_capstone_handle(args, is_64bit, &handle) < 0)
+ if (!args->options->disassembler_style ||
+ !strcmp(args->options->disassembler_style, "att"))
+ disassembler_style = true;
+
+ if (capstone_init(maps__machine(args->ms.maps), &handle, is_64bit, disassembler_style) < 0)
goto err;
needs_cs_close = true;
@@ -1966,9 +1946,11 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
}
#ifdef HAVE_LIBCAPSTONE_SUPPORT
- err = symbol__disassemble_capstone(symfs_filename, sym, args);
- if (err == 0)
- goto out_remove_tmp;
+ if (arch__is(args->arch, "x86")) {
+ err = symbol__disassemble_capstone(symfs_filename, sym, args);
+ if (err == 0)
+ goto out_remove_tmp;
+ }
#endif
err = asprintf(&command,
diff --git a/tools/perf/util/print_insn.c b/tools/perf/util/print_insn.c
index a76aae81d7a0..79dec5ab3bef 100644
--- a/tools/perf/util/print_insn.c
+++ b/tools/perf/util/print_insn.c
@@ -52,6 +52,9 @@ int capstone_init(struct machine *machine, csh *cs_handle, bool is64, bool disas
} else if (machine__normalized_is(machine, "s390")) {
arch = CS_ARCH_SYSZ;
mode = CS_MODE_BIG_ENDIAN;
+ } else if (machine__normalized_is(machine, "powerpc")) {
+ arch = CS_ARCH_PPC;
+ mode = CS_MODE_64;
} else {
return -1;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 14/18] tools/perf: Add support to use libcapstone in powerpc
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (12 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 13/18] tools/perf: Use capstone_init and remove open_capstone_handle from disasm.c Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 15/18] tools/perf: Add support to find global register variables using find_data_type_global_reg Athira Rajeev
` (5 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Now perf uses the capstone library to disassemble the instructions in
x86. capstone is used (if available) for perf annotate to speed up.
Currently it only supports x86 architecture. Patch includes changes to
enable this in powerpc. For now, only for data type sort keys, this
method is used and only binary code (raw instruction) is read. This is
because powerpc approach to understand instructions and reg fields uses
raw instruction. The "cs_disasm" is currently not enabled. While
attempting to do cs_disasm, observation is that some of the instructions
were not identified (ex: extswsli, maddld) and it had to fallback to use
objdump. Hence enabling "cs_disasm" is added in comment section as a
TODO for powerpc.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/perf/util/disasm.c | 143 +++++++++++++++++++++++++++++++++++++++
1 file changed, 143 insertions(+)
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index a848e6f5f05a..63681df6482b 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -1585,6 +1585,144 @@ static void print_capstone_detail(cs_insn *insn, char *buf, size_t len,
}
}
+static int symbol__disassemble_capstone_powerpc(char *filename, struct symbol *sym,
+ struct annotate_args *args)
+{
+ struct annotation *notes = symbol__annotation(sym);
+ struct map *map = args->ms.map;
+ struct dso *dso = map__dso(map);
+ struct nscookie nsc;
+ u64 start = map__rip_2objdump(map, sym->start);
+ u64 end = map__rip_2objdump(map, sym->end);
+ u64 len = end - start;
+ u64 offset;
+ int i, fd, count;
+ bool is_64bit = false;
+ bool needs_cs_close = false;
+ u8 *buf = NULL;
+ struct find_file_offset_data data = {
+ .ip = start,
+ };
+ csh handle;
+ char disasm_buf[512];
+ struct disasm_line *dl;
+ u32 *line;
+ bool disassembler_style = false;
+
+ if (args->options->objdump_path)
+ return -1;
+
+ nsinfo__mountns_enter(dso->nsinfo, &nsc);
+ fd = open(filename, O_RDONLY);
+ nsinfo__mountns_exit(&nsc);
+ if (fd < 0)
+ return -1;
+
+ if (file__read_maps(fd, /*exe=*/true, find_file_offset, &data,
+ &is_64bit) == 0)
+ goto err;
+
+ if (!args->options->disassembler_style ||
+ !strcmp(args->options->disassembler_style, "att"))
+ disassembler_style = true;
+
+ if (capstone_init(maps__machine(args->ms.maps), &handle, is_64bit, disassembler_style) < 0)
+ goto err;
+
+ needs_cs_close = true;
+
+ buf = malloc(len);
+ if (buf == NULL)
+ goto err;
+
+ count = pread(fd, buf, len, data.offset);
+ close(fd);
+ fd = -1;
+
+ if ((u64)count != len)
+ goto err;
+
+ line = (u32 *)buf;
+
+ /* add the function address and name */
+ scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
+ start, sym->name);
+
+ args->offset = -1;
+ args->line = disasm_buf;
+ args->line_nr = 0;
+ args->fileloc = NULL;
+ args->ms.sym = sym;
+
+ dl = disasm_line__new(args);
+ if (dl == NULL)
+ goto err;
+
+ annotation_line__add(&dl->al, ¬es->src->source);
+
+ /*
+ * TODO: enable disassm for powerpc
+ * count = cs_disasm(handle, buf, len, start, len, &insn);
+ *
+ * For now, only binary code is saved in disassembled line
+ * to be used in "type" and "typeoff" sort keys. Each raw code
+ * is 32 bit instruction. So use "len/4" to get the number of
+ * entries.
+ */
+ count = len/4;
+
+ for (i = 0, offset = 0; i < count; i++) {
+ args->offset = offset;
+ sprintf(args->line, "%x", line[i]);
+
+ dl = disasm_line__new(args);
+ if (dl == NULL)
+ goto err;
+
+ annotation_line__add(&dl->al, ¬es->src->source);
+
+ offset += 4;
+ }
+
+ /* It failed in the middle */
+ if (offset != len) {
+ struct list_head *list = ¬es->src->source;
+
+ /* Discard all lines and fallback to objdump */
+ while (!list_empty(list)) {
+ dl = list_first_entry(list, struct disasm_line, al.node);
+
+ list_del_init(&dl->al.node);
+ disasm_line__free(dl);
+ }
+ count = -1;
+ }
+
+out:
+ if (needs_cs_close)
+ cs_close(&handle);
+ free(buf);
+ return count < 0 ? count : 0;
+
+err:
+ if (fd >= 0)
+ close(fd);
+ if (needs_cs_close) {
+ struct disasm_line *tmp;
+
+ /*
+ * It probably failed in the middle of the above loop.
+ * Release any resources it might add.
+ */
+ list_for_each_entry_safe(dl, tmp, ¬es->src->source, al.node) {
+ list_del(&dl->al.node);
+ free(dl);
+ }
+ }
+ count = -1;
+ goto out;
+}
+
static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
struct annotate_args *args)
{
@@ -1942,6 +2080,11 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
err = symbol__disassemble_raw(symfs_filename, sym, args);
if (err == 0)
goto out_remove_tmp;
+#ifdef HAVE_LIBCAPSTONE_SUPPORT
+ err = symbol__disassemble_capstone_powerpc(symfs_filename, sym, args);
+ if (err == 0)
+ goto out_remove_tmp;
+#endif
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 15/18] tools/perf: Add support to find global register variables using find_data_type_global_reg
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (13 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 14/18] tools/perf: Add support to use libcapstone in powerpc Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-18 5:11 ` Namhyung Kim
2024-07-13 16:55 ` [PATCH V7 16/18] tools/perf: Add support for global_die to capture name of variable in case of register defined variable Athira Rajeev
` (4 subsequent siblings)
19 siblings, 1 reply; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
There are cases where define a global register variable and associate it
with a specified register. Example, in powerpc, two registers are
defined to represent variable:
1. r13: represents local_paca
register struct paca_struct *local_paca asm("r13");
2. r1: represents stack_pointer
register void *__stack_pointer asm("r1");
These regs are present in dwarf debug as DW_OP_reg as part of variables
in the cu_die (compile unit). These are not present in die search done
in the list of nested scopes since these are global register variables.
Example for local_paca represented by r13:
<<>>
<1><18dc6b4>: Abbrev Number: 128 (DW_TAG_variable)
<18dc6b6> DW_AT_name : (indirect string, offset: 0x3861): local_paca
<18dc6ba> DW_AT_decl_file : 48
<18dc6bb> DW_AT_decl_line : 36
<18dc6bc> DW_AT_decl_column : 30
<18dc6bd> DW_AT_type : <0x18dc6c3>
<18dc6c1> DW_AT_external : 1
<18dc6c1> DW_AT_location : 1 byte block: 5d (DW_OP_reg13 (r13))
<1><18dc6c3>: Abbrev Number: 3 (DW_TAG_pointer_type)
<18dc6c4> DW_AT_byte_size : 8
<18dc6c4> DW_AT_type : <0x18dc353>
Where DW_AT_type : <0x18dc6c3> further points to :
<1><18dc6c3>: Abbrev Number: 3 (DW_TAG_pointer_type)
<18dc6c4> DW_AT_byte_size : 8
<18dc6c4> DW_AT_type : <0x18dc353>
which belongs to:
<1><18dc353>: Abbrev Number: 67 (DW_TAG_structure_type)
<18dc354> DW_AT_name : (indirect string, offset: 0x56cd): paca_struct
<18dc358> DW_AT_byte_size : 2944
<18dc35a> DW_AT_alignment : 128
<18dc35b> DW_AT_decl_file : 48
<18dc35c> DW_AT_decl_line : 61
<18dc35d> DW_AT_decl_column : 8
<18dc35d> DW_AT_sibling : <0x18dc6b4>
<<>>
Similar is case with "r1".
<<>>
<1><18dd772>: Abbrev Number: 129 (DW_TAG_variable)
<18dd774> DW_AT_name : (indirect string, offset: 0x11ba): current_stack_pointer
<18dd778> DW_AT_decl_file : 51
<18dd779> DW_AT_decl_line : 1468
<18dd77b> DW_AT_decl_column : 24
<18dd77c> DW_AT_type : <0x18da5cd>
<18dd780> DW_AT_external : 1
<18dd780> DW_AT_location : 1 byte block: 51 (DW_OP_reg1 (r1))
where 18da5cd is:
<1><18da5cd>: Abbrev Number: 47 (DW_TAG_base_type)
<18da5ce> DW_AT_byte_size : 8
<18da5cf> DW_AT_encoding : 7 (unsigned)
<18da5d0> DW_AT_name : (indirect string, offset: 0x55c7): long unsigned int
<<>>
To identify data type for these two special cases, iterate over
variables in the CU die (Compile Unit) and match it with the register.
If the variable is a base type, ie die_get_real_type will return NULL
here, set offset to zero. With the changes, data type for "paca_struct"
and "long unsigned int" for r1 is identified.
Snippet from ./perf report -s type,type_off
12.85% long unsigned int long unsigned int +0 (no field)
4.68% struct paca_struct struct paca_struct +2312 (__current)
4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/perf/util/annotate-data.c | 42 ++++++++++++++++++++++++++++
tools/perf/util/annotate.c | 8 ++++++
tools/perf/util/annotate.h | 1 +
tools/perf/util/include/dwarf-regs.h | 1 +
4 files changed, 52 insertions(+)
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 734acdd8c4b7..a5b4429ede57 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -1170,6 +1170,42 @@ static int find_data_type_block(struct data_loc_info *dloc,
return ret;
}
+/*
+ * Handle cases where define a global register variable and
+ * associate it with a specified register. These regs are
+ * present in dwarf debug as DW_OP_reg as part of variables
+ * in the cu_die (compile unit). Iterate over variables in the
+ * cu_die and match with reg to identify data type die.
+ */
+static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_Die *cu_die,
+ Dwarf_Die *type_die)
+{
+ Dwarf_Die vr_die;
+ int ret = -1;
+ struct die_var_type *var_types, *vt = NULL;
+
+ die_collect_vars(cu_die, &vt);
+ for (var_types = vt; var_types; var_types = var_types->next) {
+ if (var_types->reg != reg)
+ continue;
+ if (dwarf_offdie(dloc->di->dbg, var_types->die_off, &vr_die)) {
+ if (die_get_real_type(&vr_die, type_die) == NULL) {
+ dloc->type_offset = 0;
+ dwarf_offdie(dloc->di->dbg, var_types->die_off, type_die);
+ }
+ pr_debug_type_name(type_die, TSR_KIND_TYPE);
+ ret = 0;
+ pr_debug_dtp("found by CU for %s (die:%#lx)\n",
+ dwarf_diename(type_die), (long)dwarf_dieoffset(type_die));
+ break;
+ }
+ }
+
+ delete_var_types(vt);
+
+ return ret;
+}
+
/* The result will be saved in @type_die */
static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
{
@@ -1217,6 +1253,12 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
pr_debug_dtp("CU for %s (die:%#lx)\n",
dwarf_diename(&cu_die), (long)dwarf_dieoffset(&cu_die));
+ if (loc->reg_type == DWARF_REG_GLOBAL) {
+ ret = find_data_type_global_reg(dloc, reg, &cu_die, type_die);
+ if (!ret)
+ goto out;
+ }
+
if (reg == DWARF_REG_PC) {
if (get_global_var_type(&cu_die, dloc, dloc->ip, dloc->var_addr,
&offset, type_die)) {
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index ce99db291c5e..8db2f32700aa 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2425,6 +2425,14 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
op_loc->reg1 = DWARF_REG_PC;
}
+ /* Global reg variable 13 and 1
+ * assign to DWARF_REG_GLOBAL
+ */
+ if (arch__is(arch, "powerpc")) {
+ if ((op_loc->reg1 == 13) || (op_loc->reg1 == 1))
+ op_loc->reg_type = DWARF_REG_GLOBAL;
+ }
+
mem_type = find_data_type(&dloc);
if (mem_type == NULL && is_stack_canary(arch, op_loc)) {
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 9ba772f46270..ad69842a8ebc 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -475,6 +475,7 @@ struct annotated_op_loc {
bool mem_ref;
bool multi_regs;
bool imm;
+ int reg_type;
};
enum annotated_insn_ops {
diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
index 75b28dcc8317..a5e8e9498683 100644
--- a/tools/perf/util/include/dwarf-regs.h
+++ b/tools/perf/util/include/dwarf-regs.h
@@ -5,6 +5,7 @@
#define DWARF_REG_PC 0xd3af9c /* random number */
#define DWARF_REG_FB 0xd3affb /* random number */
+#define DWARF_REG_GLOBAL 0xd3affc /* random number */
#ifdef HAVE_DWARF_SUPPORT
const char *get_arch_regstr(unsigned int n);
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 16/18] tools/perf: Add support for global_die to capture name of variable in case of register defined variable
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (14 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 15/18] tools/perf: Add support to find global register variables using find_data_type_global_reg Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-18 5:25 ` Namhyung Kim
2024-07-13 16:55 ` [PATCH V7 17/18] tools/perf: Update data_type_cmp and sort__typeoff_sort function to include var_name in comparison Athira Rajeev
` (3 subsequent siblings)
19 siblings, 1 reply; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
In case of register defined variable (found using
find_data_type_global_reg), if the type of variable happens to be base
type (example, long unsigned int), perf report captures it as:
12.85% long unsigned int long unsigned int +0 (no field)
The above data type is actually referring to samples captured while
accessing "r1" which represents current stack pointer in powerpc.
register void *__stack_pointer asm("r1");
The dwarf debug contains this as:
<<>>
<1><18dd772>: Abbrev Number: 129 (DW_TAG_variable)
<18dd774> DW_AT_name : (indirect string, offset: 0x11ba): current_stack_pointer
<18dd778> DW_AT_decl_file : 51
<18dd779> DW_AT_decl_line : 1468
<18dd77b> DW_AT_decl_column : 24
<18dd77c> DW_AT_type : <0x18da5cd>
<18dd780> DW_AT_external : 1
<18dd780> DW_AT_location : 1 byte block: 51 (DW_OP_reg1 (r1))
where 18da5cd is:
<1><18da5cd>: Abbrev Number: 47 (DW_TAG_base_type)
<18da5ce> DW_AT_byte_size : 8
<18da5cf> DW_AT_encoding : 7 (unsigned)
<18da5d0> DW_AT_name : (indirect string, offset: 0x55c7): long unsigned int
<<>>
To make it more clear to the user, capture the DW_AT_name of the
variable and save it as part of Dwarf_Global. Dwarf_Global is used so
that it can be used and retrieved while presenting the result.
Update "dso__findnew_data_type" function to set "var_name" if
variable name is set as part of Dwarf_Global.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/perf/util/annotate-data.c | 30 ++++++++++++++++++++++++------
tools/perf/util/dwarf-aux.c | 1 +
tools/perf/util/dwarf-aux.h | 1 +
3 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index a5b4429ede57..8d05f3dbddf6 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -268,28 +268,37 @@ static void delete_members(struct annotated_member *member)
}
static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
- Dwarf_Die *type_die)
+ Dwarf_Die *type_die, Dwarf_Global *global_die)
{
struct annotated_data_type *result = NULL;
struct annotated_data_type key;
struct rb_node *node;
struct strbuf sb;
+ struct strbuf sb_var_name;
char *type_name;
+ char *var_name = NULL;
Dwarf_Word size;
strbuf_init(&sb, 32);
+ strbuf_init(&sb_var_name, 32);
if (die_get_typename_from_type(type_die, &sb) < 0)
strbuf_add(&sb, "(unknown type)", 14);
+ if (global_die->name) {
+ strbuf_addstr(&sb_var_name, global_die->name);
+ var_name = strbuf_detach(&sb_var_name, NULL);
+ }
type_name = strbuf_detach(&sb, NULL);
dwarf_aggregate_size(type_die, &size);
/* Check existing nodes in dso->data_types tree */
key.self.type_name = type_name;
+ key.self.var_name = var_name;
key.self.size = size;
node = rb_find(&key, dso__data_types(dso), data_type_cmp);
if (node) {
result = rb_entry(node, struct annotated_data_type, node);
free(type_name);
+ free(var_name);
return result;
}
@@ -297,10 +306,12 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
result = zalloc(sizeof(*result));
if (result == NULL) {
free(type_name);
+ free(var_name);
return NULL;
}
result->self.type_name = type_name;
+ result->self.var_name = var_name;
result->self.size = size;
INIT_LIST_HEAD(&result->self.children);
@@ -1178,7 +1189,7 @@ static int find_data_type_block(struct data_loc_info *dloc,
* cu_die and match with reg to identify data type die.
*/
static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_Die *cu_die,
- Dwarf_Die *type_die)
+ Dwarf_Die *type_die, Dwarf_Global *global_die)
{
Dwarf_Die vr_die;
int ret = -1;
@@ -1191,8 +1202,11 @@ static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_
if (dwarf_offdie(dloc->di->dbg, var_types->die_off, &vr_die)) {
if (die_get_real_type(&vr_die, type_die) == NULL) {
dloc->type_offset = 0;
+ global_die->name = var_types->name;
dwarf_offdie(dloc->di->dbg, var_types->die_off, type_die);
}
+ global_die->die_offset = (long)dwarf_dieoffset(type_die);
+ global_die->cu_offset = (long)dwarf_dieoffset(cu_die);
pr_debug_type_name(type_die, TSR_KIND_TYPE);
ret = 0;
pr_debug_dtp("found by CU for %s (die:%#lx)\n",
@@ -1207,7 +1221,8 @@ static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_
}
/* The result will be saved in @type_die */
-static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
+static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die,
+ Dwarf_Global *global_die)
{
struct annotated_op_loc *loc = dloc->op;
Dwarf_Die cu_die, var_die;
@@ -1221,6 +1236,8 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
u64 pc;
char buf[64];
+ memset(global_die, 0, sizeof(Dwarf_Global));
+
if (dloc->op->multi_regs)
snprintf(buf, sizeof(buf), "reg%d, reg%d", dloc->op->reg1, dloc->op->reg2);
else if (dloc->op->reg1 == DWARF_REG_PC)
@@ -1254,7 +1271,7 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
dwarf_diename(&cu_die), (long)dwarf_dieoffset(&cu_die));
if (loc->reg_type == DWARF_REG_GLOBAL) {
- ret = find_data_type_global_reg(dloc, reg, &cu_die, type_die);
+ ret = find_data_type_global_reg(dloc, reg, &cu_die, type_die, global_die);
if (!ret)
goto out;
}
@@ -1390,6 +1407,7 @@ struct annotated_data_type *find_data_type(struct data_loc_info *dloc)
struct annotated_data_type *result = NULL;
struct dso *dso = map__dso(dloc->ms->map);
Dwarf_Die type_die;
+ Dwarf_Global global_die;
dloc->di = debuginfo__new(dso__long_name(dso));
if (dloc->di == NULL) {
@@ -1405,10 +1423,10 @@ struct annotated_data_type *find_data_type(struct data_loc_info *dloc)
dloc->fbreg = -1;
- if (find_data_type_die(dloc, &type_die) < 0)
+ if (find_data_type_die(dloc, &type_die, &global_die) < 0)
goto out;
- result = dso__findnew_data_type(dso, &type_die);
+ result = dso__findnew_data_type(dso, &type_die, &global_die);
out:
debuginfo__delete(dloc->di);
diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 44ef968a7ad3..9e61ff326651 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1610,6 +1610,7 @@ static int __die_collect_vars_cb(Dwarf_Die *die_mem, void *arg)
vt->reg = reg_from_dwarf_op(ops);
vt->offset = offset_from_dwarf_op(ops);
vt->next = *var_types;
+ vt->name = dwarf_diename(die_mem);
*var_types = vt;
return DIE_FIND_CB_SIBLING;
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index 24446412b869..406a5b1e269b 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -146,6 +146,7 @@ struct die_var_type {
u64 addr;
int reg;
int offset;
+ const char *name;
};
/* Return type info of a member at offset */
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 17/18] tools/perf: Update data_type_cmp and sort__typeoff_sort function to include var_name in comparison
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (15 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 16/18] tools/perf: Add support for global_die to capture name of variable in case of register defined variable Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 18/18] tools/perf: Set instruction name to be used with insn-stat when using raw instruction Athira Rajeev
` (2 subsequent siblings)
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Currently data_type_cmp() only compares size and type name.
But in cases where the type name of two data type entries
is same, but var_name is different, the comparison can't distinguish
two different types.
Consider there is a "long unsigned int" with var_name as "X" and there
is global variable "long unsigned int". Currently since
data_type_cmp uses only type_name for comparison ( "long unsigned int"),
it won't distinguish these as separate entries. Update the
functions "data_type_cmp" as well as "sort__typeoff_sort" to
compare variable names after type name if it exists. Inorder to
use cmp_null, make the cmp_null from sort.c as not static.
Also updated "hist_entry__typeoff_snprintf" to print var_name if
it is set. With the changes,
11.42% long unsigned int long unsigned int +0 (current_stack_pointer)
4.68% struct paca_struct struct paca_struct +2312 (__current)
4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)
2.69% struct paca_struct struct paca_struct +2808 (canary)
2.68% struct paca_struct struct paca_struct +8 (paca_index)
2.24% struct paca_struct struct paca_struct +48 (data_offset)
1.43% long unsigned int long unsigned int +0 (no field)
Using ./perf report -s type,typeoff -H:
17.65% struct paca_struct
4.68% struct paca_struct +2312 (__current)
4.57% struct paca_struct +2354 (irq_soft_mask)
2.69% struct paca_struct +2808 (canary)
2.68% struct paca_struct +8 (paca_index)
2.24% struct paca_struct +48 (data_offset)
0.55% struct paca_struct +2816 (mmiowb_state.nesting_count)
0.18% struct paca_struct +2818 (mmiowb_state.mmiowb_pending)
0.03% struct paca_struct +2352 (hsrr_valid)
0.02% struct paca_struct +2356 (irq_work_pending)
0.00% struct paca_struct +0 (lppaca_ptr)
12.85% long unsigned int
11.42% long unsigned int +0 (current_stack_pointer)
1.43% long unsigned int +0 (no field)
With perf report -s type:
17.65% struct paca_struct
12.85% long unsigned int
1.69% struct task_struct
1.51% struct rq
with perf report -s typeoff
11.42% long unsigned int +0 (current_stack_pointer)
4.68% struct paca_struct +2312 (__current)
4.57% struct paca_struct +2354 (irq_soft_mask)
2.69% struct paca_struct +2808 (canary)
2.68% struct paca_struct +8 (paca_index)
2.24% struct paca_struct +48 (data_offset)
1.43% long unsigned int +0 (no field)
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
tools/perf/util/annotate-data.c | 23 +++++++++++++++++++++--
tools/perf/util/sort.c | 25 ++++++++++++++++++++++---
tools/perf/util/sort.h | 1 +
3 files changed, 44 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 8d05f3dbddf6..ea69c8d3d856 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -167,7 +167,7 @@ static void exit_type_state(struct type_state *state)
}
/*
- * Compare type name and size to maintain them in a tree.
+ * Compare type name, var_name and size to maintain them in a tree.
* I'm not sure if DWARF would have information of a single type in many
* different places (compilation units). If not, it could compare the
* offset of the type entry in the .debug_info section.
@@ -176,12 +176,31 @@ static int data_type_cmp(const void *_key, const struct rb_node *node)
{
const struct annotated_data_type *key = _key;
struct annotated_data_type *type;
+ int64_t ret = 0;
type = rb_entry(node, struct annotated_data_type, node);
if (key->self.size != type->self.size)
return key->self.size - type->self.size;
- return strcmp(key->self.type_name, type->self.type_name);
+
+ ret = strcmp(key->self.type_name, type->self.type_name);
+ if (ret)
+ return ret;
+
+ /*
+ * Compare var_name if it exists for key and type.
+ * If both nodes doesn't have var_name, but one of
+ * them has, return non-zero. This is to indicate nodes
+ * are not the same if one has var_name, but other doesn't.
+ */
+ if (key->self.var_name && type->self.var_name) {
+ ret = strcmp(key->self.var_name, type->self.var_name);
+ if (ret)
+ return ret;
+ } else if (!key->self.var_name != !type->self.var_name)
+ return cmp_null(key->self.var_name, type->self.var_name);
+
+ return ret;
}
static bool data_type_less(struct rb_node *node_a, const struct rb_node *node_b)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index cd39ea972193..25761d01dbd0 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -95,7 +95,7 @@ static int repsep_snprintf(char *bf, size_t size, const char *fmt, ...)
return n;
}
-static int64_t cmp_null(const void *l, const void *r)
+int64_t cmp_null(const void *l, const void *r)
{
if (!l && !r)
return 0;
@@ -2267,9 +2267,25 @@ sort__typeoff_sort(struct hist_entry *left, struct hist_entry *right)
right_type = right->mem_type;
}
+ /*
+ * Compare type_name first. Next, ompare var_name if it exists
+ * for left and right hist_entry. If both entries doesn't have
+ * var_name, but one of them has, return non-zero. This is to
+ * indicate entries are not the same if one has var_name, but the
+ * other doesn't.
+ * If type_name and var_name is same, use mem_type_off field.
+ */
ret = strcmp(left_type->self.type_name, right_type->self.type_name);
if (ret)
return ret;
+
+ if (left_type->self.var_name && right_type->self.var_name) {
+ ret = strcmp(left_type->self.var_name, right_type->self.var_name);
+ if (ret)
+ return ret;
+ } else if (!left_type->self.var_name != !right_type->self.var_name)
+ return cmp_null(left_type->self.var_name, right_type->self.var_name);
+
return left->mem_type_off - right->mem_type_off;
}
@@ -2305,9 +2321,12 @@ static int hist_entry__typeoff_snprintf(struct hist_entry *he, char *bf,
char buf[4096];
buf[0] = '\0';
- if (list_empty(&he_type->self.children))
+ if (list_empty(&he_type->self.children)) {
snprintf(buf, sizeof(buf), "no field");
- else
+ if (he_type->self.var_name)
+ strcpy(buf, he_type->self.var_name);
+
+ } else
fill_member_name(buf, sizeof(buf), &he_type->self,
he->mem_type_off, true);
buf[4095] = '\0';
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 0bd0ee3ae76b..41346d2b940e 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -151,4 +151,5 @@ sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right);
int64_t
_sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r);
char *hist_entry__srcline(struct hist_entry *he);
+int64_t cmp_null(const void *l, const void *r);
#endif /* __PERF_SORT_H */
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH V7 18/18] tools/perf: Set instruction name to be used with insn-stat when using raw instruction
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (16 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 17/18] tools/perf: Update data_type_cmp and sort__typeoff_sort function to include var_name in comparison Athira Rajeev
@ 2024-07-13 16:55 ` Athira Rajeev
2024-07-16 14:18 ` [PATCH V7 00/18] Add data type profiling support for powerpc kajoljain
2024-07-18 5:34 ` Namhyung Kim
19 siblings, 0 replies; 25+ messages in thread
From: Athira Rajeev @ 2024-07-13 16:55 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, irogers, namhyung, segher,
christophe.leroy
Cc: linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
atrajeev, kjain, disgoel
Since the "ins.name" is not set while using raw instruction,
perf annotate with insn-stat gives wrong data:
Result from "./perf annotate --data-type --insn-stat":
Annotate Instruction stats
total 615, ok 419 (68.1%), bad 196 (31.9%)
Name : Good Bad
-----------------------------------------------------------
: 419 196
Patch sets "dl->ins.name" in arch specific function "check_ppc_insn"
while initialising "struct disasm_line". Also update "ins_find" function
to pass "struct disasm_line" as a parameter so as to set its name field
in arch specific call.
With the patch changes:
Annotate Instruction stats
total 609, ok 446 (73.2%), bad 163 (26.8%)
Name/opcode : Good Bad
-----------------------------------------------------------
58 : 323 80
32 : 49 43
34 : 33 11
OP_31_XOP_LDX : 8 20
40 : 23 0
OP_31_XOP_LWARX : 5 1
OP_31_XOP_LWZX : 2 3
OP_31_XOP_LDARX : 3 0
33 : 0 2
OP_31_XOP_LBZX : 0 1
OP_31_XOP_LWAX : 0 1
OP_31_XOP_LHZX : 0 1
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
.../perf/arch/powerpc/annotate/instructions.c | 18 +++++++++++++++---
tools/perf/builtin-annotate.c | 4 ++--
tools/perf/util/annotate.c | 2 +-
tools/perf/util/disasm.c | 10 +++++-----
tools/perf/util/disasm.h | 2 +-
5 files changed, 24 insertions(+), 12 deletions(-)
diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index af1032572bf3..ede9eeade0ab 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -189,8 +189,9 @@ static int cmp_offset(const void *a, const void *b)
return (val1->value - val2->value);
}
-static struct ins_ops *check_ppc_insn(u32 raw_insn)
+static struct ins_ops *check_ppc_insn(struct disasm_line *dl)
{
+ int raw_insn = dl->raw.raw_insn;
int opcode = PPC_OP(raw_insn);
int mem_insn_31 = PPC_21_30(raw_insn);
struct insn_offset *ret;
@@ -198,19 +199,30 @@ static struct ins_ops *check_ppc_insn(u32 raw_insn)
"OP_31_INSN",
mem_insn_31
};
+ char name_insn[32];
/*
* Instructions with opcode 32 to 63 are memory
* instructions in powerpc
*/
if ((opcode & 0x20)) {
+ /*
+ * Set name in case of raw instruction to
+ * opcode to be used in insn-stat
+ */
+ if (!strlen(dl->ins.name)) {
+ sprintf(name_insn, "%d", opcode);
+ dl->ins.name = strdup(name_insn);
+ }
return &load_store_ops;
} else if (opcode == 31) {
/* Check for memory instructions with opcode 31 */
ret = bsearch(&mem_insns_31_opcode, ins_array, ARRAY_SIZE(ins_array), sizeof(ins_array[0]), cmp_offset);
- if (ret != NULL)
+ if (ret) {
+ if (!strlen(dl->ins.name))
+ dl->ins.name = strdup(ret->name);
return &load_store_ops;
- else {
+ } else {
mem_insns_31_opcode.value = PPC_22_30(raw_insn);
ret = bsearch(&mem_insns_31_opcode, arithmetic_ins_op_31, ARRAY_SIZE(arithmetic_ins_op_31),
sizeof(arithmetic_ins_op_31[0]), cmp_offset);
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index b10b7f005658..cf60392b1c19 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -396,10 +396,10 @@ static void print_annotate_item_stat(struct list_head *head, const char *title)
printf("total %d, ok %d (%.1f%%), bad %d (%.1f%%)\n\n", total,
total_good, 100.0 * total_good / (total ?: 1),
total_bad, 100.0 * total_bad / (total ?: 1));
- printf(" %-10s: %5s %5s\n", "Name", "Good", "Bad");
+ printf(" %-20s: %5s %5s\n", "Name/opcode", "Good", "Bad");
printf("-----------------------------------------------------------\n");
list_for_each_entry(istat, head, list)
- printf(" %-10s: %5d %5d\n", istat->name, istat->good, istat->bad);
+ printf(" %-20s: %5d %5d\n", istat->name, istat->good, istat->bad);
printf("\n");
}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 8db2f32700aa..e1f24dff8042 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2229,7 +2229,7 @@ static struct annotated_item_stat *annotate_data_stat(struct list_head *head,
return NULL;
istat->name = strdup(name);
- if (istat->name == NULL) {
+ if ((istat->name == NULL) || (!strlen(istat->name))) {
free(istat);
return NULL;
}
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 63681df6482b..cd283c42195c 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -859,7 +859,7 @@ static void ins__sort(struct arch *arch)
qsort(arch->instructions, nmemb, sizeof(struct ins), ins__cmp);
}
-static struct ins_ops *__ins__find(struct arch *arch, const char *name, u32 raw_insn)
+static struct ins_ops *__ins__find(struct arch *arch, const char *name, struct disasm_line *dl)
{
struct ins *ins;
const int nmemb = arch->nr_instructions;
@@ -871,7 +871,7 @@ static struct ins_ops *__ins__find(struct arch *arch, const char *name, u32 raw_
*/
struct ins_ops *ops;
- ops = check_ppc_insn(raw_insn);
+ ops = check_ppc_insn(dl);
if (ops)
return ops;
}
@@ -905,9 +905,9 @@ static struct ins_ops *__ins__find(struct arch *arch, const char *name, u32 raw_
return ins ? ins->ops : NULL;
}
-struct ins_ops *ins__find(struct arch *arch, const char *name, u32 raw_insn)
+struct ins_ops *ins__find(struct arch *arch, const char *name, struct disasm_line *dl)
{
- struct ins_ops *ops = __ins__find(arch, name, raw_insn);
+ struct ins_ops *ops = __ins__find(arch, name, dl);
if (!ops && arch->associate_instruction_ops)
ops = arch->associate_instruction_ops(arch, name);
@@ -917,7 +917,7 @@ struct ins_ops *ins__find(struct arch *arch, const char *name, u32 raw_insn)
static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map_symbol *ms)
{
- dl->ins.ops = ins__find(arch, dl->ins.name, dl->raw.raw_insn);
+ dl->ins.ops = ins__find(arch, dl->ins.name, dl);
if (!dl->ins.ops)
return;
diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
index c1bb1e484bfb..f56beedeb9da 100644
--- a/tools/perf/util/disasm.h
+++ b/tools/perf/util/disasm.h
@@ -105,7 +105,7 @@ struct annotate_args {
struct arch *arch__find(const char *name);
bool arch__is(struct arch *arch, const char *name);
-struct ins_ops *ins__find(struct arch *arch, const char *name, u32 raw_insn);
+struct ins_ops *ins__find(struct arch *arch, const char *name, struct disasm_line *dl);
int ins__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name);
--
2.43.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH V7 00/18] Add data type profiling support for powerpc
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (17 preceding siblings ...)
2024-07-13 16:55 ` [PATCH V7 18/18] tools/perf: Set instruction name to be used with insn-stat when using raw instruction Athira Rajeev
@ 2024-07-16 14:18 ` kajoljain
2024-07-18 5:34 ` Namhyung Kim
19 siblings, 0 replies; 25+ messages in thread
From: kajoljain @ 2024-07-16 14:18 UTC (permalink / raw)
To: Athira Rajeev, acme, jolsa, adrian.hunter, irogers, namhyung,
segher, christophe.leroy
Cc: linux-kernel, akanksha, linux-perf-users, maddy, disgoel,
linuxppc-dev
Patchset looks fine to me.
Tested-by: Kajol Jain<kjain@linux.ibm.com>
Reviewed-by: Kajol Jain<kjain@linux.ibm.com>
Thanks,
Kajol Jain
On 7/13/24 22:25, Athira Rajeev wrote:
> The patchset from Namhyung added support for data type profiling
> in perf tool. This enabled support to associate PMU samples to data
> types they refer using DWARF debug information. With the upstream
> perf, currently it possible to run perf report or perf annotate to
> view the data type information on x86.
>
> Initial patchset posted here had changes need to enable data type
> profiling support for powerpc.
>
> https://lore.kernel.org/all/6e09dc28-4a2e-49d8-a2b5-ffb3396a9952@csgroup.eu/T/
>
> Main change were:
> 1. powerpc instruction nmemonic table to associate load/store
> instructions with move_ops which is use to identify if instruction
> is a memory access one.
> 2. To get register number and access offset from the given
> instruction, code uses fields from "struct arch" -> objump.
> Added entry for powerpc here.
> 3. A get_arch_regnum to return register number from the
> register name string.
>
> But the apporach used in the initial patchset used parsing of
> disassembled code which the current perf tool implementation does.
>
> Example: lwz r10,0(r9)
>
> This line "lwz r10,0(r9)" is parsed to extract instruction name,
> registers names and offset. Also to find whether there is a memory
> reference in the operands, "memory_ref_char" field of objdump is used.
> For x86, "(" is used as memory_ref_char to tackle instructions of the
> form "mov (%rax), %rcx".
>
> In case of powerpc, not all instructions using "(" are the only memory
> instructions. Example, above instruction can also be of extended form (X
> form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
> and extract the source/target registers, second patchset added support to use
> raw instruction. With raw instruction, macros are added to extract opcode
> and register fields.
> Link to second patchset:
> https://lore.kernel.org/all/20240506121906.76639-1-atrajeev@linux.vnet.ibm.com/
>
> Example representation using --show-raw-insn in objdump gives result:
>
> 38 01 81 e8 ld r4,312(r1)
>
> Here "38 01 81 e8" is the raw instruction representation. In powerpc,
> this translates to instruction form: "ld RT,DS(RA)" and binary code
> as:
> _____________________________________
> | 58 | RT | RA | DS | |
> -------------------------------------
> 0 6 11 16 30 31
>
> Second patchset used "objdump" again to read the raw instruction.
> But since there is no need to disassemble and binary code can be read
> directly from the DSO, third patchset (ie this patchset) uses below
> apporach. The apporach preferred in powerpc to parse sample for data
> type profiling in V3 patchset is:
> - Read directly from DSO using dso__data_read_offset
> - If that fails for any case, fallback to using libcapstone
> - If libcapstone is not supported, approach will use objdump
>
> Patchset adds support to pick the opcode and reg fields from this
> raw/binary instruction code. This approach came in from review comment
> by Segher Boessenkool and Christophe for the initial patchset.
>
> Apart from that, instruction tracking is enabled for powerpc and
> support function is added to find variables defined as registers
> Example, in powerpc, below two registers are
> defined to represent variable:
> 1. r13: represents local_paca
> register struct paca_struct *local_paca asm("r13");
>
> 2. r1: represents stack_pointer
> register void *__stack_pointer asm("r1");
>
> These are handled in this patchset.
>
> - Patch 1 is to rearrange register state type structures to header file
> so that it can referred from other arch specific files
> - Patch 2 is to make instruction tracking as a callback to"struct arch"
> so that it can be implemented by other archs easily and defined in arch
> specific files
> - Patch 3 is to handle state type regs array size for x86 and powerpc
> - Patch 4 adds support to capture and parse raw instruction in powerpc
> using dso__data_read_offset utility
> - Patch 4 also adds logic to support using objdump when doing default "perf
> report" or "perf annotate" since it that needs disassembled instruction.
> - Patch 5 adds disasm_line__parse to parse raw instruction for powerpc
> - Patch 6 update parameters for reg extract functions to use raw
> instruction on powerpc
> - Patch 7 updates ins__find to carry raw_insn and also adds parse
> callback for memory instructions for powerpc
> - Patch 8 add support to identify memory instructions of opcode 31 in
> powerpc
> - Patch 9 adds more instructions to support instruction tracking in powerpc
> - Patch 10 and 11 handles instruction tracking for powerpc.
> - Patch 12, 13 and 14 add support to use libcapstone in powerpc
> - Patch 15 and patch 16 handles support to find global register variables
> - PAtch 17 updates data type compare functions data_type_cmp and
> sort__typeoff_sort to include var_name along with type_name in
> comparison.
> - Patch 18 handles insn-stat option for perf annotate
>
> Note:
> - There are remaining unknowns (25%) as seen in annotate Instruction stats
> below.
> - This patchset is not tested on powerpc32. In next step of enhancements
> along with handling remaining unknowns, plan to cover powerpc32 changes
> based on how testing goes.
>
> With the current patchset:
>
> ./perf record -a -e mem-loads sleep 1
> ./perf report -s type,typeoff --hierarchy --group --stdio
> ./perf annotate --data-type --insn-stat
>
> perf annotate logs:
> ==================
>
>
> Annotate Instruction stats
> total 609, ok 446 (73.2%), bad 163 (26.8%)
>
> Name/opcode : Good Bad
> -----------------------------------------------------------
> 58 : 323 80
> 32 : 49 43
> 34 : 33 11
> OP_31_XOP_LDX : 8 20
> 40 : 23 0
> OP_31_XOP_LWARX : 5 1
> OP_31_XOP_LWZX : 2 3
> OP_31_XOP_LDARX : 3 0
> 33 : 0 2
> OP_31_XOP_LBZX : 0 1
> OP_31_XOP_LWAX : 0 1
> OP_31_XOP_LHZX : 0 1
>
> perf report logs:
> =================
>
> Total Lost Samples: 0
>
> Samples: 1K of event 'mem-loads'
> Event count (approx.): 937238
>
> Overhead Data Type Data Type Offset
> ........ ......... ................
> 48.60% (unknown) (unknown) +0 (no field)
> 11.42% long unsigned int long unsigned int +0 (current_stack_pointer)
> 4.68% struct paca_struct struct paca_struct +2312 (__current)
> 4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)
> 2.69% struct paca_struct struct paca_struct +2808 (canary)
> 2.68% struct paca_struct struct paca_struct +8 (paca_index)
> 2.24% struct paca_struct struct paca_struct +48 (data_offset)
> 1.43% long unsigned int long unsigned int +0 (no field)
> 1.41% struct vm_fault struct vm_fault +0 (vma)
> 1.29% struct task_struct struct task_struct +276 (flags)
> 1.03% struct pt_regs struct pt_regs +264 (user_regs.msr)
> 0.90% struct security_hook_list struct security_hook_list +0 (list.next)
> 0.76% struct irq_desc struct irq_desc +304 (irq_data.chip)
> 0.76% struct rq struct rq +2856 (cpu)
> 0.72% long long unsigned int long long unsigned int +0 (no field)
>
> Thanks
> Athira Rajeev
>
> Changelog:
> From v6 -> v7:
> - Addressed review comments from Namhyung
> Changed format string space to %-20s while printing
> instruction stats in patch 18.
> Use cmp_null in patch 17 while comparing var_name to
> properly sort with correct order.
>
> From v5 -> v6:
> - Addressed review comments from Namhyung
> Conditionally define TYPE_STATE_MAX_REGS based on arch.
> Added macro for defining width of the raw codes and spaces
> in disasm_line__parse_powerpc.
> Call disasm_line__parse from disasm_line__parse_powerpc
> for generic code.
> Renamed symbol__disassemble_dso to symbol__disassemble_raw.
> Fixed find_data_type_global_reg to correclty free var_types
> and change indent level.
> Fixed data_type_cmp and sort__typeoff_sort to include var_name
> in comparing data type entries.
>
> From v4 -> v5:
> - Addressed review comments from Namhyung
> Handle max number of type state regs as 16 for x86 and 32 for
> powerpc.
> Added generic support for objdump patch first and DSO read
> optimisation next
> combined patch 3 and patch 4 in patchseries V4 to one patch
> Changed reference for "raw_insn" to use "u32"
> Splitted "parse" callback patch changes and "ins__find" patch
> changes into two
> Instead of making weak function, added get_powerpc_regs to
> extract register and offset fields for powerpc
> - Addressed complation fail when "dwarf.h" is not present ie
> elfutils devel is not present. Used includes for #ifdef HAVE_DWARF_SUPPORT
> when including functions that use Dwarf references. Also
> conditionally include some of the header files.
>
> From v3->v4:
> - Addressed review comments from Ian by using capston_init from
> "util/print_insn.c" instead of "open_capston_handle".
> - Addressed review comment from Namhyung by moving "opcode"
> field from "struct ins" to "struct disasm_line"
>
> From v2->v3:
> - Addressed review comments from Christophe and Namhyung for V2
> - Changed the apporach in powerpc to parse sample for data
> type profiling as:
> Read directly from DSO using dso__data_read_offset
> If that fails for any case, fallback to using libcapstone
> If libcapstone is not supported, approach will use objdump
> - Include instructions with opcode as 31 and correctly categorize
> them as memory or arithmetic instructions.
> - Include more instructions for instruction tracking in powerpc
>
> From v1->v2:
> - Addressed suggestion from Christophe Leroy and Segher Boessenkool
> to use the binary code (raw insn) to fetch opcode, register and
> offset fields.
> - Added support for instruction tracking in powerpc
> - Find the register defined variables (r13 and r1 which points to
> local_paca and current_stack_pointer in powerpc)
>
> Athira Rajeev (18):
> tools/perf: Move the data structures related to register type to
> header file
> tools/perf: Add "update_insn_state" callback function to handle arch
> specific instruction tracking
> tools/perf: Update TYPE_STATE_MAX_REGS to include max of regs in
> powerpc
> tools/perf: Add disasm_line__parse to parse raw instruction for
> powerpc
> tools/perf: Add support to capture and parse raw instruction in
> powerpc using dso__data_read_offset utility
> tools/perf: Update parameters for reg extract functions to use raw
> instruction on powerpc
> tools/perf: Add parse function for memory instructions in powerpc
> tools/perf: Add support to identify memory instructions of opcode 31
> in powerpc
> tools/perf: Add some of the arithmetic instructions to support
> instruction tracking in powerpc
> tools/perf: Add more instructions for instruction tracking
> tools/perf: Update instruction tracking for powerpc
> tools/perf: Make capstone_init non-static so that it can be used
> during symbol disassemble
> tools/perf: Use capstone_init and remove open_capstone_handle from
> disasm.c
> tools/perf: Add support to use libcapstone in powerpc
> tools/perf: Add support to find global register variables using
> find_data_type_global_reg
> tools/perf: Add support for global_die to capture name of variable in
> case of register defined variable
> tools/perf: Update data_type_cmp and sort__typeoff_sort function to
> include var_name in comparison
> tools/perf: Set instruction name to be used with insn-stat when using
> raw instruction
>
> tools/include/linux/string.h | 2 +
> tools/lib/string.c | 13 +
> tools/perf/arch/arm64/annotate/instructions.c | 3 +-
> .../arch/loongarch/annotate/instructions.c | 6 +-
> .../perf/arch/powerpc/annotate/instructions.c | 254 ++++++++
> tools/perf/arch/powerpc/util/dwarf-regs.c | 53 ++
> tools/perf/arch/s390/annotate/instructions.c | 5 +-
> tools/perf/arch/x86/annotate/instructions.c | 377 ++++++++++++
> tools/perf/builtin-annotate.c | 4 +-
> tools/perf/util/annotate-data.c | 544 ++++--------------
> tools/perf/util/annotate-data.h | 83 +++
> tools/perf/util/annotate.c | 29 +-
> tools/perf/util/annotate.h | 6 +-
> tools/perf/util/disasm.c | 468 +++++++++++++--
> tools/perf/util/disasm.h | 19 +-
> tools/perf/util/dwarf-aux.c | 1 +
> tools/perf/util/dwarf-aux.h | 1 +
> tools/perf/util/include/dwarf-regs.h | 12 +
> tools/perf/util/print_insn.c | 15 +-
> tools/perf/util/print_insn.h | 5 +
> tools/perf/util/sort.c | 25 +-
> tools/perf/util/sort.h | 1 +
> 22 files changed, 1421 insertions(+), 505 deletions(-)
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH V7 15/18] tools/perf: Add support to find global register variables using find_data_type_global_reg
2024-07-13 16:55 ` [PATCH V7 15/18] tools/perf: Add support to find global register variables using find_data_type_global_reg Athira Rajeev
@ 2024-07-18 5:11 ` Namhyung Kim
0 siblings, 0 replies; 25+ messages in thread
From: Namhyung Kim @ 2024-07-18 5:11 UTC (permalink / raw)
To: Athira Rajeev
Cc: acme, jolsa, adrian.hunter, irogers, segher, christophe.leroy,
linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
kjain, disgoel
Hello,
On Sat, Jul 13, 2024 at 10:25:26PM +0530, Athira Rajeev wrote:
> There are cases where define a global register variable and associate it
> with a specified register. Example, in powerpc, two registers are
> defined to represent variable:
> 1. r13: represents local_paca
> register struct paca_struct *local_paca asm("r13");
>
> 2. r1: represents stack_pointer
> register void *__stack_pointer asm("r1");
>
> These regs are present in dwarf debug as DW_OP_reg as part of variables
> in the cu_die (compile unit). These are not present in die search done
> in the list of nested scopes since these are global register variables.
>
> Example for local_paca represented by r13:
>
> <<>>
> <1><18dc6b4>: Abbrev Number: 128 (DW_TAG_variable)
> <18dc6b6> DW_AT_name : (indirect string, offset: 0x3861): local_paca
> <18dc6ba> DW_AT_decl_file : 48
> <18dc6bb> DW_AT_decl_line : 36
> <18dc6bc> DW_AT_decl_column : 30
> <18dc6bd> DW_AT_type : <0x18dc6c3>
> <18dc6c1> DW_AT_external : 1
> <18dc6c1> DW_AT_location : 1 byte block: 5d (DW_OP_reg13 (r13))
>
> <1><18dc6c3>: Abbrev Number: 3 (DW_TAG_pointer_type)
> <18dc6c4> DW_AT_byte_size : 8
> <18dc6c4> DW_AT_type : <0x18dc353>
>
> Where DW_AT_type : <0x18dc6c3> further points to :
>
> <1><18dc6c3>: Abbrev Number: 3 (DW_TAG_pointer_type)
> <18dc6c4> DW_AT_byte_size : 8
> <18dc6c4> DW_AT_type : <0x18dc353>
>
> which belongs to:
>
> <1><18dc353>: Abbrev Number: 67 (DW_TAG_structure_type)
> <18dc354> DW_AT_name : (indirect string, offset: 0x56cd): paca_struct
> <18dc358> DW_AT_byte_size : 2944
> <18dc35a> DW_AT_alignment : 128
> <18dc35b> DW_AT_decl_file : 48
> <18dc35c> DW_AT_decl_line : 61
> <18dc35d> DW_AT_decl_column : 8
> <18dc35d> DW_AT_sibling : <0x18dc6b4>
> <<>>
>
> Similar is case with "r1".
>
> <<>>
> <1><18dd772>: Abbrev Number: 129 (DW_TAG_variable)
> <18dd774> DW_AT_name : (indirect string, offset: 0x11ba): current_stack_pointer
> <18dd778> DW_AT_decl_file : 51
> <18dd779> DW_AT_decl_line : 1468
> <18dd77b> DW_AT_decl_column : 24
> <18dd77c> DW_AT_type : <0x18da5cd>
> <18dd780> DW_AT_external : 1
> <18dd780> DW_AT_location : 1 byte block: 51 (DW_OP_reg1 (r1))
>
> where 18da5cd is:
>
> <1><18da5cd>: Abbrev Number: 47 (DW_TAG_base_type)
> <18da5ce> DW_AT_byte_size : 8
> <18da5cf> DW_AT_encoding : 7 (unsigned)
> <18da5d0> DW_AT_name : (indirect string, offset: 0x55c7): long unsigned int
> <<>>
>
> To identify data type for these two special cases, iterate over
> variables in the CU die (Compile Unit) and match it with the register.
> If the variable is a base type, ie die_get_real_type will return NULL
I'm not sure why die_get_real_type() returned NULL. The variable has
the type attribute and the function run the loop only if the type is
either const, restrict, volatile, shared or typedef. So I think it
should return the base_type DIE.
> here, set offset to zero. With the changes, data type for "paca_struct"
> and "long unsigned int" for r1 is identified.
>
> Snippet from ./perf report -s type,type_off
>
> 12.85% long unsigned int long unsigned int +0 (no field)
> 4.68% struct paca_struct struct paca_struct +2312 (__current)
> 4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)
>
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
> tools/perf/util/annotate-data.c | 42 ++++++++++++++++++++++++++++
> tools/perf/util/annotate.c | 8 ++++++
> tools/perf/util/annotate.h | 1 +
> tools/perf/util/include/dwarf-regs.h | 1 +
> 4 files changed, 52 insertions(+)
>
> diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
> index 734acdd8c4b7..a5b4429ede57 100644
> --- a/tools/perf/util/annotate-data.c
> +++ b/tools/perf/util/annotate-data.c
> @@ -1170,6 +1170,42 @@ static int find_data_type_block(struct data_loc_info *dloc,
> return ret;
> }
>
> +/*
> + * Handle cases where define a global register variable and
> + * associate it with a specified register. These regs are
> + * present in dwarf debug as DW_OP_reg as part of variables
> + * in the cu_die (compile unit). Iterate over variables in the
> + * cu_die and match with reg to identify data type die.
> + */
> +static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_Die *cu_die,
> + Dwarf_Die *type_die)
> +{
> + Dwarf_Die vr_die;
> + int ret = -1;
> + struct die_var_type *var_types, *vt = NULL;
> +
> + die_collect_vars(cu_die, &vt);
> + for (var_types = vt; var_types; var_types = var_types->next) {
> + if (var_types->reg != reg)
> + continue;
> + if (dwarf_offdie(dloc->di->dbg, var_types->die_off, &vr_die)) {
> + if (die_get_real_type(&vr_die, type_die) == NULL) {
Ok, I think I know the reason. You don't need to call die_get_real_type()
here as var_types already has the type, not the variable. Usually we
want a pointer type for a variable. For the final result, you want the
target type of the pointer though.
> + dloc->type_offset = 0;
> + dwarf_offdie(dloc->di->dbg, var_types->die_off, type_die);
> + }
> + pr_debug_type_name(type_die, TSR_KIND_TYPE);
> + ret = 0;
> + pr_debug_dtp("found by CU for %s (die:%#lx)\n",
I think it's better to say found by global register %d.
> + dwarf_diename(type_die), (long)dwarf_dieoffset(type_die));
> + break;
> + }
> + }
> +
> + delete_var_types(vt);
> +
> + return ret;
> +}
> +
> /* The result will be saved in @type_die */
> static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
> {
> @@ -1217,6 +1253,12 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
> pr_debug_dtp("CU for %s (die:%#lx)\n",
> dwarf_diename(&cu_die), (long)dwarf_dieoffset(&cu_die));
>
> + if (loc->reg_type == DWARF_REG_GLOBAL) {
> + ret = find_data_type_global_reg(dloc, reg, &cu_die, type_die);
> + if (!ret)
> + goto out;
> + }
> +
> if (reg == DWARF_REG_PC) {
> if (get_global_var_type(&cu_die, dloc, dloc->ip, dloc->var_addr,
> &offset, type_die)) {
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index ce99db291c5e..8db2f32700aa 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -2425,6 +2425,14 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
> op_loc->reg1 = DWARF_REG_PC;
> }
>
> + /* Global reg variable 13 and 1
> + * assign to DWARF_REG_GLOBAL
> + */
> + if (arch__is(arch, "powerpc")) {
> + if ((op_loc->reg1 == 13) || (op_loc->reg1 == 1))
> + op_loc->reg_type = DWARF_REG_GLOBAL;
> + }
> +
> mem_type = find_data_type(&dloc);
>
> if (mem_type == NULL && is_stack_canary(arch, op_loc)) {
> diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
> index 9ba772f46270..ad69842a8ebc 100644
> --- a/tools/perf/util/annotate.h
> +++ b/tools/perf/util/annotate.h
> @@ -475,6 +475,7 @@ struct annotated_op_loc {
> bool mem_ref;
> bool multi_regs;
> bool imm;
> + int reg_type;
Just bool global_reg would be enough unless you plan to add more
register types.
Thanks,
Namhyung
> };
>
> enum annotated_insn_ops {
> diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
> index 75b28dcc8317..a5e8e9498683 100644
> --- a/tools/perf/util/include/dwarf-regs.h
> +++ b/tools/perf/util/include/dwarf-regs.h
> @@ -5,6 +5,7 @@
>
> #define DWARF_REG_PC 0xd3af9c /* random number */
> #define DWARF_REG_FB 0xd3affb /* random number */
> +#define DWARF_REG_GLOBAL 0xd3affc /* random number */
>
> #ifdef HAVE_DWARF_SUPPORT
> const char *get_arch_regstr(unsigned int n);
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH V7 16/18] tools/perf: Add support for global_die to capture name of variable in case of register defined variable
2024-07-13 16:55 ` [PATCH V7 16/18] tools/perf: Add support for global_die to capture name of variable in case of register defined variable Athira Rajeev
@ 2024-07-18 5:25 ` Namhyung Kim
0 siblings, 0 replies; 25+ messages in thread
From: Namhyung Kim @ 2024-07-18 5:25 UTC (permalink / raw)
To: Athira Rajeev
Cc: acme, jolsa, adrian.hunter, irogers, segher, christophe.leroy,
linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
kjain, disgoel
On Sat, Jul 13, 2024 at 10:25:27PM +0530, Athira Rajeev wrote:
> In case of register defined variable (found using
> find_data_type_global_reg), if the type of variable happens to be base
> type (example, long unsigned int), perf report captures it as:
>
> 12.85% long unsigned int long unsigned int +0 (no field)
>
> The above data type is actually referring to samples captured while
> accessing "r1" which represents current stack pointer in powerpc.
> register void *__stack_pointer asm("r1");
>
> The dwarf debug contains this as:
>
> <<>>
> <1><18dd772>: Abbrev Number: 129 (DW_TAG_variable)
> <18dd774> DW_AT_name : (indirect string, offset: 0x11ba): current_stack_pointer
> <18dd778> DW_AT_decl_file : 51
> <18dd779> DW_AT_decl_line : 1468
> <18dd77b> DW_AT_decl_column : 24
> <18dd77c> DW_AT_type : <0x18da5cd>
> <18dd780> DW_AT_external : 1
> <18dd780> DW_AT_location : 1 byte block: 51 (DW_OP_reg1 (r1))
>
> where 18da5cd is:
>
> <1><18da5cd>: Abbrev Number: 47 (DW_TAG_base_type)
> <18da5ce> DW_AT_byte_size : 8
> <18da5cf> DW_AT_encoding : 7 (unsigned)
> <18da5d0> DW_AT_name : (indirect string, offset: 0x55c7): long unsigned int
> <<>>
Actually this is different from your description. I expect
DW_TAG_variable
DW_AT_name: __stack_pointer
DW_AT_type: <pointer_type> (void *)
But it seems your DWARF has
DW_TAG_variable
DW_AT_name: current_stack_pointer
DW_AT_type: <base_type> (long unsigned int)
>
> To make it more clear to the user, capture the DW_AT_name of the
> variable and save it as part of Dwarf_Global. Dwarf_Global is used so
> that it can be used and retrieved while presenting the result.
>
> Update "dso__findnew_data_type" function to set "var_name" if
> variable name is set as part of Dwarf_Global.
>
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
> tools/perf/util/annotate-data.c | 30 ++++++++++++++++++++++++------
> tools/perf/util/dwarf-aux.c | 1 +
> tools/perf/util/dwarf-aux.h | 1 +
> 3 files changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
> index a5b4429ede57..8d05f3dbddf6 100644
> --- a/tools/perf/util/annotate-data.c
> +++ b/tools/perf/util/annotate-data.c
> @@ -268,28 +268,37 @@ static void delete_members(struct annotated_member *member)
> }
>
> static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
> - Dwarf_Die *type_die)
> + Dwarf_Die *type_die, Dwarf_Global *global_die)
> {
> struct annotated_data_type *result = NULL;
> struct annotated_data_type key;
> struct rb_node *node;
> struct strbuf sb;
> + struct strbuf sb_var_name;
> char *type_name;
> + char *var_name = NULL;
> Dwarf_Word size;
>
> strbuf_init(&sb, 32);
> + strbuf_init(&sb_var_name, 32);
> if (die_get_typename_from_type(type_die, &sb) < 0)
> strbuf_add(&sb, "(unknown type)", 14);
> + if (global_die->name) {
> + strbuf_addstr(&sb_var_name, global_die->name);
> + var_name = strbuf_detach(&sb_var_name, NULL);
I think you can just use strdup(global_die->name).
> + }
> type_name = strbuf_detach(&sb, NULL);
> dwarf_aggregate_size(type_die, &size);
>
> /* Check existing nodes in dso->data_types tree */
> key.self.type_name = type_name;
> + key.self.var_name = var_name;
> key.self.size = size;
> node = rb_find(&key, dso__data_types(dso), data_type_cmp);
> if (node) {
> result = rb_entry(node, struct annotated_data_type, node);
> free(type_name);
> + free(var_name);
> return result;
> }
>
> @@ -297,10 +306,12 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
> result = zalloc(sizeof(*result));
> if (result == NULL) {
> free(type_name);
> + free(var_name);
> return NULL;
> }
>
> result->self.type_name = type_name;
> + result->self.var_name = var_name;
> result->self.size = size;
> INIT_LIST_HEAD(&result->self.children);
>
> @@ -1178,7 +1189,7 @@ static int find_data_type_block(struct data_loc_info *dloc,
> * cu_die and match with reg to identify data type die.
> */
> static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_Die *cu_die,
> - Dwarf_Die *type_die)
> + Dwarf_Die *type_die, Dwarf_Global *global_die)
> {
> Dwarf_Die vr_die;
> int ret = -1;
> @@ -1191,8 +1202,11 @@ static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_
> if (dwarf_offdie(dloc->di->dbg, var_types->die_off, &vr_die)) {
> if (die_get_real_type(&vr_die, type_die) == NULL) {
> dloc->type_offset = 0;
> + global_die->name = var_types->name;
> dwarf_offdie(dloc->di->dbg, var_types->die_off, type_die);
> }
> + global_die->die_offset = (long)dwarf_dieoffset(type_die);
> + global_die->cu_offset = (long)dwarf_dieoffset(cu_die);
It seems all you need is the name of the variable. Can we simply pass
the name instead of Dwarf_Global?
> pr_debug_type_name(type_die, TSR_KIND_TYPE);
> ret = 0;
> pr_debug_dtp("found by CU for %s (die:%#lx)\n",
> @@ -1207,7 +1221,8 @@ static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_
> }
>
> /* The result will be saved in @type_die */
> -static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
> +static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die,
> + Dwarf_Global *global_die)
> {
> struct annotated_op_loc *loc = dloc->op;
> Dwarf_Die cu_die, var_die;
> @@ -1221,6 +1236,8 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
> u64 pc;
> char buf[64];
>
> + memset(global_die, 0, sizeof(Dwarf_Global));
> +
> if (dloc->op->multi_regs)
> snprintf(buf, sizeof(buf), "reg%d, reg%d", dloc->op->reg1, dloc->op->reg2);
> else if (dloc->op->reg1 == DWARF_REG_PC)
> @@ -1254,7 +1271,7 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
> dwarf_diename(&cu_die), (long)dwarf_dieoffset(&cu_die));
>
> if (loc->reg_type == DWARF_REG_GLOBAL) {
> - ret = find_data_type_global_reg(dloc, reg, &cu_die, type_die);
> + ret = find_data_type_global_reg(dloc, reg, &cu_die, type_die, global_die);
> if (!ret)
> goto out;
> }
> @@ -1390,6 +1407,7 @@ struct annotated_data_type *find_data_type(struct data_loc_info *dloc)
> struct annotated_data_type *result = NULL;
> struct dso *dso = map__dso(dloc->ms->map);
> Dwarf_Die type_die;
> + Dwarf_Global global_die;
>
> dloc->di = debuginfo__new(dso__long_name(dso));
> if (dloc->di == NULL) {
> @@ -1405,10 +1423,10 @@ struct annotated_data_type *find_data_type(struct data_loc_info *dloc)
>
> dloc->fbreg = -1;
>
> - if (find_data_type_die(dloc, &type_die) < 0)
> + if (find_data_type_die(dloc, &type_die, &global_die) < 0)
> goto out;
>
> - result = dso__findnew_data_type(dso, &type_die);
> + result = dso__findnew_data_type(dso, &type_die, &global_die);
>
> out:
> debuginfo__delete(dloc->di);
> diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
> index 44ef968a7ad3..9e61ff326651 100644
> --- a/tools/perf/util/dwarf-aux.c
> +++ b/tools/perf/util/dwarf-aux.c
> @@ -1610,6 +1610,7 @@ static int __die_collect_vars_cb(Dwarf_Die *die_mem, void *arg)
> vt->reg = reg_from_dwarf_op(ops);
> vt->offset = offset_from_dwarf_op(ops);
> vt->next = *var_types;
> + vt->name = dwarf_diename(die_mem);
Hmm.. maybe we can just collect variables (not their types) directly,
then we can get the name from the variable DIE without saving it in
the struct die_var_type.
Thanks,
Namhyung
> *var_types = vt;
>
> return DIE_FIND_CB_SIBLING;
> diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
> index 24446412b869..406a5b1e269b 100644
> --- a/tools/perf/util/dwarf-aux.h
> +++ b/tools/perf/util/dwarf-aux.h
> @@ -146,6 +146,7 @@ struct die_var_type {
> u64 addr;
> int reg;
> int offset;
> + const char *name;
> };
>
> /* Return type info of a member at offset */
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH V7 00/18] Add data type profiling support for powerpc
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
` (18 preceding siblings ...)
2024-07-16 14:18 ` [PATCH V7 00/18] Add data type profiling support for powerpc kajoljain
@ 2024-07-18 5:34 ` Namhyung Kim
2024-07-18 6:11 ` Athira Rajeev
19 siblings, 1 reply; 25+ messages in thread
From: Namhyung Kim @ 2024-07-18 5:34 UTC (permalink / raw)
To: Athira Rajeev
Cc: acme, jolsa, adrian.hunter, irogers, segher, christophe.leroy,
linux-kernel, linux-perf-users, linuxppc-dev, akanksha, maddy,
kjain, disgoel
Hello,
On Sat, Jul 13, 2024 at 10:25:11PM +0530, Athira Rajeev wrote:
> The patchset from Namhyung added support for data type profiling
> in perf tool. This enabled support to associate PMU samples to data
> types they refer using DWARF debug information. With the upstream
> perf, currently it possible to run perf report or perf annotate to
> view the data type information on x86.
>
> Initial patchset posted here had changes need to enable data type
> profiling support for powerpc.
>
> https://lore.kernel.org/all/6e09dc28-4a2e-49d8-a2b5-ffb3396a9952@csgroup.eu/T/
>
> Main change were:
> 1. powerpc instruction nmemonic table to associate load/store
> instructions with move_ops which is use to identify if instruction
> is a memory access one.
> 2. To get register number and access offset from the given
> instruction, code uses fields from "struct arch" -> objump.
> Added entry for powerpc here.
> 3. A get_arch_regnum to return register number from the
> register name string.
>
> But the apporach used in the initial patchset used parsing of
> disassembled code which the current perf tool implementation does.
>
> Example: lwz r10,0(r9)
>
> This line "lwz r10,0(r9)" is parsed to extract instruction name,
> registers names and offset. Also to find whether there is a memory
> reference in the operands, "memory_ref_char" field of objdump is used.
> For x86, "(" is used as memory_ref_char to tackle instructions of the
> form "mov (%rax), %rcx".
>
> In case of powerpc, not all instructions using "(" are the only memory
> instructions. Example, above instruction can also be of extended form (X
> form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
> and extract the source/target registers, second patchset added support to use
> raw instruction. With raw instruction, macros are added to extract opcode
> and register fields.
> Link to second patchset:
> https://lore.kernel.org/all/20240506121906.76639-1-atrajeev@linux.vnet.ibm.com/
>
> Example representation using --show-raw-insn in objdump gives result:
>
> 38 01 81 e8 ld r4,312(r1)
>
> Here "38 01 81 e8" is the raw instruction representation. In powerpc,
> this translates to instruction form: "ld RT,DS(RA)" and binary code
> as:
> _____________________________________
> | 58 | RT | RA | DS | |
> -------------------------------------
> 0 6 11 16 30 31
>
> Second patchset used "objdump" again to read the raw instruction.
> But since there is no need to disassemble and binary code can be read
> directly from the DSO, third patchset (ie this patchset) uses below
> apporach. The apporach preferred in powerpc to parse sample for data
> type profiling in V3 patchset is:
> - Read directly from DSO using dso__data_read_offset
> - If that fails for any case, fallback to using libcapstone
> - If libcapstone is not supported, approach will use objdump
>
> Patchset adds support to pick the opcode and reg fields from this
> raw/binary instruction code. This approach came in from review comment
> by Segher Boessenkool and Christophe for the initial patchset.
>
> Apart from that, instruction tracking is enabled for powerpc and
> support function is added to find variables defined as registers
> Example, in powerpc, below two registers are
> defined to represent variable:
> 1. r13: represents local_paca
> register struct paca_struct *local_paca asm("r13");
>
> 2. r1: represents stack_pointer
> register void *__stack_pointer asm("r1");
>
> These are handled in this patchset.
>
> - Patch 1 is to rearrange register state type structures to header file
> so that it can referred from other arch specific files
> - Patch 2 is to make instruction tracking as a callback to"struct arch"
> so that it can be implemented by other archs easily and defined in arch
> specific files
> - Patch 3 is to handle state type regs array size for x86 and powerpc
> - Patch 4 adds support to capture and parse raw instruction in powerpc
> using dso__data_read_offset utility
> - Patch 4 also adds logic to support using objdump when doing default "perf
> report" or "perf annotate" since it that needs disassembled instruction.
> - Patch 5 adds disasm_line__parse to parse raw instruction for powerpc
> - Patch 6 update parameters for reg extract functions to use raw
> instruction on powerpc
> - Patch 7 updates ins__find to carry raw_insn and also adds parse
> callback for memory instructions for powerpc
> - Patch 8 add support to identify memory instructions of opcode 31 in
> powerpc
> - Patch 9 adds more instructions to support instruction tracking in powerpc
> - Patch 10 and 11 handles instruction tracking for powerpc.
> - Patch 12, 13 and 14 add support to use libcapstone in powerpc
> - Patch 15 and patch 16 handles support to find global register variables
> - PAtch 17 updates data type compare functions data_type_cmp and
> sort__typeoff_sort to include var_name along with type_name in
> comparison.
> - Patch 18 handles insn-stat option for perf annotate
>
> Note:
> - There are remaining unknowns (25%) as seen in annotate Instruction stats
> below.
> - This patchset is not tested on powerpc32. In next step of enhancements
> along with handling remaining unknowns, plan to cover powerpc32 changes
> based on how testing goes.
>
> With the current patchset:
>
> ./perf record -a -e mem-loads sleep 1
> ./perf report -s type,typeoff --hierarchy --group --stdio
> ./perf annotate --data-type --insn-stat
>
> perf annotate logs:
> ==================
>
>
> Annotate Instruction stats
> total 609, ok 446 (73.2%), bad 163 (26.8%)
>
> Name/opcode : Good Bad
> -----------------------------------------------------------
> 58 : 323 80
> 32 : 49 43
> 34 : 33 11
> OP_31_XOP_LDX : 8 20
> 40 : 23 0
> OP_31_XOP_LWARX : 5 1
> OP_31_XOP_LWZX : 2 3
> OP_31_XOP_LDARX : 3 0
> 33 : 0 2
> OP_31_XOP_LBZX : 0 1
> OP_31_XOP_LWAX : 0 1
> OP_31_XOP_LHZX : 0 1
>
> perf report logs:
> =================
>
> Total Lost Samples: 0
>
> Samples: 1K of event 'mem-loads'
> Event count (approx.): 937238
>
> Overhead Data Type Data Type Offset
> ........ ......... ................
> 48.60% (unknown) (unknown) +0 (no field)
> 11.42% long unsigned int long unsigned int +0 (current_stack_pointer)
> 4.68% struct paca_struct struct paca_struct +2312 (__current)
> 4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)
> 2.69% struct paca_struct struct paca_struct +2808 (canary)
> 2.68% struct paca_struct struct paca_struct +8 (paca_index)
> 2.24% struct paca_struct struct paca_struct +48 (data_offset)
> 1.43% long unsigned int long unsigned int +0 (no field)
> 1.41% struct vm_fault struct vm_fault +0 (vma)
> 1.29% struct task_struct struct task_struct +276 (flags)
> 1.03% struct pt_regs struct pt_regs +264 (user_regs.msr)
> 0.90% struct security_hook_list struct security_hook_list +0 (list.next)
> 0.76% struct irq_desc struct irq_desc +304 (irq_data.chip)
> 0.76% struct rq struct rq +2856 (cpu)
> 0.72% long long unsigned int long long unsigned int +0 (no field)
Thanks for your work! But I think you need to split the basic part and
global register support part which needs more review.
For the patch 1 to 14:
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Thanks,
Namhyung
>
> Thanks
> Athira Rajeev
>
> Changelog:
> From v6 -> v7:
> - Addressed review comments from Namhyung
> Changed format string space to %-20s while printing
> instruction stats in patch 18.
> Use cmp_null in patch 17 while comparing var_name to
> properly sort with correct order.
>
> From v5 -> v6:
> - Addressed review comments from Namhyung
> Conditionally define TYPE_STATE_MAX_REGS based on arch.
> Added macro for defining width of the raw codes and spaces
> in disasm_line__parse_powerpc.
> Call disasm_line__parse from disasm_line__parse_powerpc
> for generic code.
> Renamed symbol__disassemble_dso to symbol__disassemble_raw.
> Fixed find_data_type_global_reg to correclty free var_types
> and change indent level.
> Fixed data_type_cmp and sort__typeoff_sort to include var_name
> in comparing data type entries.
>
> From v4 -> v5:
> - Addressed review comments from Namhyung
> Handle max number of type state regs as 16 for x86 and 32 for
> powerpc.
> Added generic support for objdump patch first and DSO read
> optimisation next
> combined patch 3 and patch 4 in patchseries V4 to one patch
> Changed reference for "raw_insn" to use "u32"
> Splitted "parse" callback patch changes and "ins__find" patch
> changes into two
> Instead of making weak function, added get_powerpc_regs to
> extract register and offset fields for powerpc
> - Addressed complation fail when "dwarf.h" is not present ie
> elfutils devel is not present. Used includes for #ifdef HAVE_DWARF_SUPPORT
> when including functions that use Dwarf references. Also
> conditionally include some of the header files.
>
> From v3->v4:
> - Addressed review comments from Ian by using capston_init from
> "util/print_insn.c" instead of "open_capston_handle".
> - Addressed review comment from Namhyung by moving "opcode"
> field from "struct ins" to "struct disasm_line"
>
> From v2->v3:
> - Addressed review comments from Christophe and Namhyung for V2
> - Changed the apporach in powerpc to parse sample for data
> type profiling as:
> Read directly from DSO using dso__data_read_offset
> If that fails for any case, fallback to using libcapstone
> If libcapstone is not supported, approach will use objdump
> - Include instructions with opcode as 31 and correctly categorize
> them as memory or arithmetic instructions.
> - Include more instructions for instruction tracking in powerpc
>
> From v1->v2:
> - Addressed suggestion from Christophe Leroy and Segher Boessenkool
> to use the binary code (raw insn) to fetch opcode, register and
> offset fields.
> - Added support for instruction tracking in powerpc
> - Find the register defined variables (r13 and r1 which points to
> local_paca and current_stack_pointer in powerpc)
>
> Athira Rajeev (18):
> tools/perf: Move the data structures related to register type to
> header file
> tools/perf: Add "update_insn_state" callback function to handle arch
> specific instruction tracking
> tools/perf: Update TYPE_STATE_MAX_REGS to include max of regs in
> powerpc
> tools/perf: Add disasm_line__parse to parse raw instruction for
> powerpc
> tools/perf: Add support to capture and parse raw instruction in
> powerpc using dso__data_read_offset utility
> tools/perf: Update parameters for reg extract functions to use raw
> instruction on powerpc
> tools/perf: Add parse function for memory instructions in powerpc
> tools/perf: Add support to identify memory instructions of opcode 31
> in powerpc
> tools/perf: Add some of the arithmetic instructions to support
> instruction tracking in powerpc
> tools/perf: Add more instructions for instruction tracking
> tools/perf: Update instruction tracking for powerpc
> tools/perf: Make capstone_init non-static so that it can be used
> during symbol disassemble
> tools/perf: Use capstone_init and remove open_capstone_handle from
> disasm.c
> tools/perf: Add support to use libcapstone in powerpc
> tools/perf: Add support to find global register variables using
> find_data_type_global_reg
> tools/perf: Add support for global_die to capture name of variable in
> case of register defined variable
> tools/perf: Update data_type_cmp and sort__typeoff_sort function to
> include var_name in comparison
> tools/perf: Set instruction name to be used with insn-stat when using
> raw instruction
>
> tools/include/linux/string.h | 2 +
> tools/lib/string.c | 13 +
> tools/perf/arch/arm64/annotate/instructions.c | 3 +-
> .../arch/loongarch/annotate/instructions.c | 6 +-
> .../perf/arch/powerpc/annotate/instructions.c | 254 ++++++++
> tools/perf/arch/powerpc/util/dwarf-regs.c | 53 ++
> tools/perf/arch/s390/annotate/instructions.c | 5 +-
> tools/perf/arch/x86/annotate/instructions.c | 377 ++++++++++++
> tools/perf/builtin-annotate.c | 4 +-
> tools/perf/util/annotate-data.c | 544 ++++--------------
> tools/perf/util/annotate-data.h | 83 +++
> tools/perf/util/annotate.c | 29 +-
> tools/perf/util/annotate.h | 6 +-
> tools/perf/util/disasm.c | 468 +++++++++++++--
> tools/perf/util/disasm.h | 19 +-
> tools/perf/util/dwarf-aux.c | 1 +
> tools/perf/util/dwarf-aux.h | 1 +
> tools/perf/util/include/dwarf-regs.h | 12 +
> tools/perf/util/print_insn.c | 15 +-
> tools/perf/util/print_insn.h | 5 +
> tools/perf/util/sort.c | 25 +-
> tools/perf/util/sort.h | 1 +
> 22 files changed, 1421 insertions(+), 505 deletions(-)
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH V7 00/18] Add data type profiling support for powerpc
2024-07-18 5:34 ` Namhyung Kim
@ 2024-07-18 6:11 ` Athira Rajeev
2024-07-18 6:43 ` Namhyung Kim
0 siblings, 1 reply; 25+ messages in thread
From: Athira Rajeev @ 2024-07-18 6:11 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Adrian Hunter, irogers,
segher, christophe.leroy, linux-kernel, linux-perf-users,
linuxppc-dev, akanksha, maddy, kjain, disgoel
> On 18 Jul 2024, at 11:04 AM, Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hello,
>
> On Sat, Jul 13, 2024 at 10:25:11PM +0530, Athira Rajeev wrote:
>> The patchset from Namhyung added support for data type profiling
>> in perf tool. This enabled support to associate PMU samples to data
>> types they refer using DWARF debug information. With the upstream
>> perf, currently it possible to run perf report or perf annotate to
>> view the data type information on x86.
>>
>> Initial patchset posted here had changes need to enable data type
>> profiling support for powerpc.
>>
>> https://lore.kernel.org/all/6e09dc28-4a2e-49d8-a2b5-ffb3396a9952@csgroup.eu/T/
>>
>> Main change were:
>> 1. powerpc instruction nmemonic table to associate load/store
>> instructions with move_ops which is use to identify if instruction
>> is a memory access one.
>> 2. To get register number and access offset from the given
>> instruction, code uses fields from "struct arch" -> objump.
>> Added entry for powerpc here.
>> 3. A get_arch_regnum to return register number from the
>> register name string.
>>
>> But the apporach used in the initial patchset used parsing of
>> disassembled code which the current perf tool implementation does.
>>
>> Example: lwz r10,0(r9)
>>
>> This line "lwz r10,0(r9)" is parsed to extract instruction name,
>> registers names and offset. Also to find whether there is a memory
>> reference in the operands, "memory_ref_char" field of objdump is used.
>> For x86, "(" is used as memory_ref_char to tackle instructions of the
>> form "mov (%rax), %rcx".
>>
>> In case of powerpc, not all instructions using "(" are the only memory
>> instructions. Example, above instruction can also be of extended form (X
>> form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
>> and extract the source/target registers, second patchset added support to use
>> raw instruction. With raw instruction, macros are added to extract opcode
>> and register fields.
>> Link to second patchset:
>> https://lore.kernel.org/all/20240506121906.76639-1-atrajeev@linux.vnet.ibm.com/
>>
>> Example representation using --show-raw-insn in objdump gives result:
>>
>> 38 01 81 e8 ld r4,312(r1)
>>
>> Here "38 01 81 e8" is the raw instruction representation. In powerpc,
>> this translates to instruction form: "ld RT,DS(RA)" and binary code
>> as:
>> _____________________________________
>> | 58 | RT | RA | DS | |
>> -------------------------------------
>> 0 6 11 16 30 31
>>
>> Second patchset used "objdump" again to read the raw instruction.
>> But since there is no need to disassemble and binary code can be read
>> directly from the DSO, third patchset (ie this patchset) uses below
>> apporach. The apporach preferred in powerpc to parse sample for data
>> type profiling in V3 patchset is:
>> - Read directly from DSO using dso__data_read_offset
>> - If that fails for any case, fallback to using libcapstone
>> - If libcapstone is not supported, approach will use objdump
>>
>> Patchset adds support to pick the opcode and reg fields from this
>> raw/binary instruction code. This approach came in from review comment
>> by Segher Boessenkool and Christophe for the initial patchset.
>>
>> Apart from that, instruction tracking is enabled for powerpc and
>> support function is added to find variables defined as registers
>> Example, in powerpc, below two registers are
>> defined to represent variable:
>> 1. r13: represents local_paca
>> register struct paca_struct *local_paca asm("r13");
>>
>> 2. r1: represents stack_pointer
>> register void *__stack_pointer asm("r1");
>>
>> These are handled in this patchset.
>>
>> - Patch 1 is to rearrange register state type structures to header file
>> so that it can referred from other arch specific files
>> - Patch 2 is to make instruction tracking as a callback to"struct arch"
>> so that it can be implemented by other archs easily and defined in arch
>> specific files
>> - Patch 3 is to handle state type regs array size for x86 and powerpc
>> - Patch 4 adds support to capture and parse raw instruction in powerpc
>> using dso__data_read_offset utility
>> - Patch 4 also adds logic to support using objdump when doing default "perf
>> report" or "perf annotate" since it that needs disassembled instruction.
>> - Patch 5 adds disasm_line__parse to parse raw instruction for powerpc
>> - Patch 6 update parameters for reg extract functions to use raw
>> instruction on powerpc
>> - Patch 7 updates ins__find to carry raw_insn and also adds parse
>> callback for memory instructions for powerpc
>> - Patch 8 add support to identify memory instructions of opcode 31 in
>> powerpc
>> - Patch 9 adds more instructions to support instruction tracking in powerpc
>> - Patch 10 and 11 handles instruction tracking for powerpc.
>> - Patch 12, 13 and 14 add support to use libcapstone in powerpc
>> - Patch 15 and patch 16 handles support to find global register variables
>> - PAtch 17 updates data type compare functions data_type_cmp and
>> sort__typeoff_sort to include var_name along with type_name in
>> comparison.
>> - Patch 18 handles insn-stat option for perf annotate
>>
>> Note:
>> - There are remaining unknowns (25%) as seen in annotate Instruction stats
>> below.
>> - This patchset is not tested on powerpc32. In next step of enhancements
>> along with handling remaining unknowns, plan to cover powerpc32 changes
>> based on how testing goes.
>>
>> With the current patchset:
>>
>> ./perf record -a -e mem-loads sleep 1
>> ./perf report -s type,typeoff --hierarchy --group --stdio
>> ./perf annotate --data-type --insn-stat
>>
>> perf annotate logs:
>> ==================
>>
>>
>> Annotate Instruction stats
>> total 609, ok 446 (73.2%), bad 163 (26.8%)
>>
>> Name/opcode : Good Bad
>> -----------------------------------------------------------
>> 58 : 323 80
>> 32 : 49 43
>> 34 : 33 11
>> OP_31_XOP_LDX : 8 20
>> 40 : 23 0
>> OP_31_XOP_LWARX : 5 1
>> OP_31_XOP_LWZX : 2 3
>> OP_31_XOP_LDARX : 3 0
>> 33 : 0 2
>> OP_31_XOP_LBZX : 0 1
>> OP_31_XOP_LWAX : 0 1
>> OP_31_XOP_LHZX : 0 1
>>
>> perf report logs:
>> =================
>>
>> Total Lost Samples: 0
>>
>> Samples: 1K of event 'mem-loads'
>> Event count (approx.): 937238
>>
>> Overhead Data Type Data Type Offset
>> ........ ......... ................
>> 48.60% (unknown) (unknown) +0 (no field)
>> 11.42% long unsigned int long unsigned int +0 (current_stack_pointer)
>> 4.68% struct paca_struct struct paca_struct +2312 (__current)
>> 4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)
>> 2.69% struct paca_struct struct paca_struct +2808 (canary)
>> 2.68% struct paca_struct struct paca_struct +8 (paca_index)
>> 2.24% struct paca_struct struct paca_struct +48 (data_offset)
>> 1.43% long unsigned int long unsigned int +0 (no field)
>> 1.41% struct vm_fault struct vm_fault +0 (vma)
>> 1.29% struct task_struct struct task_struct +276 (flags)
>> 1.03% struct pt_regs struct pt_regs +264 (user_regs.msr)
>> 0.90% struct security_hook_list struct security_hook_list +0 (list.next)
>> 0.76% struct irq_desc struct irq_desc +304 (irq_data.chip)
>> 0.76% struct rq struct rq +2856 (cpu)
>> 0.72% long long unsigned int long long unsigned int +0 (no field)
>
> Thanks for your work! But I think you need to split the basic part and
> global register support part which needs more review.
>
> For the patch 1 to 14:
> Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Hi Namhyung
Thanks for all suggestions and reviews. I will check latest comments for patches 15 and 16 (also patch 17 is dependent the global register support part). But patch 18 is not dependent on global register support patches. Along with patches 1 to 14, can you please add patch 18 also ?
Thanks
Athira
>
> Thanks,
> Namhyung
>
>>
>> Thanks
>> Athira Rajeev
>>
>> Changelog:
>> From v6 -> v7:
>> - Addressed review comments from Namhyung
>> Changed format string space to %-20s while printing
>> instruction stats in patch 18.
>> Use cmp_null in patch 17 while comparing var_name to
>> properly sort with correct order.
>>
>> From v5 -> v6:
>> - Addressed review comments from Namhyung
>> Conditionally define TYPE_STATE_MAX_REGS based on arch.
>> Added macro for defining width of the raw codes and spaces
>> in disasm_line__parse_powerpc.
>> Call disasm_line__parse from disasm_line__parse_powerpc
>> for generic code.
>> Renamed symbol__disassemble_dso to symbol__disassemble_raw.
>> Fixed find_data_type_global_reg to correclty free var_types
>> and change indent level.
>> Fixed data_type_cmp and sort__typeoff_sort to include var_name
>> in comparing data type entries.
>>
>> From v4 -> v5:
>> - Addressed review comments from Namhyung
>> Handle max number of type state regs as 16 for x86 and 32 for
>> powerpc.
>> Added generic support for objdump patch first and DSO read
>> optimisation next
>> combined patch 3 and patch 4 in patchseries V4 to one patch
>> Changed reference for "raw_insn" to use "u32"
>> Splitted "parse" callback patch changes and "ins__find" patch
>> changes into two
>> Instead of making weak function, added get_powerpc_regs to
>> extract register and offset fields for powerpc
>> - Addressed complation fail when "dwarf.h" is not present ie
>> elfutils devel is not present. Used includes for #ifdef HAVE_DWARF_SUPPORT
>> when including functions that use Dwarf references. Also
>> conditionally include some of the header files.
>>
>> From v3->v4:
>> - Addressed review comments from Ian by using capston_init from
>> "util/print_insn.c" instead of "open_capston_handle".
>> - Addressed review comment from Namhyung by moving "opcode"
>> field from "struct ins" to "struct disasm_line"
>>
>> From v2->v3:
>> - Addressed review comments from Christophe and Namhyung for V2
>> - Changed the apporach in powerpc to parse sample for data
>> type profiling as:
>> Read directly from DSO using dso__data_read_offset
>> If that fails for any case, fallback to using libcapstone
>> If libcapstone is not supported, approach will use objdump
>> - Include instructions with opcode as 31 and correctly categorize
>> them as memory or arithmetic instructions.
>> - Include more instructions for instruction tracking in powerpc
>>
>> From v1->v2:
>> - Addressed suggestion from Christophe Leroy and Segher Boessenkool
>> to use the binary code (raw insn) to fetch opcode, register and
>> offset fields.
>> - Added support for instruction tracking in powerpc
>> - Find the register defined variables (r13 and r1 which points to
>> local_paca and current_stack_pointer in powerpc)
>>
>> Athira Rajeev (18):
>> tools/perf: Move the data structures related to register type to
>> header file
>> tools/perf: Add "update_insn_state" callback function to handle arch
>> specific instruction tracking
>> tools/perf: Update TYPE_STATE_MAX_REGS to include max of regs in
>> powerpc
>> tools/perf: Add disasm_line__parse to parse raw instruction for
>> powerpc
>> tools/perf: Add support to capture and parse raw instruction in
>> powerpc using dso__data_read_offset utility
>> tools/perf: Update parameters for reg extract functions to use raw
>> instruction on powerpc
>> tools/perf: Add parse function for memory instructions in powerpc
>> tools/perf: Add support to identify memory instructions of opcode 31
>> in powerpc
>> tools/perf: Add some of the arithmetic instructions to support
>> instruction tracking in powerpc
>> tools/perf: Add more instructions for instruction tracking
>> tools/perf: Update instruction tracking for powerpc
>> tools/perf: Make capstone_init non-static so that it can be used
>> during symbol disassemble
>> tools/perf: Use capstone_init and remove open_capstone_handle from
>> disasm.c
>> tools/perf: Add support to use libcapstone in powerpc
>> tools/perf: Add support to find global register variables using
>> find_data_type_global_reg
>> tools/perf: Add support for global_die to capture name of variable in
>> case of register defined variable
>> tools/perf: Update data_type_cmp and sort__typeoff_sort function to
>> include var_name in comparison
>> tools/perf: Set instruction name to be used with insn-stat when using
>> raw instruction
>>
>> tools/include/linux/string.h | 2 +
>> tools/lib/string.c | 13 +
>> tools/perf/arch/arm64/annotate/instructions.c | 3 +-
>> .../arch/loongarch/annotate/instructions.c | 6 +-
>> .../perf/arch/powerpc/annotate/instructions.c | 254 ++++++++
>> tools/perf/arch/powerpc/util/dwarf-regs.c | 53 ++
>> tools/perf/arch/s390/annotate/instructions.c | 5 +-
>> tools/perf/arch/x86/annotate/instructions.c | 377 ++++++++++++
>> tools/perf/builtin-annotate.c | 4 +-
>> tools/perf/util/annotate-data.c | 544 ++++--------------
>> tools/perf/util/annotate-data.h | 83 +++
>> tools/perf/util/annotate.c | 29 +-
>> tools/perf/util/annotate.h | 6 +-
>> tools/perf/util/disasm.c | 468 +++++++++++++--
>> tools/perf/util/disasm.h | 19 +-
>> tools/perf/util/dwarf-aux.c | 1 +
>> tools/perf/util/dwarf-aux.h | 1 +
>> tools/perf/util/include/dwarf-regs.h | 12 +
>> tools/perf/util/print_insn.c | 15 +-
>> tools/perf/util/print_insn.h | 5 +
>> tools/perf/util/sort.c | 25 +-
>> tools/perf/util/sort.h | 1 +
>> 22 files changed, 1421 insertions(+), 505 deletions(-)
>>
>> --
>> 2.43.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH V7 00/18] Add data type profiling support for powerpc
2024-07-18 6:11 ` Athira Rajeev
@ 2024-07-18 6:43 ` Namhyung Kim
0 siblings, 0 replies; 25+ messages in thread
From: Namhyung Kim @ 2024-07-18 6:43 UTC (permalink / raw)
To: Athira Rajeev
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Adrian Hunter, irogers,
segher, christophe.leroy, linux-kernel, linux-perf-users,
linuxppc-dev, akanksha, maddy, kjain, disgoel
On Wed, Jul 17, 2024 at 11:12 PM Athira Rajeev
<atrajeev@linux.vnet.ibm.com> wrote:
>
>
>
> > On 18 Jul 2024, at 11:04 AM, Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hello,
> >
> > On Sat, Jul 13, 2024 at 10:25:11PM +0530, Athira Rajeev wrote:
> >> The patchset from Namhyung added support for data type profiling
> >> in perf tool. This enabled support to associate PMU samples to data
> >> types they refer using DWARF debug information. With the upstream
> >> perf, currently it possible to run perf report or perf annotate to
> >> view the data type information on x86.
> >>
> >> Initial patchset posted here had changes need to enable data type
> >> profiling support for powerpc.
> >>
> >> https://lore.kernel.org/all/6e09dc28-4a2e-49d8-a2b5-ffb3396a9952@csgroup.eu/T/
> >>
> >> Main change were:
> >> 1. powerpc instruction nmemonic table to associate load/store
> >> instructions with move_ops which is use to identify if instruction
> >> is a memory access one.
> >> 2. To get register number and access offset from the given
> >> instruction, code uses fields from "struct arch" -> objump.
> >> Added entry for powerpc here.
> >> 3. A get_arch_regnum to return register number from the
> >> register name string.
> >>
> >> But the apporach used in the initial patchset used parsing of
> >> disassembled code which the current perf tool implementation does.
> >>
> >> Example: lwz r10,0(r9)
> >>
> >> This line "lwz r10,0(r9)" is parsed to extract instruction name,
> >> registers names and offset. Also to find whether there is a memory
> >> reference in the operands, "memory_ref_char" field of objdump is used.
> >> For x86, "(" is used as memory_ref_char to tackle instructions of the
> >> form "mov (%rax), %rcx".
> >>
> >> In case of powerpc, not all instructions using "(" are the only memory
> >> instructions. Example, above instruction can also be of extended form (X
> >> form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
> >> and extract the source/target registers, second patchset added support to use
> >> raw instruction. With raw instruction, macros are added to extract opcode
> >> and register fields.
> >> Link to second patchset:
> >> https://lore.kernel.org/all/20240506121906.76639-1-atrajeev@linux.vnet.ibm.com/
> >>
> >> Example representation using --show-raw-insn in objdump gives result:
> >>
> >> 38 01 81 e8 ld r4,312(r1)
> >>
> >> Here "38 01 81 e8" is the raw instruction representation. In powerpc,
> >> this translates to instruction form: "ld RT,DS(RA)" and binary code
> >> as:
> >> _____________________________________
> >> | 58 | RT | RA | DS | |
> >> -------------------------------------
> >> 0 6 11 16 30 31
> >>
> >> Second patchset used "objdump" again to read the raw instruction.
> >> But since there is no need to disassemble and binary code can be read
> >> directly from the DSO, third patchset (ie this patchset) uses below
> >> apporach. The apporach preferred in powerpc to parse sample for data
> >> type profiling in V3 patchset is:
> >> - Read directly from DSO using dso__data_read_offset
> >> - If that fails for any case, fallback to using libcapstone
> >> - If libcapstone is not supported, approach will use objdump
> >>
> >> Patchset adds support to pick the opcode and reg fields from this
> >> raw/binary instruction code. This approach came in from review comment
> >> by Segher Boessenkool and Christophe for the initial patchset.
> >>
> >> Apart from that, instruction tracking is enabled for powerpc and
> >> support function is added to find variables defined as registers
> >> Example, in powerpc, below two registers are
> >> defined to represent variable:
> >> 1. r13: represents local_paca
> >> register struct paca_struct *local_paca asm("r13");
> >>
> >> 2. r1: represents stack_pointer
> >> register void *__stack_pointer asm("r1");
> >>
> >> These are handled in this patchset.
> >>
> >> - Patch 1 is to rearrange register state type structures to header file
> >> so that it can referred from other arch specific files
> >> - Patch 2 is to make instruction tracking as a callback to"struct arch"
> >> so that it can be implemented by other archs easily and defined in arch
> >> specific files
> >> - Patch 3 is to handle state type regs array size for x86 and powerpc
> >> - Patch 4 adds support to capture and parse raw instruction in powerpc
> >> using dso__data_read_offset utility
> >> - Patch 4 also adds logic to support using objdump when doing default "perf
> >> report" or "perf annotate" since it that needs disassembled instruction.
> >> - Patch 5 adds disasm_line__parse to parse raw instruction for powerpc
> >> - Patch 6 update parameters for reg extract functions to use raw
> >> instruction on powerpc
> >> - Patch 7 updates ins__find to carry raw_insn and also adds parse
> >> callback for memory instructions for powerpc
> >> - Patch 8 add support to identify memory instructions of opcode 31 in
> >> powerpc
> >> - Patch 9 adds more instructions to support instruction tracking in powerpc
> >> - Patch 10 and 11 handles instruction tracking for powerpc.
> >> - Patch 12, 13 and 14 add support to use libcapstone in powerpc
> >> - Patch 15 and patch 16 handles support to find global register variables
> >> - PAtch 17 updates data type compare functions data_type_cmp and
> >> sort__typeoff_sort to include var_name along with type_name in
> >> comparison.
> >> - Patch 18 handles insn-stat option for perf annotate
> >>
> >> Note:
> >> - There are remaining unknowns (25%) as seen in annotate Instruction stats
> >> below.
> >> - This patchset is not tested on powerpc32. In next step of enhancements
> >> along with handling remaining unknowns, plan to cover powerpc32 changes
> >> based on how testing goes.
> >>
> >> With the current patchset:
> >>
> >> ./perf record -a -e mem-loads sleep 1
> >> ./perf report -s type,typeoff --hierarchy --group --stdio
> >> ./perf annotate --data-type --insn-stat
> >>
> >> perf annotate logs:
> >> ==================
> >>
> >>
> >> Annotate Instruction stats
> >> total 609, ok 446 (73.2%), bad 163 (26.8%)
> >>
> >> Name/opcode : Good Bad
> >> -----------------------------------------------------------
> >> 58 : 323 80
> >> 32 : 49 43
> >> 34 : 33 11
> >> OP_31_XOP_LDX : 8 20
> >> 40 : 23 0
> >> OP_31_XOP_LWARX : 5 1
> >> OP_31_XOP_LWZX : 2 3
> >> OP_31_XOP_LDARX : 3 0
> >> 33 : 0 2
> >> OP_31_XOP_LBZX : 0 1
> >> OP_31_XOP_LWAX : 0 1
> >> OP_31_XOP_LHZX : 0 1
> >>
> >> perf report logs:
> >> =================
> >>
> >> Total Lost Samples: 0
> >>
> >> Samples: 1K of event 'mem-loads'
> >> Event count (approx.): 937238
> >>
> >> Overhead Data Type Data Type Offset
> >> ........ ......... ................
> >> 48.60% (unknown) (unknown) +0 (no field)
> >> 11.42% long unsigned int long unsigned int +0 (current_stack_pointer)
> >> 4.68% struct paca_struct struct paca_struct +2312 (__current)
> >> 4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)
> >> 2.69% struct paca_struct struct paca_struct +2808 (canary)
> >> 2.68% struct paca_struct struct paca_struct +8 (paca_index)
> >> 2.24% struct paca_struct struct paca_struct +48 (data_offset)
> >> 1.43% long unsigned int long unsigned int +0 (no field)
> >> 1.41% struct vm_fault struct vm_fault +0 (vma)
> >> 1.29% struct task_struct struct task_struct +276 (flags)
> >> 1.03% struct pt_regs struct pt_regs +264 (user_regs.msr)
> >> 0.90% struct security_hook_list struct security_hook_list +0 (list.next)
> >> 0.76% struct irq_desc struct irq_desc +304 (irq_data.chip)
> >> 0.76% struct rq struct rq +2856 (cpu)
> >> 0.72% long long unsigned int long long unsigned int +0 (no field)
> >
> > Thanks for your work! But I think you need to split the basic part and
> > global register support part which needs more review.
> >
> > For the patch 1 to 14:
> > Reviewed-by: Namhyung Kim <namhyung@kernel.org>
>
> Hi Namhyung
>
> Thanks for all suggestions and reviews. I will check latest comments for patches 15 and 16 (also patch 17 is dependent the global register support part). But patch 18 is not dependent on global register support patches. Along with patches 1 to 14, can you please add patch 18 also ?
Sure, feel free to add it to the patch 18.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2024-07-18 6:43 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-13 16:55 [PATCH V7 00/18] Add data type profiling support for powerpc Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 01/18] tools/perf: Move the data structures related to register type to header file Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 02/18] tools/perf: Add "update_insn_state" callback function to handle arch specific instruction tracking Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 03/18] tools/perf: Update TYPE_STATE_MAX_REGS to include max of regs in powerpc Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 04/18] tools/perf: Add disasm_line__parse to parse raw instruction for powerpc Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 05/18] tools/perf: Add support to capture and parse raw instruction in powerpc using dso__data_read_offset utility Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 06/18] tools/perf: Update parameters for reg extract functions to use raw instruction on powerpc Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 07/18] tools/perf: Add parse function for memory instructions in powerpc Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 08/18] tools/perf: Add support to identify memory instructions of opcode 31 " Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 09/18] tools/perf: Add some of the arithmetic instructions to support instruction tracking " Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 10/18] tools/perf: Add more instructions for instruction tracking Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 11/18] tools/perf: Update instruction tracking for powerpc Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 12/18] tools/perf: Make capstone_init non-static so that it can be used during symbol disassemble Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 13/18] tools/perf: Use capstone_init and remove open_capstone_handle from disasm.c Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 14/18] tools/perf: Add support to use libcapstone in powerpc Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 15/18] tools/perf: Add support to find global register variables using find_data_type_global_reg Athira Rajeev
2024-07-18 5:11 ` Namhyung Kim
2024-07-13 16:55 ` [PATCH V7 16/18] tools/perf: Add support for global_die to capture name of variable in case of register defined variable Athira Rajeev
2024-07-18 5:25 ` Namhyung Kim
2024-07-13 16:55 ` [PATCH V7 17/18] tools/perf: Update data_type_cmp and sort__typeoff_sort function to include var_name in comparison Athira Rajeev
2024-07-13 16:55 ` [PATCH V7 18/18] tools/perf: Set instruction name to be used with insn-stat when using raw instruction Athira Rajeev
2024-07-16 14:18 ` [PATCH V7 00/18] Add data type profiling support for powerpc kajoljain
2024-07-18 5:34 ` Namhyung Kim
2024-07-18 6:11 ` Athira Rajeev
2024-07-18 6:43 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).