From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>,
Kan Liang <kan.liang@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 4/6] perf annotate-data: Check memory access with two registers
Date: Thu, 2 May 2024 11:05:04 -0300 [thread overview]
Message-ID: <ZjOdkHraWXZIuSy_@x1> (raw)
In-Reply-To: <20240502060011.1838090-5-namhyung@kernel.org>
On Wed, May 01, 2024 at 11:00:09PM -0700, Namhyung Kim wrote:
> The following instruction pattern is used to access a global variable.
>
> mov $0x231c0, %rax
> movsql %edi, %rcx
> mov -0x7dc94ae0(,%rcx,8), %rcx
> cmpl $0x0, 0xa60(%rcx,%rax,1) <<<--- here
>
> The first instruction set the address of the per-cpu variable (here, it
> is 'runqueus' of struct rq). The second instruction seems like a cpu
You mean 'runqueues', i.e. this one:
kernel/sched/core.c
DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
?
But that 0xa60 would be in an alignment hole, at least in:
$ pahole --hex rq | egrep 0xa40 -A12
struct mm_struct * prev_mm; /* 0xa40 0x8 */
unsigned int clock_update_flags; /* 0xa48 0x4 */
/* XXX 4 bytes hole, try to pack */
u64 clock; /* 0xa50 0x8 */
/* XXX 40 bytes hole, try to pack */
/* --- cacheline 42 boundary (2688 bytes) --- */
u64 clock_task __attribute__((__aligned__(64))); /* 0xa80 0x8 */
u64 clock_pelt; /* 0xa88 0x8 */
long unsigned int lost_idle_time; /* 0xa90 0x8 */
$ uname -a
Linux toolbox 6.7.11-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Mar 27 16:50:39 UTC 2024 x86_64 GNU/Linux
$
The paragraph then reads:
----
The first instruction set the address of the per-cpu variable (here, it
is 'runqueues' of type 'struct rq'). The second instruction seems like
a cpu number of the per-cpu base. The third instruction get the base
offset of per-cpu area for that cpu. The last instruction compares the
value of the per-cpu variable at the offset of 0xa60.
----
Ok?
> number of the per-cpu base. The third instruction get the base offset
> of per-cpu area for that cpu. The last instruction compares the value
> of the per-cpu variable at the offset of 0xa60.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/perf/util/annotate-data.c | 44 +++++++++++++++++++++++++++++----
> 1 file changed, 39 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
> index f1e52a531563..245e3ef3e2ff 100644
> --- a/tools/perf/util/annotate-data.c
> +++ b/tools/perf/util/annotate-data.c
> @@ -1031,22 +1031,37 @@ static void update_insn_state_x86(struct type_state *state,
> else if (has_reg_type(state, sreg) &&
> state->regs[sreg].kind == TSR_KIND_PERCPU_BASE) {
> u64 ip = dloc->ms->sym->start + dl->al.offset;
> + u64 var_addr = src->offset;
> int offset;
>
> + if (src->multi_regs) {
> + int reg2 = (sreg == src->reg1) ? src->reg2 : src->reg1;
> +
> + if (has_reg_type(state, reg2) && state->regs[reg2].ok &&
> + state->regs[reg2].kind == TSR_KIND_CONST)
> + var_addr += state->regs[reg2].imm_value;
> + }
> +
> /*
> * In kernel, %gs points to a per-cpu region for the
> * current CPU. Access with a constant offset should
> * be treated as a global variable access.
> */
> - if (get_global_var_type(cu_die, dloc, ip, src->offset,
> + if (get_global_var_type(cu_die, dloc, ip, var_addr,
> &offset, &type_die) &&
> die_get_member_type(&type_die, offset, &type_die)) {
> tsr->type = type_die;
> tsr->kind = TSR_KIND_TYPE;
> tsr->ok = true;
>
> - pr_debug_dtp("mov [%x] percpu %#x(reg%d) -> reg%d",
> - insn_offset, src->offset, sreg, dst->reg1);
> + if (src->multi_regs) {
> + pr_debug_dtp("mov [%x] percpu %#x(reg%d,reg%d) -> reg%d",
> + insn_offset, src->offset, src->reg1,
> + src->reg2, dst->reg1);
> + } else {
> + pr_debug_dtp("mov [%x] percpu %#x(reg%d) -> reg%d",
> + insn_offset, src->offset, sreg, dst->reg1);
> + }
> pr_debug_type_name(&tsr->type, tsr->kind);
> } else {
> tsr->ok = false;
> @@ -1340,6 +1355,17 @@ static int check_matching_type(struct type_state *state,
>
> pr_debug_dtp(" percpu var\n");
>
> + if (dloc->op->multi_regs) {
> + int reg2 = dloc->op->reg2;
> +
> + if (dloc->op->reg2 == reg)
> + reg2 = dloc->op->reg1;
> +
> + if (has_reg_type(state, reg2) && state->regs[reg2].ok &&
> + state->regs[reg2].kind == TSR_KIND_CONST)
> + var_addr += state->regs[reg2].imm_value;
> + }
> +
> if (get_global_var_type(cu_die, dloc, dloc->ip, var_addr,
> &var_offset, type_die)) {
> dloc->type_offset = var_offset;
> @@ -1527,8 +1553,16 @@ static int find_data_type_block(struct data_loc_info *dloc, int reg,
> found = find_data_type_insn(dloc, reg, &basic_blocks, var_types,
> cu_die, type_die);
> if (found > 0) {
> - pr_debug_dtp("found by insn track: %#x(reg%d) type-offset=%#x\n",
> - dloc->op->offset, reg, dloc->type_offset);
> + char buf[64];
> +
> + if (dloc->op->multi_regs)
> + snprintf(buf, sizeof(buf), "reg%d, reg%d",
> + dloc->op->reg1, dloc->op->reg2);
> + else
> + snprintf(buf, sizeof(buf), "reg%d", dloc->op->reg1);
> +
> + pr_debug_dtp("found by insn track: %#x(%s) type-offset=%#x\n",
> + dloc->op->offset, buf, dloc->type_offset);
> pr_debug_type_name(type_die, TSR_KIND_TYPE);
> ret = 0;
> break;
> --
> 2.45.0.rc1.225.g2a3ae87e7f-goog
>
next prev parent reply other threads:[~2024-05-02 14:05 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-02 6:00 [PATCHSET 0/6] perf annotate-data: Small updates in the data type profiling (v1) Namhyung Kim
2024-05-02 6:00 ` [PATCH 1/6] perf dwarf-aux: Add die_collect_global_vars() Namhyung Kim
2024-05-02 6:00 ` [PATCH 2/6] perf annotate-data: Collect global variables in advance Namhyung Kim
2024-05-02 13:50 ` Arnaldo Carvalho de Melo
2024-05-02 18:23 ` Namhyung Kim
2024-05-02 23:28 ` Namhyung Kim
2024-05-02 6:00 ` [PATCH 3/6] perf annotate-data: Handle direct global variable access Namhyung Kim
2024-05-02 6:00 ` [PATCH 4/6] perf annotate-data: Check memory access with two registers Namhyung Kim
2024-05-02 14:05 ` Arnaldo Carvalho de Melo [this message]
2024-05-02 18:14 ` Namhyung Kim
2024-05-04 18:26 ` Arnaldo Carvalho de Melo
2024-05-02 6:00 ` [PATCH 5/6] perf annotate-data: Handle multi regs in find_data_type_block() Namhyung Kim
2024-05-02 6:00 ` [PATCH 6/6] perf annotate-data: Check kind of stack variables Namhyung Kim
2024-05-02 14:25 ` [PATCHSET 0/6] perf annotate-data: Small updates in the data type profiling (v1) Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZjOdkHraWXZIuSy_@x1 \
--to=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.