From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>,
Kan Liang <kan.liang@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 4/6] perf annotate-data: Check memory access with two registers
Date: Thu, 2 May 2024 11:05:04 -0300 [thread overview]
Message-ID: <ZjOdkHraWXZIuSy_@x1> (raw)
In-Reply-To: <20240502060011.1838090-5-namhyung@kernel.org>
On Wed, May 01, 2024 at 11:00:09PM -0700, Namhyung Kim wrote:
> The following instruction pattern is used to access a global variable.
>
> mov $0x231c0, %rax
> movsql %edi, %rcx
> mov -0x7dc94ae0(,%rcx,8), %rcx
> cmpl $0x0, 0xa60(%rcx,%rax,1) <<<--- here
>
> The first instruction set the address of the per-cpu variable (here, it
> is 'runqueus' of struct rq). The second instruction seems like a cpu
You mean 'runqueues', i.e. this one:
kernel/sched/core.c
DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
?
But that 0xa60 would be in an alignment hole, at least in:
$ pahole --hex rq | egrep 0xa40 -A12
struct mm_struct * prev_mm; /* 0xa40 0x8 */
unsigned int clock_update_flags; /* 0xa48 0x4 */
/* XXX 4 bytes hole, try to pack */
u64 clock; /* 0xa50 0x8 */
/* XXX 40 bytes hole, try to pack */
/* --- cacheline 42 boundary (2688 bytes) --- */
u64 clock_task __attribute__((__aligned__(64))); /* 0xa80 0x8 */
u64 clock_pelt; /* 0xa88 0x8 */
long unsigned int lost_idle_time; /* 0xa90 0x8 */
$ uname -a
Linux toolbox 6.7.11-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Mar 27 16:50:39 UTC 2024 x86_64 GNU/Linux
$
The paragraph then reads:
----
The first instruction set the address of the per-cpu variable (here, it
is 'runqueues' of type 'struct rq'). The second instruction seems like
a cpu number of the per-cpu base. The third instruction get the base
offset of per-cpu area for that cpu. The last instruction compares the
value of the per-cpu variable at the offset of 0xa60.
----
Ok?
> number of the per-cpu base. The third instruction get the base offset
> of per-cpu area for that cpu. The last instruction compares the value
> of the per-cpu variable at the offset of 0xa60.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/perf/util/annotate-data.c | 44 +++++++++++++++++++++++++++++----
> 1 file changed, 39 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
> index f1e52a531563..245e3ef3e2ff 100644
> --- a/tools/perf/util/annotate-data.c
> +++ b/tools/perf/util/annotate-data.c
> @@ -1031,22 +1031,37 @@ static void update_insn_state_x86(struct type_state *state,
> else if (has_reg_type(state, sreg) &&
> state->regs[sreg].kind == TSR_KIND_PERCPU_BASE) {
> u64 ip = dloc->ms->sym->start + dl->al.offset;
> + u64 var_addr = src->offset;
> int offset;
>
> + if (src->multi_regs) {
> + int reg2 = (sreg == src->reg1) ? src->reg2 : src->reg1;
> +
> + if (has_reg_type(state, reg2) && state->regs[reg2].ok &&
> + state->regs[reg2].kind == TSR_KIND_CONST)
> + var_addr += state->regs[reg2].imm_value;
> + }
> +
> /*
> * In kernel, %gs points to a per-cpu region for the
> * current CPU. Access with a constant offset should
> * be treated as a global variable access.
> */
> - if (get_global_var_type(cu_die, dloc, ip, src->offset,
> + if (get_global_var_type(cu_die, dloc, ip, var_addr,
> &offset, &type_die) &&
> die_get_member_type(&type_die, offset, &type_die)) {
> tsr->type = type_die;
> tsr->kind = TSR_KIND_TYPE;
> tsr->ok = true;
>
> - pr_debug_dtp("mov [%x] percpu %#x(reg%d) -> reg%d",
> - insn_offset, src->offset, sreg, dst->reg1);
> + if (src->multi_regs) {
> + pr_debug_dtp("mov [%x] percpu %#x(reg%d,reg%d) -> reg%d",
> + insn_offset, src->offset, src->reg1,
> + src->reg2, dst->reg1);
> + } else {
> + pr_debug_dtp("mov [%x] percpu %#x(reg%d) -> reg%d",
> + insn_offset, src->offset, sreg, dst->reg1);
> + }
> pr_debug_type_name(&tsr->type, tsr->kind);
> } else {
> tsr->ok = false;
> @@ -1340,6 +1355,17 @@ static int check_matching_type(struct type_state *state,
>
> pr_debug_dtp(" percpu var\n");
>
> + if (dloc->op->multi_regs) {
> + int reg2 = dloc->op->reg2;
> +
> + if (dloc->op->reg2 == reg)
> + reg2 = dloc->op->reg1;
> +
> + if (has_reg_type(state, reg2) && state->regs[reg2].ok &&
> + state->regs[reg2].kind == TSR_KIND_CONST)
> + var_addr += state->regs[reg2].imm_value;
> + }
> +
> if (get_global_var_type(cu_die, dloc, dloc->ip, var_addr,
> &var_offset, type_die)) {
> dloc->type_offset = var_offset;
> @@ -1527,8 +1553,16 @@ static int find_data_type_block(struct data_loc_info *dloc, int reg,
> found = find_data_type_insn(dloc, reg, &basic_blocks, var_types,
> cu_die, type_die);
> if (found > 0) {
> - pr_debug_dtp("found by insn track: %#x(reg%d) type-offset=%#x\n",
> - dloc->op->offset, reg, dloc->type_offset);
> + char buf[64];
> +
> + if (dloc->op->multi_regs)
> + snprintf(buf, sizeof(buf), "reg%d, reg%d",
> + dloc->op->reg1, dloc->op->reg2);
> + else
> + snprintf(buf, sizeof(buf), "reg%d", dloc->op->reg1);
> +
> + pr_debug_dtp("found by insn track: %#x(%s) type-offset=%#x\n",
> + dloc->op->offset, buf, dloc->type_offset);
> pr_debug_type_name(type_die, TSR_KIND_TYPE);
> ret = 0;
> break;
> --
> 2.45.0.rc1.225.g2a3ae87e7f-goog
>
next prev parent reply other threads:[~2024-05-02 14:05 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-02 6:00 [PATCHSET 0/6] perf annotate-data: Small updates in the data type profiling (v1) Namhyung Kim
2024-05-02 6:00 ` [PATCH 1/6] perf dwarf-aux: Add die_collect_global_vars() Namhyung Kim
2024-05-02 6:00 ` [PATCH 2/6] perf annotate-data: Collect global variables in advance Namhyung Kim
2024-05-02 13:50 ` Arnaldo Carvalho de Melo
2024-05-02 18:23 ` Namhyung Kim
2024-05-02 23:28 ` Namhyung Kim
2024-05-02 6:00 ` [PATCH 3/6] perf annotate-data: Handle direct global variable access Namhyung Kim
2024-05-02 6:00 ` [PATCH 4/6] perf annotate-data: Check memory access with two registers Namhyung Kim
2024-05-02 14:05 ` Arnaldo Carvalho de Melo [this message]
2024-05-02 18:14 ` Namhyung Kim
2024-05-04 18:26 ` Arnaldo Carvalho de Melo
2024-05-02 6:00 ` [PATCH 5/6] perf annotate-data: Handle multi regs in find_data_type_block() Namhyung Kim
2024-05-02 6:00 ` [PATCH 6/6] perf annotate-data: Check kind of stack variables Namhyung Kim
2024-05-02 14:25 ` [PATCHSET 0/6] perf annotate-data: Small updates in the data type profiling (v1) Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZjOdkHraWXZIuSy_@x1 \
--to=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox