All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: namhyung@kernel.org, kim.phillips@amd.com, peterz@infradead.org,
	mingo@redhat.com, mark.rutland@arm.com,
	alexander.shishkin@linux.intel.com, jolsa@kernel.org,
	irogers@google.com, adrian.hunter@intel.com,
	kan.liang@linux.intel.com, changbin.du@huawei.com,
	yangjihong1@huawei.com, zwisler@chromium.org,
	wangming01@loongson.cn, chenhuacai@kernel.org,
	kprateek.nayak@amd.com, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org, sandipan.das@amd.com,
	ananth.narayan@amd.com, santosh.shukla@amd.com
Subject: Re: [PATCH 2/2] perf header: Additional note on AMD IBS for max_precise pmu cap
Date: Thu, 9 Nov 2023 16:56:15 -0300	[thread overview]
Message-ID: <ZU05X1SLwk1NTW59@kernel.org> (raw)
In-Reply-To: <20231107083331.901-2-ravi.bangoria@amd.com>

Em Tue, Nov 07, 2023 at 02:03:31PM +0530, Ravi Bangoria escreveu:
> From: Arnaldo Carvalho de Melo <acme@kernel.org>

Applied this one, waiting for some more time to address Ian comments,

- Arnaldo
 
> x86 core pmu exposes supported maximum precision level via max_precise
> pmu capability. Although, AMD core pmu does not support precise mode,
> certain core pmu events with precise_ip > 0 are allowed and forwarded
> to IBS OP pmu. Display a note about this in perf report header and
> document the details in the perf-list man page.
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@kernel.org>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
>  tools/perf/Documentation/perf-list.txt | 12 +++++++-----
>  tools/perf/util/env.c                  | 18 ++++++++++++++++++
>  tools/perf/util/env.h                  |  2 ++
>  tools/perf/util/header.c               |  8 ++++++++
>  4 files changed, 35 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
> index d5f78e125efe..1b90575ee3c8 100644
> --- a/tools/perf/Documentation/perf-list.txt
> +++ b/tools/perf/Documentation/perf-list.txt
> @@ -81,11 +81,13 @@ For Intel systems precise event sampling is implemented with PEBS
>  which supports up to precise-level 2, and precise level 3 for
>  some special cases
>  
> -On AMD systems it is implemented using IBS (up to precise-level 2).
> -The precise modifier works with event types 0x76 (cpu-cycles, CPU
> -clocks not halted) and 0xC1 (micro-ops retired). Both events map to
> -IBS execution sampling (IBS op) with the IBS Op Counter Control bit
> -(IbsOpCntCtl) set respectively (see the
> +On AMD systems it is implemented using IBS OP (up to precise-level 2).
> +Unlike Intel PEBS which provides levels of precision, AMD core pmu is
> +inherently non-precise and IBS is inherently precise. (i.e. ibs_op//,
> +ibs_op//p, ibs_op//pp and ibs_op//ppp are all same). The precise modifier
> +works with event types 0x76 (cpu-cycles, CPU clocks not halted) and 0xC1
> +(micro-ops retired). Both events map to IBS execution sampling (IBS op)
> +with the IBS Op Counter Control bit (IbsOpCntCtl) set respectively (see the
>  Core Complex (CCX) -> Processor x86 Core -> Instruction Based Sampling (IBS)
>  section of the [AMD Processor Programming Reference (PPR)] relevant to the
>  family, model and stepping of the processor being used).
> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> index 44140b7f596a..cbc18b22ace5 100644
> --- a/tools/perf/util/env.c
> +++ b/tools/perf/util/env.c
> @@ -531,6 +531,24 @@ int perf_env__numa_node(struct perf_env *env, struct perf_cpu cpu)
>  	return cpu.cpu >= 0 && cpu.cpu < env->nr_numa_map ? env->numa_map[cpu.cpu] : -1;
>  }
>  
> +bool perf_env__has_pmu_mapping(struct perf_env *env, const char *pmu_name)
> +{
> +	char *pmu_mapping = env->pmu_mappings, *colon;
> +
> +	for (int i = 0; i < env->nr_pmu_mappings; ++i) {
> +		if (strtoul(pmu_mapping, &colon, 0) == ULONG_MAX || *colon != ':')
> +			goto out_error;
> +
> +		pmu_mapping = colon + 1;
> +		if (strcmp(pmu_mapping, pmu_name) == 0)
> +			return true;
> +
> +		pmu_mapping += strlen(pmu_mapping) + 1;
> +	}
> +out_error:
> +	return false;
> +}
> +
>  char *perf_env__find_pmu_cap(struct perf_env *env, const char *pmu_name,
>  			     const char *cap)
>  {
> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index 4566c51f2fd9..56aea562c61b 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -174,4 +174,6 @@ struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);
>  int perf_env__numa_node(struct perf_env *env, struct perf_cpu cpu);
>  char *perf_env__find_pmu_cap(struct perf_env *env, const char *pmu_name,
>  			     const char *cap);
> +
> +bool perf_env__has_pmu_mapping(struct perf_env *env, const char *pmu_name);
>  #endif /* __PERF_ENV_H */
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index e86b9439ffee..3cc288d14002 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -2145,6 +2145,14 @@ static void print_pmu_caps(struct feat_fd *ff, FILE *fp)
>  		__print_pmu_caps(fp, pmu_caps->nr_caps, pmu_caps->caps,
>  				 pmu_caps->pmu_name);
>  	}
> +
> +	if (strcmp(perf_env__arch(&ff->ph->env), "x86") == 0 &&
> +	    perf_env__has_pmu_mapping(&ff->ph->env, "ibs_op")) {
> +		char *max_precise = perf_env__find_pmu_cap(&ff->ph->env, "cpu", "max_precise");
> +
> +		if (max_precise != NULL && atoi(max_precise) == 0)
> +			fprintf(fp, "# AMD systems uses ibs_op// PMU for some precise events, e.g.: cycles:p, see the 'perf list' man page for further details.\n");
> +	}
>  }
>  
>  static void print_pmu_mappings(struct feat_fd *ff, FILE *fp)
> -- 
> 2.41.0
> 

-- 

- Arnaldo

  reply	other threads:[~2023-11-09 19:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-07  8:33 [PATCH 1/2] perf tool AMD: Use non-precise cycles as default event on certain Zen2 processors Ravi Bangoria
2023-11-07  8:33 ` [PATCH 2/2] perf header: Additional note on AMD IBS for max_precise pmu cap Ravi Bangoria
2023-11-09 19:56   ` Arnaldo Carvalho de Melo [this message]
2023-11-07 18:22 ` [PATCH 1/2] perf tool AMD: Use non-precise cycles as default event on certain Zen2 processors Ian Rogers
2023-11-10  9:37   ` Ravi Bangoria
2023-11-09 21:53 ` Namhyung Kim
2023-11-10  9:46   ` Ravi Bangoria
2023-12-08 23:33     ` Namhyung Kim
2023-12-11 13:53       ` Ravi Bangoria
2023-12-11 23:01         ` Namhyung Kim
2023-12-12  4:18           ` Ravi Bangoria

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZU05X1SLwk1NTW59@kernel.org \
    --to=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=ananth.narayan@amd.com \
    --cc=changbin.du@huawei.com \
    --cc=chenhuacai@kernel.org \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kim.phillips@amd.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    --cc=sandipan.das@amd.com \
    --cc=santosh.shukla@amd.com \
    --cc=wangming01@loongson.cn \
    --cc=yangjihong1@huawei.com \
    --cc=zwisler@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.