linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Clark <james.clark@arm.com>
To: Ian Rogers <irogers@google.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Arnaldo Carvalho de Melo" <acme@kernel.org>,
	"Mark Rutland" <mark.rutland@arm.com>,
	"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"Namhyung Kim" <namhyung@kernel.org>,
	"Adrian Hunter" <adrian.hunter@intel.com>,
	"Suzuki K Poulose" <suzuki.poulose@arm.com>,
	"Mike Leach" <mike.leach@linaro.org>,
	"John Garry" <john.g.garry@oracle.com>,
	"Will Deacon" <will@kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Darren Hart" <dvhart@infradead.org>,
	"Davidlohr Bueso" <dave@stgolabs.net>,
	"André Almeida" <andrealmeid@igalia.com>,
	"Kan Liang" <kan.liang@linux.intel.com>,
	"K Prateek Nayak" <kprateek.nayak@amd.com>,
	"Sean Christopherson" <seanjc@google.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Kajol Jain" <kjain@linux.ibm.com>,
	"Athira Rajeev" <atrajeev@linux.vnet.ibm.com>,
	"Andrew Jones" <ajones@ventanamicro.com>,
	"Alexandre Ghiti" <alexghiti@rivosinc.com>,
	"Atish Patra" <atishp@rivosinc.com>,
	"Steinar H. Gunderson" <sesse@google.com>,
	"Yang Jihong" <yangjihong1@huawei.com>,
	"Yang Li" <yang.lee@linux.alibaba.com>,
	"Changbin Du" <changbin.du@huawei.com>,
	"Sandipan Das" <sandipan.das@amd.com>,
	"Ravi Bangoria" <ravi.bangoria@amd.com>,
	"Paran Lee" <p4ranlee@gmail.com>,
	"Nick Desaulniers" <ndesaulniers@google.com>,
	"Huacai Chen" <chenhuacai@kernel.org>,
	"Yanteng Si" <siyanteng@loongson.cn>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org,
	bpf@vger.kernel.org, "Leo Yan" <leo.yan@linaro.org>
Subject: Re: [PATCH v1 07/14] perf arm-spe/cs-etm: Directly iterate CPU maps
Date: Tue, 12 Dec 2023 14:36:21 +0000	[thread overview]
Message-ID: <2adf8e9c-e08d-a772-bfe2-378d6759721f@arm.com> (raw)
In-Reply-To: <e3a01313-ed03-bc54-0260-5445fb2c15ee@arm.com>



On 12/12/2023 14:17, James Clark wrote:
> 
> 
> On 29/11/2023 06:02, Ian Rogers wrote:
>> Rather than iterate all CPUs and see if they are in CPU maps, directly
>> iterate the CPU map. Similarly make use of the intersect
>> function. Switch perf_cpu_map__has_any_cpu_or_is_empty to more
>> appropriate alternatives.
>>
>> Signed-off-by: Ian Rogers <irogers@google.com>
>> ---
>>  tools/perf/arch/arm/util/cs-etm.c    | 77 ++++++++++++----------------
>>  tools/perf/arch/arm64/util/arm-spe.c |  4 +-
>>  2 files changed, 34 insertions(+), 47 deletions(-)
>>
>> diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
>> index 77e6663c1703..a68a72f2f668 100644
>> --- a/tools/perf/arch/arm/util/cs-etm.c
>> +++ b/tools/perf/arch/arm/util/cs-etm.c
>> @@ -197,38 +197,32 @@ static int cs_etm_validate_timestamp(struct auxtrace_record *itr,
>>  static int cs_etm_validate_config(struct auxtrace_record *itr,
>>  				  struct evsel *evsel)
>>  {
>> -	int i, err = -EINVAL;
>> +	int idx, err = -EINVAL;
>>  	struct perf_cpu_map *event_cpus = evsel->evlist->core.user_requested_cpus;
>>  	struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus();
>> +	struct perf_cpu_map *intersect_cpus = perf_cpu_map__intersect(event_cpus, online_cpus);
>> +	struct perf_cpu cpu;
>>  
>> -	/* Set option of each CPU we have */
>> -	for (i = 0; i < cpu__max_cpu().cpu; i++) {
>> -		struct perf_cpu cpu = { .cpu = i, };
>> -
>> -		/*
>> -		 * In per-cpu case, do the validation for CPUs to work with.
>> -		 * In per-thread case, the CPU map is empty.  Since the traced
>> -		 * program can run on any CPUs in this case, thus don't skip
>> -		 * validation.
>> -		 */
>> -		if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus) &&
>> -		    !perf_cpu_map__has(event_cpus, cpu))
>> -			continue;
> 
> This has broken validation for per-thread sessions.
> perf_cpu_map__intersect() doesn't seem to be able to handle the case
> where an 'any' map intersected with an online map should return the
> online map. Or at least it should for this to work, and it seems to make
> sense for it to work that way.
> 
> At least that was my initial impression, but I only debugged it and saw
> that the loop is now skipped entirely.
> 
>> -
>> -		if (!perf_cpu_map__has(online_cpus, cpu))
>> -			continue;
>> +	perf_cpu_map__put(online_cpus);
>>  
>> -		err = cs_etm_validate_context_id(itr, evsel, i);
>> +	/*
>> +	 * Set option of each CPU we have. In per-cpu case, do the validation
>> +	 * for CPUs to work with.  In per-thread case, the CPU map is empty.
>> +	 * Since the traced program can run on any CPUs in this case, thus don't
>> +	 * skip validation.
>> +	 */
>> +	perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) {
>> +		err = cs_etm_validate_context_id(itr, evsel, cpu.cpu);
>>  		if (err)
>>  			goto out;
>> -		err = cs_etm_validate_timestamp(itr, evsel, i);
>> +		err = cs_etm_validate_timestamp(itr, evsel, idx);
>>  		if (err)
>>  			goto out;
>>  	}
>>  
>>  	err = 0;
>>  out:
>> -	perf_cpu_map__put(online_cpus);
>> +	perf_cpu_map__put(intersect_cpus);
>>  	return err;
>>  }
>>  
>> @@ -435,7 +429,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
>>  	 * Also the case of per-cpu mmaps, need the contextID in order to be notified
>>  	 * when a context switch happened.
>>  	 */
>> -	if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) {
>> +	if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) {
>>  		evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
>>  					   "timestamp", 1);
>>  		evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
>> @@ -461,7 +455,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
>>  	evsel->core.attr.sample_period = 1;
>>  
>>  	/* In per-cpu case, always need the time of mmap events etc */
>> -	if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus))
>> +	if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus))
>>  		evsel__set_sample_bit(evsel, TIME);
>>  
>>  	err = cs_etm_validate_config(itr, cs_etm_evsel);
>> @@ -533,38 +527,32 @@ static size_t
>>  cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
>>  		      struct evlist *evlist __maybe_unused)
>>  {
>> -	int i;
>> +	int idx;
>>  	int etmv3 = 0, etmv4 = 0, ete = 0;
>>  	struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus;
>>  	struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus();
>> +	struct perf_cpu cpu;
>>  
>>  	/* cpu map is not empty, we have specific CPUs to work with */
>> -	if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) {
>> -		for (i = 0; i < cpu__max_cpu().cpu; i++) {
>> -			struct perf_cpu cpu = { .cpu = i, };
>> -
>> -			if (!perf_cpu_map__has(event_cpus, cpu) ||
>> -			    !perf_cpu_map__has(online_cpus, cpu))
>> -				continue;
>> +	if (!perf_cpu_map__is_empty(event_cpus)) {
>> +		struct perf_cpu_map *intersect_cpus =
>> +			perf_cpu_map__intersect(event_cpus, online_cpus);
>>  
>> -			if (cs_etm_is_ete(itr, i))
>> +		perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) {
>> +			if (cs_etm_is_ete(itr, cpu.cpu))

Similar problem here. For a per-thread session, the CPU map is not empty
(it's an 'any' map, presumably length 1), so it comes into this first
if, rather than the else below which is for the 'any' scenario.

Then the intersect with online CPUs results in an empty map, so no CPU
metadata is recorded, then the session fails.

If you made the intersect work in the way I mentioned above we could
also delete the else below, because that's just another way to convert
from 'any' to 'all online'.

>>  				ete++;
>> -			else if (cs_etm_is_etmv4(itr, i))
>> +			else if (cs_etm_is_etmv4(itr, cpu.cpu))
>>  				etmv4++;
>>  			else
>>  				etmv3++;
>>  		}
>> +		perf_cpu_map__put(intersect_cpus);
>>  	} else {
>>  		/* get configuration for all CPUs in the system */
>> -		for (i = 0; i < cpu__max_cpu().cpu; i++) {
>> -			struct perf_cpu cpu = { .cpu = i, };
>> -
>> -			if (!perf_cpu_map__has(online_cpus, cpu))
>> -				continue;
>> -
>> -			if (cs_etm_is_ete(itr, i))
>> +		perf_cpu_map__for_each_cpu(cpu, idx, online_cpus) {
>> +			if (cs_etm_is_ete(itr, cpu.cpu))
>>  				ete++;
>> -			else if (cs_etm_is_etmv4(itr, i))
>> +			else if (cs_etm_is_etmv4(itr, cpu.cpu))
>>  				etmv4++;
>>  			else
>>  				etmv3++;
>> @@ -814,15 +802,14 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
>>  		return -EINVAL;
>>  
>>  	/* If the cpu_map is empty all online CPUs are involved */
>> -	if (perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) {
>> +	if (perf_cpu_map__is_empty(event_cpus)) {
>>  		cpu_map = online_cpus;
>>  	} else {
>>  		/* Make sure all specified CPUs are online */
>> -		for (i = 0; i < perf_cpu_map__nr(event_cpus); i++) {
>> -			struct perf_cpu cpu = { .cpu = i, };
>> +		struct perf_cpu cpu;
>>  
>> -			if (perf_cpu_map__has(event_cpus, cpu) &&
>> -			    !perf_cpu_map__has(online_cpus, cpu))
>> +		perf_cpu_map__for_each_cpu(cpu, i, event_cpus) {
>> +			if (!perf_cpu_map__has(online_cpus, cpu))
>>  				return -EINVAL;
>>  		}
>>  
>> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
>> index 51ccbfd3d246..0b52e67edb3b 100644
>> --- a/tools/perf/arch/arm64/util/arm-spe.c
>> +++ b/tools/perf/arch/arm64/util/arm-spe.c
>> @@ -232,7 +232,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
>>  	 * In the case of per-cpu mmaps, sample CPU for AUX event;
>>  	 * also enable the timestamp tracing for samples correlation.
>>  	 */
>> -	if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) {
>> +	if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) {
>>  		evsel__set_sample_bit(arm_spe_evsel, CPU);
>>  		evsel__set_config_if_unset(arm_spe_pmu, arm_spe_evsel,
>>  					   "ts_enable", 1);
>> @@ -265,7 +265,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
>>  	tracking_evsel->core.attr.sample_period = 1;
>>  
>>  	/* In per-cpu case, always need the time of mmap events etc */
>> -	if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) {
>> +	if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) {
>>  		evsel__set_sample_bit(tracking_evsel, TIME);
>>  		evsel__set_sample_bit(tracking_evsel, CPU);
>>  

  reply	other threads:[~2023-12-12 14:36 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-29  6:01 [PATCH v1 00/14] Clean up libperf cpumap's empty function Ian Rogers
2023-11-29  6:01 ` [PATCH v1 01/14] libperf cpumap: Rename perf_cpu_map__dummy_new Ian Rogers
2023-12-12 11:20   ` James Clark
2023-11-29  6:01 ` [PATCH v1 02/14] libperf cpumap: Rename and prefer sysfs for perf_cpu_map__default_new Ian Rogers
2023-12-12 11:32   ` James Clark
2023-12-12 17:39   ` Arnaldo Carvalho de Melo
2023-12-12 17:52     ` Ian Rogers
2023-11-29  6:02 ` [PATCH v1 03/14] libperf cpumap: Rename perf_cpu_map__empty Ian Rogers
2023-12-12 11:38   ` James Clark
2023-11-29  6:02 ` [PATCH v1 04/14] libperf cpumap: Replace usage of perf_cpu_map__new(NULL) Ian Rogers
2023-12-12 11:44   ` James Clark
2023-11-29  6:02 ` [PATCH v1 05/14] libperf cpumap: Add for_each_cpu that skips the "any CPU" case Ian Rogers
2023-12-12 13:54   ` James Clark
2023-11-29  6:02 ` [PATCH v1 06/14] libperf cpumap: Add any, empty and min helpers Ian Rogers
2023-12-12 14:00   ` James Clark
2023-12-12 14:51     ` James Clark
2023-12-12 20:02     ` Ian Rogers
2023-12-12 15:06   ` James Clark
2023-12-12 20:27     ` Ian Rogers
2023-12-13 13:48       ` James Clark
2023-11-29  6:02 ` [PATCH v1 07/14] perf arm-spe/cs-etm: Directly iterate CPU maps Ian Rogers
2023-12-12 14:17   ` James Clark
2023-12-12 14:36     ` James Clark [this message]
2024-02-01  2:12       ` Ian Rogers
2024-02-01 11:06         ` James Clark
2023-11-29  6:02 ` [PATCH v1 08/14] perf intel-pt/intel-bts: Switch perf_cpu_map__has_any_cpu_or_is_empty use Ian Rogers
2023-11-29  6:02 ` [PATCH v1 09/14] perf cpumap: Clean up use of perf_cpu_map__has_any_cpu_or_is_empty Ian Rogers
2023-12-12 15:10   ` James Clark
2023-11-29  6:02 ` [PATCH v1 10/14] perf top: Avoid repeated function calls Ian Rogers
2023-12-12 15:11   ` James Clark
2023-12-18 20:34     ` Arnaldo Carvalho de Melo
2023-11-29  6:02 ` [PATCH v1 11/14] perf arm64 header: Remove unnecessary CPU map get and put Ian Rogers
2023-12-12 15:13   ` James Clark
2023-11-29  6:02 ` [PATCH v1 12/14] perf stat: Remove duplicate cpus_map_matched function Ian Rogers
2023-12-12 11:28   ` James Clark
2023-11-29  6:02 ` [PATCH v1 13/14] perf cpumap: Use perf_cpu_map__for_each_cpu when possible Ian Rogers
2023-12-12 11:25   ` James Clark
2023-11-29  6:02 ` [PATCH v1 14/14] libperf cpumap: Document perf_cpu_map__nr's behavior Ian Rogers
2023-12-12 15:20   ` James Clark
2023-12-18 20:36     ` Arnaldo Carvalho de Melo
2023-12-11 19:31 ` [PATCH v1 00/14] Clean up libperf cpumap's empty function Ian Rogers
2023-12-12 17:59 ` Arnaldo Carvalho de Melo
2023-12-13 12:48   ` Adrian Hunter
2023-12-14 13:49     ` Arnaldo Carvalho de Melo
2023-12-13 23:29 ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2adf8e9c-e08d-a772-bfe2-378d6759721f@arm.com \
    --to=james.clark@arm.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ajones@ventanamicro.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexghiti@rivosinc.com \
    --cc=andrealmeid@igalia.com \
    --cc=atishp@rivosinc.com \
    --cc=atrajeev@linux.vnet.ibm.com \
    --cc=bpf@vger.kernel.org \
    --cc=changbin.du@huawei.com \
    --cc=chenhuacai@kernel.org \
    --cc=coresight@lists.linaro.org \
    --cc=dave@stgolabs.net \
    --cc=dvhart@infradead.org \
    --cc=irogers@google.com \
    --cc=john.g.garry@oracle.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kjain@linux.ibm.com \
    --cc=kprateek.nayak@amd.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mike.leach@linaro.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=p4ranlee@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    --cc=sandipan.das@amd.com \
    --cc=seanjc@google.com \
    --cc=sesse@google.com \
    --cc=siyanteng@loongson.cn \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=yang.lee@linux.alibaba.com \
    --cc=yangjihong1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).