From: kajoljain <kjain@linux.ibm.com>
To: Kim Phillips <kim.phillips@amd.com>, Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Ian Rogers <irogers@google.com>, Ingo Molnar <mingo@redhat.com>,
Joao Martins <joao.m.martins@oracle.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Mark Rutland <mark.rutland@arm.com>,
Michael Petlan <mpetlan@redhat.com>,
Namhyung Kim <namhyung@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Robert Richter <robert.richter@amd.com>,
Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH 2/2] perf tools: Improve IBS error handling
Date: Wed, 8 Dec 2021 12:03:01 +0530 [thread overview]
Message-ID: <e937591b-542d-c3ac-bc64-d5223c27f70d@linux.ibm.com> (raw)
In-Reply-To: <4281dce8-0e2a-cfe8-3b05-1b9a455d09a9@amd.com>
On 11/30/21 3:39 AM, Kim Phillips wrote:
> On 11/24/21 2:00 AM, kajoljain wrote:
>> On 11/23/21 8:55 PM, Kim Phillips wrote:
>>> On 11/23/21 2:40 AM, kajoljain wrote:
>>>> On 10/8/21 12:47 AM, Kim Phillips wrote:
>>>>> On 10/7/21 12:28 PM, Jiri Olsa wrote:
>>>>>> On Mon, Oct 04, 2021 at 04:41:14PM -0500, Kim Phillips wrote:
>>>>>>> ---
>>>>>>> tools/perf/util/evsel.c | 24 ++++++++++++++++++++++++
>>>>>>> 1 file changed, 24 insertions(+)
>>>>>>>
>>>>>>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>>>>>>> index b915840690d4..f8a9cbd99314 100644
>>>>>>> --- a/tools/perf/util/evsel.c
>>>>>>> +++ b/tools/perf/util/evsel.c
>>>>>>> @@ -2743,9 +2743,22 @@ static bool find_process(const char *name)
>>>>>>> return ret ? false : true;
>>>>>>> }
>>>>>>> +static bool is_amd(const char *arch, const char *cpuid)
>>>>>>> +{
>>>>>>> + return arch && !strcmp("x86", arch) && cpuid &&
>>>>>>> strstarts(cpuid,
>>>>>>> "AuthenticAMD");
>>>>>>> +}
>>>>>>> +
>>>>>>> +static bool is_amd_ibs(struct evsel *evsel)
>>>>>>> +{
>>>>>>> + return evsel->core.attr.precise_ip || !strncmp(evsel->pmu_name,
>>>>>>> "ibs", 3);
>>>>>>> +}
>>>>>>> +
>>>>>>> int evsel__open_strerror(struct evsel *evsel, struct target
>>>>>>> *target,
>>>>>>> int err, char *msg, size_t size)
>>>>>>> {
>>>>>>> + struct perf_env *env = evsel__env(evsel);
>>>>>>> + const char *arch = perf_env__arch(env);
>>>>>>> + const char *cpuid = perf_env__cpuid(env);
>>>>>>> char sbuf[STRERR_BUFSIZE];
>>>>>>> int printed = 0, enforced = 0;
>>>>>>> @@ -2841,6 +2854,17 @@ int evsel__open_strerror(struct evsel
>>>>>>> *evsel, struct target *target,
>>>>>>> return scnprintf(msg, size, "wrong clockid (%d).",
>>>>>>> clockid);
>>>>>>> if (perf_missing_features.aux_output)
>>>>>>> return scnprintf(msg, size, "The 'aux_output'
>>>>>>> feature
>>>>>>> is not supported, update the kernel.");
>>>>>>> + if (is_amd(arch, cpuid)) {
>>>>>>> + if (is_amd_ibs(evsel)) {
>>>>>>
>>>>>> would single 'is_amd_ibs' call be better? checking on both amd and
>>>>>> ibs
>>>>>
>>>>> Good suggestion. If you look at the later patch in the
>>>>> BRS series, I have rewritten it to add the new
>>>>> AMD PMU like so:
>>>>>
>>>>> if (is_amd()) {
>>>>> if (is_amd_ibs()) {
>>>>> if (evsel->this)
>>>>> return
>>>>> if (evsel->that)
>>>>> return
>>>>> }
>>>>> + if (is_amd_brs()) {
>>>>> + if (evsel->this)
>>>>> + return
>>>>> + if (evsel->that)
>>>>> + return
>>>>> + }
>>>>> }
>>>>
>>>> Hi Kim,
>>>> From my point of view, it won't be a good idea of adding so many
>>>> checks in common function definition itself.
>>>> Can you just create a check to see if its amd machine and then add a
>>>> function call which will handle all four conditions together?
>>>>
>>>> which is basically for:
>>>>
>>>> + if (is_amd(arch, cpuid)) {
>>>> + if (is_amd_ibs(evsel)) {
>>>> + if (evsel->core.attr.exclude_kernel)
>>>> + return scnprintf(msg, size,
>>>> + "AMD IBS can't exclude kernel events. Try running at a higher
>>>> privilege level.");
>>>> + if (!evsel->core.system_wide)
>>>> + return scnprintf(msg, size,
>>>> + "AMD IBS may only be available in system-wide/per-cpu mode. Try
>>>> using
>>>> -a, or -C and workload affinity");
>>>> + }
>>>>
>>>> and this:
>>>>
>>>> + if (is_amd_brs(evsel)) {
>>>> + if (evsel->core.attr.freq)
>>>> + return scnprintf(msg, size,
>>>> + "AMD Branch Sampling does not support frequency mode sampling,
>>>> must
>>>> pass a fixed sampling period via -c option or
>>>> cpu/branch-brs,period=xxxx/.");
>>>> + /* another reason is that the period is too small */
>>>> + return scnprintf(msg, size,
>>>> + "AMD Branch Sampling does not support sampling period smaller than
>>>> what is reported in /sys/devices/cpu/caps/branches.");
>>>> + }
>>>
>>> IIRC, I tried something like that but carrying the
>>>
>>>
>>> struct target *target, int err, char *msg, size_t size
>>>
>>> parameters made things worse.
>>>
>>>> So, incase we are in amd machine, common function evsel__open_strerror
>>>> will call function may be something like amd_evesel_open_strerror_check
>>>> which will look for both ibs and brs conditions and return
>>>> corresponding
>>>> error statement.
>>>
>>> The vast majority of decisions made by evsel__open_strerror are
>>> going to be common across most arch/uarches. AMD has only these
>>> two pesky exceptions to the rule and therefore IMO it's ok
>>> to have them inline with the common function, since the decisions
>>> are so deeply intertwined. A new amd_evsel_open_strerror_check
>>> sounds like it'd duplicate too much of the common function code
>>> in order to handle the common error cases.
>>
>> Hi Kim,
>> Sorry for the confusion, what I meant by adding new function is just
>> to handle these corner error cases and not duplicating whole
>> evsel__open_strerror code.
>>
>> Maybe something like below code, Its just prototype of code to show you
>> the flow, you can refine it and check for any build or indentation
>> issues using checkpatch.pl script.
>>
>> So basically, in common function we can just have 2 calls, first to
>> check if we are in amd system and second to return corresponding error
>> message, rather then adding whole chunk of if's which are specific to
>> amd.
>>
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index ac0127be0459..adefb162ae08 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -2852,9 +2852,40 @@ static bool find_process(const char *name)
>> return ret ? false : true;
>> }
>>
>> +static bool is_amd(const char *arch, const char *cpuid)
>> +{
>> + return arch && !strcmp("x86", arch) && cpuid && strstarts(cpuid,
>> "AuthenticAMD");
>> +}
>> +
>> +static int error_amd_ibs_brs(struct evsel *evsel, char *msg, size_t
>> size)
>> +{
>> + if (evsel->core.attr.precise_ip || !strncmp(evsel->pmu_name,
>> "ibs", 3)) {
>> + if (evsel->core.attr.exclude_kernel)
>> + return scnprintf(msg, size,
>> + "AMD IBS can't exclude kernel events. Try running at a higher
>> privilege level.");
>> + if (!evsel->core.system_wide)
>> + return scnprintf(msg, size,
>> + "AMD IBS may only be available in system-wide/per-cpu mode. Try
>> using -a, or -C and workload affinity");
>> + }
>> +
>> + if (((evsel->core.attr.config & 0xff) == 0xc4) &&
>> (evsel->core.attr.sample_type & PERF_SAMPLE_BRANCH_STACK)) {
>> + if (evsel->core.attr.freq) {
>> + return scnprintf(msg, size,
>> + "AMD Branch Sampling does not support frequency mode sampling,
>> must pass a fixed sampling
>> + period via -c option or cpu/branch-brs,period=xxxx/.");
>> + /* another reason is that the period is too small */
>> + return scnprintf(msg, size,
>> + "AMD Branch Sampling does not support sampling period smaller
>> than what is reported in /sys/devices/cpu/caps/branches.");
>> + }
>> + }
>> +}
>> +
>> int evsel__open_strerror(struct evsel *evsel, struct target *target,
>> int err, char *msg, size_t size)
>> {
>> + struct perf_env *env = evsel__env(evsel);
>> + const char *arch = perf_env__arch(env);
>> + const char *cpuid = perf_env__cpuid(env);
>> char sbuf[STRERR_BUFSIZE];
>> int printed = 0, enforced = 0;
>>
>> @@ -2950,6 +2981,8 @@ int evsel__open_strerror(struct evsel *evsel,
>> struct target *target,
>> return scnprintf(msg, size, "wrong clockid
>> (%d).", clockid);
>> if (perf_missing_features.aux_output)
>> return scnprintf(msg, size, "The 'aux_output'
>> feature is not supported, update the kernel.");
>> + if (is_amd(arch, cpuid))
>> + return error_amd_ibs_brs(evsel, msg, size);
>> break;
>> case ENODATA:
>> return scnprintf(msg, size, "Cannot collect data source
>> with the load latency event alone. "
>
> That change will makes AMD machines fail to fall back to the default
> "The sys_perf_event_open() syscall returned with..." error string
> in case it's not those AMD IBS and BRS sub-conditions.
Yes right, as I mentioned before, the code I pointed was just a
prototype to show you the flow, these corner cases can be handled on top
of it.
>
> Is having the AMD error code checking in the main evsel__open_strerror()
> so bad? Other arches and their PMU implementations may find error
> conditions that they have in common with AMD's, therefore
> opening up the code for opposite types of refactoring and
> reuse than what is being requested here. E.g., I've seen
> other hardware configurations - not specific to one architecture -
> that could also use this message:
>
From my understanding, adding too many checks in common function
for a specific arch is not a good practice. Since you already adding
multiple functions to get information like ,if current system is
amd/ibs/brs. Can't we rather just add a single function and handled all
these checks there?
That's just my thoughts, if maintainers are ok with it, then its fine
for me too.
Thanks,
Kajol Jain
> {"AMD IBS"->"%s",pmu_name} may only be available in system-wide/per-cpu
> mode. Try using -a, or -C and workload affinity");
>
> Thanks,
>
> Kim
next prev parent reply other threads:[~2021-12-08 6:33 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-04 21:41 [PATCH 1/2] perf evsel: Make evsel__env always return a valid env Kim Phillips
2021-10-04 21:41 ` [PATCH 2/2] perf tools: Improve IBS error handling Kim Phillips
2021-10-07 17:28 ` Jiri Olsa
2021-10-07 19:17 ` Kim Phillips
2021-11-18 22:45 ` Kim Phillips
2021-11-23 8:40 ` kajoljain
2021-11-23 15:25 ` Kim Phillips
2021-11-24 8:00 ` kajoljain
2021-11-29 22:09 ` Kim Phillips
2021-12-08 6:33 ` kajoljain [this message]
2021-12-08 17:14 ` Kim Phillips
2021-11-23 8:26 ` [PATCH 1/2] perf evsel: Make evsel__env always return a valid env kajoljain
2022-03-22 21:36 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e937591b-542d-c3ac-bc64-d5223c27f70d@linux.ibm.com \
--to=kjain@linux.ibm.com \
--cc=acme@redhat.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=boris.ostrovsky@oracle.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=joao.m.martins@oracle.com \
--cc=jolsa@redhat.com \
--cc=kim.phillips@amd.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=mpetlan@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=robert.richter@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).