linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
To: "Zhang, Rui" <rui.zhang@intel.com>,
	"alexander.shishkin@linux.intel.com"
	<alexander.shishkin@linux.intel.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"Hunter, Adrian" <adrian.hunter@intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"irogers@google.com" <irogers@google.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"gustavoars@kernel.org" <gustavoars@kernel.org>,
	"kan.liang@linux.intel.com" <kan.liang@linux.intel.com>,
	"kees@kernel.org" <kees@kernel.org>,
	"mark.rutland@arm.com" <mark.rutland@arm.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"bp@alien8.de" <bp@alien8.de>,
	"acme@kernel.org" <acme@kernel.org>,
	"jolsa@kernel.org" <jolsa@kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"namhyung@kernel.org" <namhyung@kernel.org>
Cc: "ravi.bangoria@amd.com" <ravi.bangoria@amd.com>,
	"kprateek.nayak@amd.com" <kprateek.nayak@amd.com>,
	"gautham.shenoy@amd.com" <gautham.shenoy@amd.com>,
	"linux-perf-users@vger.kernel.org"
	<linux-perf-users@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-hardening@vger.kernel.org"
	<linux-hardening@vger.kernel.org>,
	"sandipan.das@amd.com" <sandipan.das@amd.com>,
	"ananth.narayan@amd.com" <ananth.narayan@amd.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Subject: Re: [PATCH 6/6] perf/x86/rapl: Add per-core energy counter support for AMD CPUs
Date: Thu, 13 Jun 2024 12:09:47 +0530	[thread overview]
Message-ID: <f98c662d-9270-4974-a12b-60ea995d0aa6@amd.com> (raw)
In-Reply-To: <d7f1d65c7e8872dee2a97ee47be191496d048d1d.camel@intel.com>

Hi Rui,

On 6/11/2024 2:00 PM, Zhang, Rui wrote:
>> @@ -345,9 +353,14 @@ static int rapl_pmu_event_init(struct perf_event
>> *event)
>>         u64 cfg = event->attr.config & RAPL_EVENT_MASK;
>>         int bit, ret = 0;
>>         struct rapl_pmu *rapl_pmu;
>> +       struct rapl_pmus *curr_rapl_pmus;
>>  
>>         /* only look at RAPL events */
>> -       if (event->attr.type != rapl_pmus->pmu.type)
>> +       if (event->attr.type == rapl_pmus->pmu.type)
>> +               curr_rapl_pmus = rapl_pmus;
>> +       else if (rapl_pmus_per_core && event->attr.type ==
>> rapl_pmus_per_core->pmu.type)
>> +               curr_rapl_pmus = rapl_pmus_per_core;
>> +       else
>>                 return -ENOENT;
> 
> can we use container_of(event->pmu, struct rapl_pmus, pmu)?

Yes! that would be cleaner, will add it in next version.

> 
>>  
>>         /* check only supported bits are set */
>> @@ -374,9 +387,14 @@ static int rapl_pmu_event_init(struct perf_event
>> *event)
>>                 return -EINVAL;
>>  
>>         /* must be done before validate_group */
>> -       rapl_pmu = cpu_to_rapl_pmu(event->cpu);
>> +       if (curr_rapl_pmus == rapl_pmus_per_core)
>> +               rapl_pmu = curr_rapl_pmus-
>>> rapl_pmu[topology_core_id(event->cpu)];
>> +       else
>> +               rapl_pmu = curr_rapl_pmus-
>>> rapl_pmu[get_rapl_pmu_idx(event->cpu)];
>> +
>>         if (!rapl_pmu)
>>                 return -EINVAL;
> 
> Current code has PERF_EV_CAP_READ_ACTIVE_PKG flag set.
> Can you help me understand why it does not affect the new per-core pmu?

Good question, I went back and looked thru the code, it turns out that we 
are not going thru the code path that checks this flag and decides whether 
to run on the local cpu(cpu on which perf is running) or the event->cpu.

So, having or not having this flag doesnt make a difference here, I did a 
small experiment for this. 

On a single package system, any core should be able to read the energy-pkg 
RAPL MSR and return the value, so there would be no need for a smp call to 
the event->cpu, but if we look thru the ftrace below we can see that only 
core 0 executes the pmu event even though we launched the perf stat for 
core 1.

--------------------------------------------------------------------------

root@shatadru:/sys/kernel/tracing# perf stat -C 1 -e power/energy-pkg/ -- dd if=/dev/zero of=/dev/null bs=1M count=100000
100000+0 records in
100000+0 records out
104857600000 bytes (105 GB, 98 GiB) copied, 2.03295 s, 51.6 GB/s

 Performance counter stats for 'CPU(s) 1':

            231.59 Joules power/energy-pkg/

       2.033916467 seconds time elapsed

root@shatadru:/sys/kernel/tracing# echo 0 > tracing_on
root@shatadru:/sys/kernel/tracing# cat trace
# tracer: function
#
# entries-in-buffer/entries-written: 12/12   #P:192
#
#                                _-----=> irqs-off/BH-disabled
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
            perf-3309    [096] ...1.  3422.558183: rapl_get_attr_cpumask <-dev_attr_show
            perf-3309    [001] ...1.  3422.559436: rapl_pmu_event_init <-perf_try_init_event
            perf-3309    [001] ...1.  3422.559441: rapl_pmu_event_init <-perf_try_init_event
            perf-3309    [001] ...1.  3422.559449: rapl_pmu_event_init <-perf_try_init_event
            perf-3309    [001] ...1.  3422.559537: smp_call_function_single <-event_function_call	<-- smp call to the event owner cpu(i.e. CPU0)
          <idle>-0       [000] d.h3.  3422.559544: rapl_pmu_event_add <-event_sched_in			<-- CPU# column changed to 0
          <idle>-0       [000] d.h4.  3422.559545: __rapl_pmu_event_start <-rapl_pmu_event_add
            perf-3309    [001] ...1.  3424.593398: smp_call_function_single <-event_function_call	<-- smp call to the event owner cpu(i.e. CPU0)
          <idle>-0       [000] d.h3.  3424.593403: rapl_pmu_event_del <-event_sched_out			<-- CPU# column changed to 0
          <idle>-0       [000] d.h3.  3424.593403: rapl_pmu_event_stop <-rapl_pmu_event_del
          <idle>-0       [000] d.h4.  3424.593404: rapl_event_update.isra.0 <-rapl_pmu_event_stop
            perf-3309    [001] ...1.  3424.593514: smp_call_function_single <-event_function_call

--------------------------------------------------------------------------

So, as we always use the event->cpu to run the event, the per-core PMU
is not being affected by this flag.

Anyway in next version, I will only selectively enable this flag for 
package scope events. But we will need to look into fixing this 
ineffective flag. 

> 
>> +
>>         event->cpu = rapl_pmu->cpu;
>>         event->pmu_private = rapl_pmu;
>>         event->hw.event_base = rapl_msrs[bit].msr;
>> @@ -408,17 +426,38 @@ static struct attribute_group
>> rapl_pmu_attr_group = {
>>         .attrs = rapl_pmu_attrs,
>>  };
>>  
>> +static ssize_t rapl_get_attr_per_core_cpumask(struct device *dev,
>> +                                            struct device_attribute
>> *attr, char *buf)
>> +{
>> +       return cpumap_print_to_pagebuf(true, buf,
>> &rapl_pmus_per_core->cpumask);
>> +}
>> +
>> +static struct device_attribute dev_attr_per_core_cpumask =
>> __ATTR(cpumask, 0444,
>> +                                                               
>> rapl_get_attr_per_core_cpumask,
>> +                                                               
>> NULL);
> 
> DEVICE_ATTR

I was not able to use DEVICE_ATTR, because there is already a "device_attribute dev_attr_cpumask_name" 
created for package PMU cpumask using DEVICE_ATTR(). 
So I had to create a "device_attribute dev_attr_per_core_cpumask" manually 
to avoid variable name clash.

> 
>> +
>> +static struct attribute *rapl_pmu_per_core_attrs[] = {
>> +       &dev_attr_per_core_cpumask.attr,
>> +       NULL,
>> +};
>> +
>> +static struct attribute_group rapl_pmu_per_core_attr_group = {
>> +       .attrs = rapl_pmu_per_core_attrs,
>> +};
>> +
>>  RAPL_EVENT_ATTR_STR(energy-cores, rapl_cores, "event=0x01");
>>  RAPL_EVENT_ATTR_STR(energy-pkg  ,   rapl_pkg, "event=0x02");
>>  RAPL_EVENT_ATTR_STR(energy-ram  ,   rapl_ram, "event=0x03");
>>  RAPL_EVENT_ATTR_STR(energy-gpu  ,   rapl_gpu, "event=0x04");
>>  RAPL_EVENT_ATTR_STR(energy-psys,   rapl_psys, "event=0x05");
>> +RAPL_EVENT_ATTR_STR(energy-per-core,   rapl_per_core, "event=0x06");
> 
> energy-per-core is for a separate pmu, so the event id does not need to
> be 6. The same applies to PERF_RAPL_PERCORE.

Correct, will fix in next version.

> 
>>  
>>  static struct rapl_model model_amd_hygon = {
>> -       .events         = BIT(PERF_RAPL_PKG),
>> +       .events         = BIT(PERF_RAPL_PKG) |
>> +                         BIT(PERF_RAPL_PERCORE),
>>         .msr_power_unit = MSR_AMD_RAPL_POWER_UNIT,
>>         .rapl_msrs      = amd_rapl_msrs,
>> +       .per_core = true,
>>  };
> 
> can we use bit PERF_RAPL_PERCORE to check per_core pmu suppot?

Makes sense, will modify.

> 
> Just FYI, arch/x86/events/intel/cstate.c handles package/module/core
> scope cstate pmus. It uses a different approach in the probing part,
> which IMO is clearer.

Yes, I went thru it, I see that separate variables are being used to 
mark the valid events for package and core scope and a wrapper fn around 
perf_msr_probe is created, will see if that will make sense here as well.

Thanks for the review,
Dhananjay

> 
> thanks,
> rui
> 

  reply	other threads:[~2024-06-13  6:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-10 10:07 [PATCH 0/6] Add per-core RAPL energy counter support for AMD CPUs Dhananjay Ugwekar
2024-06-10 10:07 ` [PATCH 1/6] perf/x86/rapl: Fix the energy-pkg event " Dhananjay Ugwekar
2024-06-11  5:35   ` Zhang, Rui
2024-06-11 14:17     ` Dhananjay Ugwekar
2024-06-10 10:07 ` [PATCH 2/6] perf/x86/rapl: Rename rapl_pmu variables Dhananjay Ugwekar
2024-06-11  5:43   ` Zhang, Rui
2024-06-11  8:33     ` Dhananjay Ugwekar
2024-06-10 10:07 ` [PATCH 3/6] perf/x86/rapl: Make rapl_model struct global Dhananjay Ugwekar
2024-06-10 10:07 ` [PATCH 4/6] perf/x86/rapl: Move cpumask variable to rapl_pmus struct Dhananjay Ugwekar
2024-06-10 10:07 ` [PATCH 5/6] perf/x86/rapl: Add wrapper for online/offline functions Dhananjay Ugwekar
2024-06-10 10:07 ` [PATCH 6/6] perf/x86/rapl: Add per-core energy counter support for AMD CPUs Dhananjay Ugwekar
2024-06-11  8:30   ` Zhang, Rui
2024-06-13  6:39     ` Dhananjay Ugwekar [this message]
2024-06-10 14:28 ` [PATCH 0/6] Add per-core RAPL " Oleksandr Natalenko
2024-06-10 15:17   ` Dhananjay Ugwekar
2024-06-10 18:08     ` Oleksandr Natalenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f98c662d-9270-4974-a12b-60ea995d0aa6@amd.com \
    --to=dhananjay.ugwekar@amd.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=ananth.narayan@amd.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=gautham.shenoy@amd.com \
    --cc=gustavoars@kernel.org \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kees@kernel.org \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    --cc=rui.zhang@intel.com \
    --cc=sandipan.das@amd.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).