All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zhang, Rui" <rui.zhang@intel.com>
To: "alexander.shishkin@linux.intel.com"
	<alexander.shishkin@linux.intel.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"Hunter, Adrian" <adrian.hunter@intel.com>,
	"oleksandr@natalenko.name" <oleksandr@natalenko.name>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"irogers@google.com" <irogers@google.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"gustavoars@kernel.org" <gustavoars@kernel.org>,
	"kan.liang@linux.intel.com" <kan.liang@linux.intel.com>,
	"kees@kernel.org" <kees@kernel.org>,
	"mark.rutland@arm.com" <mark.rutland@arm.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"Dhananjay.Ugwekar@amd.com" <Dhananjay.Ugwekar@amd.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"acme@kernel.org" <acme@kernel.org>,
	"jolsa@kernel.org" <jolsa@kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"namhyung@kernel.org" <namhyung@kernel.org>
Cc: "ravi.bangoria@amd.com" <ravi.bangoria@amd.com>,
	"gautham.shenoy@amd.com" <gautham.shenoy@amd.com>,
	"kprateek.nayak@amd.com" <kprateek.nayak@amd.com>,
	"linux-perf-users@vger.kernel.org"
	<linux-perf-users@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-hardening@vger.kernel.org"
	<linux-hardening@vger.kernel.org>,
	"ananth.narayan@amd.com" <ananth.narayan@amd.com>,
	"sandipan.das@amd.com" <sandipan.das@amd.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Subject: Re: [PATCH v3 10/10] perf/x86/rapl: Add per-core energy counter support for AMD CPUs
Date: Thu, 27 Jun 2024 06:49:26 +0000	[thread overview]
Message-ID: <f2dfe380f06bdde6bf4aabc9c45ddea7e28d35fd.camel@intel.com> (raw)
In-Reply-To: <dff31583-adaf-4da8-954e-f35f7ef5a5d3@amd.com>

Hi, Dhananjay
> 
> > 
> > [...]
> > 
> > > @@ -685,6 +774,13 @@ static void __init rapl_advertise(void)
> > >                                 rapl_pkg_domain_names[i],
> > > rapl_hw_unit[i]);
> > >                 }
> > >         }
> > > +
> > > +       for (i = 0; i < NR_RAPL_CORE_DOMAINS; i++) {
> > > +               if (rapl_core_cntr_mask & (1 << i)) {
> > > +                       pr_info("hw unit of domain %s 2^-%d
> > > Joules\n",
> > > +                               rapl_core_domain_names[i],
> > > rapl_hw_unit[i]);
> > 
> > rapl_hw_unit[] is for package pmu only and
> > rapl_hw_unit[0] is rapl_hw_unit[PERF_RAPL_PP0] rather than
> > rapl_hw_unit[PERF_RAPL_PER_CORE]
> > 
> > you cannot use rapl_hw_unit[i] to represent per-core rapl domain
> > unit.
> 
> Yes right, I saw that all the elements in the rapl_hw_unit array were
> actually 
> using the value from the same register "MSR_RAPL_POWER_UNIT" or
> "MSR_AMD_RAPL_POWER_UNIT".
> Except for the two quirks,
>  
>  737         case
> RAPL_UNIT_QUIRK_INTEL_HSW:                                           
>                                                                      
>                                                       
>  738                 rapl_hw_unit[PERF_RAPL_RAM] =
> 16;                                                                  
>                                                                      
>                      
>  739                
> break;                                                               
>                                                                      
>                                                    
>  740         /* SPR uses a fixed energy unit for Psys domain. */
>  741         case RAPL_UNIT_QUIRK_INTEL_SPR:
>  742                 rapl_hw_unit[PERF_RAPL_PSYS] = 0;
>  743                 break;
> 
> So, as for AMD systems the rapl_hw_unit[] elements will always have
> the same value, I ended 
> up using the rapl_hw_unit[PERF_RAPL_PP0] for
> rapl_hw_unit[PERF_RAPL_PER_CORE], but I do realize
> it is quite hacky. So, better to do it cleanly and add a separate
> array/variable for the core events.
> 
yeah, that is much better.
> 

> > 
> > >  
> > >  static struct rapl_model model_amd_hygon = {
> > >         .pkg_events     = BIT(PERF_RAPL_PKG),
> > > +       .core_events    = BIT(PERF_RAPL_PER_CORE),
> > >         .msr_power_unit = MSR_AMD_RAPL_POWER_UNIT,
> > > -       .rapl_msrs      = amd_rapl_pkg_msrs,
> > > +       .rapl_pkg_msrs  = amd_rapl_pkg_msrs,
> > > +       .rapl_core_msrs = amd_rapl_core_msrs,
> > >  };
> > >  
> > >  static const struct x86_cpu_id rapl_model_match[] __initconst =
> > > {
> > > @@ -858,6 +957,11 @@ static int __init rapl_pmu_init(void)
> > >  {
> > >         const struct x86_cpu_id *id;
> > >         int ret;
> > > +       int nr_rapl_pmu = topology_max_packages() *
> > > topology_max_dies_per_package();
> > > +       int nr_cores = topology_max_packages() *
> > > topology_num_cores_per_package();
> > 
> > I'd suggest either using two variables nr_pkgs/nr_cores, or reuse
> > one
> > variable nr_rapl_pmu for both pkg pmu and per-core pmu.
> 
> I understand your point, but the problem with that is, there are
> actually three scopes needed here
> 
> Some Intel systems need a *die* scope for the rapl_pmus_pkg PMU
> Some Intel systems and all AMD systems need a *package* scope for the
> rapl_pmus_pkg PMU
> And AMD systems need a *core* scope for the rapl_pmus_per_core PMU
> 
> I think what we can do is three variables, nr_dies (for all Intel
> systems as before), 
> nr_pkgs(for AMD systems rapl_pmus_pkg PMU)

Not necessarily, we already uses rapl_pmus_pkg for intel systems,
right?

>  and nr_cores(for rapl_pmus_per_core PMU)
> 
> Sounds good?

what about just one variable "count" and reuse it for every cases?

> 
> > 
> > > +
> > > +       if (rapl_pmu_is_pkg_scope())
> > > +               nr_rapl_pmu = topology_max_packages();
> > >  
> > >         id = x86_match_cpu(rapl_model_match);
> > >         if (!id)
> > > @@ -865,17 +969,34 @@ static int __init rapl_pmu_init(void)
> > >  
> > >         rapl_model = (struct rapl_model *) id->driver_data;
> > >  
> > > -       rapl_pkg_cntr_mask = perf_msr_probe(rapl_model-
> > > >rapl_msrs,
> > > PERF_RAPL_PKG_EVENTS_MAX,
> > > +       rapl_pkg_cntr_mask = perf_msr_probe(rapl_model-
> > > > rapl_pkg_msrs, PERF_RAPL_PKG_EVENTS_MAX,
> > >                                         false, (void *)
> > > &rapl_model-
> > > > pkg_events);
> > >  
> > >         ret = rapl_check_hw_unit();
> > >         if (ret)
> > >                 return ret;
> > >  
> > > -       ret = init_rapl_pmus(&rapl_pmus_pkg);
> > > +       ret = init_rapl_pmus(&rapl_pmus_pkg, nr_rapl_pmu,
> > > rapl_attr_groups, rapl_attr_update);
> > >         if (ret)
> > >                 return ret;
> > >  
> > > +       if (rapl_model->core_events) {
> > > +               rapl_core_cntr_mask = perf_msr_probe(rapl_model-
> > > > rapl_core_msrs,
> > > +                                                   
> > > PERF_RAPL_CORE_EVENTS_MAX, false,
> > > +                                                    (void *)
> > > &rapl_model->core_events);
> > > +
> > > +               ret = init_rapl_pmus(&rapl_pmus_core, nr_cores,
> > > +                                    rapl_per_core_attr_groups,
> > > rapl_per_core_attr_update);
> > > +               if (ret) {
> > > +                       /*
> > > +                        * If initialization of per_core PMU
> > > fails,
> > > reset per_core
> > > +                        * flag, and continue with power PMU
> > > initialization.
> > > +                        */
> > > +                       pr_warn("Per-core PMU initialization
> > > failed
> > > (%d)\n", ret);
> > > +                       rapl_model->core_events = 0UL;
> > > +               }
> > > +       }
> > > +
> > >         /*
> > >          * Install callbacks. Core will call them for each online
> > > cpu.
> > >          */
> > > @@ -889,6 +1010,20 @@ static int __init rapl_pmu_init(void)
> > >         if (ret)
> > >                 goto out1;
> > >  
> > > +       if (rapl_model->core_events) {
> > > +               ret = perf_pmu_register(&rapl_pmus_core->pmu,
> > > "power_per_core", -1);
> > > +               if (ret) {
> > > +                       /*
> > > +                        * If registration of per_core PMU fails,
> > > cleanup per_core PMU
> > > +                        * variables, reset the per_core flag and
> > > keep the
> > > +                        * power PMU untouched.
> > > +                        */
> > > +                       pr_warn("Per-core PMU registration failed
> > > (%d)\n", ret);
> > > +                       cleanup_rapl_pmus(rapl_pmus_core);
> > > +                       rapl_model->core_events = 0UL;
> > > +               }
> > > +       }
> > > +
> > >         rapl_advertise();
> > >         return 0;
> > >  
> > > @@ -906,5 +1041,9 @@ static void __exit intel_rapl_exit(void)
> > >         cpuhp_remove_state_nocalls(CPUHP_AP_PERF_X86_RAPL_ONLINE)
> > > ;
> > >         perf_pmu_unregister(&rapl_pmus_pkg->pmu);
> > >         cleanup_rapl_pmus(rapl_pmus_pkg);
> > > +       if (rapl_model->core_events) {
> > > +               perf_pmu_unregister(&rapl_pmus_core->pmu);
> > > +               cleanup_rapl_pmus(rapl_pmus_core);
> > > +       }
> > 
> > we do check rapl_pmus_core before accessing it, but we never check
> > rapl_pmus_pkg because the previous code assumes it always exists.
> > 
> > so could there be a problem if some one starts the per-core pmu
> > when
> > pkg pmu is unregistered and cleaned up?
> > 
> > say, in rapl_pmu_event_init(),
> > 
> > if (event->attr.type == rapl_pmus_pkg->pmu.type ||
> >    (rapl_pmus_core && event->attr.type == rapl_pmus_core-
> > >pmu.type))
> > 
> > this can break because rapl_pmus_pkg is freed, right?
> 
> Hmm, I think this situation can't arise as whenever the power PMU
> fails, we 
> directly go to the failure path and dont setup the per-core PMU(which
> means 
> no one will be able to start the per-core PMU), 
> Please let me know if there is a scenario where this assumption can
> fail.

I mean if we do module unload and access power-per-core pmu at the same
time, could there be a race?

why not just unregister and cleanup the per-core pmu before the pkg
pmu?
> 

thanks,
rui


  reply	other threads:[~2024-06-27  6:49 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-24  5:58 [PATCH v3 00/10] Add per-core RAPL energy counter support for AMD CPUs Dhananjay Ugwekar
2024-06-24  5:58 ` [PATCH v3 01/10] x86/topology: Introduce topology_logical_core_id() Dhananjay Ugwekar
2024-06-26 14:14   ` Zhang, Rui
2024-06-24  5:58 ` [PATCH v3 02/10] perf/x86/rapl: Fix the energy-pkg event for AMD CPUs Dhananjay Ugwekar
2024-06-26 14:18   ` Zhang, Rui
2024-06-26 15:34     ` Dhananjay Ugwekar
2024-07-01 12:59   ` Peter Zijlstra
2024-07-02  9:39     ` Dhananjay Ugwekar
2024-06-24  5:59 ` [PATCH v3 03/10] perf/x86/rapl: Rename rapl_pmu variables Dhananjay Ugwekar
2024-06-24  5:59 ` [PATCH v3 04/10] perf/x86/rapl: Make rapl_model struct global Dhananjay Ugwekar
2024-06-24  5:59 ` [PATCH v3 05/10] perf/x86/rapl: Move cpumask variable to rapl_pmus struct Dhananjay Ugwekar
2024-07-01 13:02   ` Peter Zijlstra
2024-07-02 10:16     ` Dhananjay Ugwekar
2024-06-24  5:59 ` [PATCH v3 06/10] perf/x86/rapl: Add wrapper for online/offline functions Dhananjay Ugwekar
2024-06-24  5:59 ` [PATCH v3 07/10] perf/x86/rapl: Add an argument to the cleanup and init functions Dhananjay Ugwekar
2024-06-24  5:59 ` [PATCH v3 08/10] perf/x86/rapl: Modify the generic variable names to *_pkg* Dhananjay Ugwekar
2024-07-01 13:08   ` Peter Zijlstra
2024-07-02  2:25     ` Zhang, Rui
2024-07-02 10:20       ` Dhananjay Ugwekar
2024-07-03  4:07         ` Zhang, Rui
2024-07-03  6:31           ` Dhananjay Ugwekar
2024-07-05  2:18             ` Zhang, Rui
2024-07-05  9:24               ` Peter Zijlstra
2024-07-08  1:56                 ` Zhang, Rui
2024-06-24  5:59 ` [PATCH v3 09/10] perf/x86/rapl: Remove the global variable rapl_msrs Dhananjay Ugwekar
2024-06-24  5:59 ` [PATCH v3 10/10] perf/x86/rapl: Add per-core energy counter support for AMD CPUs Dhananjay Ugwekar
2024-06-26 15:18   ` Zhang, Rui
2024-06-26 16:37     ` Dhananjay Ugwekar
2024-06-27  6:49       ` Zhang, Rui [this message]
2024-06-27 11:13         ` Dhananjay Ugwekar
2024-06-24  6:39 ` [PATCH v3 00/10] Add per-core RAPL " K Prateek Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f2dfe380f06bdde6bf4aabc9c45ddea7e28d35fd.camel@intel.com \
    --to=rui.zhang@intel.com \
    --cc=Dhananjay.Ugwekar@amd.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=ananth.narayan@amd.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=gautham.shenoy@amd.com \
    --cc=gustavoars@kernel.org \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kees@kernel.org \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=oleksandr@natalenko.name \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    --cc=sandipan.das@amd.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.