linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: maddy@linux.ibm.com, linuxppc-dev@lists.ozlabs.org,
	rnsastry@linux.ibm.com, kjain@linux.ibm.com
Subject: Re: [PATCH V2] powerpc/perf: Enable PMU counters post partition migration if PMU is active
Date: Mon, 25 Oct 2021 22:37:30 +0530	[thread overview]
Message-ID: <EDB0F140-EEE8-424C-9D15-4D1465BC24F1@linux.vnet.ibm.com> (raw)
In-Reply-To: <1634812863.5e9oss88pa.astroid@bobo.none>

[-- Attachment #1: Type: text/plain, Size: 10238 bytes --]



> On 21-Oct-2021, at 4:24 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> Excerpts from Athira Rajeev's message of July 11, 2021 10:25 pm:
>> During Live Partition Migration (LPM), it is observed that perf
>> counter values reports zero post migration completion. However
>> 'perf stat' with workload continues to show counts post migration
>> since PMU gets disabled/enabled during sched switches. But incase
>> of system/cpu wide monitoring, zero counts were reported with 'perf
>> stat' after migration completion.
>> 
>> Example:
>> ./perf stat -e r1001e -I 1000
>>           time             counts unit events
>>     1.001010437         22,137,414      r1001e
>>     2.002495447         15,455,821      r1001e
>> <<>> As seen in next below logs, the counter values shows zero
>>        after migration is completed.
>> <<>>
>>    86.142535370    129,392,333,440      r1001e
>>    87.144714617                  0      r1001e
>>    88.146526636                  0      r1001e
>>    89.148085029                  0      r1001e
>> 
>> Here PMU is enabled during start of perf session and counter
>> values are read at intervals. Counters are only disabled at the
>> end of session. The powerpc mobility code presently does not handle
>> disabling and enabling back of PMU counters during partition
>> migration. Also since the PMU register values are not saved/restored
>> during migration, PMU registers like Monitor Mode Control Register 0
>> (MMCR0), Monitor Mode Control Register 1 (MMCR1) will not contain
>> the value it was programmed with. Hence PMU counters will not be
>> enabled correctly post migration.
>> 
>> Fix this in mobility code by handling disabling and enabling of
>> PMU in all cpu's before and after migration. Patch introduces two
>> functions 'mobility_pmu_disable' and 'mobility_pmu_enable'.
>> mobility_pmu_disable() is called before the processor threads goes
>> to suspend state so as to disable the PMU counters. And disable is
>> done only if there are any active events running on that cpu.
>> mobility_pmu_enable() is called after the processor threads are
>> back online to enable back the PMU counters.
>> 
>> Since the performance Monitor counters ( PMCs) are not
>> saved/restored during LPM, results in PMC value being zero and the
>> 'event->hw.prev_count' being non-zero value. This causes problem
> 
> Interesting. Are they defined to not be migrated, or may not be 
> migrated?
> 
> I wonder what QEMU migration does with PMU registers.
> 
>> during updation of event->count since we always accumulate
>> (event->hw.prev_count - PMC value) in event->count.  If
>> event->hw.prev_count is greater PMC value, event->count becomes
>> negative. Fix this by re-initialising 'prev_count' also for all
>> events while enabling back the events. A new variable 'migrate' is
>> introduced in 'struct cpu_hw_event' to achieve this for LPM cases
>> in power_pmu_enable. Use the 'migrate' value to clear the PMC
>> index (stored in event->hw.idx) for all events so that event
>> count settings will get re-initialised correctly.
>> 
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> [ Fixed compilation error reported by kernel test robot ]
>> Reported-by: kernel test robot <lkp@intel.com>
>> ---
>> Change from v1 -> v2:
>> - Moved the mobility_pmu_enable and mobility_pmu_disable
>>   declarations under CONFIG_PPC_PERF_CTRS in rtas.h.
>>   Also included 'asm/rtas.h' in core-book3s to fix the
>>   compilation warning reported by kernel test robot.
>> 
>> arch/powerpc/include/asm/rtas.h           |  8 ++++++
>> arch/powerpc/perf/core-book3s.c           | 44 ++++++++++++++++++++++++++++---
>> arch/powerpc/platforms/pseries/mobility.c |  4 +++
>> 3 files changed, 53 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
>> index 9dc97d2..cea72d7 100644
>> --- a/arch/powerpc/include/asm/rtas.h
>> +++ b/arch/powerpc/include/asm/rtas.h
>> @@ -380,5 +380,13 @@ static inline void rtas_initialize(void) { }
>> static inline void read_24x7_sys_info(void) { }
>> #endif
>> 
>> +#ifdef CONFIG_PPC_PERF_CTRS
>> +void mobility_pmu_disable(void);
>> +void mobility_pmu_enable(void);
>> +#else
>> +static inline void mobility_pmu_disable(void) { }
>> +static inline void mobility_pmu_enable(void) { }
>> +#endif
>> +
>> #endif /* __KERNEL__ */
>> #endif /* _POWERPC_RTAS_H */
> 
> It's not implemented in rtas, maybe consider putting this into a perf 
> header?

Hi Nick,
Thanks for the review comments.

Sure, I will move this to perf_event header file

> 
>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>> index bb0ee71..90da7fa 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -18,6 +18,7 @@
>> #include <asm/ptrace.h>
>> #include <asm/code-patching.h>
>> #include <asm/interrupt.h>
>> +#include <asm/rtas.h>
>> 
>> #ifdef CONFIG_PPC64
>> #include "internal.h"
>> @@ -58,6 +59,7 @@ struct cpu_hw_events {
>> 
>> 	/* Store the PMC values */
>> 	unsigned long pmcs[MAX_HWEVENTS];
>> +	int migrate;
>> };
>> 
>> static DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events);
>> @@ -1335,6 +1337,22 @@ static void power_pmu_disable(struct pmu *pmu)
>> }
>> 
>> /*
>> + * Called from powerpc mobility code
>> + * before migration to disable counters
>> + * if the PMU is active.
>> + */
>> +void mobility_pmu_disable(void)
>> +{
>> +	struct cpu_hw_events *cpuhw;
>> +
>> +	cpuhw = this_cpu_ptr(&cpu_hw_events);
>> +	if (cpuhw->n_events != 0) {
>> +		power_pmu_disable(NULL);
>> +		cpuhw->migrate = 1;
>> +	}
> 
> Rather than add this migrate logic into power_pmu_enable(), would you be 
> able to do the work in the mobility callbacks instead? Something like
> this:
> 
> void mobility_pmu_disable(void)
> {
>        struct cpu_hw_events *cpuhw;
> 
>        cpuhw = this_cpu_ptr(&cpu_hw_events);
>        if (cpuhw->n_events != 0) {
>                int i;
> 
>                power_pmu_disable(NULL);
>                /*
>                 * Read off any pre-existing events because the register
>                 * values may not be migrated.
>                 */
>                for (i = 0; i < cpuhw->n_events; ++i) {
>                        event = cpuhw->event[i];
>                        if (event->hw.idx) {
>                                power_pmu_read(event);
>                                event->hw.idx = 0;
>                        }
>                }
>        }
> }
> 
> void mobility_pmu_enable(void)
> {
>        struct cpu_hw_events *cpuhw;
> 
>        cpuhw = this_cpu_ptr(&cpu_hw_events);
>        cpuhw->n_added = cpuhw->n_events;
>        power_pmu_enable(NULL);
> }
> 

Ok Nick, Will address these changes in next version.
With this approach, I won’t need the “migrate” field.
>> +}
>> +
>> +/*
>>  * Re-enable all events if disable == 0.
>>  * If we were previously disabled and events were added, then
>>  * put the new config on the PMU.
>> @@ -1379,8 +1397,10 @@ static void power_pmu_enable(struct pmu *pmu)
>> 	 * no need to recalculate MMCR* settings and reset the PMCs.
>> 	 * Just reenable the PMU with the current MMCR* settings
>> 	 * (possibly updated for removal of events).
>> +	 * While reenabling PMU during partition migration, continue
>> +	 * with normal flow.
>> 	 */
>> -	if (!cpuhw->n_added) {
>> +	if (!cpuhw->n_added && !cpuhw->migrate) {
>> 		mtspr(SPRN_MMCRA, cpuhw->mmcr.mmcra & ~MMCRA_SAMPLE_ENABLE);
>> 		mtspr(SPRN_MMCR1, cpuhw->mmcr.mmcr1);
>> 		if (ppmu->flags & PPMU_ARCH_31)
>> @@ -1434,11 +1454,15 @@ static void power_pmu_enable(struct pmu *pmu)
>> 	/*
>> 	 * Read off any pre-existing events that need to move
>> 	 * to another PMC.
>> +	 * While enabling PMU during partition migration,
>> +	 * skip power_pmu_read since all event count settings
>> +	 * needs to be re-initialised after migration.
>> 	 */
>> 	for (i = 0; i < cpuhw->n_events; ++i) {
>> 		event = cpuhw->event[i];
>> -		if (event->hw.idx && event->hw.idx != hwc_index[i] + 1) {
>> -			power_pmu_read(event);
>> +		if ((event->hw.idx && event->hw.idx != hwc_index[i] + 1) || (cpuhw->migrate)) {
>> +			if (!cpuhw->migrate)
>> +				power_pmu_read(event);
>> 			write_pmc(event->hw.idx, 0);
>> 			event->hw.idx = 0;
>> 		}
>> @@ -1506,6 +1530,20 @@ static void power_pmu_enable(struct pmu *pmu)
>> 	local_irq_restore(flags);
>> }
>> 
>> +/*
>> + * Called from powerpc mobility code
>> + * during migration completion to
>> + * enable back PMU counters.
>> + */
>> +void mobility_pmu_enable(void)
>> +{
>> +	struct cpu_hw_events *cpuhw;
>> +
>> +	cpuhw = this_cpu_ptr(&cpu_hw_events);
>> +	power_pmu_enable(NULL);
>> +	cpuhw->migrate = 0;
>> +}
>> +
>> static int collect_events(struct perf_event *group, int max_count,
>> 			  struct perf_event *ctrs[], u64 *events,
>> 			  unsigned int *flags)
>> diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
>> index e83e089..ff7a77c 100644
>> --- a/arch/powerpc/platforms/pseries/mobility.c
>> +++ b/arch/powerpc/platforms/pseries/mobility.c
>> @@ -476,6 +476,8 @@ static int do_join(void *arg)
>> retry:
>> 	/* Must ensure MSR.EE off for H_JOIN. */
>> 	hard_irq_disable();
>> +	/* Disable PMU before suspend */
>> +	mobility_pmu_disable();
>> 	hvrc = plpar_hcall_norets(H_JOIN);
>> 
>> 	switch (hvrc) {
> 
> Does the retry case cause mobility_pmu_disable to be called twice 
> without mobility_pmu_enable called? Would be neater if it was
> balanced.
> 

I will be moving these mobility callbacks to “pseries_migrate_partition” as Nathan suggested.
Will send a V3 with new changes. 

Thanks
Athira

> 
>> @@ -530,6 +532,8 @@ static int do_join(void *arg)
>> 	 * reset the watchdog.
>> 	 */
>> 	touch_nmi_watchdog();
>> +	/* Enable PMU after resuming */
>> +	mobility_pmu_enable();
>> 	return ret;
>> }
> 
> Not really a big deal but while you're updating it, you could leave 
> touch_nmi_watchdog() at the very end.
> 
> Thanks,
> Nick


[-- Attachment #2: Type: text/html, Size: 59543 bytes --]

  parent reply	other threads:[~2021-10-25 20:56 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-11 12:25 [PATCH V2] powerpc/perf: Enable PMU counters post partition migration if PMU is active Athira Rajeev
2021-10-21 10:54 ` Nicholas Piggin
2021-10-21 17:33   ` Nathan Lynch
2021-10-22  3:33     ` Madhavan Srinivasan
2021-10-25 17:10     ` Athira Rajeev
2021-10-25 17:07   ` Athira Rajeev [this message]
2021-10-21 17:17 ` Nathan Lynch
2021-10-22  0:19   ` Michael Ellerman
2021-10-22  5:11     ` Nicholas Piggin
2021-10-25 17:09   ` Athira Rajeev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=EDB0F140-EEE8-424C-9D15-4D1465BC24F1@linux.vnet.ibm.com \
    --to=atrajeev@linux.vnet.ibm.com \
    --cc=kjain@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=rnsastry@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).