LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC PATCH] powerpc/pseries/svm: capture instruction faulting on MMIO access, in sprg0 register
From: Segher Boessenkool @ 2020-07-20 20:10 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: aik, Ram Pai, kvm-ppc, bharata, sathnaga, sukadev, linuxppc-dev,
	bauerman, david
In-Reply-To: <18e3bcee-8a3a-bd13-c995-8e4168471f74@linux.ibm.com>

On Mon, Jul 20, 2020 at 11:39:56AM +0200, Laurent Dufour wrote:
> Le 16/07/2020 à 10:32, Ram Pai a écrit :
> >+	if (is_secure_guest()) {					\
> >+		__asm__ __volatile__("mfsprg0 %3;"			\
> >+				"lnia %2;"				\
> >+				"ld %2,12(%2);"				\
> >+				"mtsprg0 %2;"				\
> >+				"sync;"					\
> >+				#insn" %0,%y1;"				\
> >+				"twi 0,%0,0;"				\
> >+				"isync;"				\
> >+				"mtsprg0 %3"				\
> >+			: "=r" (ret)					\
> >+			: "Z" (*addr), "r" (0), "r" (0)			\
> 
> I'm wondering if SPRG0 is restored to its original value.
> You're using the same register (r0) for parameters 2 and 3, so when doing 
> lnia %2, you're overwriting the SPRG0 value you saved in r0 just earlier.

It is putting the value 0 in the registers the compiler chooses for
operands 2 and 3.  But operand 3 is written, while the asm says it is an
input.  It needs an earlyclobber as well.

> It may be clearer to use explicit registers for %2 and %3 and to mark them 
> as modified for the compiler.

That is not a good idea, imnsho.


Segher

^ permalink raw reply

* Re: [RFC PATCH] powerpc/pseries/svm: capture instruction faulting on MMIO access, in sprg0 register
From: Segher Boessenkool @ 2020-07-20 20:24 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: aik, Ram Pai, kvm-ppc, bharata, sathnaga, sukadev, linuxppc-dev,
	bauerman, david
In-Reply-To: <20200720201041.GM30544@gate.crashing.org>

On Mon, Jul 20, 2020 at 03:10:41PM -0500, Segher Boessenkool wrote:
> On Mon, Jul 20, 2020 at 11:39:56AM +0200, Laurent Dufour wrote:
> > Le 16/07/2020 à 10:32, Ram Pai a écrit :
> > >+	if (is_secure_guest()) {					\
> > >+		__asm__ __volatile__("mfsprg0 %3;"			\
> > >+				"lnia %2;"				\
> > >+				"ld %2,12(%2);"				\
> > >+				"mtsprg0 %2;"				\
> > >+				"sync;"					\
> > >+				#insn" %0,%y1;"				\
> > >+				"twi 0,%0,0;"				\
> > >+				"isync;"				\
> > >+				"mtsprg0 %3"				\
> > >+			: "=r" (ret)					\
> > >+			: "Z" (*addr), "r" (0), "r" (0)			\
> > 
> > I'm wondering if SPRG0 is restored to its original value.
> > You're using the same register (r0) for parameters 2 and 3, so when doing 
> > lnia %2, you're overwriting the SPRG0 value you saved in r0 just earlier.
> 
> It is putting the value 0 in the registers the compiler chooses for
> operands 2 and 3.  But operand 3 is written, while the asm says it is an
> input.  It needs an earlyclobber as well.
> 
> > It may be clearer to use explicit registers for %2 and %3 and to mark them 
> > as modified for the compiler.
> 
> That is not a good idea, imnsho.

(The explicit register number part, I mean; operand 2 should be an
output as well, yes.)


Segher

^ permalink raw reply

* Re: [PATCH] powerpc/boot: Use address-of operator on section symbols
From: Segher Boessenkool @ 2020-07-20 21:02 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Arnd Bergmann, Geoff Levand, Linux Kernel Mailing List,
	clang-built-linux, Paul Mackerras, Joel Stanley,
	Nathan Chancellor, linuxppc-dev
In-Reply-To: <CAMuHMdU_KfQ-RT_nev5LgN=Vj_P97Fn=nwRoC6ZREFLa3Ysj7w@mail.gmail.com>

Hi!

On Sat, Jul 18, 2020 at 09:50:50AM +0200, Geert Uytterhoeven wrote:
> On Wed, Jun 24, 2020 at 6:02 AM Nathan Chancellor
> <natechancellor@gmail.com> wrote:
> >         /* If we have an image attached to us, it overrides anything
> >          * supplied by the loader. */
> > -       if (_initrd_end > _initrd_start) {
> > +       if (&_initrd_end > &_initrd_start) {
> 
> Are you sure that fix is correct?
> 
>     extern char _initrd_start[];
>     extern char _initrd_end[];
>     extern char _esm_blob_start[];
>     extern char _esm_blob_end[];
> 
> Of course the result of their comparison is a constant, as the addresses
> are constant.  If clangs warns about it, perhaps that warning should be moved
> to W=1?
> 
> But adding "&" is not correct, according to C.

Why not?

6.5.3.2/3
The unary & operator yields the address of its operand.  [...]
Otherwise, the result is a pointer to the object or function designated
by its operand.

This is the same as using the name of an array without anything else,
yes.  It is a bit clearer if it would not be declared as array, perhaps,
but it is correct just fine like this.


Segher

^ permalink raw reply

* Re: [PATCH net-next] net: fs_enet: remove redundant null check
From: David Miller @ 2020-07-21  0:42 UTC (permalink / raw)
  To: zhangchangzhong; +Cc: kuba, netdev, linuxppc-dev, linux-kernel
In-Reply-To: <1595243553-12325-1-git-send-email-zhangchangzhong@huawei.com>

From: Zhang Changzhong <zhangchangzhong@huawei.com>
Date: Mon, 20 Jul 2020 19:12:33 +0800

> Because clk_prepare_enable and clk_disable_unprepare already
> checked NULL clock parameter, so the additional checks are
> unnecessary, just remove them.
> 
> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH] powerpc/fault: kernel can extend a user process's stack
From: Daniel Axtens @ 2020-07-21  0:57 UTC (permalink / raw)
  To: Michael Ellerman, Michal Suchánek
  Cc: Tom Lane, linuxppc-dev, Daniel Black
In-Reply-To: <878sfethsj.fsf@mpe.ellerman.id.au>

Michael Ellerman <mpe@ellerman.id.au> writes:

> Michal Suchánek <msuchanek@suse.de> writes:
>> Hello,
>>
>> On Wed, Dec 11, 2019 at 08:37:21PM +1100, Daniel Axtens wrote:
>>> > Fixes: 14cf11af6cf6 ("powerpc: Merge enough to start building in
>>> > arch/powerpc.")
>>> 
>>> Wow, that's pretty ancient! I'm also not sure it's right - in that same
>>> patch, arch/ppc64/mm/fault.c contains:
>>> 
>>> ^1da177e4c3f4 (Linus Torvalds         2005-04-16 15:20:36 -0700 213)            if (address + 2048 < uregs->gpr[1]
>>> ^1da177e4c3f4 (Linus Torvalds         2005-04-16 15:20:36 -0700 214)                && (!user_mode(regs) || !store_updates_sp(regs)))
>>> ^1da177e4c3f4 (Linus Torvalds         2005-04-16 15:20:36 -0700 215)                    goto bad_area;
>>> 
>>> Which is the same as the new arch/powerpc/mm/fault.c code:
>>> 
>>> 14cf11af6cf60 (Paul Mackerras 2005-09-26 16:04:21 +1000 234)            if (address + 2048 < uregs->gpr[1]
>>> 14cf11af6cf60 (Paul Mackerras 2005-09-26 16:04:21 +1000 235)                && (!user_mode(regs) || !store_updates_sp(regs)))
>>> 14cf11af6cf60 (Paul Mackerras 2005-09-26 16:04:21 +1000 236)                    goto bad_area;
>>> 
>>> So either they're both right or they're both wrong, either way I'm not
>>> sure how this patch is to blame.
>>
>> Is there any progress on resolving this?
>>
>> I did not notice any followup patch nor this one being merged/refuted.
>
> It ended up with this:
>
> https://lore.kernel.org/linuxppc-dev/20200703141327.1732550-2-mpe@ellerman.id.au/
>
>
> Which I was hoping would get some reviews :)

Ah, I missed this. I'll give it a look as soon as I can.

Kind regards,
Daniel

>
> I'll probably merge the whole series into next this week.
>
> cheers

^ permalink raw reply

* Re: Question about NUMA distance calculation in powerpc/mm/numa.c
From: Michael Ellerman @ 2020-07-21  1:36 UTC (permalink / raw)
  To: Daniel Henrique Barboza, linuxppc-dev
In-Reply-To: <e5c3b1f1-d6ac-50d5-95f5-3c6e830a078e@gmail.com>

Daniel Henrique Barboza <danielhb413@gmail.com> writes:
> Hello,
>
>
> I didn't find an explanation about the 'double the distance' logic in
> 'git log' or anywhere in the kernel docs:
>
>
> (arch/powerpc/mm/numa.c, __node_distance()):

Adding more context:

  int distance = LOCAL_DISTANCE;
  ...

> for (i = 0; i < distance_ref_points_depth; i++) {
> 	if (distance_lookup_table[a][i] == distance_lookup_table[b][i])
> 		break;
>
> 	/* Double the distance for each NUMA level */
> 	distance *= 2;
> }

And:

#define LOCAL_DISTANCE		10
#define REMOTE_DISTANCE		20


So AFAICS the doubling is just a way to ensure we go from LOCAL_DISTANCE
to REMOTE_DISTANCE at the first level, and then after that it's fairly
arbitrary.

cheers

^ permalink raw reply

* Re: [PATCH v3 0/4] powerpc/mm/radix: Memory unplug fixes
From: Michael Ellerman @ 2020-07-21  1:45 UTC (permalink / raw)
  To: Nathan Lynch, Aneesh Kumar K.V; +Cc: linuxppc-dev, Bharata B Rao
In-Reply-To: <87r1tb1rw2.fsf@linux.ibm.com>

Nathan Lynch <nathanl@linux.ibm.com> writes:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>> This is the next version of the fixes for memory unplug on radix.
>> The issues and the fix are described in the actual patches.
>
> I guess this isn't actually causing problems at runtime right now, but I
> notice calls to resize_hpt_for_hotplug() from arch_add_memory() and
> arch_remove_memory(), which ought to be mmu-agnostic:
>
> int __ref arch_add_memory(int nid, u64 start, u64 size,
> 			  struct mhp_params *params)
> {
> 	unsigned long start_pfn = start >> PAGE_SHIFT;
> 	unsigned long nr_pages = size >> PAGE_SHIFT;
> 	int rc;
>
> 	resize_hpt_for_hotplug(memblock_phys_mem_size());
>
> 	start = (unsigned long)__va(start);
> 	rc = create_section_mapping(start, start + size, nid,
> 				    params->pgprot);
> ...

Hmm well spotted.

That does return early if the ops are not setup:

int resize_hpt_for_hotplug(unsigned long new_mem_size)
{
	unsigned target_hpt_shift;

	if (!mmu_hash_ops.resize_hpt)
		return 0;


And:

void __init hpte_init_pseries(void)
{
	...
	if (firmware_has_feature(FW_FEATURE_HPT_RESIZE))
		mmu_hash_ops.resize_hpt = pseries_lpar_resize_hpt;

And that comes in via ibm,hypertas-functions:

	{FW_FEATURE_HPT_RESIZE,		"hcall-hpt-resize"},


But firmware is not necessarily going to add/remove that call based on
whether we're using hash/radix.

So I think a follow-up patch is needed to make this more robust.

Aneesh/Bharata what platform did you test this series on? I'm curious
how this didn't break.

cheers

^ permalink raw reply

* Re: [PATCH 07/11] Powerpc/numa: Detect support for coregroup
From: Srikar Dronamraju @ 2020-07-21  2:57 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Nathan Lynch, Gautham R Shenoy, Oliver OHalloran, Michael Neuling,
	Anton Blanchard, linuxppc-dev, Nick Piggin
In-Reply-To: <875zaithmk.fsf@mpe.ellerman.id.au>

* Michael Ellerman <mpe@ellerman.id.au> [2020-07-20 23:56:19]:

> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> 
> > Add support for grouping cores based on the device-tree classification.
> > - The last domain in the associativity domains always refers to the
> > core.
> > - If primary reference domain happens to be the penultimate domain in
> > the associativity domains device-tree property, then there are no
> > coregroups. However if its not a penultimate domain, then there are
> > coregroups. There can be more than one coregroup. For now we would be
> > interested in the last or the smallest coregroups.
> 
> Should I know what a "coregroup" is? It's not a term I'm familiar with.
> 

Coregroup is a group or subset of cores from the same chip/DIE that share
some resources. This is very similar to MC domain aka Multi-Core Cache (MC),
(kernel/sched/topology.c) where cores share the same cache. In the
Multi-Core Cache domain, all the cores of that domain share the last level
of cache.

Will add the description to the commit message.

> When you repost can you expand the Cc list to include lkml and
> scheduler/topology folks please.
> 

Okay, will add them, I shall copy to LKML, Peter Zijlstra and Ingo Molnar.
Do let me know if you want anyone else to added.

> cheers

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply

* Re: [powerpc:next-test 103/106] arch/powerpc/mm/book3s64/radix_pgtable.c:513:21: error: use of undeclared identifier 'SECTION_SIZE_BITS'
From: Michael Ellerman @ 2020-07-21  3:13 UTC (permalink / raw)
  To: Christophe Leroy, Aneesh Kumar K.V
  Cc: linuxppc-dev, Bharata B Rao, kbuild-all, kernel test robot,
	clang-built-linux
In-Reply-To: <20200720183900.Horde.y2dVSL93KA1P6bzz7IKxoA1@messagerie.si.c-s.fr>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> a écrit :
...
>>
>> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c  
>> b/arch/powerpc/mm/book3s64/radix_pgtable.c
>> index bba45fc0b7b2..c5bf2ef73c36 100644
>> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
>> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
>> @@ -492,6 +492,7 @@ static int __init  
>> @@ -532,6 +533,15 @@ static unsigned long radix_memory_block_size(void)
>>  	return mem_block_size;
>>  }
>>
>> +#else   /* CONFIG_MEMORY_HOTPLUG */
>> +
>> +static unsigned long radix_memory_block_size(void)
>> +{
>> +	return 1UL * 1024 * 1024 * 1024;
>
> Use SZ_1G instead ?

I've already squashed that in.

I'd take a patch to convert all cases though, I see at least:

  arch/powerpc/boot/ep8248e.c:    mem_size *= 1024 * 1024;
  arch/powerpc/boot/ep88xc.c:     mem_size *= 1024 * 1024;
  arch/powerpc/include/asm/kexec.h:#define KEXEC_SOURCE_MEMORY_LIMIT      (2 * 1024 * 1024 * 1024UL - 1)
  arch/powerpc/include/asm/kexec.h:#define KEXEC_DESTINATION_MEMORY_LIMIT (2 * 1024 * 1024 * 1024UL - 1)
  arch/powerpc/include/asm/kexec.h:#define KEXEC_CONTROL_MEMORY_LIMIT     (2 * 1024 * 1024 * 1024UL - 1)
  arch/powerpc/kernel/iommu.c:    if ((tbl->it_size << tbl->it_page_shift) >= (1UL * 1024 * 1024 * 1024))
  arch/powerpc/kernel/setup-common.c:                        (unsigned int)(total_memory / (1024 * 1024)));
  arch/powerpc/mm/book3s64/radix_pgtable.c:               mem_block_size = 1UL * 1024 * 1024 * 1024;
  arch/powerpc/mm/book3s64/radix_pgtable.c:       return 1UL * 1024 * 1024 * 1024;
  arch/powerpc/mm/ioremap_32.c:   if (p < 16 * 1024 * 1024)
  arch/powerpc/platforms/powernv/setup.c:         return 256UL * 1024 * 1024;
  arch/powerpc/platforms/pseries/cmm.c:   signed long min_mem_pages = (min_mem_mb * 1024 * 1024) / PAGE_SIZE;

cheers

^ permalink raw reply

* Re: [PATCH v4 10/10] powerpc/watchpoint: Remove 512 byte boundary
From: Ravi Bangoria @ 2020-07-21  3:24 UTC (permalink / raw)
  To: Jordan Niethe
  Cc: Christophe Leroy, apopple, mikey, miltonm, peterz, oleg,
	Nicholas Piggin, linux-kernel, Paul Mackerras, jolsa, fweisbec,
	pedromfc, naveen.n.rao, linuxppc-dev, mingo, Ravi Bangoria
In-Reply-To: <CACzsE9og50tH9jRZjWYDgbFxdTkDXJq3gMuP8uxPWfrrREo=4w@mail.gmail.com>

Hi Jordan,

On 7/20/20 12:24 PM, Jordan Niethe wrote:
> On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
> <ravi.bangoria@linux.ibm.com> wrote:
>>
>> Power10 has removed 512 bytes boundary from match criteria. i.e. The watch
>> range can cross 512 bytes boundary.
> It looks like this change is not mentioned in ISA v3.1 Book III 9.4
> Data Address Watchpoint. It could be useful to mention that in the
> commit message.

Yes, ISA 3.1 Book III 9.4 has a documentation mistake and hopefully it
will be fixed in the next version of ISA. Though, this is mentioned in
ISA 3.1 change log:

   Multiple DEAW:
   Added a second Data Address Watchpoint. [H]DAR is
   set to the first byte of overlap. 512B boundary is
   removed.

I'll mention this in the commit description.

> Also I wonder if could add a test for this to the ptrace-hwbreak selftest?

Yes, I already have a selftest for this in perf-hwbreak. Will send that soon.

Thanks,
Ravi

^ permalink raw reply

* Re: [PATCH v3 0/4] powerpc/mm/radix: Memory unplug fixes
From: Bharata B Rao @ 2020-07-21  3:29 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Nathan Lynch, Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <87tuy1sksv.fsf@mpe.ellerman.id.au>

On Tue, Jul 21, 2020 at 11:45:20AM +1000, Michael Ellerman wrote:
> Nathan Lynch <nathanl@linux.ibm.com> writes:
> > "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> >> This is the next version of the fixes for memory unplug on radix.
> >> The issues and the fix are described in the actual patches.
> >
> > I guess this isn't actually causing problems at runtime right now, but I
> > notice calls to resize_hpt_for_hotplug() from arch_add_memory() and
> > arch_remove_memory(), which ought to be mmu-agnostic:
> >
> > int __ref arch_add_memory(int nid, u64 start, u64 size,
> > 			  struct mhp_params *params)
> > {
> > 	unsigned long start_pfn = start >> PAGE_SHIFT;
> > 	unsigned long nr_pages = size >> PAGE_SHIFT;
> > 	int rc;
> >
> > 	resize_hpt_for_hotplug(memblock_phys_mem_size());
> >
> > 	start = (unsigned long)__va(start);
> > 	rc = create_section_mapping(start, start + size, nid,
> > 				    params->pgprot);
> > ...
> 
> Hmm well spotted.
> 
> That does return early if the ops are not setup:
> 
> int resize_hpt_for_hotplug(unsigned long new_mem_size)
> {
> 	unsigned target_hpt_shift;
> 
> 	if (!mmu_hash_ops.resize_hpt)
> 		return 0;
> 
> 
> And:
> 
> void __init hpte_init_pseries(void)
> {
> 	...
> 	if (firmware_has_feature(FW_FEATURE_HPT_RESIZE))
> 		mmu_hash_ops.resize_hpt = pseries_lpar_resize_hpt;
> 
> And that comes in via ibm,hypertas-functions:
> 
> 	{FW_FEATURE_HPT_RESIZE,		"hcall-hpt-resize"},
> 
> 
> But firmware is not necessarily going to add/remove that call based on
> whether we're using hash/radix.

Correct but hpte_init_pseries() will not be called for radix guests.

> 
> So I think a follow-up patch is needed to make this more robust.
> 
> Aneesh/Bharata what platform did you test this series on? I'm curious
> how this didn't break.

I have tested memory hotplug/unplug for radix guest on zz platform and
sanity-tested this for hash guest on P8.

As noted above, mmu_hash_ops.resize_hpt will not be set for radix
guest and hence we won't see any breakage.

However a separate patch to fix this will be good.

Regards,
Bharata.

^ permalink raw reply

* Re: [v3 01/15] powerpc/perf: Update cpu_hw_event to use `struct` for storing MMCR registers
From: Jordan Niethe @ 2020-07-21  3:42 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: Gautham R Shenoy, mikey, maddy, kvm, kvm-ppc, svaidyan, acme,
	jolsa, linuxppc-dev
In-Reply-To: <1594996707-3727-2-git-send-email-atrajeev@linux.vnet.ibm.com>

On Sat, Jul 18, 2020 at 12:48 AM Athira Rajeev
<atrajeev@linux.vnet.ibm.com> wrote:
>
> core-book3s currently uses array to store the MMCR registers as part
> of per-cpu `cpu_hw_events`. This patch does a clean up to use `struct`
> to store mmcr regs instead of array. This will make code easier to read
> and reduces chance of any subtle bug that may come in the future, say
> when new registers are added. Patch updates all relevant code that was
> using MMCR array ( cpuhw->mmcr[x]) to use newly introduced `struct`.
> This includes the PMU driver code for supported platforms (power5
> to power9) and ISA macros for counter support functions.
>
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/perf_event_server.h | 10 ++++--
>  arch/powerpc/perf/core-book3s.c              | 53 +++++++++++++---------------
>  arch/powerpc/perf/isa207-common.c            | 20 +++++------
>  arch/powerpc/perf/isa207-common.h            |  4 +--
>  arch/powerpc/perf/mpc7450-pmu.c              | 21 +++++++----
>  arch/powerpc/perf/power5+-pmu.c              | 17 ++++-----
>  arch/powerpc/perf/power5-pmu.c               | 17 ++++-----
>  arch/powerpc/perf/power6-pmu.c               | 16 ++++-----
>  arch/powerpc/perf/power7-pmu.c               | 17 ++++-----
>  arch/powerpc/perf/ppc970-pmu.c               | 24 ++++++-------
>  10 files changed, 105 insertions(+), 94 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
> index 3e9703f..f9a3668 100644
> --- a/arch/powerpc/include/asm/perf_event_server.h
> +++ b/arch/powerpc/include/asm/perf_event_server.h
> @@ -17,6 +17,12 @@
>
>  struct perf_event;
>
> +struct mmcr_regs {
> +       unsigned long mmcr0;
> +       unsigned long mmcr1;
> +       unsigned long mmcr2;
> +       unsigned long mmcra;
> +};
>  /*
>   * This struct provides the constants and functions needed to
>   * describe the PMU on a particular POWER-family CPU.
> @@ -28,7 +34,7 @@ struct power_pmu {
>         unsigned long   add_fields;
>         unsigned long   test_adder;
>         int             (*compute_mmcr)(u64 events[], int n_ev,
> -                               unsigned int hwc[], unsigned long mmcr[],
> +                               unsigned int hwc[], struct mmcr_regs *mmcr,
>                                 struct perf_event *pevents[]);
>         int             (*get_constraint)(u64 event_id, unsigned long *mskp,
>                                 unsigned long *valp);
> @@ -41,7 +47,7 @@ struct power_pmu {
>         unsigned long   group_constraint_val;
>         u64             (*bhrb_filter_map)(u64 branch_sample_type);
>         void            (*config_bhrb)(u64 pmu_bhrb_filter);
> -       void            (*disable_pmc)(unsigned int pmc, unsigned long mmcr[]);
> +       void            (*disable_pmc)(unsigned int pmc, struct mmcr_regs *mmcr);
>         int             (*limited_pmc_event)(u64 event_id);
>         u32             flags;
>         const struct attribute_group    **attr_groups;
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index cd6a742..18b1b6a 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -37,12 +37,7 @@ struct cpu_hw_events {
>         struct perf_event *event[MAX_HWEVENTS];
>         u64 events[MAX_HWEVENTS];
>         unsigned int flags[MAX_HWEVENTS];
> -       /*
> -        * The order of the MMCR array is:
> -        *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
> -        *  - 32-bit, MMCR0, MMCR1, MMCR2
> -        */
> -       unsigned long mmcr[4];
> +       struct mmcr_regs mmcr;
>         struct perf_event *limited_counter[MAX_LIMITED_HWCOUNTERS];
>         u8  limited_hwidx[MAX_LIMITED_HWCOUNTERS];
>         u64 alternatives[MAX_HWEVENTS][MAX_EVENT_ALTERNATIVES];
> @@ -121,7 +116,7 @@ static void ebb_event_add(struct perf_event *event) { }
>  static void ebb_switch_out(unsigned long mmcr0) { }
>  static unsigned long ebb_switch_in(bool ebb, struct cpu_hw_events *cpuhw)
>  {
> -       return cpuhw->mmcr[0];
> +       return cpuhw->mmcr.mmcr0;
>  }
>
>  static inline void power_pmu_bhrb_enable(struct perf_event *event) {}
> @@ -590,7 +585,7 @@ static void ebb_switch_out(unsigned long mmcr0)
>
>  static unsigned long ebb_switch_in(bool ebb, struct cpu_hw_events *cpuhw)
>  {
> -       unsigned long mmcr0 = cpuhw->mmcr[0];
> +       unsigned long mmcr0 = cpuhw->mmcr.mmcr0;
>
>         if (!ebb)
>                 goto out;
> @@ -624,7 +619,7 @@ static unsigned long ebb_switch_in(bool ebb, struct cpu_hw_events *cpuhw)
>          * unfreeze counters, it should not set exclude_xxx in its events and
>          * instead manage the MMCR2 entirely by itself.
>          */
> -       mtspr(SPRN_MMCR2, cpuhw->mmcr[3] | current->thread.mmcr2);
> +       mtspr(SPRN_MMCR2, cpuhw->mmcr.mmcr2 | current->thread.mmcr2);
>  out:
>         return mmcr0;
>  }
> @@ -1232,9 +1227,9 @@ static void power_pmu_disable(struct pmu *pmu)
>                 /*
>                  * Disable instruction sampling if it was enabled
>                  */
> -               if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE) {
> +               if (cpuhw->mmcr.mmcra & MMCRA_SAMPLE_ENABLE) {
>                         mtspr(SPRN_MMCRA,
> -                             cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
> +                             cpuhw->mmcr.mmcra & ~MMCRA_SAMPLE_ENABLE);
>                         mb();
>                         isync();
>                 }
> @@ -1308,18 +1303,18 @@ static void power_pmu_enable(struct pmu *pmu)
>          * (possibly updated for removal of events).
>          */
>         if (!cpuhw->n_added) {
> -               mtspr(SPRN_MMCRA, cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
> -               mtspr(SPRN_MMCR1, cpuhw->mmcr[1]);
> +               mtspr(SPRN_MMCRA, cpuhw->mmcr.mmcra & ~MMCRA_SAMPLE_ENABLE);
> +               mtspr(SPRN_MMCR1, cpuhw->mmcr.mmcr1);
>                 goto out_enable;
>         }
>
>         /*
>          * Clear all MMCR settings and recompute them for the new set of events.
>          */
> -       memset(cpuhw->mmcr, 0, sizeof(cpuhw->mmcr));
> +       memset(&cpuhw->mmcr, 0, sizeof(cpuhw->mmcr));
>
>         if (ppmu->compute_mmcr(cpuhw->events, cpuhw->n_events, hwc_index,
> -                              cpuhw->mmcr, cpuhw->event)) {
> +                              &cpuhw->mmcr, cpuhw->event)) {
>                 /* shouldn't ever get here */
>                 printk(KERN_ERR "oops compute_mmcr failed\n");
>                 goto out;
> @@ -1333,11 +1328,11 @@ static void power_pmu_enable(struct pmu *pmu)
>                  */
>                 event = cpuhw->event[0];
>                 if (event->attr.exclude_user)
> -                       cpuhw->mmcr[0] |= MMCR0_FCP;
> +                       cpuhw->mmcr.mmcr0 |= MMCR0_FCP;
>                 if (event->attr.exclude_kernel)
> -                       cpuhw->mmcr[0] |= freeze_events_kernel;
> +                       cpuhw->mmcr.mmcr0 |= freeze_events_kernel;
>                 if (event->attr.exclude_hv)
> -                       cpuhw->mmcr[0] |= MMCR0_FCHV;
> +                       cpuhw->mmcr.mmcr0 |= MMCR0_FCHV;
>         }
>
>         /*
> @@ -1346,12 +1341,12 @@ static void power_pmu_enable(struct pmu *pmu)
>          * Then unfreeze the events.
>          */
>         ppc_set_pmu_inuse(1);
> -       mtspr(SPRN_MMCRA, cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
> -       mtspr(SPRN_MMCR1, cpuhw->mmcr[1]);
> -       mtspr(SPRN_MMCR0, (cpuhw->mmcr[0] & ~(MMCR0_PMC1CE | MMCR0_PMCjCE))
> +       mtspr(SPRN_MMCRA, cpuhw->mmcr.mmcra & ~MMCRA_SAMPLE_ENABLE);
> +       mtspr(SPRN_MMCR1, cpuhw->mmcr.mmcr1);
> +       mtspr(SPRN_MMCR0, (cpuhw->mmcr.mmcr0 & ~(MMCR0_PMC1CE | MMCR0_PMCjCE))
>                                 | MMCR0_FC);
>         if (ppmu->flags & PPMU_ARCH_207S)
> -               mtspr(SPRN_MMCR2, cpuhw->mmcr[3]);
> +               mtspr(SPRN_MMCR2, cpuhw->mmcr.mmcr2);
>
>         /*
>          * Read off any pre-existing events that need to move
> @@ -1402,7 +1397,7 @@ static void power_pmu_enable(struct pmu *pmu)
>                 perf_event_update_userpage(event);
>         }
>         cpuhw->n_limited = n_lim;
> -       cpuhw->mmcr[0] |= MMCR0_PMXE | MMCR0_FCECE;
> +       cpuhw->mmcr.mmcr0 |= MMCR0_PMXE | MMCR0_FCECE;
>
>   out_enable:
>         pmao_restore_workaround(ebb);
> @@ -1418,9 +1413,9 @@ static void power_pmu_enable(struct pmu *pmu)
>         /*
>          * Enable instruction sampling if necessary
>          */
> -       if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE) {
> +       if (cpuhw->mmcr.mmcra & MMCRA_SAMPLE_ENABLE) {
>                 mb();
> -               mtspr(SPRN_MMCRA, cpuhw->mmcr[2]);
> +               mtspr(SPRN_MMCRA, cpuhw->mmcr.mmcra);
>         }
>
>   out:
> @@ -1550,7 +1545,7 @@ static void power_pmu_del(struct perf_event *event, int ef_flags)
>                                 cpuhw->flags[i-1] = cpuhw->flags[i];
>                         }
>                         --cpuhw->n_events;
> -                       ppmu->disable_pmc(event->hw.idx - 1, cpuhw->mmcr);
> +                       ppmu->disable_pmc(event->hw.idx - 1, &cpuhw->mmcr);
>                         if (event->hw.idx) {
>                                 write_pmc(event->hw.idx, 0);
>                                 event->hw.idx = 0;
> @@ -1571,7 +1566,7 @@ static void power_pmu_del(struct perf_event *event, int ef_flags)
>         }
>         if (cpuhw->n_events == 0) {
>                 /* disable exceptions if no events are running */
> -               cpuhw->mmcr[0] &= ~(MMCR0_PMXE | MMCR0_FCECE);
> +               cpuhw->mmcr.mmcr0 &= ~(MMCR0_PMXE | MMCR0_FCECE);
>         }
>
>         if (has_branch_stack(event))
> @@ -2240,7 +2235,7 @@ static void __perf_event_interrupt(struct pt_regs *regs)
>          * XXX might want to use MSR.PM to keep the events frozen until
>          * we get back out of this interrupt.
>          */
> -       write_mmcr0(cpuhw, cpuhw->mmcr[0]);
> +       write_mmcr0(cpuhw, cpuhw->mmcr.mmcr0);
>
>         if (nmi)
>                 nmi_exit();
> @@ -2262,7 +2257,7 @@ static int power_pmu_prepare_cpu(unsigned int cpu)
>
>         if (ppmu) {
>                 memset(cpuhw, 0, sizeof(*cpuhw));
> -               cpuhw->mmcr[0] = MMCR0_FC;
> +               cpuhw->mmcr.mmcr0 = MMCR0_FC;
>         }
>         return 0;
>  }
> diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
> index 4c86da5..2fe63f2 100644
> --- a/arch/powerpc/perf/isa207-common.c
> +++ b/arch/powerpc/perf/isa207-common.c
> @@ -363,7 +363,7 @@ int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp)
>  }
>
>  int isa207_compute_mmcr(u64 event[], int n_ev,
> -                              unsigned int hwc[], unsigned long mmcr[],
> +                              unsigned int hwc[], struct mmcr_regs *mmcr,
>                                struct perf_event *pevents[])
>  {
>         unsigned long mmcra, mmcr1, mmcr2, unit, combine, psel, cache, val;
> @@ -464,30 +464,30 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>         }
>
>         /* Return MMCRx values */
> -       mmcr[0] = 0;
> +       mmcr->mmcr0 = 0;
>
>         /* pmc_inuse is 1-based */
>         if (pmc_inuse & 2)
> -               mmcr[0] = MMCR0_PMC1CE;
> +               mmcr->mmcr0 = MMCR0_PMC1CE;
>
>         if (pmc_inuse & 0x7c)
> -               mmcr[0] |= MMCR0_PMCjCE;
> +               mmcr->mmcr0 |= MMCR0_PMCjCE;
>
>         /* If we're not using PMC 5 or 6, freeze them */
>         if (!(pmc_inuse & 0x60))
> -               mmcr[0] |= MMCR0_FC56;
> +               mmcr->mmcr0 |= MMCR0_FC56;
>
> -       mmcr[1] = mmcr1;
> -       mmcr[2] = mmcra;
> -       mmcr[3] = mmcr2;
> +       mmcr->mmcr1 = mmcr1;
> +       mmcr->mmcra = mmcra;
> +       mmcr->mmcr2 = mmcr2;
>
>         return 0;
>  }
>
> -void isa207_disable_pmc(unsigned int pmc, unsigned long mmcr[])
> +void isa207_disable_pmc(unsigned int pmc, struct mmcr_regs *mmcr)
>  {
>         if (pmc <= 3)
> -               mmcr[1] &= ~(0xffUL << MMCR1_PMCSEL_SHIFT(pmc + 1));
> +               mmcr->mmcr1 &= ~(0xffUL << MMCR1_PMCSEL_SHIFT(pmc + 1));
>  }
>
>  static int find_alternative(u64 event, const unsigned int ev_alt[][MAX_ALT], int size)
> diff --git a/arch/powerpc/perf/isa207-common.h b/arch/powerpc/perf/isa207-common.h
> index 63fd4f3..df968fd 100644
> --- a/arch/powerpc/perf/isa207-common.h
> +++ b/arch/powerpc/perf/isa207-common.h
> @@ -217,9 +217,9 @@
>
>  int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp);
>  int isa207_compute_mmcr(u64 event[], int n_ev,
> -                               unsigned int hwc[], unsigned long mmcr[],
> +                               unsigned int hwc[], struct mmcr_regs *mmcr,
>                                 struct perf_event *pevents[]);
> -void isa207_disable_pmc(unsigned int pmc, unsigned long mmcr[]);
> +void isa207_disable_pmc(unsigned int pmc, struct mmcr_regs *mmcr);
>  int isa207_get_alternatives(u64 event, u64 alt[], int size, unsigned int flags,
>                                         const unsigned int ev_alt[][MAX_ALT]);
>  void isa207_get_mem_data_src(union perf_mem_data_src *dsrc, u32 flags,
> diff --git a/arch/powerpc/perf/mpc7450-pmu.c b/arch/powerpc/perf/mpc7450-pmu.c
> index 4d5ef92..826de25 100644
> --- a/arch/powerpc/perf/mpc7450-pmu.c
> +++ b/arch/powerpc/perf/mpc7450-pmu.c
> @@ -257,7 +257,7 @@ static int mpc7450_get_alternatives(u64 event, unsigned int flags, u64 alt[])
>   * Compute MMCR0/1/2 values for a set of events.
>   */
>  static int mpc7450_compute_mmcr(u64 event[], int n_ev, unsigned int hwc[],
> -                               unsigned long mmcr[],
> +                               struct mmcr_regs *mmcr,
>                                 struct perf_event *pevents[])
>  {
>         u8 event_index[N_CLASSES][N_COUNTER];
> @@ -321,9 +321,16 @@ static int mpc7450_compute_mmcr(u64 event[], int n_ev, unsigned int hwc[],
>                 mmcr0 |= MMCR0_PMCnCE;
>
>         /* Return MMCRx values */
> -       mmcr[0] = mmcr0;
> -       mmcr[1] = mmcr1;
> -       mmcr[2] = mmcr2;
> +       mmcr->mmcr0 = mmcr0;
> +       mmcr->mmcr1 = mmcr1;
> +       mmcr->mmcr2 = mmcr2;
Will this mmcr->mmcr2 be used anywere, or will it always use mmcr->mmcra?
> +       /*
> +        * 32-bit doesn't have an MMCRA and uses SPRN_MMCR2 to define
> +        * SPRN_MMCRA. So assign mmcra of cpu_hw_events with `mmcr2`
> +        * value to ensure that any write to this SPRN_MMCRA will
> +        * use mmcr2 value.
> +        */
Something like this comment would probably be useful to near the struct mmcr.
> +       mmcr->mmcra = mmcr2;
>         return 0;
>  }
>
> @@ -331,12 +338,12 @@ static int mpc7450_compute_mmcr(u64 event[], int n_ev, unsigned int hwc[],
>   * Disable counting by a PMC.
>   * Note that the pmc argument is 0-based here, not 1-based.
>   */
> -static void mpc7450_disable_pmc(unsigned int pmc, unsigned long mmcr[])
> +static void mpc7450_disable_pmc(unsigned int pmc, struct mmcr_regs *mmcr)
>  {
>         if (pmc <= 1)
> -               mmcr[0] &= ~(pmcsel_mask[pmc] << pmcsel_shift[pmc]);
> +               mmcr->mmcr0 &= ~(pmcsel_mask[pmc] << pmcsel_shift[pmc]);
>         else
> -               mmcr[1] &= ~(pmcsel_mask[pmc] << pmcsel_shift[pmc]);
> +               mmcr->mmcr1 &= ~(pmcsel_mask[pmc] << pmcsel_shift[pmc]);
>  }
>
>  static int mpc7450_generic_events[] = {
> diff --git a/arch/powerpc/perf/power5+-pmu.c b/arch/powerpc/perf/power5+-pmu.c
> index f857454..5f0821e 100644
> --- a/arch/powerpc/perf/power5+-pmu.c
> +++ b/arch/powerpc/perf/power5+-pmu.c
> @@ -448,7 +448,8 @@ static int power5p_marked_instr_event(u64 event)
>  }
>
>  static int power5p_compute_mmcr(u64 event[], int n_ev,
> -                               unsigned int hwc[], unsigned long mmcr[], struct perf_event *pevents[])
> +                               unsigned int hwc[], struct mmcr_regs *mmcr,
> +                               struct perf_event *pevents[])
>  {
>         unsigned long mmcr1 = 0;
>         unsigned long mmcra = 0;
> @@ -586,20 +587,20 @@ static int power5p_compute_mmcr(u64 event[], int n_ev,
>         }
>
>         /* Return MMCRx values */
> -       mmcr[0] = 0;
> +       mmcr->mmcr0 = 0;
>         if (pmc_inuse & 1)
> -               mmcr[0] = MMCR0_PMC1CE;
> +               mmcr->mmcr0 = MMCR0_PMC1CE;
>         if (pmc_inuse & 0x3e)
> -               mmcr[0] |= MMCR0_PMCjCE;
> -       mmcr[1] = mmcr1;
> -       mmcr[2] = mmcra;
> +               mmcr->mmcr0 |= MMCR0_PMCjCE;
> +       mmcr->mmcr1 = mmcr1;
> +       mmcr->mmcra = mmcra;
>         return 0;
>  }
>
> -static void power5p_disable_pmc(unsigned int pmc, unsigned long mmcr[])
> +static void power5p_disable_pmc(unsigned int pmc, struct mmcr_regs *mmcr)
>  {
>         if (pmc <= 3)
> -               mmcr[1] &= ~(0x7fUL << MMCR1_PMCSEL_SH(pmc));
> +               mmcr->mmcr1 &= ~(0x7fUL << MMCR1_PMCSEL_SH(pmc));
>  }
>
>  static int power5p_generic_events[] = {
> diff --git a/arch/powerpc/perf/power5-pmu.c b/arch/powerpc/perf/power5-pmu.c
> index da52eca..426021d 100644
> --- a/arch/powerpc/perf/power5-pmu.c
> +++ b/arch/powerpc/perf/power5-pmu.c
> @@ -379,7 +379,8 @@ static int power5_marked_instr_event(u64 event)
>  }
>
>  static int power5_compute_mmcr(u64 event[], int n_ev,
> -                              unsigned int hwc[], unsigned long mmcr[], struct perf_event *pevents[])
> +                              unsigned int hwc[], struct mmcr_regs *mmcr,
> +                              struct perf_event *pevents[])
>  {
>         unsigned long mmcr1 = 0;
>         unsigned long mmcra = MMCRA_SDAR_DCACHE_MISS | MMCRA_SDAR_ERAT_MISS;
> @@ -528,20 +529,20 @@ static int power5_compute_mmcr(u64 event[], int n_ev,
>         }
>
>         /* Return MMCRx values */
> -       mmcr[0] = 0;
> +       mmcr->mmcr0 = 0;
>         if (pmc_inuse & 1)
> -               mmcr[0] = MMCR0_PMC1CE;
> +               mmcr->mmcr0 = MMCR0_PMC1CE;
>         if (pmc_inuse & 0x3e)
> -               mmcr[0] |= MMCR0_PMCjCE;
> -       mmcr[1] = mmcr1;
> -       mmcr[2] = mmcra;
> +               mmcr->mmcr0 |= MMCR0_PMCjCE;
> +       mmcr->mmcr1 = mmcr1;
> +       mmcr->mmcra = mmcra;
>         return 0;
>  }
>
> -static void power5_disable_pmc(unsigned int pmc, unsigned long mmcr[])
> +static void power5_disable_pmc(unsigned int pmc, struct mmcr_regs *mmcr)
>  {
>         if (pmc <= 3)
> -               mmcr[1] &= ~(0x7fUL << MMCR1_PMCSEL_SH(pmc));
> +               mmcr->mmcr1 &= ~(0x7fUL << MMCR1_PMCSEL_SH(pmc));
>  }
>
>  static int power5_generic_events[] = {
> diff --git a/arch/powerpc/perf/power6-pmu.c b/arch/powerpc/perf/power6-pmu.c
> index 3929cac..e343a51 100644
> --- a/arch/powerpc/perf/power6-pmu.c
> +++ b/arch/powerpc/perf/power6-pmu.c
> @@ -171,7 +171,7 @@ static int power6_marked_instr_event(u64 event)
>   * Assign PMC numbers and compute MMCR1 value for a set of events
>   */
>  static int p6_compute_mmcr(u64 event[], int n_ev,
> -                          unsigned int hwc[], unsigned long mmcr[], struct perf_event *pevents[])
> +                          unsigned int hwc[], struct mmcr_regs *mmcr, struct perf_event *pevents[])
>  {
>         unsigned long mmcr1 = 0;
>         unsigned long mmcra = MMCRA_SDAR_DCACHE_MISS | MMCRA_SDAR_ERAT_MISS;
> @@ -243,13 +243,13 @@ static int p6_compute_mmcr(u64 event[], int n_ev,
>                 if (pmc < 4)
>                         mmcr1 |= (unsigned long)psel << MMCR1_PMCSEL_SH(pmc);
>         }
> -       mmcr[0] = 0;
> +       mmcr->mmcr0 = 0;
>         if (pmc_inuse & 1)
> -               mmcr[0] = MMCR0_PMC1CE;
> +               mmcr->mmcr0 = MMCR0_PMC1CE;
>         if (pmc_inuse & 0xe)
> -               mmcr[0] |= MMCR0_PMCjCE;
> -       mmcr[1] = mmcr1;
> -       mmcr[2] = mmcra;
> +               mmcr->mmcr0 |= MMCR0_PMCjCE;
> +       mmcr->mmcr1 = mmcr1;
> +       mmcr->mmcra = mmcra;
>         return 0;
>  }
>
> @@ -457,11 +457,11 @@ static int p6_get_alternatives(u64 event, unsigned int flags, u64 alt[])
>         return nalt;
>  }
>
> -static void p6_disable_pmc(unsigned int pmc, unsigned long mmcr[])
> +static void p6_disable_pmc(unsigned int pmc, struct mmcr_regs *mmcr)
>  {
>         /* Set PMCxSEL to 0 to disable PMCx */
>         if (pmc <= 3)
> -               mmcr[1] &= ~(0xffUL << MMCR1_PMCSEL_SH(pmc));
> +               mmcr->mmcr1 &= ~(0xffUL << MMCR1_PMCSEL_SH(pmc));
>  }
>
>  static int power6_generic_events[] = {
> diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
> index a137813..3152336 100644
> --- a/arch/powerpc/perf/power7-pmu.c
> +++ b/arch/powerpc/perf/power7-pmu.c
> @@ -242,7 +242,8 @@ static int power7_marked_instr_event(u64 event)
>  }
>
>  static int power7_compute_mmcr(u64 event[], int n_ev,
> -                              unsigned int hwc[], unsigned long mmcr[], struct perf_event *pevents[])
> +                              unsigned int hwc[], struct mmcr_regs *mmcr,
> +                              struct perf_event *pevents[])
>  {
>         unsigned long mmcr1 = 0;
>         unsigned long mmcra = MMCRA_SDAR_DCACHE_MISS | MMCRA_SDAR_ERAT_MISS;
> @@ -298,20 +299,20 @@ static int power7_compute_mmcr(u64 event[], int n_ev,
>         }
>
>         /* Return MMCRx values */
> -       mmcr[0] = 0;
> +       mmcr->mmcr0 = 0;
>         if (pmc_inuse & 1)
> -               mmcr[0] = MMCR0_PMC1CE;
> +               mmcr->mmcr0 = MMCR0_PMC1CE;
>         if (pmc_inuse & 0x3e)
> -               mmcr[0] |= MMCR0_PMCjCE;
> -       mmcr[1] = mmcr1;
> -       mmcr[2] = mmcra;
> +               mmcr->mmcr0 |= MMCR0_PMCjCE;
> +       mmcr->mmcr1 = mmcr1;
> +       mmcr->mmcra = mmcra;
>         return 0;
>  }
>
> -static void power7_disable_pmc(unsigned int pmc, unsigned long mmcr[])
> +static void power7_disable_pmc(unsigned int pmc, struct mmcr_regs *mmcr)
>  {
>         if (pmc <= 3)
> -               mmcr[1] &= ~(0xffUL << MMCR1_PMCSEL_SH(pmc));
> +               mmcr->mmcr1 &= ~(0xffUL << MMCR1_PMCSEL_SH(pmc));
>  }
>
>  static int power7_generic_events[] = {
> diff --git a/arch/powerpc/perf/ppc970-pmu.c b/arch/powerpc/perf/ppc970-pmu.c
> index 4035d93..89a90ab 100644
> --- a/arch/powerpc/perf/ppc970-pmu.c
> +++ b/arch/powerpc/perf/ppc970-pmu.c
> @@ -253,7 +253,8 @@ static int p970_get_alternatives(u64 event, unsigned int flags, u64 alt[])
>  }
>
>  static int p970_compute_mmcr(u64 event[], int n_ev,
> -                            unsigned int hwc[], unsigned long mmcr[], struct perf_event *pevents[])
> +                            unsigned int hwc[], struct mmcr_regs *mmcr,
> +                            struct perf_event *pevents[])
>  {
>         unsigned long mmcr0 = 0, mmcr1 = 0, mmcra = 0;
>         unsigned int pmc, unit, byte, psel;
> @@ -393,27 +394,26 @@ static int p970_compute_mmcr(u64 event[], int n_ev,
>         mmcra |= 0x2000;        /* mark only one IOP per PPC instruction */
>
>         /* Return MMCRx values */
> -       mmcr[0] = mmcr0;
> -       mmcr[1] = mmcr1;
> -       mmcr[2] = mmcra;
> +       mmcr->mmcr0 = mmcr0;
> +       mmcr->mmcr1 = mmcr1;
> +       mmcr->mmcra = mmcra;
>         return 0;
>  }
>
> -static void p970_disable_pmc(unsigned int pmc, unsigned long mmcr[])
> +static void p970_disable_pmc(unsigned int pmc, struct mmcr_regs *mmcr)
>  {
> -       int shift, i;
> +       int shift;
>
> +       /*
> +        * Setting the PMCxSEL field to 0x08 disables PMC x.
> +        */
>         if (pmc <= 1) {
>                 shift = MMCR0_PMC1SEL_SH - 7 * pmc;
> -               i = 0;
> +               mmcr->mmcr0 = (mmcr->mmcr0 & ~(0x1fUL << shift)) | (0x08UL << shift);
>         } else {
>                 shift = MMCR1_PMC3SEL_SH - 5 * (pmc - 2);
> -               i = 1;
> +               mmcr->mmcr1 = (mmcr->mmcr1 & ~(0x1fUL << shift)) | (0x08UL << shift);
>         }
> -       /*
> -        * Setting the PMCxSEL field to 0x08 disables PMC x.
> -        */
> -       mmcr[i] = (mmcr[i] & ~(0x1fUL << shift)) | (0x08UL << shift);
>  }
>
>  static int ppc970_generic_events[] = {
> --
> 1.8.3.1
>

^ permalink raw reply

* [PATCH 2/2] ASoC: bindings: fsl-asoc-card: Support properties for configuring dai fmt
From: Shengjiu Wang @ 2020-07-21  3:41 UTC (permalink / raw)
  To: timur, nicoleotsuka, Xiubo.Lee, festevam, lgirdwood, broonie,
	perex, tiwai, alsa-devel, robh+dt, devicetree
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1595302910-19688-1-git-send-email-shengjiu.wang@nxp.com>

In order to support configuring dai fmt through DT, add some properties.
These properiese are same as the properties in simple card.

Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 Documentation/devicetree/bindings/sound/fsl-asoc-card.txt | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/sound/fsl-asoc-card.txt b/Documentation/devicetree/bindings/sound/fsl-asoc-card.txt
index 8a6a3d0fda5e..63ebf52b43e8 100644
--- a/Documentation/devicetree/bindings/sound/fsl-asoc-card.txt
+++ b/Documentation/devicetree/bindings/sound/fsl-asoc-card.txt
@@ -71,6 +71,11 @@ Optional properties:
 
   - hp-det-gpio		: The GPIO that detect headphones are plugged in
   - mic-det-gpio	: The GPIO that detect microphones are plugged in
+  - bitclock-master	: Indicates dai-link bit clock master; for details see simple-card.yaml.
+  - frame-master	: Indicates dai-link frame master; for details see simple-card.yaml.
+  - dai-format		: audio format, for details see simple-card.yaml.
+  - frame-inversion	: dai-link uses frame clock inversion, for details see simple-card.yaml.
+  - bitclock-inversion	: dai-link uses bit clock inversion, for details see simple-card.yaml.
 
 Optional unless SSI is selected as a CPU DAI:
 
-- 
2.27.0


^ permalink raw reply related

* [PATCH 1/2] ASoC: fsl-asoc-card: Support configuring dai fmt from DT
From: Shengjiu Wang @ 2020-07-21  3:41 UTC (permalink / raw)
  To: timur, nicoleotsuka, Xiubo.Lee, festevam, lgirdwood, broonie,
	perex, tiwai, alsa-devel, robh+dt, devicetree
  Cc: linuxppc-dev, linux-kernel

Support same propeties as simple card for configuring fmt
from DT.
In order to make this change compatible with old DT, these
properties are optional.

Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 sound/soc/fsl/fsl-asoc-card.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/sound/soc/fsl/fsl-asoc-card.c b/sound/soc/fsl/fsl-asoc-card.c
index ee80d02b56c6..4848ba61d083 100644
--- a/sound/soc/fsl/fsl-asoc-card.c
+++ b/sound/soc/fsl/fsl-asoc-card.c
@@ -531,11 +531,14 @@ static int fsl_asoc_card_probe(struct platform_device *pdev)
 	struct device_node *cpu_np, *codec_np, *asrc_np;
 	struct device_node *np = pdev->dev.of_node;
 	struct platform_device *asrc_pdev = NULL;
+	struct device_node *bitclkmaster = NULL;
+	struct device_node *framemaster = NULL;
 	struct platform_device *cpu_pdev;
 	struct fsl_asoc_card_priv *priv;
 	struct device *codec_dev = NULL;
 	const char *codec_dai_name;
 	const char *codec_dev_name;
+	unsigned int daifmt;
 	u32 width;
 	int ret;
 
@@ -667,6 +670,31 @@ static int fsl_asoc_card_probe(struct platform_device *pdev)
 		goto asrc_fail;
 	}
 
+	/* Format info from DT is optional. */
+	daifmt = snd_soc_of_parse_daifmt(np, NULL,
+					 &bitclkmaster, &framemaster);
+	daifmt &= ~SND_SOC_DAIFMT_MASTER_MASK;
+	if (bitclkmaster || framemaster) {
+		if (codec_np == bitclkmaster)
+			daifmt |= (codec_np == framemaster) ?
+				SND_SOC_DAIFMT_CBM_CFM : SND_SOC_DAIFMT_CBM_CFS;
+		else
+			daifmt |= (codec_np == framemaster) ?
+				SND_SOC_DAIFMT_CBS_CFM : SND_SOC_DAIFMT_CBS_CFS;
+
+		/* Override dai_fmt with value from DT */
+		priv->dai_fmt = daifmt;
+	}
+
+	/* Change direction according to format */
+	if (priv->dai_fmt & SND_SOC_DAIFMT_CBM_CFM) {
+		priv->cpu_priv.sysclk_dir[TX] = SND_SOC_CLOCK_IN;
+		priv->cpu_priv.sysclk_dir[RX] = SND_SOC_CLOCK_IN;
+	}
+
+	of_node_put(bitclkmaster);
+	of_node_put(framemaster);
+
 	if (!fsl_asoc_card_is_ac97(priv) && !codec_dev) {
 		dev_err(&pdev->dev, "failed to find codec device\n");
 		ret = -EPROBE_DEFER;
-- 
2.27.0


^ permalink raw reply related

* Re: [PATCH v4 09/10] powerpc/watchpoint: Return available watchpoints dynamically
From: Ravi Bangoria @ 2020-07-21  3:57 UTC (permalink / raw)
  To: Jordan Niethe
  Cc: Christophe Leroy, apopple, mikey, miltonm, peterz, oleg,
	Nicholas Piggin, linux-kernel, Paul Mackerras, jolsa, fweisbec,
	pedromfc, naveen.n.rao, linuxppc-dev, mingo, Ravi Bangoria
In-Reply-To: <CACzsE9r0acLUkV35mVxy1AEK_xObs0yz+fD6UdbNdc6uz=Buqw@mail.gmail.com>



On 7/20/20 9:12 AM, Jordan Niethe wrote:
> On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
> <ravi.bangoria@linux.ibm.com> wrote:
>>
>> So far Book3S Powerpc supported only one watchpoint. Power10 is
>> introducing 2nd DAWR. Enable 2nd DAWR support for Power10.
>> Availability of 2nd DAWR will depend on CPU_FTR_DAWR1.
>>
>> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
>> ---
>>   arch/powerpc/include/asm/cputable.h      | 4 +++-
>>   arch/powerpc/include/asm/hw_breakpoint.h | 5 +++--
>>   2 files changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
>> index 3445c86e1f6f..36a0851a7a9b 100644
>> --- a/arch/powerpc/include/asm/cputable.h
>> +++ b/arch/powerpc/include/asm/cputable.h
>> @@ -633,7 +633,9 @@ enum {
>>    * Maximum number of hw breakpoint supported on powerpc. Number of
>>    * breakpoints supported by actual hw might be less than this.
>>    */
>> -#define HBP_NUM_MAX    1
>> +#define HBP_NUM_MAX    2
>> +#define HBP_NUM_ONE    1
>> +#define HBP_NUM_TWO    2
> I wonder if these defines are necessary - has it any advantage over
> just using the literal?

No, not really. Initially I had something like:

#define HBP_NUM_MAX    2
#define HBP_NUM_P8_P9  1
#define HBP_NUM_P10    2

But then I thought it's also not right. So I made it _ONE and _TWO.
Now the function that decides nr watchpoints dynamically (nr_wp_slots)
is in different file, I thought to keep it like this so it would be
easier to figure out why _MAX is 2.

>>
>>   #endif /* !__ASSEMBLY__ */
>>
>> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
>> index cb424799da0d..d4eab1694bcd 100644
>> --- a/arch/powerpc/include/asm/hw_breakpoint.h
>> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
>> @@ -5,10 +5,11 @@
>>    * Copyright 2010, IBM Corporation.
>>    * Author: K.Prasad <prasad@linux.vnet.ibm.com>
>>    */
>> -
> Was removing this line deliberate?

Nah. Will remove that hunk.

>>   #ifndef _PPC_BOOK3S_64_HW_BREAKPOINT_H
>>   #define _PPC_BOOK3S_64_HW_BREAKPOINT_H
>>
>> +#include <asm/cpu_has_feature.h>
>> +
>>   #ifdef __KERNEL__
>>   struct arch_hw_breakpoint {
>>          unsigned long   address;
>> @@ -46,7 +47,7 @@ struct arch_hw_breakpoint {
>>
>>   static inline int nr_wp_slots(void)
>>   {
>> -       return HBP_NUM_MAX;
>> +       return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_TWO : HBP_NUM_ONE;
> So it'd be something like:
> +       return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_MAX : 1;
> But thinking that there might be more slots added in the future, it
> may be better to make the number of slots a variable that is set
> during the init and then have this function return that.

Not sure I follow. What do you mean by setting number of slots a
variable that is set during the init?

Thanks,
Ravi

^ permalink raw reply

* Re: [PATCH v2 13/16] scripts/kallsyms: move ignored symbol types to is_ignored_symbol()
From: Finn Thain @ 2020-07-21  4:00 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linuxppc-dev, Linux Kernel Mailing List,
	Linux Kbuild mailing list
In-Reply-To: <CAK7LNATVDfHJG=+G6DqkR-vSyVhb8KmxcT0ae+ukdi_otthi8A@mail.gmail.com>

On Mon, 20 Jul 2020, Masahiro Yamada wrote:

> 
> I got a similar report before.
> 
> I'd like to know whether or not
> this is the same issue as fixed by
> 7883a14339299773b2ce08dcfd97c63c199a9289
> 

The problem can be observed with 3d77e6a8804ab ("Linux 5.7").
So it appears that 7883a14339299 ("scripts/kallsyms: fix wrong 
kallsyms_relative_base") is not sufficient to fix it.

> 
> Does your problem happen on the latest kernel?

Unfortunately this cross compiler (gcc 4.6.4) is too old to build 
v5.8-rc1 or later. I will have to upgrade.

> Which version of binutils are you using?
> 

This toolchain uses binutils 2.22.

In case it helps,

$ powerpc-linux-gnu-nm -n vmlinux |head                  
         w mach_chrp
00000005 a LG_CACHELINE_BYTES
00000005 a LG_CACHELINE_BYTES
00000005 a LG_CACHELINE_BYTES
0000000c a Hash_bits
0000001f a CACHELINE_MASK
0000001f a CACHELINE_MASK
0000001f a CACHELINE_MASK
00000020 a CACHELINE_BYTES
00000020 a CACHELINE_BYTES
00000020 a CACHELINE_BYTES
00000020 a reg
0003ffc0 a Hash_msk
c0000000 T _start
c0000000 A _stext
c0000000 A _text
c000000c T __start
c0000054 t __after_mmu_off
c0000090 t turn_on_mmu
c00000c4 T __secondary_hold
c00000dc T __secondary_hold_spinloop
c00000e0 T __secondary_hold_acknowledge
c0000100 t Reset

^ permalink raw reply

* Re: [PATCH v4 09/10] powerpc/watchpoint: Return available watchpoints dynamically
From: Jordan Niethe @ 2020-07-21  4:41 UTC (permalink / raw)
  To: Ravi Bangoria
  Cc: Christophe Leroy, miltonm, mikey, peterz, oleg, Nicholas Piggin,
	linux-kernel, Paul Mackerras, jolsa, fweisbec, pedromfc,
	naveen.n.rao, linuxppc-dev, mingo
In-Reply-To: <ccfcf488-0ec9-1737-8368-a848de1d72d1@linux.ibm.com>

On Tue, Jul 21, 2020 at 1:57 PM Ravi Bangoria
<ravi.bangoria@linux.ibm.com> wrote:
>
>
>
> On 7/20/20 9:12 AM, Jordan Niethe wrote:
> > On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
> > <ravi.bangoria@linux.ibm.com> wrote:
> >>
> >> So far Book3S Powerpc supported only one watchpoint. Power10 is
> >> introducing 2nd DAWR. Enable 2nd DAWR support for Power10.
> >> Availability of 2nd DAWR will depend on CPU_FTR_DAWR1.
> >>
> >> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
> >> ---
> >>   arch/powerpc/include/asm/cputable.h      | 4 +++-
> >>   arch/powerpc/include/asm/hw_breakpoint.h | 5 +++--
> >>   2 files changed, 6 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
> >> index 3445c86e1f6f..36a0851a7a9b 100644
> >> --- a/arch/powerpc/include/asm/cputable.h
> >> +++ b/arch/powerpc/include/asm/cputable.h
> >> @@ -633,7 +633,9 @@ enum {
> >>    * Maximum number of hw breakpoint supported on powerpc. Number of
> >>    * breakpoints supported by actual hw might be less than this.
> >>    */
> >> -#define HBP_NUM_MAX    1
> >> +#define HBP_NUM_MAX    2
> >> +#define HBP_NUM_ONE    1
> >> +#define HBP_NUM_TWO    2
> > I wonder if these defines are necessary - has it any advantage over
> > just using the literal?
>
> No, not really. Initially I had something like:
>
> #define HBP_NUM_MAX    2
> #define HBP_NUM_P8_P9  1
> #define HBP_NUM_P10    2
>
> But then I thought it's also not right. So I made it _ONE and _TWO.
> Now the function that decides nr watchpoints dynamically (nr_wp_slots)
> is in different file, I thought to keep it like this so it would be
> easier to figure out why _MAX is 2.
>
> >>
> >>   #endif /* !__ASSEMBLY__ */
> >>
> >> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
> >> index cb424799da0d..d4eab1694bcd 100644
> >> --- a/arch/powerpc/include/asm/hw_breakpoint.h
> >> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
> >> @@ -5,10 +5,11 @@
> >>    * Copyright 2010, IBM Corporation.
> >>    * Author: K.Prasad <prasad@linux.vnet.ibm.com>
> >>    */
> >> -
> > Was removing this line deliberate?
>
> Nah. Will remove that hunk.
>
> >>   #ifndef _PPC_BOOK3S_64_HW_BREAKPOINT_H
> >>   #define _PPC_BOOK3S_64_HW_BREAKPOINT_H
> >>
> >> +#include <asm/cpu_has_feature.h>
> >> +
> >>   #ifdef __KERNEL__
> >>   struct arch_hw_breakpoint {
> >>          unsigned long   address;
> >> @@ -46,7 +47,7 @@ struct arch_hw_breakpoint {
> >>
> >>   static inline int nr_wp_slots(void)
> >>   {
> >> -       return HBP_NUM_MAX;
> >> +       return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_TWO : HBP_NUM_ONE;
> > So it'd be something like:
> > +       return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_MAX : 1;
> > But thinking that there might be more slots added in the future, it
> > may be better to make the number of slots a variable that is set
> > during the init and then have this function return that.
>
> Not sure I follow. What do you mean by setting number of slots a
> variable that is set during the init?
Sorry I was unclear there.
I was just looking and saw arm also has a variable number of hw breakpoints.
If we did something like how they handle it, it might look something like:

static int num_wp_slots __ro_after_init;

int nr_wp_slots(void) {
    return num_wp_slots;
}

static int __init arch_hw_breakpoint_init(void) {
    num_wp_slots = work out how many wp_slots
}
arch_initcall(arch_hw_breakpoint_init);

Then we wouldn't have to calculate everytime nr_wp_slots() is called.
In the future if more wp's are added nr_wp_slots() will get more complicated.
But just an idea, feel free to ignore.

>
> Thanks,
> Ravi

^ permalink raw reply

* Re: [PATCH v3 0/4] powerpc/mm/radix: Memory unplug fixes
From: Aneesh Kumar K.V @ 2020-07-21  4:42 UTC (permalink / raw)
  To: Michael Ellerman, Nathan Lynch; +Cc: linuxppc-dev, Bharata B Rao
In-Reply-To: <87tuy1sksv.fsf@mpe.ellerman.id.au>

On 7/21/20 7:15 AM, Michael Ellerman wrote:
> Nathan Lynch <nathanl@linux.ibm.com> writes:
>> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>>> This is the next version of the fixes for memory unplug on radix.
>>> The issues and the fix are described in the actual patches.
>>
>> I guess this isn't actually causing problems at runtime right now, but I
>> notice calls to resize_hpt_for_hotplug() from arch_add_memory() and
>> arch_remove_memory(), which ought to be mmu-agnostic:
>>
>> int __ref arch_add_memory(int nid, u64 start, u64 size,
>> 			  struct mhp_params *params)
>> {
>> 	unsigned long start_pfn = start >> PAGE_SHIFT;
>> 	unsigned long nr_pages = size >> PAGE_SHIFT;
>> 	int rc;
>>
>> 	resize_hpt_for_hotplug(memblock_phys_mem_size());
>>
>> 	start = (unsigned long)__va(start);
>> 	rc = create_section_mapping(start, start + size, nid,
>> 				    params->pgprot);
>> ...
> 
> Hmm well spotted.
> 
> That does return early if the ops are not setup:
> 
> int resize_hpt_for_hotplug(unsigned long new_mem_size)
> {
> 	unsigned target_hpt_shift;
> 
> 	if (!mmu_hash_ops.resize_hpt)
> 		return 0;
> 
> 
> And:
> 
> void __init hpte_init_pseries(void)
> {
> 	...
> 	if (firmware_has_feature(FW_FEATURE_HPT_RESIZE))
> 		mmu_hash_ops.resize_hpt = pseries_lpar_resize_hpt;
> 
> And that comes in via ibm,hypertas-functions:
> 
> 	{FW_FEATURE_HPT_RESIZE,		"hcall-hpt-resize"},
> 
> 
> But firmware is not necessarily going to add/remove that call based on
> whether we're using hash/radix.
> 


We are good there because hpte_init_pseries is only called for hash 
translation.

early_init_mmu()
-> hash__early_init_mmu
    -> hpte_init_pseries
       -> mmu_hash_ops.resize_hpt = pseries_lpar_resize_hpt;

> So I think a follow-up patch is needed to make this more robust.
> 
> Aneesh/Bharata what platform did you test this series on? I'm curious
> how this didn't break.

All the changes are tested with kvm.

-aneesh


^ permalink raw reply

* Re: [v3 02/15] KVM: PPC: Book3S HV: Cleanup updates for kvm vcpu MMCR
From: Paul Mackerras @ 2020-07-21  3:54 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: ego, mikey, maddy, kvm, kvm-ppc, svaidyan, acme, jolsa,
	linuxppc-dev
In-Reply-To: <1594996707-3727-3-git-send-email-atrajeev@linux.vnet.ibm.com>

On Fri, Jul 17, 2020 at 10:38:14AM -0400, Athira Rajeev wrote:
> Currently `kvm_vcpu_arch` stores all Monitor Mode Control registers
> in a flat array in order: mmcr0, mmcr1, mmcra, mmcr2, mmcrs
> Split this to give mmcra and mmcrs its own entries in vcpu and
> use a flat array for mmcr0 to mmcr2. This patch implements this
> cleanup to make code easier to read.

Changing the way KVM stores these values internally is fine, but
changing the user ABI is not.  This part:

> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
> index 264e266..e55d847 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -510,8 +510,8 @@ struct kvm_ppc_cpu_char {
>  
>  #define KVM_REG_PPC_MMCR0	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x10)
>  #define KVM_REG_PPC_MMCR1	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x11)
> -#define KVM_REG_PPC_MMCRA	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x12)
> -#define KVM_REG_PPC_MMCR2	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x13)
> +#define KVM_REG_PPC_MMCR2	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x12)
> +#define KVM_REG_PPC_MMCRA	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x13)

means that existing userspace programs that used to work would now be
broken.  That is not acceptable (breaking the user ABI is only ever
acceptable with a very compelling reason).  So NAK to this part of the
patch.

Regards,
Paul.

^ permalink raw reply

* Re: [PATCH v4 5/7] powerpc/iommu: Move iommu_table cleaning routine to iommu_table_clean
From: Alexey Kardashevskiy @ 2020-07-21  4:59 UTC (permalink / raw)
  To: Leonardo Bras, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Joel Stanley, Christophe Leroy,
	Thiago Jung Bauermann, Ram Pai, Brian King
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20200716071658.467820-6-leobras.c@gmail.com>



On 16/07/2020 17:16, Leonardo Bras wrote:
> Move the part of iommu_table_free() that does struct iommu_table cleaning
> into iommu_table_clean, so we can invoke it separately.
> 
> This new function is useful for cleaning struct iommu_table before
> initializing it again with a new DMA window, without having it freed and
> allocated again.
> 
> Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
> ---
>  arch/powerpc/kernel/iommu.c | 30 ++++++++++++++++++------------
>  1 file changed, 18 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index 9704f3f76e63..c3242253a4e7 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -735,21 +735,10 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid,
>  	return tbl;
>  }
>  
> -static void iommu_table_free(struct kref *kref)
> +static void iommu_table_clean(struct iommu_table *tbl)


iommu_table_free() + iommu_init_table() + set_iommu_table_base() should
work too, why new helper?

There is also iommu_table_clear() which does a different thing so you
need a better name.

Second, iommu_table_free() would print a warning if any IOMMU page is in
use and it would be ok as we would only see this when hot-unplugging a
PE because we always kept the default window.

Btw you must be seeing these warnings now every time you create DDW with
these patches as at least the first page is reserved, do not you?

Since we are replacing a table for a device which is still in the
system, we should not try messing with its DMA if it already has
mappings so the warning should become an error preventing DDW. It is
rather hard to trigger in practice but I could hack a driver to ask for
32bit DMA mask first, map few pages and then ask for 64bit DMA mask, it
is not illegal, I think. So this needs a new helper - "bool
iommu_table_in_use(tbl)" - to use in enable_ddw(). Or I am overthinking
this?... Thanks,



>  {
>  	unsigned long bitmap_sz;
>  	unsigned int order;
> -	struct iommu_table *tbl;
> -
> -	tbl = container_of(kref, struct iommu_table, it_kref);
> -
> -	if (tbl->it_ops->free)
> -		tbl->it_ops->free(tbl);
> -
> -	if (!tbl->it_map) {
> -		kfree(tbl);
> -		return;
> -	}
>  
>  	iommu_table_release_pages(tbl);
>  
> @@ -763,6 +752,23 @@ static void iommu_table_free(struct kref *kref)
>  	/* free bitmap */
>  	order = get_order(bitmap_sz);
>  	free_pages((unsigned long) tbl->it_map, order);
> +}
> +
> +static void iommu_table_free(struct kref *kref)
> +{
> +	struct iommu_table *tbl;
> +
> +	tbl = container_of(kref, struct iommu_table, it_kref);
> +
> +	if (tbl->it_ops->free)
> +		tbl->it_ops->free(tbl);
> +
> +	if (!tbl->it_map) {
> +		kfree(tbl);
> +		return;
> +	}
> +
> +	iommu_table_clean(tbl);
>  
>  	/* free table */
>  	kfree(tbl);
> 

-- 
Alexey

^ permalink raw reply

* Re: [PATCH] KVM: PPC: Book3S HV: Use feature flag CPU_FTR_P9_TIDR when accessing TIDR
From: Paul Mackerras @ 2020-07-21  5:04 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: linuxppc-dev, kvm-ppc, kvm, David Gibson
In-Reply-To: <20200623165027.271215-1-clg@kaod.org>

On Tue, Jun 23, 2020 at 06:50:27PM +0200, Cédric Le Goater wrote:
> The TIDR register is only available on POWER9 systems and code
> accessing this register is not always protected by the CPU_FTR_P9_TIDR
> flag. Fix that to make sure POWER10 systems won't use it as TIDR has
> been removed.

I'm concerned about what this patch would do if we are trying to
migrate from a P9 guest to a guest on P10 in P9-compat mode, in that
the destination QEMU would get an error on doing the SET_ONE_REG for
the TIDR.  I don't think the lack of TIDR is worth failing the
migration for given that TIDR only actually does anything if you are
using an accelerator, and KVM has never supported use of accelerators
in guests.  I'm cc'ing David Gibson for his comments on the
compatibility and migration issues.

In any case, given that both move to and move from TIDR will be no-ops
on P10 (for privileged code), I don't think there is a great urgency
for this patch.

Paul.

^ permalink raw reply

* Re: [PATCH] KVM: PPC: Book3S HV: Use feature flag CPU_FTR_P9_TIDR when accessing TIDR
From: David Gibson @ 2020-07-21  5:39 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm-ppc, linuxppc-dev, Cédric Le Goater, kvm
In-Reply-To: <20200721050445.GA3878639@thinks.paulus.ozlabs.org>

[-- Attachment #1: Type: text/plain, Size: 1492 bytes --]

On Tue, Jul 21, 2020 at 03:04:45PM +1000, Paul Mackerras wrote:
> On Tue, Jun 23, 2020 at 06:50:27PM +0200, Cédric Le Goater wrote:
> > The TIDR register is only available on POWER9 systems and code
> > accessing this register is not always protected by the CPU_FTR_P9_TIDR
> > flag. Fix that to make sure POWER10 systems won't use it as TIDR has
> > been removed.
> 
> I'm concerned about what this patch would do if we are trying to
> migrate from a P9 guest to a guest on P10 in P9-compat mode, in that
> the destination QEMU would get an error on doing the SET_ONE_REG for
> the TIDR.  I don't think the lack of TIDR is worth failing the
> migration for given that TIDR only actually does anything if you are
> using an accelerator, and KVM has never supported use of accelerators
> in guests.  I'm cc'ing David Gibson for his comments on the
> compatibility and migration issues.

Having thought about this a bit more, I don't think it matters.  We're
going to have to update qemu to handle POWER10 anyway.  If this causes
a problem, this would just add one small thing to whatever we need to
fix there.

> In any case, given that both move to and move from TIDR will be no-ops
> on P10 (for privileged code), I don't think there is a great urgency
> for this patch.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] KVM: PPC: Book3S HV: rework secure mem slot dropping
From: Paul Mackerras @ 2020-07-21  5:46 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: Ram Pai, linux-kernel, kvm-ppc, bharata, sathnaga, sukadev,
	linuxppc-dev, bauerman
In-Reply-To: <0588d16a-8548-0f55-1132-400807a390a1@linux.ibm.com>

On Wed, Jul 08, 2020 at 02:16:36PM +0200, Laurent Dufour wrote:
> Le 08/07/2020 à 13:25, Bharata B Rao a écrit :
> > On Fri, Jul 03, 2020 at 05:59:14PM +0200, Laurent Dufour wrote:
> > > When a secure memslot is dropped, all the pages backed in the secure device
> > > (aka really backed by secure memory by the Ultravisor) should be paged out
> > > to a normal page. Previously, this was achieved by triggering the page
> > > fault mechanism which is calling kvmppc_svm_page_out() on each pages.
> > > 
> > > This can't work when hot unplugging a memory slot because the memory slot
> > > is flagged as invalid and gfn_to_pfn() is then not trying to access the
> > > page, so the page fault mechanism is not triggered.
> > > 
> > > Since the final goal is to make a call to kvmppc_svm_page_out() it seems
> > > simpler to directly calling it instead of triggering such a mechanism. This
> > > way kvmppc_uvmem_drop_pages() can be called even when hot unplugging a
> > > memslot.
> > 
> > Yes, this appears much simpler.
> 
> Thanks Bharata for reviewing this.
> 
> > 
> > > 
> > > Since kvmppc_uvmem_drop_pages() is already holding kvm->arch.uvmem_lock,
> > > the call to __kvmppc_svm_page_out() is made.
> > > As __kvmppc_svm_page_out needs the vma pointer to migrate the pages, the
> > > VMA is fetched in a lazy way, to not trigger find_vma() all the time. In
> > > addition, the mmap_sem is help in read mode during that time, not in write
> > > mode since the virual memory layout is not impacted, and
> > > kvm->arch.uvmem_lock prevents concurrent operation on the secure device.
> > > 
> > > Cc: Ram Pai <linuxram@us.ibm.com>
> > > Cc: Bharata B Rao <bharata@linux.ibm.com>
> > > Cc: Paul Mackerras <paulus@ozlabs.org>
> > > Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> > > ---
> > >   arch/powerpc/kvm/book3s_hv_uvmem.c | 54 ++++++++++++++++++++----------
> > >   1 file changed, 37 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
> > > index 852cc9ae6a0b..479ddf16d18c 100644
> > > --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
> > > +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
> > > @@ -533,35 +533,55 @@ static inline int kvmppc_svm_page_out(struct vm_area_struct *vma,
> > >    * fault on them, do fault time migration to replace the device PTEs in
> > >    * QEMU page table with normal PTEs from newly allocated pages.
> > >    */
> > > -void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> > > +void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot,
> > >   			     struct kvm *kvm, bool skip_page_out)
> > >   {
> > >   	int i;
> > >   	struct kvmppc_uvmem_page_pvt *pvt;
> > > -	unsigned long pfn, uvmem_pfn;
> > > -	unsigned long gfn = free->base_gfn;
> > > +	struct page *uvmem_page;
> > > +	struct vm_area_struct *vma = NULL;
> > > +	unsigned long uvmem_pfn, gfn;
> > > +	unsigned long addr, end;
> > > +
> > > +	down_read(&kvm->mm->mmap_sem);
> > 
> > You should be using mmap_read_lock(kvm->mm) with recent kernels.
> 
> Absolutely, shame on me, I reviewed Michel's series about that!
> 
> Paul, Michael, could you fix that when pulling this patch or should I sent a
> whole new series?

Given that Ram has reworked his series, I think it would be best if
you rebase on top of his new series and re-send your series.

Thanks,
Paul.

^ permalink raw reply

* ASMedia ASM2142 USB host controller tries to DMA to address zero when doing bulk reads from multiple devices
From: Forest Crossman @ 2020-07-21  5:49 UTC (permalink / raw)
  To: linux-usb; +Cc: linuxppc-dev

Hello, again!

After fixing the issue in my previous thread using this patch[1], I
decided to do some stress-testing of the controller to make sure it
could handle my intended workloads and that there were no further DMA
address issues that would need to be fixed. Unfortunately, it looks
like there's still more work to be done: when I try to do long bulk
reads from multiple devices simultaneously, eventually the host
controller sends a DMA write to address zero, which then triggers EEH
in my POWER9 system, causing the controller card to get hotplug-reset,
which of course kills the disk-reading processes. For more details on
the EEH errors, you can see my kernel's EEH message log[2]. The
results of the various tests I performed are listed below.

Test results (all failures are due to DMA writes to address zero, all
hubs are USB 3.0/3.1 Gen1 only, and all disks are accessed via the
usb-storage driver):
- Reading simultaneously from two or more disks behind a hub connected
to one port on the host controller:
  - FAIL after 20-50 GB of data transferred for each device.
- Reading simultaneously from two disks, each connected directly to
one port on the host controller:
  - FAIL after about 800 GB of data transferred for each device.
- Reading from one disk behind a hub connected to one port on the host
controller:
  - OK for at least 2.7 TB of data transferred (I didn't test the
whole 8 TB disk).
- Writing simultaneously to two FL2000 dongles (using osmo-fl2k's
"fl2k_test"), each connected directly to one port on the host
controller:
  - OK, was able to write several dozen terabytes to each device over
the course of a little over 21 hours.

Seeing how simultaneous writes to multiple devices and reads from
single devices both seem to work fine, I assume that means this is
being caused by some race condition in the host controller firmware
when it responds to multiple read requests. I also assume we're not
going to be able to convince ASMedia to both fix the bug in their
firmware and release the details on how to flash it from Linux, so I
guess we'll just have to figure out how to make the driver talk to the
controller in a way that avoids triggering the bad DMA write. As
before, I decided to try a little kernel hacking of my own before
sending this email, and tried separately enabling the
XHCI_BROKEN_STREAMS and XHCI_ASMEDIA_MODIFY_FLOWCONTROL quirks in an
attempt to fix this. As you might expect since you're reading this
message, neither of those quirks fixed the issue, nor did they even
make the transfers last any longer before failing.

So now I've reached the limits of my understanding, and I need some
help devising a fix. If anyone has any comments to that effect, or any
questions about my hardware configuration, testing methodology, etc.,
please don't hesitate to air them. Also, if anyone needs me to perform
additional tests, or collect more log information, I'd be happy to do
that as well.


Thanks in advance for your help,

Forest


[1]: https://patchwork.kernel.org/patch/11669989/
[2]: https://paste.debian.net/hidden/2a442aa6

^ permalink raw reply

* Re: [v3 12/15] powerpc/perf: Add support for outputting extended regs in perf intr_regs
From: kajoljain @ 2020-07-21  6:02 UTC (permalink / raw)
  To: Athira Rajeev, mpe, acme, jolsa
  Cc: ego, mikey, maddy, kvm, kvm-ppc, svaidyan, linuxppc-dev
In-Reply-To: <1594996707-3727-13-git-send-email-atrajeev@linux.vnet.ibm.com>



On 7/17/20 8:08 PM, Athira Rajeev wrote:
> From: Anju T Sudhakar <anju@linux.vnet.ibm.com>
> 
> Add support for perf extended register capability in powerpc.
> The capability flag PERF_PMU_CAP_EXTENDED_REGS, is used to indicate the
> PMU which support extended registers. The generic code define the mask
> of extended registers as 0 for non supported architectures.
> 
> Patch adds extended regs support for power9 platform by
> exposing MMCR0, MMCR1 and MMCR2 registers.
> 
> REG_RESERVED mask needs update to include extended regs.
> `PERF_REG_EXTENDED_MASK`, contains mask value of the supported registers,
> is defined at runtime in the kernel based on platform since the supported
> registers may differ from one processor version to another and hence the
> MASK value.
> 
> with patch
> ----------
> 
> available registers: r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11
> r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26
> r27 r28 r29 r30 r31 nip msr orig_r3 ctr link xer ccr softe
> trap dar dsisr sier mmcra mmcr0 mmcr1 mmcr2
> 
> PERF_RECORD_SAMPLE(IP, 0x1): 4784/4784: 0 period: 1 addr: 0
> ... intr regs: mask 0xffffffffffff ABI 64-bit
> .... r0    0xc00000000012b77c
> .... r1    0xc000003fe5e03930
> .... r2    0xc000000001b0e000
> .... r3    0xc000003fdcddf800
> .... r4    0xc000003fc7880000
> .... r5    0x9c422724be
> .... r6    0xc000003fe5e03908
> .... r7    0xffffff63bddc8706
> .... r8    0x9e4
> .... r9    0x0
> .... r10   0x1
> .... r11   0x0
> .... r12   0xc0000000001299c0
> .... r13   0xc000003ffffc4800
> .... r14   0x0
> .... r15   0x7fffdd8b8b00
> .... r16   0x0
> .... r17   0x7fffdd8be6b8
> .... r18   0x7e7076607730
> .... r19   0x2f
> .... r20   0xc00000001fc26c68
> .... r21   0xc0002041e4227e00
> .... r22   0xc00000002018fb60
> .... r23   0x1
> .... r24   0xc000003ffec4d900
> .... r25   0x80000000
> .... r26   0x0
> .... r27   0x1
> .... r28   0x1
> .... r29   0xc000000001be1260
> .... r30   0x6008010
> .... r31   0xc000003ffebb7218
> .... nip   0xc00000000012b910
> .... msr   0x9000000000009033
> .... orig_r3 0xc00000000012b86c
> .... ctr   0xc0000000001299c0
> .... link  0xc00000000012b77c
> .... xer   0x0
> .... ccr   0x28002222
> .... softe 0x1
> .... trap  0xf00
> .... dar   0x0
> .... dsisr 0x80000000000
> .... sier  0x0
> .... mmcra 0x80000000000
> .... mmcr0 0x82008090
> .... mmcr1 0x1e000000
> .... mmcr2 0x0
>  ... thread: perf:4784
> 
> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
> [Defined PERF_REG_EXTENDED_MASK at run time to add support for different platforms ]
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---

Patch looks good to me.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>

Thanks,
Kajol Jain

>  arch/powerpc/include/asm/perf_event_server.h |  8 +++++++
>  arch/powerpc/include/uapi/asm/perf_regs.h    | 14 +++++++++++-
>  arch/powerpc/perf/core-book3s.c              |  1 +
>  arch/powerpc/perf/perf_regs.c                | 34 +++++++++++++++++++++++++---
>  arch/powerpc/perf/power9-pmu.c               |  6 +++++
>  5 files changed, 59 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
> index 832450a..bf85d1a 100644
> --- a/arch/powerpc/include/asm/perf_event_server.h
> +++ b/arch/powerpc/include/asm/perf_event_server.h
> @@ -15,6 +15,9 @@
>  #define MAX_EVENT_ALTERNATIVES	8
>  #define MAX_LIMITED_HWCOUNTERS	2
>  
> +extern u64 PERF_REG_EXTENDED_MASK;
> +#define PERF_REG_EXTENDED_MASK	PERF_REG_EXTENDED_MASK
> +
>  struct perf_event;
>  
>  struct mmcr_regs {
> @@ -62,6 +65,11 @@ struct power_pmu {
>  	int 		*blacklist_ev;
>  	/* BHRB entries in the PMU */
>  	int		bhrb_nr;
> +	/*
> +	 * set this flag with `PERF_PMU_CAP_EXTENDED_REGS` if
> +	 * the pmu supports extended perf regs capability
> +	 */
> +	int		capabilities;
>  };
>  
>  /*
> diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h
> index f599064..225c64c 100644
> --- a/arch/powerpc/include/uapi/asm/perf_regs.h
> +++ b/arch/powerpc/include/uapi/asm/perf_regs.h
> @@ -48,6 +48,18 @@ enum perf_event_powerpc_regs {
>  	PERF_REG_POWERPC_DSISR,
>  	PERF_REG_POWERPC_SIER,
>  	PERF_REG_POWERPC_MMCRA,
> -	PERF_REG_POWERPC_MAX,
> +	/* Extended registers */
> +	PERF_REG_POWERPC_MMCR0,
> +	PERF_REG_POWERPC_MMCR1,
> +	PERF_REG_POWERPC_MMCR2,
> +	/* Max regs without the extended regs */
> +	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
>  };
> +
> +#define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
> +
> +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
> +#define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) - PERF_REG_PMU_MASK)
> +
> +#define PERF_REG_MAX_ISA_300   (PERF_REG_POWERPC_MMCR2 + 1)
>  #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 31c0535..d5a9529 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -2316,6 +2316,7 @@ int register_power_pmu(struct power_pmu *pmu)
>  		pmu->name);
>  
>  	power_pmu.attr_groups = ppmu->attr_groups;
> +	power_pmu.capabilities |= (ppmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS);
>  
>  #ifdef MSR_HV
>  	/*
> diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
> index a213a0a..b0cf68f 100644
> --- a/arch/powerpc/perf/perf_regs.c
> +++ b/arch/powerpc/perf/perf_regs.c
> @@ -13,9 +13,11 @@
>  #include <asm/ptrace.h>
>  #include <asm/perf_regs.h>
>  
> +u64 PERF_REG_EXTENDED_MASK;
> +
>  #define PT_REGS_OFFSET(id, r) [id] = offsetof(struct pt_regs, r)
>  
> -#define REG_RESERVED (~((1ULL << PERF_REG_POWERPC_MAX) - 1))
> +#define REG_RESERVED (~(PERF_REG_EXTENDED_MASK | PERF_REG_PMU_MASK))
>  
>  static unsigned int pt_regs_offset[PERF_REG_POWERPC_MAX] = {
>  	PT_REGS_OFFSET(PERF_REG_POWERPC_R0,  gpr[0]),
> @@ -69,10 +71,26 @@
>  	PT_REGS_OFFSET(PERF_REG_POWERPC_MMCRA, dsisr),
>  };
>  
> +/* Function to return the extended register values */
> +static u64 get_ext_regs_value(int idx)
> +{
> +	switch (idx) {
> +	case PERF_REG_POWERPC_MMCR0:
> +		return mfspr(SPRN_MMCR0);
> +	case PERF_REG_POWERPC_MMCR1:
> +		return mfspr(SPRN_MMCR1);
> +	case PERF_REG_POWERPC_MMCR2:
> +		return mfspr(SPRN_MMCR2);
> +	default: return 0;
> +	}
> +}
> +
>  u64 perf_reg_value(struct pt_regs *regs, int idx)
>  {
> -	if (WARN_ON_ONCE(idx >= PERF_REG_POWERPC_MAX))
> -		return 0;
> +	u64 PERF_REG_EXTENDED_MAX;
> +
> +	if (cpu_has_feature(CPU_FTR_ARCH_300))
> +		PERF_REG_EXTENDED_MAX = PERF_REG_MAX_ISA_300;
>  
>  	if (idx == PERF_REG_POWERPC_SIER &&
>  	   (IS_ENABLED(CONFIG_FSL_EMB_PERF_EVENT) ||
> @@ -85,6 +103,16 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
>  	    IS_ENABLED(CONFIG_PPC32)))
>  		return 0;
>  
> +	if (idx >= PERF_REG_POWERPC_MAX && idx < PERF_REG_EXTENDED_MAX)
> +		return get_ext_regs_value(idx);
> +
> +	/*
> +	 * If the idx is referring to value beyond the
> +	 * supported registers, return 0 with a warning
> +	 */
> +	if (WARN_ON_ONCE(idx >= PERF_REG_EXTENDED_MAX))
> +		return 0;
> +
>  	return regs_get_register(regs, pt_regs_offset[idx]);
>  }
>  
> diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
> index 05dae38..2a57e93 100644
> --- a/arch/powerpc/perf/power9-pmu.c
> +++ b/arch/powerpc/perf/power9-pmu.c
> @@ -90,6 +90,8 @@ enum {
>  #define POWER9_MMCRA_IFM3		0x00000000C0000000UL
>  #define POWER9_MMCRA_BHRB_MASK		0x00000000C0000000UL
>  
> +extern u64 PERF_REG_EXTENDED_MASK;
> +
>  /* Nasty Power9 specific hack */
>  #define PVR_POWER9_CUMULUS		0x00002000
>  
> @@ -434,6 +436,7 @@ static void power9_config_bhrb(u64 pmu_bhrb_filter)
>  	.cache_events		= &power9_cache_events,
>  	.attr_groups		= power9_pmu_attr_groups,
>  	.bhrb_nr		= 32,
> +	.capabilities           = PERF_PMU_CAP_EXTENDED_REGS,
>  };
>  
>  int init_power9_pmu(void)
> @@ -457,6 +460,9 @@ int init_power9_pmu(void)
>  		}
>  	}
>  
> +	/* Set the PERF_REG_EXTENDED_MASK here */
> +	PERF_REG_EXTENDED_MASK = PERF_REG_PMU_MASK_300;
> +
>  	rc = register_power_pmu(&power9_pmu);
>  	if (rc)
>  		return rc;
> 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox