LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v10 4/6] powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
From: Dan Williams @ 2020-06-05 19:49 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Santosh Sivaraj, linux-nvdimm, Linux Kernel Mailing List,
	Steven Rostedt, Oliver O'Halloran, Aneesh Kumar K . V,
	Vaibhav Jain, linuxppc-dev
In-Reply-To: <20200605171313.GO1505637@iweiny-DESK2.sc.intel.com>

On Fri, Jun 5, 2020 at 10:13 AM Ira Weiny <ira.weiny@intel.com> wrote:
>
> On Fri, Jun 05, 2020 at 05:11:34AM +0530, Vaibhav Jain wrote:
> > Since papr_scm_ndctl() can be called from outside papr_scm, its
> > exposed to the possibility of receiving NULL as value of 'cmd_rc'
> > argument. This patch updates papr_scm_ndctl() to protect against such
> > possibility by assigning it pointer to a local variable in case cmd_rc
> > == NULL.
> >
> > Finally the patch also updates the 'default' clause of the switch-case
> > block removing a 'return' statement thereby ensuring that value of
> > 'cmd_rc' is always logged when papr_scm_ndctl() returns.
> >
> > Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> > ---
> > Changelog:
> >
> > v9..v10
> > * New patch in the series
>
> Thanks for making this a separate patch it is easier to see what is going on
> here.
>
> > ---
> >  arch/powerpc/platforms/pseries/papr_scm.c | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> > index 0c091622b15e..6512fe6a2874 100644
> > --- a/arch/powerpc/platforms/pseries/papr_scm.c
> > +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> > @@ -355,11 +355,16 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >  {
> >       struct nd_cmd_get_config_size *get_size_hdr;
> >       struct papr_scm_priv *p;
> > +     int rc;
> >
> >       /* Only dimm-specific calls are supported atm */
> >       if (!nvdimm)
> >               return -EINVAL;
> >
> > +     /* Use a local variable in case cmd_rc pointer is NULL */
> > +     if (!cmd_rc)
> > +             cmd_rc = &rc;
> > +
>
> This protects you from the NULL.  However...
>
> >       p = nvdimm_provider_data(nvdimm);
> >
> >       switch (cmd) {
> > @@ -381,12 +386,13 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >               break;
> >
> >       default:
> > -             return -EINVAL;
> > +             dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
> > +             *cmd_rc = -EINVAL;
>
> ... I think you are conflating rc and cmd_rc...
>
> >       }
> >
> >       dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc);
> >
> > -     return 0;
> > +     return *cmd_rc;
>
> ... this changes the behavior of the current commands.  Now if the underlying
> papr_scm_meta_[get|set]() fails you return that failure as rc rather than 0.
>
> Is that ok?

The expectation is that rc is "did the command get sent to the device,
or did it fail for 'transport' reasons". The role of cmd_rc is to
translate the specific status response of the command into a common
error code. The expectations are:

rc < 0: Error code, Linux terminated the ioctl before talking to hardware

rc == 0: Linux successfully submitted the command to hardware, cmd_rc
is valid for command specific response

rc > 0: Linux successfully submitted the command, but detected that
only a subset of the data was accepted for "write"-style commands, or
that only subset of data was returned for "read"-style commands. I.e.
short-write / short-read semantics. cmd_rc is valid in this case and
its up to userspace to determine if a short transfer is an error or
not.

> Also 'logging cmd_rc' in the invalid cmd case does not seem quite right unless
> you really want rc to be cmd_rc.
>
> The architecture is designed to separate errors which occur in the kernel vs
> errors in the firmware/dimm.  Are they always the same?  The current code
> differentiates them.

Yeah, they're distinct, transport vs end-point / command-specific
status returns.

^ permalink raw reply

* Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.8-1 tag
From: Linus Torvalds @ 2020-06-05 19:01 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: ego, emmanuel.nicolet, chenzhou10, jniethe5, linuxram, kernelfans,
	Linux Kernel Mailing List, st5pub, Oliver O'Halloran, huhai,
	Markus Elfring, rzinsly, leobras.c, mikey, Herbert Xu,
	Aneesh Kumar K.V, haren, michal.simek, mahesh, Takashi Iwai,
	kjain, leonardo, Naveen N. Rao, Ravi Bangoria, ajd, Arnd Bergmann,
	Stephen Rothwell, alistair, Nick Piggin, wangxiongfeng2, Qian Cai,
	clg, Nathan Chancellor, hbathini, Christophe Leroy, geoff,
	Dmitry Torokhov, Gustavo A. R. Silva, wsa, sbobroff, fbarrat,
	Christophe JAILLET, Andrew Morton, linuxppc-dev
In-Reply-To: <87eeqth3hi.fsf@mpe.ellerman.id.au>

On Fri, Jun 5, 2020 at 9:38 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> I've pushed the result of my resolution of the conflicts to the powerpc/merge
> branch, if you want to look at that, though I've also tried to describe it in
> full below.

I ended up doing the machine_check_exception() differently, because I
felt the code itself was done wrong and I wanted to add a note about
that.

Having the same function have completely different semantics depending
on a platform issue is just fundamentally wrong, and makes not just
for fragile code, but also means that you can't do single image
kernels.

It should be two different functions, possibly just

   non_nmi_fn() { ... }

   nmi_fn() { nmi_enter(); non_nmi_fn(); nmi_exit(); }

and now you don't have odd rules for the same function that depends on
how the platform happens to call it.

I didn't do the above. I did something that looked like the old code,
but had a comment. Oh well.

But thanks for describing the merge, I'd have missed the place where
there was a new use of pgd_oiffset().

..and then when I actually compared whether I otherwise got the same
result as you, I realized that this all depends on the module tree.

I'll go merge that first, and then re-do this all. Oh well.

               Linus

^ permalink raw reply

* Re: [PATCH v10 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
From: Ira Weiny @ 2020-06-05 18:36 UTC (permalink / raw)
  To: Vaibhav Jain
  Cc: Santosh Sivaraj, linux-nvdimm, linux-kernel, Steven Rostedt,
	Oliver O'Halloran, Aneesh Kumar K . V, Dan Williams,
	linuxppc-dev
In-Reply-To: <20200604234136.253703-7-vaibhav@linux.ibm.com>

On Fri, Jun 05, 2020 at 05:11:36AM +0530, Vaibhav Jain wrote:
> This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
> that returns a newly introduced 'struct nd_papr_pdsm_health' instance
> containing dimm health information back to user space in response to
> ND_CMD_CALL. This functionality is implemented in newly introduced
> papr_pdsm_health() that queries the nvdimm health information and
> then copies this information to the package payload whose layout is
> defined by 'struct nd_papr_pdsm_health'.
> 
> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
> Changelog:
> 
> v9..v10:
> * Removed code in papr_pdsm_health that performed validation on pdsm
>   payload version and corrosponding struct and defines used for
>   validation of payload version.
> * Dropped usage of struct papr_pdsm_health in 'struct
>   papr_scm_priv'. Instead papr_psdm_health() now uses
>   'papr_scm_priv.health_bitmap' to populate the pdsm payload.
> * Above change also fixes the problem where this patch was removing
>   the code that was previously introduced in this patch-series.
>   [ Ira ]
> * Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the
>   space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct
>   nd_cmd_pkg'. This def is useful in validating payload sizes.
> * Reworked papr_pdsm_health() to enforce a specific payload size for
>   'PAPR_PDSM_HEALTH' pdsm request.
> 
> Resend:
> * Added ack from Aneesh.
> 
> v8..v9:
> * s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
> * s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
> * Renamed papr_scm_get_health() to papr_psdm_health()
> * Updated patch description to replace papr-scm dimm with nvdimm.
> 
> v7..v8:
> * None
> 
> Resend:
> * None
> 
> v6..v7:
> * Updated flags_show() to use seq_buf_printf(). [Mpe]
> * Updated papr_scm_get_health() to use newly introduced
>   __drc_pmem_query_health() bypassing the cache [Mpe].
> 
> v5..v6:
> * Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
>   gaurd against possibility of different compilers adding different
>   paddings to the struct [ Dan Williams ]
> 
> * Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
>   'bool' and also updated drc_pmem_query_health() to take this into
>   account. [ Dan Williams ]
> 
> v4..v5:
> * None
> 
> v3..v4:
> * Call the DSM_PAPR_SCM_HEALTH service function from
>   papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]
> 
> v2..v3:
> * Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
>   as its exported to the userspace [Aneesh]
> * Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
>   from enum to #defines [Aneesh]
> 
> v1..v2:
> * New patch in the series
> ---
>  arch/powerpc/include/uapi/asm/papr_pdsm.h | 33 +++++++++++
>  arch/powerpc/platforms/pseries/papr_scm.c | 70 +++++++++++++++++++++++
>  2 files changed, 103 insertions(+)
> 
> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> index 8b1a4f8fa316..c4c990ede5d4 100644
> --- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> @@ -71,12 +71,17 @@ struct nd_pdsm_cmd_pkg {
>  	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
>  } __packed;
>  
> +/* Calculate size used by the pdsm header fields minus 'struct nd_cmd_pkg' */
> +#define ND_PDSM_ENVELOPE_HDR_SIZE \
> +	(sizeof(struct nd_pdsm_cmd_pkg) - sizeof(struct nd_cmd_pkg))
> +

This is kind of a weird name for this.

Isn't this just the ND PDSM header size?  What is 'envelope' mean here?

>  /*
>   * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
>   * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
>   */
>  enum papr_pdsm {
>  	PAPR_PDSM_MIN = 0x0,
> +	PAPR_PDSM_HEALTH,
>  	PAPR_PDSM_MAX,
>  };
>  
> @@ -95,4 +100,32 @@ static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
>  		return (void *)(pcmd->payload);
>  }
>  
> +/* Various nvdimm health indicators */
> +#define PAPR_PDSM_DIMM_HEALTHY       0
> +#define PAPR_PDSM_DIMM_UNHEALTHY     1
> +#define PAPR_PDSM_DIMM_CRITICAL      2
> +#define PAPR_PDSM_DIMM_FATAL         3
> +
> +/*
> + * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
> + * Various flags indicate the health status of the dimm.
> + *
> + * dimm_unarmed		: Dimm not armed. So contents wont persist.
> + * dimm_bad_shutdown	: Previous shutdown did not persist contents.
> + * dimm_bad_restore	: Contents from previous shutdown werent restored.
> + * dimm_scrubbed	: Contents of the dimm have been scrubbed.
> + * dimm_locked		: Contents of the dimm cant be modified until CEC reboot
> + * dimm_encrypted	: Contents of dimm are encrypted.
> + * dimm_health		: Dimm health indicator. One of PAPR_PDSM_DIMM_XXXX
> + */
> +struct nd_papr_pdsm_health {
> +	__u8 dimm_unarmed;
> +	__u8 dimm_bad_shutdown;
> +	__u8 dimm_bad_restore;
> +	__u8 dimm_scrubbed;
> +	__u8 dimm_locked;
> +	__u8 dimm_encrypted;
> +	__u16 dimm_health;
> +} __packed;
> +
>  #endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> index 05eb56ecab5e..984942be24c1 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -421,6 +421,72 @@ static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
>  	return 0;
>  }
>  
> +/* Fetch the DIMM health info and populate it in provided package. */
> +static int papr_pdsm_health(struct papr_scm_priv *p,
> +			    struct nd_pdsm_cmd_pkg *pkg)
> +{
> +	int rc;
> +	struct nd_papr_pdsm_health health = { 0 };
> +	u16 copysize = sizeof(struct nd_papr_pdsm_health);
> +	u16 payload_size = pkg->hdr.nd_size_out - ND_PDSM_ENVELOPE_HDR_SIZE;
> +
> +	/* Ensure correct payload size that can hold struct nd_papr_pdsm_health */
> +	if (payload_size != copysize) {
> +		dev_dbg(&p->pdev->dev,
> +			"Unexpected payload-size (%u). Expected (%u)",
> +			pkg->hdr.nd_size_out, copysize);
> +		rc = -ENOSPC;
> +		goto out;
> +	}
> +
> +	/* Ensure dimm health mutex is taken preventing concurrent access */
> +	rc = mutex_lock_interruptible(&p->health_mutex);
> +	if (rc)
> +		goto out;
> +
> +	/* Always fetch upto date dimm health data ignoring cached values */
> +	rc = __drc_pmem_query_health(p);
> +	if (rc) {
> +		mutex_unlock(&p->health_mutex);
> +		goto out;
> +	}
> +
> +	/* update health struct with various flags derived from health bitmap */
> +	health = (struct nd_papr_pdsm_health) {
> +		.dimm_unarmed = p->health_bitmap & PAPR_PMEM_UNARMED_MASK,
> +		.dimm_bad_shutdown = p->health_bitmap & PAPR_PMEM_BAD_SHUTDOWN_MASK,
> +		.dimm_bad_restore = p->health_bitmap & PAPR_PMEM_BAD_RESTORE_MASK,
> +		.dimm_encrypted = p->health_bitmap & PAPR_PMEM_ENCRYPTED,
> +		.dimm_locked = p->health_bitmap & PAPR_PMEM_SCRUBBED_AND_LOCKED,
> +		.dimm_scrubbed = p->health_bitmap & PAPR_PMEM_SCRUBBED_AND_LOCKED,

Are you sure these work?  These are not assignments to a bool so I don't think
gcc will do what you want here.

Ira

> +		.dimm_health = PAPR_PDSM_DIMM_HEALTHY,
> +	};
> +
> +	/* Update field dimm_health based on health_bitmap flags */
> +	if (p->health_bitmap & PAPR_PMEM_HEALTH_FATAL)
> +		health.dimm_health = PAPR_PDSM_DIMM_FATAL;
> +	else if (p->health_bitmap & PAPR_PMEM_HEALTH_CRITICAL)
> +		health.dimm_health = PAPR_PDSM_DIMM_CRITICAL;
> +	else if (p->health_bitmap & PAPR_PMEM_HEALTH_UNHEALTHY)
> +		health.dimm_health = PAPR_PDSM_DIMM_UNHEALTHY;
> +
> +	/* struct populated hence can release the mutex now */
> +	mutex_unlock(&p->health_mutex);
> +
> +	dev_dbg(&p->pdev->dev, "Copying payload size=%u\n", copysize);
> +
> +	/* Copy the health struct to the payload */
> +	memcpy(pdsm_cmd_to_payload(pkg), &health, copysize);
> +
> +	/* Update fw size including size of struct nd_pdsm_cmd_pkg fields */
> +	pkg->hdr.nd_fw_size = copysize + ND_PDSM_ENVELOPE_HDR_SIZE;
> +
> +out:
> +	dev_dbg(&p->pdev->dev, "completion code = %d\n", rc);
> +
> +	return rc;
> +}
> +
>  /*
>   * For a given pdsm request call an appropriate service function.
>   * Note: Use 'nd_pdsm_cmd_pkg.cmd_status to report psdm servicing errors. Hence
> @@ -435,6 +501,10 @@ static void papr_scm_service_pdsm(struct papr_scm_priv *p,
>  
>  	/* Call pdsm service function */
>  	switch (pdsm) {
> +	case PAPR_PDSM_HEALTH:
> +		pkg->cmd_status = papr_pdsm_health(p, pkg);
> +		break;
> +
>  	default:
>  		dev_dbg(&p->pdev->dev, "PDSM[0x%x]: Unsupported PDSM request\n",
>  			pdsm);
> -- 
> 2.26.2
> 

^ permalink raw reply

* Re: [RESEND PATCH v9 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
From: Dan Williams @ 2020-06-05 18:19 UTC (permalink / raw)
  To: Vaibhav Jain
  Cc: Santosh Sivaraj, Weiny, Ira, linux-nvdimm@lists.01.org,
	Aneesh Kumar K . V, linux-kernel@vger.kernel.org, Steven Rostedt,
	Oliver O'Halloran, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <873679h72g.fsf@linux.ibm.com>

On Fri, Jun 5, 2020 at 8:22 AM Vaibhav Jain <vaibhav@linux.ibm.com> wrote:
[..]
> > Oh, why not define a maximal health payload with all the attributes
> > you know about today, leave some room for future expansion, and then
> > report a validity flag for each attribute? This is how the "intel"
> > smart-health payload works. If they ever needed to extend the payload
> > they would increase the size and add more validity flags. Old
> > userspace never groks the new fields, new userspace knows to ask for
> > and parse the larger payload.
> >
> > See the flags field in 'struct nd_intel_smart' (in ndctl) and the
> > translation of those flags to ndctl generic attribute flags
> > intel_cmd_smart_get_flags().
> >
> > In general I'd like ndctl to understand the superset of all health
> > attributes across all vendors. For the truly vendor specific ones it
> > would mean that the health flags with a specific "papr_scm" back-end
> > just would never be set on an "intel" device. I.e. look at the "hpe"
> > and "msft" health backends. They only set a subset of the valid flags
> > that could be reported.
>
> Thanks, this sounds good. Infact papr_scm implementation in ndctl does
> advertises support for only a subset of ND_SMART_* flags right now.
>
> Using 'flags' instead of 'version' was indeed discussed during
> v7..v9. However re-looking at the 'msft' and 'hpe' implementations the
> approach of maximal health payload tagged with a flags field looks more
> intuitive and I would prefer implementing this scheme in this patch-set.
>
> The current set health data exchanged with between libndctl and
> papr_scm via 'struct nd_papr_pdsm_health' (e.g various health status
> bits , nvdimm arming status etc) are guaranteed to be always available
> hence associating their availability with a flag wont be much useful as
> the flag will be always set.
>
> However as you suggested, extending the 'struct nd_papr_pdsm_health' in
> future to accommodate new attributes like 'life-remaining' can be done
> via adding them to the end of the struct and setting a flag field to
> indicate its presence.
>
> So I have the following proposal:
> * Add a new '__u32 extension_flags' field at beginning of 'struct
>   nd_papr_pdsm_health'
> * Set the size of the struct to 184-bytes which is the maximum possible
>   size for a pdsm payload.
> * 'papr_scm' kernel driver will currently set 'extension_flag' to 0
>   indicating no extension fields.
>
> * Future patch that adds support for 'life-remaining' add the new-field
>   at the end of known fields in 'struct nd_papr_pdsm_health'.
> * When provided to  papr_scm kernel module, if 'life-remaining' data is
>   available its populated and corresponding flag set in
>   'extension_flags' field indicating its presence.
> * When received by libndctl papr_scm implementation its tests if the
>   extension_flags have associated 'life-remaining' flag set and if yes
>   then return ND_SMART_USED_VALID flag back from
>   ndctl_cmd_smart_get_flags().
>
> Implementing first 3 items above in the current patchset should be
> fairly trivial.
>
> Does that sounds reasonable ?

This sounds good to me.

^ permalink raw reply

* Re: [musl] Re: ppc64le and 32-bit LE userland compatibility
From: Rich Felker @ 2020-06-05 17:50 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: libc-alpha, eery, Daniel Kolesa, musl, Will Springer,
	Palmer Dabbelt via binutils, via libc-dev, Michal Suchánek,
	linuxppc-dev, Joseph Myers
In-Reply-To: <20200605172702.GP31009@gate.crashing.org>

On Fri, Jun 05, 2020 at 12:27:02PM -0500, Segher Boessenkool wrote:
> On Fri, Jun 05, 2020 at 04:18:18AM +0200, Daniel Kolesa wrote:
> > On Fri, Jun 5, 2020, at 01:35, Segher Boessenkool wrote:
> > > > The thing is, I've yet to see in which way the ELFv2 ABI *actually* requires VSX - I don't think compiling for 970 introduces any actual differences. There will be omissions, yes - but then the more accurate thing would be to say that a subset of ELFv2 is used, rather than it being a different ABI per se.
> > > 
> > > Two big things are that binaries that someone else made are supposed to
> > > work for you as well -- including binaries using VSX registers, or any
> > > instructions that require ISA 2.07 (or some older ISA after 970).  This
> > > includes DSOs (shared libraries).  So for a distribution this means that
> > > they will not use VSX *anywhere*, or only in very specialised things.
> > > That is a many-years setback, for people/situations where it could be
> > > used.
> > 
> > Third party precompiled stuff doesn't really need to concern us, since none really exists.
> 
> .... Yet.  And if you claim you support ELFv2, not mentioning the ways
> your implementation deviates from it, users will be unhappy.
> 
> > It's also still an upgrade over ELFv1 regardless (I mean, the same things apply there).
> 
> Yeah, in mostly minor ways, but it all adds up for sure.
> 
> > I'm also not really all that convinced that vectors make a huge difference in non-specialized code (autovectorization still has a way to go)
> 
> They do make a huge difference, depending on the application of course.
> But VSX is not just vectors even: it also gives you twice as many
> floating point scalars (64 now), and in newer versions of the ISA it can
> be beneficially used for integer scalars even.

Vectorization is useful for a lot of things, and I'm sure there are
specialized workloads that benefit from 64 scalars, but I've never
encountered a place where having more than 16 registers made a
practical difference.

The fact that there are specialized areas where this stuff matters
does not imply there aren't huge domains where it's completely
irrelevant.

> > and code written to use vector instructions should probably check
> > auxval and take those paths at runtime.
> 
> No, that is exactly the point of requiring ISA 2.07.  Anything can use
> ISA 2.07 (incl. VSX) without checking first, and without having a
> fallback to some other implementation.  Going from ISA 2.01 to 2.07 is
> more than a decade of improvements, it is not trivial at all.

This only affects code that's non-portable and PPC-specific, which a
lot of people have no interest in and don't care about. Any portable
code is going to either only use vectors via the compiler's choice to
vectorize or conditionally on being one of a set of supported targets
with a vector ISA it supports available. Anyone building for a target
that doesn't have them just gets the portable version of the code.

I think a lot of the unnecessary fighting on this topic is arising
from differences of opinion over what an ABI entails. I would call
what you're talking about a "platform" and more of a platform-specific
*API* than an ABI -- it's about guarantees of interfaces available to
the programmer, not implementation details of linkage.

Rich

^ permalink raw reply

* Re: [musl] Re: ppc64le and 32-bit LE userland compatibility
From: Segher Boessenkool @ 2020-06-05 17:27 UTC (permalink / raw)
  To: Daniel Kolesa
  Cc: Rich Felker, libc-alpha, eery, musl, Will Springer,
	Palmer Dabbelt via binutils, via libc-dev, Michal Suchánek,
	linuxppc-dev, Joseph Myers
In-Reply-To: <17459c98-3bd3-4a5d-a828-993b6deef44f@www.fastmail.com>

On Fri, Jun 05, 2020 at 04:18:18AM +0200, Daniel Kolesa wrote:
> On Fri, Jun 5, 2020, at 01:35, Segher Boessenkool wrote:
> > > The thing is, I've yet to see in which way the ELFv2 ABI *actually* requires VSX - I don't think compiling for 970 introduces any actual differences. There will be omissions, yes - but then the more accurate thing would be to say that a subset of ELFv2 is used, rather than it being a different ABI per se.
> > 
> > Two big things are that binaries that someone else made are supposed to
> > work for you as well -- including binaries using VSX registers, or any
> > instructions that require ISA 2.07 (or some older ISA after 970).  This
> > includes DSOs (shared libraries).  So for a distribution this means that
> > they will not use VSX *anywhere*, or only in very specialised things.
> > That is a many-years setback, for people/situations where it could be
> > used.
> 
> Third party precompiled stuff doesn't really need to concern us, since none really exists.

... Yet.  And if you claim you support ELFv2, not mentioning the ways
your implementation deviates from it, users will be unhappy.

> It's also still an upgrade over ELFv1 regardless (I mean, the same things apply there).

Yeah, in mostly minor ways, but it all adds up for sure.

> I'm also not really all that convinced that vectors make a huge difference in non-specialized code (autovectorization still has a way to go)

They do make a huge difference, depending on the application of course.
But VSX is not just vectors even: it also gives you twice as many
floating point scalars (64 now), and in newer versions of the ISA it can
be beneficially used for integer scalars even.

> and code written to use vector instructions should probably check auxval and take those paths at runtime.

No, that is exactly the point of requiring ISA 2.07.  Anything can use
ISA 2.07 (incl. VSX) without checking first, and without having a
fallback to some other implementation.  Going from ISA 2.01 to 2.07 is
more than a decade of improvements, it is not trivial at all.


> As for other instructions, fair enough, but from my rough testing, it doesn't make such a massive difference for average case

That depends on what you call the average case.  Code that is control
and memory-bound will not benefit much from *anything* :-)

> (and where it does, one can always rebuild their thing with CFLAGS=-mcpu=power9)

Yeah, but it helps quite a bit if your system (shared) libraries get all
improvements they can as well.


I'm not trying to dissuade you from not requiring VSX and 2.07 -- this
sounds like your best option, given the constraints.  I'm just saying
the cost is not trivial (even ignoring the ABI divergence).


> > The target name allows to make such distinctions: this could for example
> > be  powerpc64-*-linux-void  (maybe I put the distinction in the wrong
> > part of the name here?  The glibc people will know better, and "void" is
> > probably not a great name anyway).
> 
> Hm, I'm not a huge fan of putting ABI specifics in the triplet, it feels wrong - there is no precedent for it with POWER (ARM did it with EABI though),

Maybe look at what the various BSDs use?  We do have things like this.

> the last part should remain 'gnu' as it's still glibc; besides, gcc is compiled for exactly one target triplet, and traditionally with ppc compilers it's always been possible to target everything with just one compiler (endian, 32bit, 64bit, abi...).

This isn't completely true.

Yes, the compiler allows you to change word size, endianness, ABI, some
more things.  That does not mean you can actually build working binaries
for all resulting combinations.  As a trivial example, it will still
pick up the same libraries from the same library paths usually, and
those will spectacularly fail to work.

We are biarch for some targets, which means that both powerpc-linux
targets and powerpc64-linux targets can actually handle both of those,
with just -m32 or -m64 needed to switch which configuration is used.
But you cannot magically transparently switch to many other
configurations: for those, you just build a separate toolchain for that
specfic (variant) configuration, in the general case.

> The best way would probably be adding a new -mabi, e.g. -mabi=elfv2-novsx (just an example), which would behave exactly like -mabi=elfv2, except it'd emit some extra detection macro

Yeah, that sounds like a good idea.  Patches welcome :-)

(A separate target name is still needed, but this will make development
simpler for sure).


Segher

^ permalink raw reply

* Re: [PATCH v10 4/6] powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
From: Ira Weiny @ 2020-06-05 17:13 UTC (permalink / raw)
  To: Vaibhav Jain
  Cc: Santosh Sivaraj, linux-nvdimm, linux-kernel, Steven Rostedt,
	Oliver O'Halloran, Aneesh Kumar K . V, Dan Williams,
	linuxppc-dev
In-Reply-To: <20200604234136.253703-5-vaibhav@linux.ibm.com>

On Fri, Jun 05, 2020 at 05:11:34AM +0530, Vaibhav Jain wrote:
> Since papr_scm_ndctl() can be called from outside papr_scm, its
> exposed to the possibility of receiving NULL as value of 'cmd_rc'
> argument. This patch updates papr_scm_ndctl() to protect against such
> possibility by assigning it pointer to a local variable in case cmd_rc
> == NULL.
> 
> Finally the patch also updates the 'default' clause of the switch-case
> block removing a 'return' statement thereby ensuring that value of
> 'cmd_rc' is always logged when papr_scm_ndctl() returns.
> 
> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
> Changelog:
> 
> v9..v10
> * New patch in the series

Thanks for making this a separate patch it is easier to see what is going on
here.

> ---
>  arch/powerpc/platforms/pseries/papr_scm.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> index 0c091622b15e..6512fe6a2874 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -355,11 +355,16 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
>  {
>  	struct nd_cmd_get_config_size *get_size_hdr;
>  	struct papr_scm_priv *p;
> +	int rc;
>  
>  	/* Only dimm-specific calls are supported atm */
>  	if (!nvdimm)
>  		return -EINVAL;
>  
> +	/* Use a local variable in case cmd_rc pointer is NULL */
> +	if (!cmd_rc)
> +		cmd_rc = &rc;
> +

This protects you from the NULL.  However...

>  	p = nvdimm_provider_data(nvdimm);
>  
>  	switch (cmd) {
> @@ -381,12 +386,13 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
>  		break;
>  
>  	default:
> -		return -EINVAL;
> +		dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
> +		*cmd_rc = -EINVAL;

... I think you are conflating rc and cmd_rc...

>  	}
>  
>  	dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc);
>  
> -	return 0;
> +	return *cmd_rc;

... this changes the behavior of the current commands.  Now if the underlying
papr_scm_meta_[get|set]() fails you return that failure as rc rather than 0.

Is that ok?

Also 'logging cmd_rc' in the invalid cmd case does not seem quite right unless
you really want rc to be cmd_rc.

The architecture is designed to separate errors which occur in the kernel vs
errors in the firmware/dimm.  Are they always the same?  The current code
differentiates them.

Ira

>  }
>  
>  static ssize_t flags_show(struct device *dev,
> -- 
> 2.26.2
> 

^ permalink raw reply

* [GIT PULL] Please pull powerpc/linux.git powerpc-5.8-1 tag
From: Michael Ellerman @ 2020-06-05 16:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: ego, emmanuel.nicolet, chenzhou10, jniethe5, linuxram, kernelfans,
	linux-kernel, st5pub, oohall, huhai, elfring, rzinsly, leobras.c,
	mikey, herbert, aneesh.kumar, haren, michal.simek, mahesh, tiwai,
	kjain, leonardo, naveen.n.rao, ravi.bangoria, ajd, arnd, sfr,
	alistair, npiggin, wangxiongfeng2, cai, clg, natechancellor,
	hbathini, christophe.leroy, geoff, dmitry.torokhov, gustavoars,
	wsa, sbobroff, fbarrat, christophe.jaillet, akpm, linuxppc-dev

Hi Linus,

Please pull powerpc updates for 5.8.

Unfortunately we've ended up with quite a few conflicts, which is primarily my
fault for pushing things to next too late. Lesson learnt.

I've pushed the result of my resolution of the conflicts to the powerpc/merge
branch, if you want to look at that, though I've also tried to describe it in
full below.

Firstly there's a conflict in arch/powerpc/kernel/traps.c in
machine_check_exception() vs 69ea03b56ed2 ("hardirq/nmi: Allow nested
nmi_enter()"). That change made nmi_enter() handle nesting natively, but in
parallel we changed our code to only call nmi_enter() on some configurations
which interacts badly.

The condition on the call to nmi_enter() needs updating, as well as the two
calls to nmi_exit(), and then the comment as well. So I've just included the end
result for the bulk of the function:

void machine_check_exception(struct pt_regs *regs)
{
	int recover = 0;

	/*
	 * BOOK3S_64 does not call this handler as a non-maskable interrupt
	 * (it uses its own early real-mode handler to handle the MCE proper
	 * and then raises irq_work to call this handler when interrupts are
	 * enabled).
	 */
	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64))
		nmi_enter();

	__this_cpu_inc(irq_stat.mce_exceptions);

...

	if (check_io_access(regs))
		goto bail;

	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64))
		nmi_exit();

	die("Machine check", regs, SIGBUS);

	/* Must die if the interrupt is not recoverable */
	if (!(regs->msr & MSR_RI))
		die("Unrecoverable Machine check", regs, SIGBUS);

	return;

bail:
	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64))
		nmi_exit();
}

Then there's two conflicts with 2fb4706057bc ("powerpc: add support for folded
p4d page tables").

The first one in arch/powerpc/mm/kasan/kasan_init_32.c in
kasan_remap_early_shadow_ro() is simple, just remove the for loop entirely.

Then in arch/powerpc/mm/ptdump/ptdump.c, in walk_pagetables(), the for loop
should end up being:

	for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
		p4d_t *p4d = p4d_offset(pgd, 0);

		if (p4d_none(*p4d) || p4d_is_leaf(*p4d))
			note_page(st, addr, 1, p4d_val(*p4d), PGDIR_SIZE);
		else if (is_hugepd(__hugepd(p4d_val(*p4d))))
			walk_hugepd(st, (hugepd_t *)p4d, addr, PGDIR_SHIFT, 1);
		else
			/* p4d exists */
			walk_pud(st, p4d, addr);
	}

Finally, we need this hunk applied to avoid a build break:

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 61fc9e8f12d3..af7f13cf90cf 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -239,7 +239,7 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, p
 	pte_basic_t old = pte_val(*p);
 	pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
 	int num, i;
-	pmd_t *pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
+	pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, addr), addr), addr), addr);
 
 	if (!huge)
 		num = PAGE_SIZE / SZ_4K;


Let me know if any of that doesn't make sense or otherwise causes problems.

There's also a few out-of-area changes which I'll highlight FYI:

arch/s390/include/asm/pgtable.h	# 93a98695f2f9 mm: change pmdp_huge_get_and_clear_full take vm_area_struct as arg
include/asm-generic/pgtable.h
mm/huge_memory.c

drivers/input/serio/		# e4f4ffa8a98c input: i8042 - Remove special PowerPC handling
include/linux/hw_breakpoint.h	# ef3534a94fdb hw-breakpoints: Fix build warnings with clang
kernel/events/hw_breakpoint.c	# 29da4f91c0c1 powerpc/watchpoint: Don't allow concurrent perf and ptrace events
sound/...			# f16dca3e30c1 sound: ac97: Remove sound driver for ancient platform

cheers


The following changes since commit ae83d0b416db002fe95601e7f97f64b59514d936:

  Linux 5.7-rc2 (2020-04-19 14:35:30 -0700)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.8-1

for you to fetch changes up to 1395375c592770fe5158a592944aaeed67fa94ff:

  Merge branch 'topic/ppc-kvm' into next (2020-06-03 13:44:51 +1000)

------------------------------------------------------------------
powerpc updates for 5.8

 - Support for userspace to send requests directly to the on-chip GZIP
   accelerator on Power9.

 - Rework of our lockless page table walking (__find_linux_pte()) to make it
   safe against parallel page table manipulations without relying on an IPI for
   serialisation.

 - A series of fixes & enhancements to make our machine check handling more
   robust.

 - Lots of plumbing to add support for "prefixed" (64-bit) instructions on
   Power10.

 - Support for using huge pages for the linear mapping on 8xx (32-bit).

 - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound driver.

 - Removal of some obsolete 40x platforms and associated cruft.

 - Initial support for booting on Power10.

 - Lots of other small features, cleanups & fixes.

Thanks to:
  Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Andrey Abramov,
  Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent Abali, Cédric Le
  Goater, Chen Zhou, Christian Zigotzky, Christophe JAILLET, Christophe Leroy,
  Dmitry Torokhov, Emmanuel Nicolet, Erhard F., Gautham R. Shenoy, Geoff Levand,
  George Spelvin, Greg Kurz, Gustavo A. R. Silva, Gustavo Walbon, Haren Myneni,
  Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Kees Cook, Leonardo
  Bras, Madhavan Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael
  Neuling, Michal Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao,
  Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram
  Pai, Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher
  Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler, Wolfram
  Sang, Xiongfeng Wang.

------------------------------------------------------------------
Alistair Popple (7):
      powerpc: Enable Prefixed Instructions
      powerpc: Add new HWCAP bits
      powerpc: Add support for ISA v3.1
      powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected
      powerpc/dt_cpu_ftrs: Enable Prefixed Instructions
      powerpc/dt_cpu_ftrs: Add MMA feature
      powerpc: Add POWER10 architected mode

Andrew Donnellan (2):
      ocxl: Fix misleading comment
      cxl: Remove dead Kconfig options

Andrey Abramov (1):
      powerpc: module_[32|64].c: replace swap function with built-in one

Aneesh Kumar K.V (24):
      mm: change pmdp_huge_get_and_clear_full take vm_area_struct as arg
      powerpc/pkeys: Avoid using lockless page table walk
      powerpc/pkeys: Check vma before returning key fault error to the user
      powerpc/mm/hash64: use _PAGE_PTE when checking for pte_present
      powerpc/hash64: Restrict page table lookup using init_mm with __flush_hash_table_range
      powerpc/book3s64/hash: Use the pte_t address from the caller
      powerpc/mce: Don't reload pte val in addr_to_pfn
      powerpc/perf/callchain: Use __get_user_pages_fast in read_user_stack_slow
      powerpc/kvm/book3s: switch from raw_spin_*lock to arch_spin_lock.
      powerpc/kvm/book3s: Add helper to walk partition scoped linux page table.
      powerpc/kvm/nested: Add helper to walk nested shadow linux page table.
      powerpc/kvm/book3s: Use kvm helpers to walk shadow or secondary table
      powerpc/kvm/book3s: Add helper for host page table walk
      powerpc/kvm/book3s: Use find_kvm_host_pte in page fault handler
      powerpc/kvm/book3s: Use find_kvm_host_pte in h_enter
      powerpc/kvm/book3s: use find_kvm_host_pte in pute_tce functions
      powerpc/kvm/book3s: Avoid using rmap to protect parallel page table update.
      powerpc/kvm/book3s: use find_kvm_host_pte in kvmppc_book3s_instantiate_page
      powerpc/kvm/book3s: Use find_kvm_host_pte in kvmppc_get_hpa
      powerpc/kvm/book3s: Use pte_present instead of opencoding _PAGE_PRESENT check
      powerpc/mm/book3s64: Avoid sending IPI on clearing PMD
      powerpc/mm/book3s64: Fix MADV_DONTNEED and parallel page fault race
      powerpc/book3s64/radix/tlb: Determine hugepage flush correctly
      powerpc/book3s64/kvm: Fix secondary page table walk warning during migration

Chen Zhou (1):
      powerpc/powernv: add NULL check after kzalloc

Christophe JAILLET (1):
      powerpc/powernv: Fix a warning message

Christophe Leroy (83):
      powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
      powerpc/uaccess: Implement unsafe_copy_to_user() as a simple loop
      powerpc/uaccess: Implement user_read_access_begin and user_write_access_begin
      powerpc/8xx: Update email address in MAINTAINERS
      drivers/powerpc: Replace _ALIGN_UP() by ALIGN()
      powerpc: Replace _ALIGN_DOWN() by ALIGN_DOWN()
      powerpc: Replace _ALIGN_UP() by ALIGN()
      powerpc: Replace _ALIGN() by ALIGN()
      powerpc: Remove _ALIGN_UP(), _ALIGN_DOWN() and _ALIGN()
      powerpc/kasan: Fix stack overflow by increasing THREAD_SHIFT
      powerpc/kasan: Fix error detection on memory allocation
      powerpc/kasan: Fix issues by lowering KASAN_SHADOW_END
      powerpc/kasan: Fix shadow pages allocation failure
      powerpc/kasan: Remove unnecessary page table locking
      powerpc/kasan: Refactor update of early shadow mappings
      powerpc/kasan: Declare kasan_init_region() weak
      powerpc/ptdump: Add _PAGE_COHERENT flag
      powerpc/ptdump: Display size of BATs
      powerpc/ptdump: Standardise display of BAT flags
      powerpc/ptdump: Properly handle non standard page size
      powerpc/ptdump: Handle hugepd at PGD level
      powerpc/32s: Don't warn when mapping RO data ROX.
      powerpc/mm: Allocate static page tables for fixmap
      powerpc/mm: Fix conditions to perform MMU specific management by blocks on PPC32.
      powerpc/mm: PTE_ATOMIC_UPDATES is only for 40x
      powerpc/mm: Refactor pte_update() on nohash/32
      powerpc/mm: Refactor pte_update() on book3s/32
      powerpc/mm: Standardise __ptep_test_and_clear_young() params between PPC32 and PPC64
      powerpc/mm: Standardise pte_update() prototype between PPC32 and PPC64
      powerpc/mm: Create a dedicated pte_update() for 8xx
      powerpc/mm: Reduce hugepd size for 8M hugepages on 8xx
      powerpc/8xx: Drop CONFIG_8xx_COPYBACK option
      powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.
      powerpc/8xx: Manage 512k huge pages as standard pages.
      powerpc/8xx: Only 8M pages are hugepte pages now
      powerpc/8xx: MM_SLICE is not needed anymore
      powerpc/8xx: Move PPC_PIN_TLB options into 8xx Kconfig
      powerpc/8xx: Add function to set pinned TLBs
      powerpc/8xx: Don't set IMMR map anymore at boot
      powerpc/8xx: Always pin TLBs at startup.
      powerpc/8xx: Drop special handling of Linear and IMMR mappings in I/D TLB handlers
      powerpc/8xx: Remove now unused TLB miss functions
      powerpc/8xx: Move DTLB perf handling closer.
      powerpc/mm: Don't be too strict with _etext alignment on PPC32
      powerpc/8xx: Refactor kernel address boundary comparison
      powerpc/8xx: Add a function to early map kernel via huge pages
      powerpc/8xx: Map IMMR with a huge page
      powerpc/8xx: Map linear memory with huge pages
      powerpc/8xx: Allow STRICT_KERNEL_RwX with pinned TLB
      powerpc/8xx: Allow large TLBs with DEBUG_PAGEALLOC
      powerpc/8xx: Implement dedicated kasan_init_region()
      powerpc/32s: Allow mapping with BATs with DEBUG_PAGEALLOC
      powerpc/32s: Implement dedicated kasan_init_region()
      powerpc/40x: Rework 40x PTE access and TLB miss
      powerpc/pgtable: Drop PTE_ATOMIC_UPDATES
      powerpc/40x: Remove support for IBM 403GCX
      powerpc/40x: Remove STB03xxx
      powerpc/40x: Remove WALNUT
      powerpc/40x: Remove EP405
      powerpc/40x: Remove support for ISS Simulator
      powerpc/40x: Remove support for IBM 405GP
      powerpc/40x: Remove IBM405 Erratum #51
      powerpc: Remove IBM405 Erratum #77
      powerpc/40x: Avoid using r12 in TLB miss handlers
      powerpc/40x: Don't save CR in SPRN_SPRG_SCRATCH6
      powerpc/kprobes: Use probe_address() to read instructions
      powerpc/52xx: Blacklist functions running with MMU disabled for kprobe
      powerpc/82xx: Blacklist pq2_restart() for kprobe
      powerpc/83xx: Blacklist mpc83xx_deep_resume() for kprobe
      powerpc/powermac: Blacklist functions running with MMU disabled for kprobe
      powerpc/mem: Blacklist flush_dcache_icache_phys() for kprobe
      powerpc/32s: Make local symbols non visible in hash_low.
      powerpc/32s: Blacklist functions running with MMU disabled for kprobe
      powerpc/rtas: Remove machine_check_in_rtas()
      powerpc/32: Blacklist functions running with MMU disabled for kprobe
      powerpc/entry32: Blacklist exception entry points for kprobe.
      powerpc/entry32: Blacklist syscall exit points for kprobe.
      powerpc/entry32: Blacklist exception exit points for kprobe.
      powerpc/8xx: Reduce time spent in allow_user_access() and friends
      powerpc/uaccess: Don't set KUAP by default on book3s/32
      powerpc/uaccess: Don't set KUEP by default on book3s/32
      powerpc/32: Disable KASAN with pages bigger than 16k
      powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG

Cédric Le Goater (3):
      powerpc/xive: Enforce load-after-store ordering when StoreEOI is active
      powerpc/xive: Clear the page tables for the ESB IO mapping
      powerpc/xive: Do not expose a debugfs file when XIVE is disabled

Dmitry Torokhov (1):
      macintosh/ams-input: switch to using input device polling mode

Emmanuel Nicolet (1):
      ps3disk: use the default segment boundary

Gautham R. Shenoy (5):
      powerpc: Move idle_loop_prolog()/epilog() functions to header file
      powerpc/idle: Store PURR snapshot in a per-cpu global variable
      powerpc/pseries: Account for SPURR ticks on idle CPUs
      powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
      Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr

Geoff Levand (5):
      powerpc/head_check: Automatic verbosity
      powerpc/wrapper: Output linker map file
      powerpc/head_check: Avoid broken pipe
      powerpc/ps3: Fix kexec shutdown hang
      powerpc/ps3: Add check for otheros image size

Gustavo A. R. Silva (2):
      powerpc: Replace zero-length array with flexible-array
      powerpc/mm: Replace zero-length array with flexible-array

Haren Myneni (23):
      powerpc/xive: Define xive_native_alloc_irq_on_chip()
      powerpc/vas: Define nx_fault_stamp in coprocessor_request_block
      powerpc/vas: Alloc and setup IRQ and trigger port address
      powerpc/vas: Setup fault window per VAS instance
      powerpc/vas: Register NX with fault window ID and IRQ port value
      powerpc/vas: Take reference to PID and mm for user space windows
      powerpc/vas: Setup thread IRQ handler per VAS instance
      powerpc/vas: Update CSB and notify process for fault CRBs
      powerpc/vas: Return credits after handling fault
      powerpc/vas: Print CRB and FIFO values
      powerpc/vas: Do not use default credits for receive window
      powerpc/vas: Display process stuck message
      powerpc/vas: Free send window in VAS instance after credits returned
      powerpc: Use mm_context vas_windows counter to issue CP_ABORT
      powerpc/vas: Initialize window attributes for GZIP coprocessor type
      powerpc/vas: Define VAS_TX_WIN_OPEN ioctl API
      powerpc/vas: Add VAS user space API
      crypto/nx: Initialize coproc entry with kzalloc
      crypto/nx: Rename nx-842-powernv file name to nx-common-powernv
      crypto/nx: Make enable code generic to add new GZIP compression type
      crypto/nx: Enable and setup GZIP compression type
      crypto/nx: Remove 'pid' in vas_tx_win_attr struct
      Documentation/powerpc: VAS API

Hari Bathini (3):
      powerpc/fadump: use static allocation for reserved memory ranges
      powerpc/fadump: consider reserved ranges while reserving memory
      powerpc/fadump: Account for memory_limit while reserving memory

Jordan Niethe (30):
      powerpc/xmon: Remove store_inst() for patch_instruction()
      powerpc/xmon: Move breakpoint instructions to own array
      powerpc/xmon: Move breakpoints to text section
      powerpc/xmon: Use bitwise calculations in_breakpoint_table()
      powerpc: Change calling convention for create_branch() et. al.
      powerpc: Use a macro for creating instructions from u32s
      powerpc: Use an accessor for instructions
      powerpc: Use a function for getting the instruction op code
      powerpc: Use a function for byte swapping instructions
      powerpc: Introduce functions for instruction equality
      powerpc: Use a datatype for instructions
      powerpc: Use a function for reading instructions
      powerpc: Add a probe_user_read_inst() function
      powerpc: Add a probe_kernel_read_inst() function
      powerpc/kprobes: Use patch_instruction()
      powerpc: Define and use get_user_instr() et. al.
      powerpc: Introduce a function for reporting instruction length
      powerpc/xmon: Use a function for reading instructions
      powerpc/xmon: Move insertion of breakpoint for xol'ing
      powerpc: Make test_translate_branch() independent of instruction length
      powerpc: Define new SRR1 bits for a ISA v3.1
      powerpc/optprobes: Add register argument to patch_imm64_load_insns()
      powerpc: Add prefixed instructions to instruction data type
      powerpc: Test prefixed code patching
      powerpc: Test prefixed instructions in feature fixups
      powerpc/xmon: Don't allow breakpoints on suffixes
      powerpc/kprobes: Don't allow breakpoints on suffixes
      powerpc: Support prefixed instructions in alignment handler
      powerpc sstep: Add support for prefixed load/stores
      powerpc sstep: Add support for prefixed fixed-point arithmetic

Kajol Jain (5):
      powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple hv-24x7 events run
      powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor details
      powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show processor details
      Documentation/ABI: Add ABI documentation for chips and sockets
      powerpc/pseries: Update hv-24x7 information after migration

Leonardo Bras (4):
      powerpc/kernel: Enables memory hot-remove after reboot on pseries guests
      powerpc/crash: Use NMI context for printk when starting to crash
      powerpc/rtas: Move type/struct definitions from rtas.h into rtas-types.h
      powerpc/rtas: Implement reentrant rtas call

Markus Elfring (2):
      drivers/ps3: Remove duplicate error messages
      net/ps3_gelic_net: Remove duplicate error message

Michael Ellerman (24):
      Merge VAS page fault handling into next
      Merge NX gzip support into next
      Merge branch 'topic/uaccess' into topic/uaccess-ppc
      Merge tag 'kvm-ppc-fixes-5.7-1' into topic/ppc-kvm
      Merge the lockless page table walk rework into next
      powerpc/uaccess: Don't use "m<>" constraint
      powerpc/64: Don't initialise init_task->thread.regs
      powerpc: Drop unneeded cast in task_pt_regs()
      selftests/powerpc: Add a test of counting larx/stcx
      drivers/macintosh: Fix memleak in windfarm_pm112 driver
      powerpc/64: Update Speculation_Store_Bypass in /proc/<pid>/status
      Merge branch 'topic/uaccess-ppc' into next
      Merge branch 'topic/ppc-kvm' into next
      Merge "Use hugepages to map kernel mem on 8xx" into next
      Merge branch 'fixes' into next
      powerpc: Add ppc_inst_next()
      powerpc: Add ppc_inst_as_u64()
      powerpc/configs/64s: Enable CONFIG_PRINTK_CALLER
      powerpc/xmon: Show task->thread.regs in process display
      powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()
      powerpc/64s: Don't let DT CPU features set FSCR_DSCR
      powerpc/64s: Save FSCR to init_task.thread.fscr after feature init
      powerpc/64s: Don't set FSCR bits in INIT_THREAD
      Merge branch 'topic/ppc-kvm' into next

Michael Neuling (3):
      powerpc/tm: Document h/rfid and mtmsrd quirk
      powerpc: Fix misleading small cores print
      powerpc/configs: Add LIBNVDIMM to ppc64_defconfig

Michal Simek (2):
      powerpc: Remove Xilinx PPC405/PPC440 support
      sound: ac97: Remove sound driver for ancient platform

Nathan Chancellor (2):
      powerpc/wii: Fix declaration made after definition
      input: i8042 - Remove special PowerPC handling

Naveen N. Rao (4):
      powerpc/64: Have MPROFILE_KERNEL depend on FUNCTION_TRACER
      powerpc/module_64: Consolidate ftrace code
      powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations
      powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel

Nicholas Piggin (26):
      powerpc/64s: Always has full regs, so remove remnant checks
      powerpc: Use set_trap() and avoid open-coding trap masking
      powerpc: trap_is_syscall() helper to hide syscall trap number
      powerpc: Use trap metadata to prevent double restart rather than zeroing trap
      powerpc/64s/exception: Fix machine check no-loss idle wakeup
      powerpc/64s/exceptions: Fix in_mce accounting in unrecoverable path
      powerpc/64s/exceptions: Change irq reconcile for NMIs from reusing _DAR to RESULT
      powerpc/64s/exceptions: Machine check reconcile irq state
      powerpc/pseries/ras: Avoid calling rtas_token() in NMI paths
      powerpc/pseries/ras: Fix FWNMI_VALID off by one
      powerpc/pseries/ras: fwnmi avoid modifying r3 in error case
      powerpc/pseries/ras: fwnmi sreset should not interlock
      powerpc/pseries: Limit machine check stack to 4GB
      powerpc/pseries: Machine check use rtas_call_unlocked() with args on stack
      powerpc/64s: machine check interrupt update NMI accounting
      powerpc: Implement ftrace_enabled() helpers
      powerpc/64s: machine check do not trace real-mode handler
      powerpc/traps: Do not trace system reset
      powerpc/traps: Make unrecoverable NMIs die instead of panic
      powerpc/64s: Fix early_init_mmu section mismatch
      powerpc/64s/hash: Add stress_slb kernel boot option to increase SLB faults
      powerpc/64s/radix: Don't prefetch DAR in update_mmu_cache
      powerpc/64: Refactor interrupt exit irq disabling sequence
      powerpc/64s/kuap: Add missing isync to KUAP restore paths
      powerpc/64/kuap: Conditionally restore AMR in interrupt exit
      powerpc/64s/kuap: Conditionally restore AMR in kuap_restore_amr asm

Oliver O'Halloran (14):
      powerpc/powernv/npu: Clean up compound table group initialisation
      powerpc/powernv/iov: Don't add VFs to iommu group during PE config
      powerpc/powernv/pci: Register iommu group at PE DMA setup
      powerpc/powernv/pci: Add device to iommu group during dma_dev_setup()
      powerpc/powernv/pci: Delete old iommu recursive iommu setup
      powerpc/powernv/pci: Move tce size parsing to pci-ioda-tce.c
      powerpc/powernv/npu: Move IOMMU group setup into npu-dma.c
      powerpc/powernv: Add a print indicating when an IODA PE is released
      powerpc/powernv/pci: Add an explaination for PNV_IODA_PE_BUS_ALL
      powerpc/powernv/pci: Add helper to find ioda_pe from BDFN
      powerpc/powernv/pci: Re-work bus PE configuration
      powerpc/powernv/pci: Reserve the root bus PE during init
      powerpc/powernv/pci: Sprinkle around some WARN_ON()s
      powerpc/pseries: Make vio and ibmebus initcalls pseries specific

Pingfan Liu (1):
      powerpc/crashkernel: Take "mem=" option into account

Qian Cai (1):
      powerpc/64s/pgtable: fix an undefined behaviour

Ram Pai (1):
      powerpc/xive: Share the event-queue page with the Hypervisor.

Raphael Moreira Zinsly (5):
      selftests/powerpc: Add header files for GZIP engine test
      selftests/powerpc: Add header files for NX compresion/decompression
      selftests/powerpc: Add NX-GZIP engine compress testcase
      selftests/powerpc: Add NX-GZIP engine decompress testcase
      selftests/powerpc: Add README for GZIP engine tests

Ravi Bangoria (17):
      powerpc/watchpoint: Rename current DAWR macros
      powerpc/watchpoint: Add SPRN macros for second DAWR
      powerpc/watchpoint: Introduce function to get nr watchpoints dynamically
      powerpc/watchpoint/ptrace: Return actual num of available watchpoints
      powerpc/watchpoint: Provide DAWR number to set_dawr
      powerpc/watchpoint: Provide DAWR number to __set_breakpoint
      powerpc/watchpoint: Get watchpoint count dynamically while disabling them
      powerpc/watchpoint: Disable all available watchpoints when !dawr_force_enable
      powerpc/watchpoint: Convert thread_struct->hw_brk to an array
      powerpc/watchpoint: Use loop for thread_struct->ptrace_bps
      powerpc/watchpoint: Introduce is_ptrace_bp() function
      powerpc/watchpoint: Use builtin ALIGN*() macros
      powerpc/watchpoint: Prepare handler to handle more than one watchpoint
      powerpc/watchpoint: Don't allow concurrent perf and ptrace events
      powerpc/watchpoint/xmon: Don't allow breakpoint overwriting
      powerpc/watchpoint/xmon: Support 2nd DAWR
      hw-breakpoints: Fix build warnings with clang

Sam Bobroff (2):
      powerpc/eeh: Fix pseries_eeh_configure_bridge()
      powerpc/eeh: Release EEH device state synchronously

Stephen Rothwell (1):
      powerpc/vas: Include linux/types.h in uapi/asm/vas-api.h

Wolfram Sang (1):
      powerpc/5200: update contact email

Xiongfeng Wang (1):
      powerpc/ps3: Move static keyword to the front of declaration

huhai (1):
      powerpc/4xx: Don't unmap NULL mbase


 Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 |   21 +
 Documentation/ABI/testing/sysfs-devices-system-cpu               |   39 +
 Documentation/admin-guide/kernel-parameters.txt                  |    5 +
 Documentation/devicetree/bindings/xilinx.txt                     |  143 ---
 Documentation/powerpc/bootwrapper.rst                            |   28 +-
 Documentation/powerpc/index.rst                                  |    1 +
 Documentation/powerpc/transactional_memory.rst                   |   27 +
 Documentation/powerpc/vas-api.rst                                |  292 +++++
 Documentation/userspace-api/ioctl/ioctl-number.rst               |    1 +
 MAINTAINERS                                                      |    2 +-
 arch/powerpc/Kconfig                                             |   69 +-
 arch/powerpc/Kconfig.debug                                       |    2 +-
 arch/powerpc/boot/Makefile                                       |   14 +-
 arch/powerpc/boot/dts/Makefile                                   |    1 -
 arch/powerpc/boot/dts/ep405.dts                                  |  230 ----
 arch/powerpc/boot/dts/pcm032.dts                                 |    4 +-
 arch/powerpc/boot/dts/virtex440-ml507.dts                        |  406 ------
 arch/powerpc/boot/dts/virtex440-ml510.dts                        |  466 -------
 arch/powerpc/boot/dts/walnut.dts                                 |  246 ----
 arch/powerpc/boot/ep405.c                                        |   71 --
 arch/powerpc/boot/ops.h                                          |    1 -
 arch/powerpc/boot/serial.c                                       |    5 -
 arch/powerpc/boot/treeboot-walnut.c                              |   81 --
 arch/powerpc/boot/uartlite.c                                     |   79 --
 arch/powerpc/boot/virtex.c                                       |   97 --
 arch/powerpc/boot/virtex405-head.S                               |   31 -
 arch/powerpc/boot/wrapper                                        |   26 +-
 arch/powerpc/configs/40x/acadia_defconfig                        |    1 -
 arch/powerpc/configs/40x/ep405_defconfig                         |   62 -
 arch/powerpc/configs/40x/kilauea_defconfig                       |    1 -
 arch/powerpc/configs/40x/klondike_defconfig                      |    1 -
 arch/powerpc/configs/40x/makalu_defconfig                        |    1 -
 arch/powerpc/configs/40x/obs600_defconfig                        |    1 -
 arch/powerpc/configs/40x/virtex_defconfig                        |   75 --
 arch/powerpc/configs/44x/virtex5_defconfig                       |   74 --
 arch/powerpc/configs/adder875_defconfig                          |    1 -
 arch/powerpc/configs/ep88xc_defconfig                            |    1 -
 arch/powerpc/configs/mpc866_ads_defconfig                        |    1 -
 arch/powerpc/configs/mpc885_ads_defconfig                        |    1 -
 arch/powerpc/configs/powernv_defconfig                           |    1 +
 arch/powerpc/configs/ppc40x_defconfig                            |    9 -
 arch/powerpc/configs/ppc44x_defconfig                            |    8 -
 arch/powerpc/configs/ppc64_defconfig                             |    2 +
 arch/powerpc/configs/pseries_defconfig                           |    1 +
 arch/powerpc/configs/tqm8xx_defconfig                            |    1 -
 arch/powerpc/include/asm/asm-405.h                               |   19 -
 arch/powerpc/include/asm/atomic.h                                |   11 -
 arch/powerpc/include/asm/bitops.h                                |    4 -
 arch/powerpc/include/asm/book3s/32/kup.h                         |    7 +-
 arch/powerpc/include/asm/book3s/32/pgtable.h                     |   82 +-
 arch/powerpc/include/asm/book3s/64/kup-radix.h                   |   41 +-
 arch/powerpc/include/asm/book3s/64/mmu.h                         |    5 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h                     |   50 +-
 arch/powerpc/include/asm/book3s/64/tlbflush-hash.h               |    3 +-
 arch/powerpc/include/asm/cache.h                                 |    2 +-
 arch/powerpc/include/asm/cmpxchg.h                               |   11 -
 arch/powerpc/include/asm/code-patching.h                         |   37 +-
 arch/powerpc/include/asm/cputable.h                              |   22 +-
 arch/powerpc/include/asm/debug.h                                 |    2 +-
 arch/powerpc/include/asm/drmem.h                                 |    1 +
 arch/powerpc/include/asm/fadump-internal.h                       |    4 +-
 arch/powerpc/include/asm/firmware.h                              |    1 +
 arch/powerpc/include/asm/fixmap.h                                |    4 +
 arch/powerpc/include/asm/ftrace.h                                |   14 +
 arch/powerpc/include/asm/futex.h                                 |    3 -
 arch/powerpc/include/asm/hugetlb.h                               |    4 -
 arch/powerpc/include/asm/hw_breakpoint.h                         |   31 +-
 arch/powerpc/include/asm/icswx.h                                 |   20 +-
 arch/powerpc/include/asm/idle.h                                  |   93 ++
 arch/powerpc/include/asm/inst.h                                  |  131 ++
 arch/powerpc/include/asm/iommu.h                                 |    4 +-
 arch/powerpc/include/asm/kasan.h                                 |   10 +-
 arch/powerpc/include/asm/kprobes.h                               |    2 +-
 arch/powerpc/include/asm/kup.h                                   |   14 +-
 arch/powerpc/include/asm/kvm_book3s.h                            |    2 +-
 arch/powerpc/include/asm/kvm_book3s_64.h                         |   44 +-
 arch/powerpc/include/asm/mmu.h                                   |   10 +-
 arch/powerpc/include/asm/mmu_context.h                           |   30 +
 arch/powerpc/include/asm/module.h                                |    3 -
 arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h                 |   32 +-
 arch/powerpc/include/asm/nohash/32/mmu-8xx.h                     |   90 +-
 arch/powerpc/include/asm/nohash/32/pgtable.h                     |  120 +-
 arch/powerpc/include/asm/nohash/32/pte-40x.h                     |   23 +-
 arch/powerpc/include/asm/nohash/32/pte-8xx.h                     |    4 +-
 arch/powerpc/include/asm/nohash/32/slice.h                       |   20 -
 arch/powerpc/include/asm/nohash/64/pgtable.h                     |   28 +-
 arch/powerpc/include/asm/nohash/pgtable.h                        |    4 +-
 arch/powerpc/include/asm/paca.h                                  |    2 +
 arch/powerpc/include/asm/page.h                                  |    7 -
 arch/powerpc/include/asm/pgtable.h                               |    2 +
 arch/powerpc/include/asm/ppc-opcode.h                            |    3 +
 arch/powerpc/include/asm/processor.h                             |   11 +-
 arch/powerpc/include/asm/prom.h                                  |    1 +
 arch/powerpc/include/asm/ptrace.h                                |   46 +-
 arch/powerpc/include/asm/reg.h                                   |   19 +-
 arch/powerpc/include/asm/reg_booke.h                             |   54 -
 arch/powerpc/include/asm/rtas-types.h                            |  124 ++
 arch/powerpc/include/asm/rtas.h                                  |  125 +-
 arch/powerpc/include/asm/slice.h                                 |    2 -
 arch/powerpc/include/asm/spinlock.h                              |    4 -
 arch/powerpc/include/asm/sstep.h                                 |   17 +-
 arch/powerpc/include/asm/switch_to.h                             |    2 -
 arch/powerpc/include/asm/syscall.h                               |    5 +-
 arch/powerpc/include/asm/time.h                                  |   12 -
 arch/powerpc/include/asm/uaccess.h                               |  149 ++-
 arch/powerpc/include/asm/uprobes.h                               |    7 +-
 arch/powerpc/include/asm/vas.h                                   |   13 +-
 arch/powerpc/include/asm/xilinx_intc.h                           |   16 -
 arch/powerpc/include/asm/xilinx_pci.h                            |   21 -
 arch/powerpc/include/asm/xive-regs.h                             |    8 +
 arch/powerpc/include/asm/xive.h                                  |    9 +-
 arch/powerpc/include/uapi/asm/cputable.h                         |    2 +
 arch/powerpc/include/uapi/asm/vas-api.h                          |   24 +
 arch/powerpc/kernel/align.c                                      |   18 +-
 arch/powerpc/kernel/asm-offsets.c                                |    8 +
 arch/powerpc/kernel/cpu_setup_6xx.S                              |    2 +
 arch/powerpc/kernel/cpu_setup_power.S                            |   22 +-
 arch/powerpc/kernel/cputable.c                                   |  124 +-
 arch/powerpc/kernel/crash_dump.c                                 |    7 +-
 arch/powerpc/kernel/dawr.c                                       |   23 +-
 arch/powerpc/kernel/dt_cpu_ftrs.c                                |   32 +-
 arch/powerpc/kernel/eeh.c                                        |   31 +
 arch/powerpc/kernel/entry_32.S                                   |   69 +-
 arch/powerpc/kernel/entry_64.S                                   |    8 +-
 arch/powerpc/kernel/epapr_paravirt.c                             |    7 +-
 arch/powerpc/kernel/exceptions-64s.S                             |   51 +-
 arch/powerpc/kernel/fadump.c                                     |  155 ++-
 arch/powerpc/kernel/fpu.S                                        |    1 +
 arch/powerpc/kernel/head_32.S                                    |    2 +-
 arch/powerpc/kernel/head_40x.S                                   |  316 +----
 arch/powerpc/kernel/head_64.S                                    |    9 +-
 arch/powerpc/kernel/head_8xx.S                                   |  354 +++---
 arch/powerpc/kernel/head_booke.h                                 |    2 +-
 arch/powerpc/kernel/hw_breakpoint.c                              |  641 ++++++++--
 arch/powerpc/kernel/idle_6xx.S                                   |    1 +
 arch/powerpc/kernel/idle_e500.S                                  |    1 +
 arch/powerpc/kernel/jump_label.c                                 |    5 +-
 arch/powerpc/kernel/kgdb.c                                       |    9 +-
 arch/powerpc/kernel/kprobes.c                                    |   47 +-
 arch/powerpc/kernel/l2cr_6xx.S                                   |    1 +
 arch/powerpc/kernel/mce.c                                        |   16 +-
 arch/powerpc/kernel/mce_power.c                                  |   19 +-
 arch/powerpc/kernel/misc.S                                       |    2 +
 arch/powerpc/kernel/misc_32.S                                    |   11 +-
 arch/powerpc/kernel/module_32.c                                  |   17 +-
 arch/powerpc/kernel/module_64.c                                  |  301 ++---
 arch/powerpc/kernel/nvram_64.c                                   |    4 +-
 arch/powerpc/kernel/optprobes.c                                  |   99 +-
 arch/powerpc/kernel/optprobes_head.S                             |    3 +
 arch/powerpc/kernel/paca.c                                       |   32 +
 arch/powerpc/kernel/pci-hotplug.c                                |    2 -
 arch/powerpc/kernel/pci_64.c                                     |    6 +-
 arch/powerpc/kernel/process.c                                    |  113 +-
 arch/powerpc/kernel/prom.c                                       |   38 +-
 arch/powerpc/kernel/prom_init.c                                  |   36 +-
 arch/powerpc/kernel/ptrace/ptrace-noadv.c                        |   72 +-
 arch/powerpc/kernel/ptrace/ptrace-tm.c                           |    2 +-
 arch/powerpc/kernel/ptrace/ptrace-view.c                         |    2 +-
 arch/powerpc/kernel/ptrace/ptrace32.c                            |    4 +-
 arch/powerpc/kernel/rtas.c                                       |   52 +
 arch/powerpc/kernel/security.c                                   |   48 +-
 arch/powerpc/kernel/setup-common.c                               |    4 -
 arch/powerpc/kernel/setup_32.c                                   |   10 +-
 arch/powerpc/kernel/setup_64.c                                   |   15 +-
 arch/powerpc/kernel/signal.c                                     |   22 +-
 arch/powerpc/kernel/signal_32.c                                  |    2 +-
 arch/powerpc/kernel/signal_64.c                                  |   10 +-
 arch/powerpc/kernel/smp.c                                        |    2 +-
 arch/powerpc/kernel/swsusp_32.S                                  |    2 +
 arch/powerpc/kernel/syscall_64.c                                 |   72 +-
 arch/powerpc/kernel/sysfs.c                                      |   82 +-
 arch/powerpc/kernel/trace/ftrace.c                               |  168 +--
 arch/powerpc/kernel/traps.c                                      |   46 +-
 arch/powerpc/kernel/uprobes.c                                    |    5 +-
 arch/powerpc/kernel/vecemu.c                                     |   20 +-
 arch/powerpc/kernel/vector.S                                     |    1 +
 arch/powerpc/kernel/vmlinux.lds.S                                |    3 +-
 arch/powerpc/kexec/core.c                                        |    8 +-
 arch/powerpc/kexec/crash.c                                       |    3 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c                              |   13 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c                           |   71 +-
 arch/powerpc/kvm/book3s_64_vio_hv.c                              |   66 +-
 arch/powerpc/kvm/book3s_hv.c                                     |   15 +-
 arch/powerpc/kvm/book3s_hv_nested.c                              |   39 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c                              |   60 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S                          |   23 +-
 arch/powerpc/kvm/book3s_xive_native.c                            |    6 +
 arch/powerpc/kvm/book3s_xive_template.c                          |    3 +
 arch/powerpc/kvm/emulate_loadstore.c                             |    2 +-
 arch/powerpc/lib/Makefile                                        |    2 +-
 arch/powerpc/lib/code-patching.c                                 |  307 +++--
 arch/powerpc/lib/feature-fixups-test.S                           |   69 ++
 arch/powerpc/lib/feature-fixups.c                                |  163 ++-
 arch/powerpc/lib/inst.c                                          |   73 ++
 arch/powerpc/lib/sstep.c                                         |  460 ++++---
 arch/powerpc/lib/test_code-patching.S                            |   20 +
 arch/powerpc/lib/test_emulate_step.c                             |   56 +-
 arch/powerpc/mm/book3s32/hash_low.S                              |   32 +-
 arch/powerpc/mm/book3s32/mmu.c                                   |   12 +-
 arch/powerpc/mm/book3s64/hash_pgtable.c                          |   11 -
 arch/powerpc/mm/book3s64/hash_tlb.c                              |   22 +-
 arch/powerpc/mm/book3s64/hash_utils.c                            |   72 +-
 arch/powerpc/mm/book3s64/internal.h                              |   16 +
 arch/powerpc/mm/book3s64/pgtable.c                               |   37 +-
 arch/powerpc/mm/book3s64/radix_pgtable.c                         |   21 +-
 arch/powerpc/mm/book3s64/radix_tlb.c                             |    4 +-
 arch/powerpc/mm/book3s64/slb.c                                   |  166 ++-
 arch/powerpc/mm/fault.c                                          |   92 +-
 arch/powerpc/mm/hugetlbpage.c                                    |   43 +-
 arch/powerpc/mm/init_32.c                                        |   12 +-
 arch/powerpc/mm/init_64.c                                        |    4 +-
 arch/powerpc/mm/kasan/8xx.c                                      |   74 ++
 arch/powerpc/mm/kasan/Makefile                                   |    2 +
 arch/powerpc/mm/kasan/book3s_32.c                                |   57 +
 arch/powerpc/mm/kasan/kasan_init_32.c                            |   88 +-
 arch/powerpc/mm/mem.c                                            |    2 +
 arch/powerpc/mm/mmu_decl.h                                       |    4 +
 arch/powerpc/mm/nohash/40x.c                                     |    4 +-
 arch/powerpc/mm/nohash/8xx.c                                     |  227 ++--
 arch/powerpc/mm/pgtable.c                                        |   34 +-
 arch/powerpc/mm/pgtable_32.c                                     |   22 +-
 arch/powerpc/mm/ptdump/8xx.c                                     |    5 +
 arch/powerpc/mm/ptdump/bats.c                                    |   41 +-
 arch/powerpc/mm/ptdump/ptdump.c                                  |   73 +-
 arch/powerpc/mm/ptdump/ptdump.h                                  |    3 +
 arch/powerpc/mm/ptdump/shared.c                                  |    5 +
 arch/powerpc/mm/slice.c                                          |    2 +-
 arch/powerpc/perf/8xx-pmu.c                                      |   19 +-
 arch/powerpc/perf/callchain_64.c                                 |   46 +-
 arch/powerpc/perf/core-book3s.c                                  |    4 +-
 arch/powerpc/perf/hv-24x7.c                                      |   96 +-
 arch/powerpc/platforms/40x/Kconfig                               |   76 --
 arch/powerpc/platforms/40x/Makefile                              |    3 -
 arch/powerpc/platforms/40x/ep405.c                               |  123 --
 arch/powerpc/platforms/40x/virtex.c                              |   54 -
 arch/powerpc/platforms/40x/walnut.c                              |   65 -
 arch/powerpc/platforms/44x/Kconfig                               |   40 +-
 arch/powerpc/platforms/44x/Makefile                              |    2 -
 arch/powerpc/platforms/44x/virtex.c                              |   60 -
 arch/powerpc/platforms/44x/virtex_ml510.c                        |   30 -
 arch/powerpc/platforms/4xx/pci.c                                 |    4 +-
 arch/powerpc/platforms/52xx/lite5200_sleep.S                     |    2 +
 arch/powerpc/platforms/82xx/pq2.c                                |    3 +
 arch/powerpc/platforms/83xx/suspend-asm.S                        |    1 +
 arch/powerpc/platforms/86xx/mpc86xx_smp.c                        |    5 +-
 arch/powerpc/platforms/8xx/Kconfig                               |   50 +-
 arch/powerpc/platforms/Kconfig                                   |    4 -
 arch/powerpc/platforms/Kconfig.cputype                           |    6 +-
 arch/powerpc/platforms/cell/iommu.c                              |    6 +-
 arch/powerpc/platforms/embedded6xx/wii.c                         |   25 +-
 arch/powerpc/platforms/powermac/bootx_init.c                     |   14 +-
 arch/powerpc/platforms/powermac/cache.S                          |    2 +
 arch/powerpc/platforms/powermac/nvram.c                          |    2 +-
 arch/powerpc/platforms/powermac/sleep.S                          |    5 +-
 arch/powerpc/platforms/powermac/smp.c                            |    5 +-
 arch/powerpc/platforms/powernv/Makefile                          |    2 +-
 arch/powerpc/platforms/powernv/idle.c                            |    2 +-
 arch/powerpc/platforms/powernv/npu-dma.c                         |  117 +-
 arch/powerpc/platforms/powernv/opal-fadump.c                     |    2 +-
 arch/powerpc/platforms/powernv/opal.c                            |    4 +
 arch/powerpc/platforms/powernv/pci-ioda-tce.c                    |   28 +
 arch/powerpc/platforms/powernv/pci-ioda.c                        |  299 ++---
 arch/powerpc/platforms/powernv/pci.c                             |   20 -
 arch/powerpc/platforms/powernv/pci.h                             |   28 +-
 arch/powerpc/platforms/powernv/vas-api.c                         |  278 +++++
 arch/powerpc/platforms/powernv/vas-debug.c                       |    2 +-
 arch/powerpc/platforms/powernv/vas-fault.c                       |  382 ++++++
 arch/powerpc/platforms/powernv/vas-window.c                      |  238 +++-
 arch/powerpc/platforms/powernv/vas.c                             |   85 +-
 arch/powerpc/platforms/powernv/vas.h                             |   59 +-
 arch/powerpc/platforms/ps3/mm.c                                  |   52 +-
 arch/powerpc/platforms/ps3/setup.c                               |    2 +-
 arch/powerpc/platforms/pseries/eeh_pseries.c                     |    8 +-
 arch/powerpc/platforms/pseries/ibmebus.c                         |    3 +-
 arch/powerpc/platforms/pseries/mobility.c                        |    3 +
 arch/powerpc/platforms/pseries/ras.c                             |   62 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c                     |    2 +-
 arch/powerpc/platforms/pseries/setup.c                           |   22 +-
 arch/powerpc/platforms/pseries/vio.c                             |    7 +-
 arch/powerpc/sysdev/Makefile                                     |    2 -
 arch/powerpc/sysdev/cpm_common.c                                 |    2 +
 arch/powerpc/sysdev/xics/ics-rtas.c                              |   22 +-
 arch/powerpc/sysdev/xilinx_intc.c                                |   88 --
 arch/powerpc/sysdev/xilinx_pci.c                                 |  132 --
 arch/powerpc/sysdev/xive/common.c                                |   13 +-
 arch/powerpc/sysdev/xive/native.c                                |    6 +-
 arch/powerpc/sysdev/xive/spapr.c                                 |    7 +
 arch/powerpc/tools/head_check.sh                                 |    8 +-
 arch/powerpc/xmon/Makefile                                       |    2 +-
 arch/powerpc/xmon/xmon.c                                         |  229 ++--
 arch/powerpc/xmon/xmon_bpts.S                                    |   11 +
 arch/powerpc/xmon/xmon_bpts.h                                    |   14 +
 arch/s390/include/asm/pgtable.h                                  |    4 +-
 drivers/block/ps3disk.c                                          |    1 -
 drivers/char/Kconfig                                             |    2 +-
 drivers/cpuidle/cpuidle-pseries.c                                |   39 +-
 drivers/crypto/nx/Makefile                                       |    2 +-
 drivers/crypto/nx/{nx-842-powernv.c => nx-common-powernv.c}      |  204 ++-
 drivers/input/serio/i8042-ppcio.h                                |   57 -
 drivers/input/serio/i8042.h                                      |    2 -
 drivers/macintosh/Kconfig                                        |    1 -
 drivers/macintosh/ams/ams-input.c                                |   37 +-
 drivers/macintosh/ams/ams.h                                      |    4 +-
 drivers/macintosh/windfarm_pm112.c                               |   21 +-
 drivers/misc/cxl/Kconfig                                         |    8 -
 drivers/misc/ocxl/context.c                                      |    2 +-
 drivers/net/ethernet/toshiba/ps3_gelic_net.c                     |    2 -
 drivers/ps3/ps3-lpm.c                                            |    8 +-
 drivers/ps3/ps3-vuart.c                                          |    5 +-
 drivers/vfio/pci/vfio_pci_nvlink2.c                              |    2 +-
 drivers/video/fbdev/Kconfig                                      |    2 +-
 drivers/video/fbdev/ps3fb.c                                      |    4 +-
 include/asm-generic/pgtable.h                                    |    4 +-
 include/linux/hw_breakpoint.h                                    |    4 +
 kernel/events/hw_breakpoint.c                                    |   16 +
 mm/huge_memory.c                                                 |    4 +-
 sound/drivers/Kconfig                                            |   12 -
 sound/drivers/Makefile                                           |    2 -
 sound/drivers/ml403-ac97cr.c                                     | 1298 --------------------
 sound/drivers/pcm-indirect2.c                                    |  560 ---------
 sound/drivers/pcm-indirect2.h                                    |  127 --
 sound/ppc/snd_ps3.c                                              |    2 +-
 tools/testing/selftests/powerpc/Makefile                         |    1 +
 tools/testing/selftests/powerpc/nx-gzip/99-nx-gzip.rules         |    1 +
 tools/testing/selftests/powerpc/nx-gzip/Makefile                 |    8 +
 tools/testing/selftests/powerpc/nx-gzip/README                   |   45 +
 tools/testing/selftests/powerpc/nx-gzip/gunz_test.c              | 1028 ++++++++++++++++
 tools/testing/selftests/powerpc/nx-gzip/gzfht_test.c             |  433 +++++++
 tools/testing/selftests/powerpc/nx-gzip/gzip_vas.c               |  316 +++++
 tools/testing/selftests/powerpc/nx-gzip/include/copy-paste.h     |   56 +
 tools/testing/selftests/powerpc/nx-gzip/include/crb.h            |  155 +++
 tools/testing/selftests/powerpc/nx-gzip/include/nx.h             |   38 +
 tools/testing/selftests/powerpc/nx-gzip/include/nx_dbg.h         |   95 ++
 tools/testing/selftests/powerpc/nx-gzip/include/nxu.h            |  650 ++++++++++
 tools/testing/selftests/powerpc/nx-gzip/include/vas-api.h        |    1 +
 tools/testing/selftests/powerpc/nx-gzip/nx-gzip-test.sh          |   46 +
 tools/testing/selftests/powerpc/pmu/.gitignore                   |    1 +
 tools/testing/selftests/powerpc/pmu/Makefile                     |    8 +-
 tools/testing/selftests/powerpc/pmu/count_stcx_fail.c            |  161 +++
 tools/testing/selftests/powerpc/pmu/ebb/trace.h                  |    4 +-
 tools/testing/selftests/powerpc/pmu/loop.S                       |   35 +
 tools/testing/selftests/powerpc/signal/Makefile                  |    2 +-
 tools/testing/selftests/powerpc/signal/sig_sc_double_restart.c   |  174 +++
 343 files changed, 10392 insertions(+), 8589 deletions(-)
 create mode 100644 Documentation/powerpc/vas-api.rst
 delete mode 100644 arch/powerpc/boot/dts/ep405.dts
 delete mode 100644 arch/powerpc/boot/dts/virtex440-ml507.dts
 delete mode 100644 arch/powerpc/boot/dts/virtex440-ml510.dts
 delete mode 100644 arch/powerpc/boot/dts/walnut.dts
 delete mode 100644 arch/powerpc/boot/ep405.c
 delete mode 100644 arch/powerpc/boot/treeboot-walnut.c
 delete mode 100644 arch/powerpc/boot/uartlite.c
 delete mode 100644 arch/powerpc/boot/virtex.c
 delete mode 100644 arch/powerpc/boot/virtex405-head.S
 delete mode 100644 arch/powerpc/configs/40x/ep405_defconfig
 delete mode 100644 arch/powerpc/configs/40x/virtex_defconfig
 delete mode 100644 arch/powerpc/configs/44x/virtex5_defconfig
 delete mode 100644 arch/powerpc/include/asm/asm-405.h
 create mode 100644 arch/powerpc/include/asm/idle.h
 create mode 100644 arch/powerpc/include/asm/inst.h
 delete mode 100644 arch/powerpc/include/asm/nohash/32/slice.h
 create mode 100644 arch/powerpc/include/asm/rtas-types.h
 delete mode 100644 arch/powerpc/include/asm/xilinx_intc.h
 delete mode 100644 arch/powerpc/include/asm/xilinx_pci.h
 create mode 100644 arch/powerpc/include/uapi/asm/vas-api.h
 create mode 100644 arch/powerpc/lib/inst.c
 create mode 100644 arch/powerpc/lib/test_code-patching.S
 create mode 100644 arch/powerpc/mm/book3s64/internal.h
 create mode 100644 arch/powerpc/mm/kasan/8xx.c
 create mode 100644 arch/powerpc/mm/kasan/book3s_32.c
 delete mode 100644 arch/powerpc/platforms/40x/ep405.c
 delete mode 100644 arch/powerpc/platforms/40x/virtex.c
 delete mode 100644 arch/powerpc/platforms/40x/walnut.c
 delete mode 100644 arch/powerpc/platforms/44x/virtex.c
 delete mode 100644 arch/powerpc/platforms/44x/virtex_ml510.c
 create mode 100644 arch/powerpc/platforms/powernv/vas-api.c
 create mode 100644 arch/powerpc/platforms/powernv/vas-fault.c
 delete mode 100644 arch/powerpc/sysdev/xilinx_intc.c
 delete mode 100644 arch/powerpc/sysdev/xilinx_pci.c
 create mode 100644 arch/powerpc/xmon/xmon_bpts.S
 create mode 100644 arch/powerpc/xmon/xmon_bpts.h
 rename drivers/crypto/nx/{nx-842-powernv.c => nx-common-powernv.c} (87%)
 delete mode 100644 drivers/input/serio/i8042-ppcio.h
 delete mode 100644 sound/drivers/ml403-ac97cr.c
 delete mode 100644 sound/drivers/pcm-indirect2.c
 delete mode 100644 sound/drivers/pcm-indirect2.h
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/99-nx-gzip.rules
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/Makefile
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/README
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/gunz_test.c
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/gzfht_test.c
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/gzip_vas.c
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/include/copy-paste.h
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/include/crb.h
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/include/nx.h
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/include/nx_dbg.h
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/include/nxu.h
 create mode 120000 tools/testing/selftests/powerpc/nx-gzip/include/vas-api.h
 create mode 100755 tools/testing/selftests/powerpc/nx-gzip/nx-gzip-test.sh
 create mode 100644 tools/testing/selftests/powerpc/pmu/count_stcx_fail.c
 create mode 100644 tools/testing/selftests/powerpc/signal/sig_sc_double_restart.c

^ permalink raw reply related

* Re: Boot issue with the latest Git kernel
From: Christian Zigotzky @ 2020-06-05 16:23 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev, jroedel
  Cc: Darren Stevens, Christoph Hellwig, R.T.Dickinson,
	Christian Zigotzky
In-Reply-To: <014e1268-dcce-61a3-8bcd-a06c43e0dfaf@csgroup.eu>

On 04 June 2020 at 7:15 pm, Christophe Leroy wrote:
> Yes today's linux-next boots on my powerpc 8xx board.
>
> Christophe
Hello Christophe,

Thanks for testing.

I was able to perform a 'git bisect' [1] and identified the bad commit. 
[2] I reverted this commit and after that the kernel boots and works 
without any problems.

Could you please check this commit?

Thanks,
Christian


[1] https://forum.hyperion-entertainment.com/viewtopic.php?p=50772#p50772
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2ba3e6947aed9bb9575eb1603c0ac6e39185d32a

^ permalink raw reply

* Re: [PATCH v2] cxl: Remove dead Kconfig option
From: Michael Ellerman @ 2020-06-05 16:16 UTC (permalink / raw)
  To: Andrew Donnellan, linuxppc-dev; +Cc: fbarrat
In-Reply-To: <20200602070545.11942-1-ajd@linux.ibm.com>

Andrew Donnellan <ajd@linux.ibm.com> writes:
> The CXL_AFU_DRIVER_OPS Kconfig option was added to coordinate merging of
> new features. It no longer serves any purpose, so remove it.
>
> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
>
> ---
> v1->v2:
> - keep CXL_LIB for now to avoid breaking a driver that is currently out of tree

Sorry I already merged v1.

cheers


^ permalink raw reply

* Re: [PATCH] tpm: ibmvtpm: Wait for ready buffer before probing for TPM2 attributes
From: Stefan Berger @ 2020-06-05 15:33 UTC (permalink / raw)
  To: David Gibson, Michael Ellerman, Peter Huewe, Jarkko Sakkinen,
	Jason Gunthorpe, Nayna Jain
  Cc: linuxppc-dev, linux-integrity, Paul Mackerras, linux-kernel
In-Reply-To: <20200605063719.456277-1-david@gibson.dropbear.id.au>

On 6/5/20 2:37 AM, David Gibson wrote:
> The tpm2_get_cc_attrs_tbl() call will result in TPM commands being issued,
> which will need the use of the internal command/response buffer.  But,
> we're issuing this *before* we've waited to make sure that buffer is
> allocated.
>
> This can result in intermittent failures to probe if the hypervisor / TPM
> implementation doesn't respond quickly enough.  I find it fails almost
> every time with an 8 vcpu guest under KVM with software emulated TPM.

Uuuh. Thanks!


> Fixes: 18b3670d79ae9 "tpm: ibmvtpm: Add support for TPM2"
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>



> ---
>   drivers/char/tpm/tpm_ibmvtpm.c | 14 +++++++-------
>   1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
> index 09fe45246b8c..994385bf37c0 100644
> --- a/drivers/char/tpm/tpm_ibmvtpm.c
> +++ b/drivers/char/tpm/tpm_ibmvtpm.c
> @@ -683,13 +683,6 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
>   	if (rc)
>   		goto init_irq_cleanup;
>   
> -	if (!strcmp(id->compat, "IBM,vtpm20")) {
> -		chip->flags |= TPM_CHIP_FLAG_TPM2;
> -		rc = tpm2_get_cc_attrs_tbl(chip);
> -		if (rc)
> -			goto init_irq_cleanup;
> -	}
> -
>   	if (!wait_event_timeout(ibmvtpm->crq_queue.wq,
>   				ibmvtpm->rtce_buf != NULL,
>   				HZ)) {
> @@ -697,6 +690,13 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
>   		goto init_irq_cleanup;
>   	}
>   
> +	if (!strcmp(id->compat, "IBM,vtpm20")) {
> +		chip->flags |= TPM_CHIP_FLAG_TPM2;
> +		rc = tpm2_get_cc_attrs_tbl(chip);
> +		if (rc)
> +			goto init_irq_cleanup;
> +	}
> +
>   	return tpm_chip_register(chip);
>   init_irq_cleanup:
>   	do {




^ permalink raw reply

* RE: [RESEND PATCH v9 4/5] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
From: Vaibhav Jain @ 2020-06-05 15:21 UTC (permalink / raw)
  To: Williams, Dan J, linuxppc-dev@lists.ozlabs.org,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org
  Cc: Aneesh Kumar K . V, Santosh Sivaraj, Oliver O'Halloran,
	Steven Rostedt, Weiny, Ira
In-Reply-To: <BN6PR11MB4132FA66A84CBD798AADCEC1C6860@BN6PR11MB4132.namprd11.prod.outlook.com>

"Williams, Dan J" <dan.j.williams@intel.com> writes:

>> -----Original Message-----
>> From: Vaibhav Jain <vaibhav@linux.ibm.com>
>> Sent: Thursday, June 4, 2020 2:06 AM
>> To: Williams, Dan J <dan.j.williams@intel.com>; linuxppc-
>> dev@lists.ozlabs.org; linux-nvdimm@lists.01.org; linux-
>> kernel@vger.kernel.org
>> Cc: Santosh Sivaraj <santosh@fossix.org>; Aneesh Kumar K . V
>> <aneesh.kumar@linux.ibm.com>; Steven Rostedt <rostedt@goodmis.org>;
>> Oliver O'Halloran <oohall@gmail.com>; Weiny, Ira <ira.weiny@intel.com>
>> Subject: RE: [RESEND PATCH v9 4/5] ndctl/papr_scm,uapi: Add support for
>> PAPR nvdimm specific methods
>> 
>> Hi Dan,
>> 
>> Thanks for review and insights on this. My responses below:
>> 
>> "Williams, Dan J" <dan.j.williams@intel.com> writes:
>> 
>> > [ forgive formatting I'm temporarily stuck using Outlook this week...
>> > ]
>> >
>> >> From: Vaibhav Jain <vaibhav@linux.ibm.com>
>> > [..]
>> >>
>> >> Introduce support for PAPR NVDIMM Specific Methods (PDSM) in
>> papr_scm
>> >> module and add the command family NVDIMM_FAMILY_PAPR to the
>> white
>> >> list of NVDIMM command sets. Also advertise support for ND_CMD_CALL
>> >> for the nvdimm command mask and implement necessary scaffolding in
>> >> the module to handle ND_CMD_CALL ioctl and PDSM requests that we
>> receive.
>> >>
>> >> The layout of the PDSM request as we expect from libnvdimm/libndctl
>> >> is described in newly introduced uapi header 'papr_pdsm.h' which
>> >> defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used to
>> >> communicate the PDSM request via member
>> 'nd_cmd_pkg.nd_command' and
>> >> size of payload that need to be sent/received for servicing the PDSM.
>> >>
>> >> A new function is_cmd_valid() is implemented that reads the args to
>> >> papr_scm_ndctl() and performs sanity tests on them. A new function
>> >> papr_scm_service_pdsm() is introduced and is called from
>> >> papr_scm_ndctl() in case of a PDSM request is received via
>> >> ND_CMD_CALL command from libnvdimm.
>> >>
>> >> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
>> >> Cc: Dan Williams <dan.j.williams@intel.com>
>> >> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> >> Cc: Ira Weiny <ira.weiny@intel.com>
>> >> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> >> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> >> ---
>> >> Changelog:
>> >>
>> >> Resend:
>> >> * Added ack from Aneesh.
>> >>
>> >> v8..v9:
>> >> * Reduced the usage of term SCM replacing it with appropriate
>> >>   replacement [ Dan Williams, Aneesh ]
>> >> * Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
>> >> * s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
>> >> * s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
>> >> * Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
>> >> * Minor update to patch description.
>> >>
>> >> v7..v8:
>> >> * Removed the 'payload_offset' field from 'struct
>> >>   nd_pdsm_cmd_pkg'. Instead command payload is always assumed to
>> start
>> >>   at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
>> >> * To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
>> >>   'reserved' field of 10-bytes is introduced. [ Aneesh ]
>> >> * Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
>> >>   [ Ira ]
>> >>
>> >> Resend:
>> >> * None
>> >>
>> >> v6..v7 :
>> >> * Removed the re-definitions of __packed macro from papr_scm_pdsm.h
>> >>   [Mpe].
>> >> * Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h
>> [Mpe].
>> >> * Removed macros that were unused in papr_scm.c from
>> papr_scm_pdsm.h
>> >>   [Mpe].
>> >> * Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]
>> >>
>> >> v5..v6 :
>> >> * Changed the usage of the term DSM to PDSM to distinguish it from the
>> >>   ACPI term [ Dan Williams ]
>> >> * Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various
>> >> struct
>> >>   to reflect the new terminology.
>> >> * Updated the patch description and title to reflect the new terminology.
>> >> * Squashed patch to introduce new command family in 'ndctl.h' with
>> >>   this patch [ Dan Williams ]
>> >> * Updated the papr_scm_pdsm method starting index from 0x10000 to
>> 0x0
>> >>   [ Dan Williams ]
>> >> * Removed redundant license text from the papr_scm_psdm.h file.
>> >>   [ Dan Williams ]
>> >> * s/envelop/envelope/ at various places [ Dan Williams ]
>> >> * Added '__packed' attribute to command package header to gaurd
>> >>   against different compiler adding paddings between the fields.
>> >>   [ Dan Williams]
>> >> * Converted various pr_debug to dev_debug [ Dan Williams ]
>> >>
>> >> v4..v5 :
>> >> * None
>> >>
>> >> v3..v4 :
>> >> * None
>> >>
>> >> v2..v3 :
>> >> * Updated the patch prefix to 'ndctl/uapi' [Aneesh]
>> >>
>> >> v1..v2 :
>> >> * None
>> >> ---
>> >>  arch/powerpc/include/uapi/asm/papr_pdsm.h | 136
>> >> ++++++++++++++++++++++
>> arch/powerpc/platforms/pseries/papr_scm.c |
>> >> 101 +++++++++++++++-
>> >>  include/uapi/linux/ndctl.h                |   1 +
>> >>  3 files changed, 232 insertions(+), 6 deletions(-)  create mode
>> >> 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h
>> >>
>> >> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h
>> >> b/arch/powerpc/include/uapi/asm/papr_pdsm.h
>> >> new file mode 100644
>> >> index 000000000000..6407fefcc007
>> >> --- /dev/null
>> >> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
>> >> @@ -0,0 +1,136 @@
>> >> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
>> >> +/*
>> >> + * PAPR nvDimm Specific Methods (PDSM) and structs for libndctl
>> >> + *
>> >> + * (C) Copyright IBM 2020
>> >> + *
>> >> + * Author: Vaibhav Jain <vaibhav at linux.ibm.com>  */
>> >> +
>> >> +#ifndef _UAPI_ASM_POWERPC_PAPR_PDSM_H_ #define
>> >> +_UAPI_ASM_POWERPC_PAPR_PDSM_H_
>> >> +
>> >> +#include <linux/types.h>
>> >> +
>> >> +/*
>> >> + * PDSM Envelope:
>> >> + *
>> >> + * The ioctl ND_CMD_CALL transfers data between user-space and
>> >> +kernel via
>> >> + * envelope which consists of a header and user-defined payload
>> sections.
>> >> + * The header is described by 'struct nd_pdsm_cmd_pkg' which expects
>> >> +a
>> >> + * payload following it and accessible via 'nd_pdsm_cmd_pkg.payload'
>> field.
>> >> + * There is reserved field that can used to introduce new fields to
>> >> +the
>> >> + * structure in future. It also tries to ensure that
>> >> 'nd_pdsm_cmd_pkg.payload'
>> >> + * lies at a 8-byte boundary.
>> >> + *
>> >> + *  +-------------+---------------------+---------------------------+
>> >> + *  |   64-Bytes  |       16-Bytes      |       Max 176-Bytes       |
>> >> + *  +-------------+---------------------+---------------------------+
>> >> + *  |               nd_pdsm_cmd_pkg     |                           |
>> >> + *  |-------------+                     |                           |
>> >> + *  |  nd_cmd_pkg |                     |                           |
>> >> + *  +-------------+---------------------+---------------------------+
>> >> + *  | nd_family   |                     |                           |
>> >> + *  | nd_size_out | cmd_status          |                           |
>> >> + *  | nd_size_in  | payload_version     |     payload               |
>> >> + *  | nd_command  | reserved            |                           |
>> >> + *  | nd_fw_size  |                     |                           |
>> >> + *
>> >> + +-------------+---------------------+---------------------------+
>> >> + *
>> >> + * PDSM Header:
>> >> + *
>> >> + * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
>> >> + * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to
>> >> member
>> >> + * 'nd_cmd_pkg.nd_command'. Apart from size information of the
>> >> envelope
>> >> +which is
>> >> + * contained in 'struct nd_cmd_pkg', the header also has members
>> >> +following
>> >> + * members:
>> >> + *
>> >> + * 'cmd_status'		: (Out) Errors if any encountered while
>> >> servicing PDSM.
>> >> + * 'payload_version'	: (In/Out) Version number associated with
>> the
>> >> payload.
>> >> + * 'reserved'		: Not used and reserved for future.
>> >> + *
>> >> + * PDSM Payload:
>> >> + *
>> >> + * The layout of the PDSM Payload is defined by various structs
>> >> +shared between
>> >> + * papr_scm and libndctl so that contents of payload can be
>> >> +interpreted. During
>> >> + * servicing of a PDSM the papr_scm module will read input args from
>> >> +the payload
>> >> + * field by casting its contents to an appropriate struct pointer
>> >> +based on the
>> >> + * PDSM command. Similarly the output of servicing the PDSM command
>> >> +will be
>> >> + * copied to the payload field using the same struct.
>> >> + *
>> >> + * 'libnvdimm' enforces a hard limit of 256 bytes on the envelope
>> >> +size, which
>> >> + * leaves around 176 bytes for the envelope payload (ignoring any
>> >> +padding that
>> >> + * the compiler may silently introduce).
>> >> + *
>> >> + * Payload Version:
>> >> + *
>> >> + * A 'payload_version' field is present in PDSM header that
>> >> +indicates a specific
>> >> + * version of the structure present in PDSM Payload for a given PDSM
>> >> command.
>> >> + * This provides backward compatibility in case the PDSM Payload
>> >> +structure
>> >> + * evolves and different structures are supported by 'papr_scm' and
>> >> 'libndctl'.
>> >> + *
>> >> + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send
>> >> +the version
>> >> + * of the payload struct it supports via 'payload_version' field.
>> >> +The
>> >> 'papr_scm'
>> >> + * module when servicing the PDSM envelope checks the
>> 'payload_version'
>> >> +and then
>> >> + * uses 'payload struct version' == MIN('payload_version field',
>> >> + * 'max payload-struct-version supported by papr_scm') to service
>> >> +the
>> >> PDSM.
>> >> + * After servicing the PDSM, 'papr_scm' put the negotiated version
>> >> +of payload
>> >> + * struct in returned 'payload_version' field.
>> >> + *
>> >> + * Libndctl on receiving the envelope back from papr_scm again
>> >> +checks the
>> >> + * 'payload_version' field and based on it use the appropriate
>> >> +version dsm
>> >> + * struct to parse the results.
>> >> + *
>> >> + * Backward Compatibility:
>> >> + *
>> >> + * Above scheme of exchanging different versioned PDSM struct
>> >> +between libndctl
>> >> + * and papr_scm should provide backward compatibility until
>> >> +following two
>> >> + * assumptions/conditions when defining new PDSM structs hold:
>> >> + *
>> >> + * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
>> >> + *
>> >> + * 1. T(X) is a proper subset of T(Y) if Y > X.
>> >> + *    i.e Each new version of PDSM struct should retain existing struct
>> >> + *    attributes from previous version
>> >> + *
>> >> + * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
>> >> + *    it should also support T(1), T(2)...T(X - 1).
>> >> + *    i.e When adding support for new version of a PDSM struct, libndctl
>> >> + *    and papr_scm should retain support of the existing PDSM struct
>> >> + *    version they support.
>> >> + */
>> >> +
>> >> +/* PDSM-header + payload expected with ND_CMD_CALL ioctl from
>> >> libnvdimm
>> >> +*/ struct nd_pdsm_cmd_pkg {
>> >> +	struct nd_cmd_pkg hdr;	/* Package header containing sub-
>> >> cmd */
>> >> +	__s32 cmd_status;	/* Out: Sub-cmd status returned back */
>> >> +	__u16 reserved[5];	/* Ignored and to be used in future */
>> >> +	__u16 payload_version;	/* In/Out: version of the payload */
>> >> +	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
>> >> +} __packed;
>> >> +
>> >> +/*
>> >> + * Methods to be embedded in ND_CMD_CALL request. These are sent
>> to
>> >> the
>> >> +kernel
>> >> + * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
>> >> +*/ enum papr_pdsm {
>> >> +	PAPR_PDSM_MIN = 0x0,
>> >> +	PAPR_PDSM_MAX,
>> >> +};
>> >> +
>> >> +/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */ static
>> >> +inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct
>> nd_cmd_pkg
>> >> *cmd) {
>> >> +	return (struct nd_pdsm_cmd_pkg *) cmd; }
>> >> +
>> >> +/* Return the payload pointer for a given pcmd */ static inline void
>> >> +*pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd) {
>> >> +	if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
>> >> +		return NULL;
>> >> +	else
>> >> +		return (void *)(pcmd->payload);
>> >> +}
>> >> +
>> >> +#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
>> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c
>> >> b/arch/powerpc/platforms/pseries/papr_scm.c
>> >> index 149431594839..5e2237e7ec08 100644
>> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>> >> @@ -15,13 +15,15 @@
>> >>  #include <linux/seq_buf.h>
>> >>
>> >>  #include <asm/plpar_wrappers.h>
>> >> +#include <asm/papr_pdsm.h>
>> >>
>> >>  #define BIND_ANY_ADDR (~0ul)
>> >>
>> >>  #define PAPR_SCM_DIMM_CMD_MASK \
>> >>  	((1ul << ND_CMD_GET_CONFIG_SIZE) | \
>> >>  	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
>> >> -	 (1ul << ND_CMD_SET_CONFIG_DATA))
>> >> +	 (1ul << ND_CMD_SET_CONFIG_DATA) | \
>> >> +	 (1ul << ND_CMD_CALL))
>> >>
>> >>  /* DIMM health bitmap bitmap indicators */
>> >>  /* SCM device is unable to persist memory contents */ @@ -350,16
>> >> +352,97 @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
>> >>  	return 0;
>> >>  }
>> >>
>> >> +/*
>> >> + * Validate the inputs args to dimm-control function and return '0' if valid.
>> >> + * This also does initial sanity validation to ND_CMD_CALL
>> >> +sub-command
>> >> packages.
>> >> + */
>> >> +static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd,
>> >> +void
>> >> *buf,
>> >> +		       unsigned int buf_len)
>> >> +{
>> >> +	unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
>> >> +	struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
>> >> +	struct papr_scm_priv *p;
>> >> +
>> >> +	/* Only dimm-specific calls are supported atm */
>> >> +	if (!nvdimm)
>> >> +		return -EINVAL;
>> >> +
>> >> +	/* get the provider date from struct nvdimm */
>> >> +	p = nvdimm_provider_data(nvdimm);
>> >> +
>> >> +	if (!test_bit(cmd, &cmd_mask)) {
>> >> +		dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
>> >> +		return -EINVAL;
>> >> +	} else if (cmd == ND_CMD_CALL) {
>> >> +
>> >> +		/* Verify the envelope package */
>> >> +		if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
>> >> +			dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
>> >> +				buf_len);
>> >> +			return -EINVAL;
>> >> +		}
>> >> +
>> >> +		/* Verify that the PDSM family is valid */
>> >> +		if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
>> >> +			dev_dbg(&p->pdev->dev, "Invalid pkg
>> >> family=0x%llx\n",
>> >> +				pkg->hdr.nd_family);
>> >> +			return -EINVAL;
>> >> +
>> >> +		}
>> >> +
>> >> +		/* We except a payload with all PDSM commands */
>> >> +		if (pdsm_cmd_to_payload(pkg) == NULL) {
>> >> +			dev_dbg(&p->pdev->dev,
>> >> +				"Empty payload for sub-command=0x%llx\n",
>> >> +				pkg->hdr.nd_command);
>> >> +			return -EINVAL;
>> >> +		}
>> >> +	}
>> >> +
>> >> +	/* Command looks valid */
>> >
>> <snip>
>> > So this is where I would expect the kernel to validate the command vs
>> > a known list of supported commands / payloads. One of the goals of
>> > requiring public documentation of any commands that libnvdimm might
>> > support for the ioctl path is to give the kernel the ability to gate
>> > future enabling on consideration of a common kernel front-end
>> > interface. I believe this would also address questions about the
>> > versioning scheme because userspace would be actively prevented from
>> > sending command payloads that were not first explicitly enabled in the
>> > kernel. This interface as it stands in this patch set seems to be a
>> > very thin / "anything goes" passthrough with no consideration for that
>> > policy.
>> >
>> > As an example of the utility of this policy, consider the recent
>> > support for nvdimm security commands that allow a passphrase to be set
>> > and issue commands like "unlock" and "secure erase". The kernel
>> > actively prevents those commands from being sent from userspace. See
>> > acpi_nfit_clear_to_send() and nd_cmd_clear_to_send(). The reasoning is
>> > that it enforces the kernel's nvdimm security model that uses
>> > encrypted/trusted keys to protect key material (clear text keys
>> > only-ever exist in kernel-space). Yes, that restriction is painful for
>> > people that don't want the kernel's security model and just want the
>> > simplicity of passing clear-text keys around, but it's necessary for
>> > the kernel to have any chance to provide a common abstraction across
>> > vendors. The pain of negotiating every single command with what the
>> > kernel will support is useful for the long term health of the kernel.
>> > It forces ongoing conversations across vendors to consolidate
>> > interfaces and reuse kernel best practices like encrypted/trusted
>> > keys. Code acceptance is the only real gate the kernel has to enforce
>> > cooperation across vendors.
>> >
>> > The expectation is that the kernel does not allow any command to pass
>> > that is not explicitly listed in a bitmap of known commands. I would
>> > expect that if you changed the payload of an existing command that
>> > would likely require a new entry in this bitmap. The goal is to give
>> > the kernel a chance to constrain the passthrough interface to afford a
>> > chance to have a discussion of what might done in a common
>> > implementation. Another example is the label-area read-write commands.
>> > The kernel needs explicit control to ensure that it owns the label
>> > area and that userspace is not able to corrupt it (write it behind the
>> > kernel's back).
>> >
>> > Now that said, I have battle scars with some OEMs that just want a
>> > generic passthrough interface so they never need to work with the
>> > kernel community again and can just write their custom validation
>> > tooling and be done. I've mostly been successful in that fight outside
>> > of the gaping hole of ND_CMD_VENDOR. That's the path that ipmctl has
>> > used to issue commands that have not made it into the public
>> > specification on docs.pmem.io. My warning shot for that is the
>> > "disable_vendor_specific" module option that administrators can set to
>> > only allow commands that the kernel explicitly knows the effects of to
>> > be issued. The result is only tooling / enabling that submits to this
>> > auditing regime is guaranteed to work everywhere.
>> 
>> Agree with points made above. With this patchset we arent really trying to
>> push an ioctl passthrough to exchange arbitary data with papr-scm module.
>> Nor do we want to bypass the kernel community for any future
>> enhancements on this interface. We made some design choices based on
>> our understanding of certain restriction we saw in ndctl/libndctl. Specifically
>> wanted to avoid issuing two CMD_CALL ioctl roundtrips.
>> 
>> That being said I had an extended discussion with Aneesh rethinking the
>> 'version' field and we both agreed *to remove this field* from the proposed
>> 'struct nd_pdsm_cmd_pkg'. This should resolve the contentions around this
>> Patch-4 in this patchset. Since the 'version' field isnt extensively used right
>> now the impact on the patchset would be small.
>> 
>> >
>> > So, that long explanation out of the way, what does that mean for this
>> > patch set? I'd like to understand if you still see a need for a
>> > versioning scheme if the implementation is required to explicitly list
>> > all the commands it supports? I.e. that the kernel need not worry
>> > about userspace sending future unknown payloads because unknown
>> > payloads are blocked. Also if your interface has anything similar to a
>> > "vendor specific" passthrough I would like to require that go through
>> > the ND_CMD_VENDOR ioctl, so that the kernel still has a common check
>> > point to prevent vendor specific "I don't want to talk to the kernel
>> > community" shenanigans, but even better if ND_CMD_VENDOR is
>> something
>> > the kernel can eventually jettison because nobody is using it.
>> 
>> As I mentioned above this isn't a 'vendor specific passthrough'
>> machenism. The 'version' field was proposed to avoid two CMD_CALL ioctl
>> roundtrip to fetch and report extended nvdimm health data like 'life-
>> remaining' which isnt always available for papr-scm.
>

> Oh, why not define a maximal health payload with all the attributes
> you know about today, leave some room for future expansion, and then
> report a validity flag for each attribute? This is how the "intel"
> smart-health payload works. If they ever needed to extend the payload
> they would increase the size and add more validity flags. Old
> userspace never groks the new fields, new userspace knows to ask for
> and parse the larger payload.
>
> See the flags field in 'struct nd_intel_smart' (in ndctl) and the
> translation of those flags to ndctl generic attribute flags
> intel_cmd_smart_get_flags().
>
> In general I'd like ndctl to understand the superset of all health
> attributes across all vendors. For the truly vendor specific ones it
> would mean that the health flags with a specific "papr_scm" back-end
> just would never be set on an "intel" device. I.e. look at the "hpe"
> and "msft" health backends. They only set a subset of the valid flags
> that could be reported.

Thanks, this sounds good. Infact papr_scm implementation in ndctl does
advertises support for only a subset of ND_SMART_* flags right now.

Using 'flags' instead of 'version' was indeed discussed during
v7..v9. However re-looking at the 'msft' and 'hpe' implementations the
approach of maximal health payload tagged with a flags field looks more
intuitive and I would prefer implementing this scheme in this patch-set.

The current set health data exchanged with between libndctl and
papr_scm via 'struct nd_papr_pdsm_health' (e.g various health status
bits , nvdimm arming status etc) are guaranteed to be always available
hence associating their availability with a flag wont be much useful as
the flag will be always set.

However as you suggested, extending the 'struct nd_papr_pdsm_health' in
future to accommodate new attributes like 'life-remaining' can be done
via adding them to the end of the struct and setting a flag field to
indicate its presence.

So I have the following proposal:
* Add a new '__u32 extension_flags' field at beginning of 'struct
  nd_papr_pdsm_health'
* Set the size of the struct to 184-bytes which is the maximum possible
  size for a pdsm payload.
* 'papr_scm' kernel driver will currently set 'extension_flag' to 0
  indicating no extension fields.

* Future patch that adds support for 'life-remaining' add the new-field
  at the end of known fields in 'struct nd_papr_pdsm_health'.
* When provided to  papr_scm kernel module, if 'life-remaining' data is
  available its populated and corresponding flag set in
  'extension_flags' field indicating its presence.
* When received by libndctl papr_scm implementation its tests if the
  extension_flags have associated 'life-remaining' flag set and if yes
  then return ND_SMART_USED_VALID flag back from
  ndctl_cmd_smart_get_flags().
  
Implementing first 3 items above in the current patchset should be
fairly trivial.

Does that sounds reasonable ?

Thanks,
~ Vaibhav

>
>> However we just realized instead of relying on 'version' field we can
>> advertise support for these extended attributes via nvdimm-flags from sysfs.
>> Looking at the nvdimm-flags libndctl can use an appropriate pdsm command
>> and struct to fetch the dimm health information from papr_scm via
>> CMD_CALL.
>> 
>> But thats something we plan to do in future and not with the current
>> patchset which only reports fixed set of nvdimm health attributes.
>> 
>> >
>> > I feel like this is a conversation that will take a few days to
>> > resolve, which does not leave time to push this for v5.8. That said, I
>> > do think the health flags patches at the beginning of this series are
>> > low risk and uncontentious. How about I merge those for v5.8 and
>> > circle back to get this ioctl path queued early in v5.8-rc? Apologies
>> > for the late feedback on this relative to v5.8.
>> >
>> Thanks for this consideration. Agree to the proposal. However changes to
>> patchset with removal of 'version' field is fairly small hence can quickly push
>> an updated patch series cumulating rest of the review comments from Ira.
>> 
>> Does that sounds reasonable ?
>

^ permalink raw reply

* Re: [PATCH v1 2/4] KVM: PPC: Book3S HV: track shared GFNs of secure VMs
From: Ram Pai @ 2020-06-05 14:38 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: cclaudio, kvm-ppc, bharata, aneesh.kumar, sukadev, linuxppc-dev,
	bauerman, david
In-Reply-To: <4e1a5f90-984a-129c-d336-98fc90019379@linux.ibm.com>

> >diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> >index 803940d..3448459 100644
> >--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> >+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> >@@ -1100,7 +1100,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm,
> >  	unsigned int shift;
> >  	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START)
> >-		kvmppc_uvmem_drop_pages(memslot, kvm, true);
> >+		kvmppc_uvmem_drop_pages(memslot, kvm, true, false);
> 
> Why purge_gfn is false here?
> That call function is called when dropping an hot plugged memslot.

This function does not know, under what context it is called. Since
its job is to just flush the memslot, it cannot assume anything
about purging the pages in the memslot.

.snip..


RP

^ permalink raw reply

* [PATCH] powerpc/mm: Fix typo in IS_ENABLED()
From: Kees Cook @ 2020-06-05 14:18 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Joe Perches, linuxppc-dev, linux-kernel

From: Joe Perches <joe@perches.com>

IS_ENABLED() matches names exactly, so the missing "CONFIG_" prefix
means this code would never be built.

Also fixes a missing newline in pr_warn().

Signed-off-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/lkml/b08611018fdb6d88757c6008a5c02fa0e07b32fb.camel@perches.com
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/powerpc/mm/book3s64/hash_utils.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 8ed2411c3f39..cf2e1b06e5d4 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -660,11 +660,10 @@ static void __init htab_init_page_sizes(void)
 		 * Pick a size for the linear mapping. Currently, we only
 		 * support 16M, 1M and 4K which is the default
 		 */
-		if (IS_ENABLED(STRICT_KERNEL_RWX) &&
+		if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) &&
 		    (unsigned long)_stext % 0x1000000) {
 			if (mmu_psize_defs[MMU_PAGE_16M].shift)
-				pr_warn("Kernel not 16M aligned, "
-					"disabling 16M linear map alignment");
+				pr_warn("Kernel not 16M aligned, disabling 16M linear map alignment\n");
 			aligned = false;
 		}
 
-- 
2.25.1


-- 
Kees Cook

^ permalink raw reply related

* Re: [PATCH v4 1/4] riscv: Move kernel mapping to vmalloc zone
From: Alex Ghiti @ 2020-06-05 12:30 UTC (permalink / raw)
  To: Zong Li
  Cc: Albert Ou, Anup Patel, linux-kernel@vger.kernel.org List,
	Atish Patra, Paul Mackerras, Paul Walmsley, Palmer Dabbelt,
	linux-riscv, linuxppc-dev
In-Reply-To: <CANXhq0qjWKCqbY4BmCa1wZKYY_Dax8fGj1s4Q_ZipaFPo9dz8g@mail.gmail.com>

Hi Zong,

Le 6/3/20 à 10:52 PM, Zong Li a écrit :
> On Wed, Jun 3, 2020 at 4:01 PM Alexandre Ghiti <alex@ghiti.fr> wrote:
>> This is a preparatory patch for relocatable kernel.
>>
>> The kernel used to be linked at PAGE_OFFSET address and used to be loaded
>> physically at the beginning of the main memory. Therefore, we could use
>> the linear mapping for the kernel mapping.
>>
>> But the relocated kernel base address will be different from PAGE_OFFSET
>> and since in the linear mapping, two different virtual addresses cannot
>> point to the same physical address, the kernel mapping needs to lie outside
>> the linear mapping.
>>
>> In addition, because modules and BPF must be close to the kernel (inside
>> +-2GB window), the kernel is placed at the end of the vmalloc zone minus
>> 2GB, which leaves room for modules and BPF. The kernel could not be
>> placed at the beginning of the vmalloc zone since other vmalloc
>> allocations from the kernel could get all the +-2GB window around the
>> kernel which would prevent new modules and BPF programs to be loaded.
>>
>> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
>> ---
>>   arch/riscv/boot/loader.lds.S     |  3 +-
>>   arch/riscv/include/asm/page.h    | 10 +++++-
>>   arch/riscv/include/asm/pgtable.h | 38 ++++++++++++++-------
>>   arch/riscv/kernel/head.S         |  3 +-
>>   arch/riscv/kernel/module.c       |  4 +--
>>   arch/riscv/kernel/vmlinux.lds.S  |  3 +-
>>   arch/riscv/mm/init.c             | 58 +++++++++++++++++++++++++-------
>>   arch/riscv/mm/physaddr.c         |  2 +-
>>   8 files changed, 88 insertions(+), 33 deletions(-)
>>
>> diff --git a/arch/riscv/boot/loader.lds.S b/arch/riscv/boot/loader.lds.S
>> index 47a5003c2e28..62d94696a19c 100644
>> --- a/arch/riscv/boot/loader.lds.S
>> +++ b/arch/riscv/boot/loader.lds.S
>> @@ -1,13 +1,14 @@
>>   /* SPDX-License-Identifier: GPL-2.0 */
>>
>>   #include <asm/page.h>
>> +#include <asm/pgtable.h>
>>
>>   OUTPUT_ARCH(riscv)
>>   ENTRY(_start)
>>
>>   SECTIONS
>>   {
>> -       . = PAGE_OFFSET;
>> +       . = KERNEL_LINK_ADDR;
>>
>>          .payload : {
>>                  *(.payload)
>> diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
>> index 2d50f76efe48..48bb09b6a9b7 100644
>> --- a/arch/riscv/include/asm/page.h
>> +++ b/arch/riscv/include/asm/page.h
>> @@ -90,18 +90,26 @@ typedef struct page *pgtable_t;
>>
>>   #ifdef CONFIG_MMU
>>   extern unsigned long va_pa_offset;
>> +extern unsigned long va_kernel_pa_offset;
>>   extern unsigned long pfn_base;
>>   #define ARCH_PFN_OFFSET                (pfn_base)
>>   #else
>>   #define va_pa_offset           0
>> +#define va_kernel_pa_offset    0
>>   #define ARCH_PFN_OFFSET                (PAGE_OFFSET >> PAGE_SHIFT)
>>   #endif /* CONFIG_MMU */
>>
>>   extern unsigned long max_low_pfn;
>>   extern unsigned long min_low_pfn;
>> +extern unsigned long kernel_virt_addr;
>>
>>   #define __pa_to_va_nodebug(x)  ((void *)((unsigned long) (x) + va_pa_offset))
>> -#define __va_to_pa_nodebug(x)  ((unsigned long)(x) - va_pa_offset)
>> +#define linear_mapping_va_to_pa(x)     ((unsigned long)(x) - va_pa_offset)
>> +#define kernel_mapping_va_to_pa(x)     \
>> +       ((unsigned long)(x) - va_kernel_pa_offset)
>> +#define __va_to_pa_nodebug(x)          \
>> +       (((x) >= PAGE_OFFSET) ?         \
>> +               linear_mapping_va_to_pa(x) : kernel_mapping_va_to_pa(x))
>>
>>   #ifdef CONFIG_DEBUG_VIRTUAL
>>   extern phys_addr_t __virt_to_phys(unsigned long x);
>> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
>> index 35b60035b6b0..94ef3b49dfb6 100644
>> --- a/arch/riscv/include/asm/pgtable.h
>> +++ b/arch/riscv/include/asm/pgtable.h
>> @@ -11,23 +11,29 @@
>>
>>   #include <asm/pgtable-bits.h>
>>
>> -#ifndef __ASSEMBLY__
>> -
>> -/* Page Upper Directory not used in RISC-V */
>> -#include <asm-generic/pgtable-nopud.h>
>> -#include <asm/page.h>
>> -#include <asm/tlbflush.h>
>> -#include <linux/mm_types.h>
>> -
>> -#ifdef CONFIG_MMU
>> +#ifndef CONFIG_MMU
>> +#define KERNEL_VIRT_ADDR       PAGE_OFFSET
>> +#define KERNEL_LINK_ADDR       PAGE_OFFSET
>> +#else
>> +/*
>> + * Leave 2GB for modules and BPF that must lie within a 2GB range around
>> + * the kernel.
>> + */
>> +#define KERNEL_VIRT_ADDR       (VMALLOC_END - SZ_2G + 1)
>> +#define KERNEL_LINK_ADDR       KERNEL_VIRT_ADDR
>>
>>   #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
>>   #define VMALLOC_END      (PAGE_OFFSET - 1)
>>   #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
>>
>>   #define BPF_JIT_REGION_SIZE    (SZ_128M)
>> -#define BPF_JIT_REGION_START   (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
>> -#define BPF_JIT_REGION_END     (VMALLOC_END)
>> +#define BPF_JIT_REGION_START   PFN_ALIGN((unsigned long)&_end)
>> +#define BPF_JIT_REGION_END     (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
>> +
>> +#ifdef CONFIG_64BIT
>> +#define VMALLOC_MODULE_START   BPF_JIT_REGION_END
>> +#define VMALLOC_MODULE_END     (((unsigned long)&_start & PAGE_MASK) + SZ_2G)
>> +#endif
>>
>>   /*
>>    * Roughly size the vmemmap space to be large enough to fit enough
>> @@ -57,9 +63,16 @@
>>   #define FIXADDR_SIZE     PGDIR_SIZE
>>   #endif
>>   #define FIXADDR_START    (FIXADDR_TOP - FIXADDR_SIZE)
>> -
>>   #endif
>>
>> +#ifndef __ASSEMBLY__
>> +
>> +/* Page Upper Directory not used in RISC-V */
>> +#include <asm-generic/pgtable-nopud.h>
>> +#include <asm/page.h>
>> +#include <asm/tlbflush.h>
>> +#include <linux/mm_types.h>
>> +
>>   #ifdef CONFIG_64BIT
>>   #include <asm/pgtable-64.h>
>>   #else
>> @@ -483,6 +496,7 @@ static inline void __kernel_map_pages(struct page *page, int numpages, int enabl
>>
>>   #define kern_addr_valid(addr)   (1) /* FIXME */
>>
>> +extern char _start[];
>>   extern void *dtb_early_va;
>>   void setup_bootmem(void);
>>   void paging_init(void);
>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
>> index 98a406474e7d..8f5bb7731327 100644
>> --- a/arch/riscv/kernel/head.S
>> +++ b/arch/riscv/kernel/head.S
>> @@ -49,7 +49,8 @@ ENTRY(_start)
>>   #ifdef CONFIG_MMU
>>   relocate:
>>          /* Relocate return address */
>> -       li a1, PAGE_OFFSET
>> +       la a1, kernel_virt_addr
>> +       REG_L a1, 0(a1)
>>          la a2, _start
>>          sub a1, a1, a2
>>          add ra, ra, a1
>> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
>> index 8bbe5dbe1341..1a8fbe05accf 100644
>> --- a/arch/riscv/kernel/module.c
>> +++ b/arch/riscv/kernel/module.c
>> @@ -392,12 +392,10 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
>>   }
>>
>>   #if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
>> -#define VMALLOC_MODULE_START \
>> -        max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>>   void *module_alloc(unsigned long size)
>>   {
>>          return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
>> -                                   VMALLOC_END, GFP_KERNEL,
>> +                                   VMALLOC_MODULE_END, GFP_KERNEL,
>>                                      PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
>>                                      __builtin_return_address(0));
>>   }
>> diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
>> index 0339b6bbe11a..a9abde62909f 100644
>> --- a/arch/riscv/kernel/vmlinux.lds.S
>> +++ b/arch/riscv/kernel/vmlinux.lds.S
>> @@ -4,7 +4,8 @@
>>    * Copyright (C) 2017 SiFive
>>    */
>>
>> -#define LOAD_OFFSET PAGE_OFFSET
>> +#include <asm/pgtable.h>
>> +#define LOAD_OFFSET KERNEL_LINK_ADDR
>>   #include <asm/vmlinux.lds.h>
>>   #include <asm/page.h>
>>   #include <asm/cache.h>
>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>> index 736de6c8739f..37be2eb45e58 100644
>> --- a/arch/riscv/mm/init.c
>> +++ b/arch/riscv/mm/init.c
>> @@ -22,6 +22,9 @@
>>
>>   #include "../kernel/head.h"
>>
>> +unsigned long kernel_virt_addr = KERNEL_VIRT_ADDR;
>> +EXPORT_SYMBOL(kernel_virt_addr);
>> +
>>   unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
>>                                                          __page_aligned_bss;
>>   EXPORT_SYMBOL(empty_zero_page);
>> @@ -178,8 +181,12 @@ void __init setup_bootmem(void)
>>   }
>>
>>   #ifdef CONFIG_MMU
>> +/* Offset between linear mapping virtual address and kernel load address */
>>   unsigned long va_pa_offset;
>>   EXPORT_SYMBOL(va_pa_offset);
>> +/* Offset between kernel mapping virtual address and kernel load address */
>> +unsigned long va_kernel_pa_offset;
>> +EXPORT_SYMBOL(va_kernel_pa_offset);
>>   unsigned long pfn_base;
>>   EXPORT_SYMBOL(pfn_base);
>>
>> @@ -271,7 +278,7 @@ static phys_addr_t __init alloc_pmd(uintptr_t va)
>>          if (mmu_enabled)
>>                  return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
>>
>> -       pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT;
>> +       pmd_num = (va - kernel_virt_addr) >> PGDIR_SHIFT;
>>          BUG_ON(pmd_num >= NUM_EARLY_PMDS);
>>          return (uintptr_t)&early_pmd[pmd_num * PTRS_PER_PMD];
>>   }
>> @@ -372,14 +379,30 @@ static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
>>   #error "setup_vm() is called from head.S before relocate so it should not use absolute addressing."
>>   #endif
>>
>> +static uintptr_t load_pa, load_sz;
>> +
>> +void create_kernel_page_table(pgd_t *pgdir, uintptr_t map_size)
> It could be static if this function is only used in this file, as
> kbuild test reported.
> Apart from this, it looks good to me.
> Reviewed-by: Zong Li <zong.li@sifive.com>


Thanks, that was the missing Reviewed-by of this series :) I send a v5 
right now.

Looking forward to seeing your KASLR patchset on top of that.

Alex


>
>> +{
>> +       uintptr_t va, end_va;
>> +
>> +       end_va = kernel_virt_addr + load_sz;
>> +       for (va = kernel_virt_addr; va < end_va; va += map_size)
>> +               create_pgd_mapping(pgdir, va,
>> +                                  load_pa + (va - kernel_virt_addr),
>> +                                  map_size, PAGE_KERNEL_EXEC);
>> +}
>> +
>>   asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>   {
>>          uintptr_t va, end_va;
>> -       uintptr_t load_pa = (uintptr_t)(&_start);
>> -       uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
>>          uintptr_t map_size = best_map_size(load_pa, MAX_EARLY_MAPPING_SIZE);
>>
>> +       load_pa = (uintptr_t)(&_start);
>> +       load_sz = (uintptr_t)(&_end) - load_pa;
>> +
>>          va_pa_offset = PAGE_OFFSET - load_pa;
>> +       va_kernel_pa_offset = kernel_virt_addr - load_pa;
>> +
>>          pfn_base = PFN_DOWN(load_pa);
>>
>>          /*
>> @@ -402,26 +425,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>          create_pmd_mapping(fixmap_pmd, FIXADDR_START,
>>                             (uintptr_t)fixmap_pte, PMD_SIZE, PAGE_TABLE);
>>          /* Setup trampoline PGD and PMD */
>> -       create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
>> +       create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>>                             (uintptr_t)trampoline_pmd, PGDIR_SIZE, PAGE_TABLE);
>> -       create_pmd_mapping(trampoline_pmd, PAGE_OFFSET,
>> +       create_pmd_mapping(trampoline_pmd, kernel_virt_addr,
>>                             load_pa, PMD_SIZE, PAGE_KERNEL_EXEC);
>>   #else
>>          /* Setup trampoline PGD */
>> -       create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
>> +       create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>>                             load_pa, PGDIR_SIZE, PAGE_KERNEL_EXEC);
>>   #endif
>>
>>          /*
>> -        * Setup early PGD covering entire kernel which will allows
>> +        * Setup early PGD covering entire kernel which will allow
>>           * us to reach paging_init(). We map all memory banks later
>>           * in setup_vm_final() below.
>>           */
>> -       end_va = PAGE_OFFSET + load_sz;
>> -       for (va = PAGE_OFFSET; va < end_va; va += map_size)
>> -               create_pgd_mapping(early_pg_dir, va,
>> -                                  load_pa + (va - PAGE_OFFSET),
>> -                                  map_size, PAGE_KERNEL_EXEC);
>> +       create_kernel_page_table(early_pg_dir, map_size);
>>
>>          /* Create fixed mapping for early FDT parsing */
>>          end_va = __fix_to_virt(FIX_FDT) + FIX_FDT_SIZE;
>> @@ -441,6 +460,7 @@ static void __init setup_vm_final(void)
>>          uintptr_t va, map_size;
>>          phys_addr_t pa, start, end;
>>          struct memblock_region *reg;
>> +       static struct vm_struct vm_kernel = { 0 };
>>
>>          /* Set mmu_enabled flag */
>>          mmu_enabled = true;
>> @@ -467,10 +487,22 @@ static void __init setup_vm_final(void)
>>                  for (pa = start; pa < end; pa += map_size) {
>>                          va = (uintptr_t)__va(pa);
>>                          create_pgd_mapping(swapper_pg_dir, va, pa,
>> -                                          map_size, PAGE_KERNEL_EXEC);
>> +                                          map_size, PAGE_KERNEL);
>>                  }
>>          }
>>
>> +       /* Map the kernel */
>> +       create_kernel_page_table(swapper_pg_dir, PMD_SIZE);
>> +
>> +       /* Reserve the vmalloc area occupied by the kernel */
>> +       vm_kernel.addr = (void *)kernel_virt_addr;
>> +       vm_kernel.phys_addr = load_pa;
>> +       vm_kernel.size = (load_sz + PMD_SIZE - 1) & ~(PMD_SIZE - 1);
>> +       vm_kernel.flags = VM_MAP | VM_NO_GUARD;
>> +       vm_kernel.caller = __builtin_return_address(0);
>> +
>> +       vm_area_add_early(&vm_kernel);
>> +
>>          /* Clear fixmap PTE and PMD mappings */
>>          clear_fixmap(FIX_PTE);
>>          clear_fixmap(FIX_PMD);
>> diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
>> index e8e4dcd39fed..35703d5ef5fd 100644
>> --- a/arch/riscv/mm/physaddr.c
>> +++ b/arch/riscv/mm/physaddr.c
>> @@ -23,7 +23,7 @@ EXPORT_SYMBOL(__virt_to_phys);
>>
>>   phys_addr_t __phys_addr_symbol(unsigned long x)
>>   {
>> -       unsigned long kernel_start = (unsigned long)PAGE_OFFSET;
>> +       unsigned long kernel_start = (unsigned long)kernel_virt_addr;
>>          unsigned long kernel_end = (unsigned long)_end;
>>
>>          /*
>> --
>> 2.20.1
>>

^ permalink raw reply

* [PATCH AUTOSEL 4.9 5/6] drivers/net/ibmvnic: Update VNIC protocol version reporting
From: Sasha Levin @ 2020-06-05 12:26 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, netdev, Thomas Falcon, linuxppc-dev,
	David S . Miller
In-Reply-To: <20200605122620.2882962-1-sashal@kernel.org>

From: Thomas Falcon <tlfalcon@linux.ibm.com>

[ Upstream commit 784688993ebac34dffe44a9f2fabbe126ebfd4db ]

VNIC protocol version is reported in big-endian format, but it
is not byteswapped before logging. Fix that, and remove version
comparison as only one protocol version exists at this time.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 897a87ae8655..20f7ab4aa2f1 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -3362,12 +3362,10 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
 			dev_err(dev, "Error %ld in VERSION_EXCHG_RSP\n", rc);
 			break;
 		}
-		dev_info(dev, "Partner protocol version is %d\n",
-			 crq->version_exchange_rsp.version);
-		if (be16_to_cpu(crq->version_exchange_rsp.version) <
-		    ibmvnic_version)
-			ibmvnic_version =
+		ibmvnic_version =
 			    be16_to_cpu(crq->version_exchange_rsp.version);
+		dev_info(dev, "Partner protocol version is %d\n",
+			 ibmvnic_version);
 		send_cap_queries(adapter);
 		break;
 	case QUERY_CAPABILITY_RSP:
-- 
2.25.1


^ permalink raw reply related

* [PATCH AUTOSEL 4.14 7/8] drivers/net/ibmvnic: Update VNIC protocol version reporting
From: Sasha Levin @ 2020-06-05 12:26 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, netdev, Thomas Falcon, linuxppc-dev,
	David S . Miller
In-Reply-To: <20200605122609.2882841-1-sashal@kernel.org>

From: Thomas Falcon <tlfalcon@linux.ibm.com>

[ Upstream commit 784688993ebac34dffe44a9f2fabbe126ebfd4db ]

VNIC protocol version is reported in big-endian format, but it
is not byteswapped before logging. Fix that, and remove version
comparison as only one protocol version exists at this time.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 956fbb164e6f..85c11dafb4cd 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -3560,12 +3560,10 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
 			dev_err(dev, "Error %ld in VERSION_EXCHG_RSP\n", rc);
 			break;
 		}
-		dev_info(dev, "Partner protocol version is %d\n",
-			 crq->version_exchange_rsp.version);
-		if (be16_to_cpu(crq->version_exchange_rsp.version) <
-		    ibmvnic_version)
-			ibmvnic_version =
+		ibmvnic_version =
 			    be16_to_cpu(crq->version_exchange_rsp.version);
+		dev_info(dev, "Partner protocol version is %d\n",
+			 ibmvnic_version);
 		send_cap_queries(adapter);
 		break;
 	case QUERY_CAPABILITY_RSP:
-- 
2.25.1


^ permalink raw reply related

* [PATCH AUTOSEL 4.19 8/9] drivers/net/ibmvnic: Update VNIC protocol version reporting
From: Sasha Levin @ 2020-06-05 12:25 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, netdev, Thomas Falcon, linuxppc-dev,
	David S . Miller
In-Reply-To: <20200605122558.2882712-1-sashal@kernel.org>

From: Thomas Falcon <tlfalcon@linux.ibm.com>

[ Upstream commit 784688993ebac34dffe44a9f2fabbe126ebfd4db ]

VNIC protocol version is reported in big-endian format, but it
is not byteswapped before logging. Fix that, and remove version
comparison as only one protocol version exists at this time.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index abfd990ba4d8..645298628b6f 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -4295,12 +4295,10 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
 			dev_err(dev, "Error %ld in VERSION_EXCHG_RSP\n", rc);
 			break;
 		}
-		dev_info(dev, "Partner protocol version is %d\n",
-			 crq->version_exchange_rsp.version);
-		if (be16_to_cpu(crq->version_exchange_rsp.version) <
-		    ibmvnic_version)
-			ibmvnic_version =
+		ibmvnic_version =
 			    be16_to_cpu(crq->version_exchange_rsp.version);
+		dev_info(dev, "Partner protocol version is %d\n",
+			 ibmvnic_version);
 		send_cap_queries(adapter);
 		break;
 	case QUERY_CAPABILITY_RSP:
-- 
2.25.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.4 13/14] drivers/net/ibmvnic: Update VNIC protocol version reporting
From: Sasha Levin @ 2020-06-05 12:25 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, netdev, Thomas Falcon, linuxppc-dev,
	David S . Miller
In-Reply-To: <20200605122540.2882539-1-sashal@kernel.org>

From: Thomas Falcon <tlfalcon@linux.ibm.com>

[ Upstream commit 784688993ebac34dffe44a9f2fabbe126ebfd4db ]

VNIC protocol version is reported in big-endian format, but it
is not byteswapped before logging. Fix that, and remove version
comparison as only one protocol version exists at this time.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index aaa03ce5796f..5a42ddeecfe5 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -4536,12 +4536,10 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
 			dev_err(dev, "Error %ld in VERSION_EXCHG_RSP\n", rc);
 			break;
 		}
-		dev_info(dev, "Partner protocol version is %d\n",
-			 crq->version_exchange_rsp.version);
-		if (be16_to_cpu(crq->version_exchange_rsp.version) <
-		    ibmvnic_version)
-			ibmvnic_version =
+		ibmvnic_version =
 			    be16_to_cpu(crq->version_exchange_rsp.version);
+		dev_info(dev, "Partner protocol version is %d\n",
+			 ibmvnic_version);
 		send_cap_queries(adapter);
 		break;
 	case QUERY_CAPABILITY_RSP:
-- 
2.25.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.6 16/17] drivers/net/ibmvnic: Update VNIC protocol version reporting
From: Sasha Levin @ 2020-06-05 12:25 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Sasha Levin, netdev, Thomas Falcon, linuxppc-dev,
	David S . Miller
In-Reply-To: <20200605122517.2882338-1-sashal@kernel.org>

From: Thomas Falcon <tlfalcon@linux.ibm.com>

[ Upstream commit 784688993ebac34dffe44a9f2fabbe126ebfd4db ]

VNIC protocol version is reported in big-endian format, but it
is not byteswapped before logging. Fix that, and remove version
comparison as only one protocol version exists at this time.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 3de549c6c693..197dc5b2c090 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -4678,12 +4678,10 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
 			dev_err(dev, "Error %ld in VERSION_EXCHG_RSP\n", rc);
 			break;
 		}
-		dev_info(dev, "Partner protocol version is %d\n",
-			 crq->version_exchange_rsp.version);
-		if (be16_to_cpu(crq->version_exchange_rsp.version) <
-		    ibmvnic_version)
-			ibmvnic_version =
+		ibmvnic_version =
 			    be16_to_cpu(crq->version_exchange_rsp.version);
+		dev_info(dev, "Partner protocol version is %d\n",
+			 ibmvnic_version);
 		send_cap_queries(adapter);
 		break;
 	case QUERY_CAPABILITY_RSP:
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH 6/7] powerpc/perf: power10 Performance Monitoring support
From: kernel test robot @ 2020-06-05 10:29 UTC (permalink / raw)
  To: Athira Rajeev, mpe
  Cc: mikey, mikey, kbuild-all, maddy, atrajeev, linuxppc-dev
In-Reply-To: <1591343830-8286-7-git-send-email-atrajeev@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 1630 bytes --]

Hi Athira,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on next-20200605]
[cannot apply to kvm-ppc/kvm-ppc-next mpe/next v5.7]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Athira-Rajeev/powerpc-perf-Add-support-for-power10-PMU-Hardware/20200605-161850
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>, old ones prefixed by <<):

>> arch/powerpc/perf/power10-pmu.c:405:5: warning: no previous prototype for 'init_power10_pmu' [-Wmissing-prototypes]
405 | int init_power10_pmu(void)
|     ^~~~~~~~~~~~~~~~

vim +/init_power10_pmu +405 arch/powerpc/perf/power10-pmu.c

   404	
 > 405	int init_power10_pmu(void)

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 66059 bytes --]

^ permalink raw reply

* Re: [PATCH v1 2/4] KVM: PPC: Book3S HV: track shared GFNs of secure VMs
From: Laurent Dufour @ 2020-06-05  9:48 UTC (permalink / raw)
  To: Ram Pai, kvm-ppc, linuxppc-dev
  Cc: cclaudio, bharata, aneesh.kumar, sukadev, bauerman, david
In-Reply-To: <1590892071-25549-3-git-send-email-linuxram@us.ibm.com>

Le 31/05/2020 à 04:27, Ram Pai a écrit :
> During the life of SVM, its GFNs can transition from secure to shared
> state and vice-versa. Since the kernel does not track GFNs that are
> shared, it is not possible to disambiguate a shared GFN from a GFN whose
> PFN has not yet been migrated to a device-PFN.
> 
> The ability to identify a shared GFN is needed to skip migrating its PFN
> to device PFN. This functionality is leveraged in a subsequent patch.
> 
> Add the ability to identify the state of a GFN.
> 
> Cc: Paul Mackerras <paulus@ozlabs.org>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Bharata B Rao <bharata@linux.ibm.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Cc: Laurent Dufour <ldufour@linux.ibm.com>
> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Claudio Carvalho <cclaudio@linux.ibm.com>
> Cc: kvm-ppc@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>   arch/powerpc/include/asm/kvm_book3s_uvmem.h |   6 +-
>   arch/powerpc/kvm/book3s_64_mmu_radix.c      |   2 +-
>   arch/powerpc/kvm/book3s_hv.c                |   2 +-
>   arch/powerpc/kvm/book3s_hv_uvmem.c          | 115 ++++++++++++++++++++++++++--
>   4 files changed, 113 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s_uvmem.h b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> index 5a9834e..f0c5708 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
> @@ -21,7 +21,8 @@ unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
>   int kvmppc_send_page_to_uv(struct kvm *kvm, unsigned long gfn);
>   unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm);
>   void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> -			     struct kvm *kvm, bool skip_page_out);
> +			     struct kvm *kvm, bool skip_page_out,
> +			     bool purge_gfn);
>   #else
>   static inline int kvmppc_uvmem_init(void)
>   {
> @@ -75,6 +76,7 @@ static inline int kvmppc_send_page_to_uv(struct kvm *kvm, unsigned long gfn)
>   
>   static inline void
>   kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> -			struct kvm *kvm, bool skip_page_out) { }
> +			struct kvm *kvm, bool skip_page_out,
> +			bool purge_gfn) { }
>   #endif /* CONFIG_PPC_UV */
>   #endif /* __ASM_KVM_BOOK3S_UVMEM_H__ */
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> index 803940d..3448459 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> @@ -1100,7 +1100,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm,
>   	unsigned int shift;
>   
>   	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START)
> -		kvmppc_uvmem_drop_pages(memslot, kvm, true);
> +		kvmppc_uvmem_drop_pages(memslot, kvm, true, false);

Why purge_gfn is false here?
That call function is called when dropping an hot plugged memslot.
That's being said, when called by kvmppc_core_commit_memory_region_hv(), the mem 
slot is then free by kvmppc_uvmem_slot_free() so that shared state will not 
remain long but there is a window...

>   
>   	if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE)
>   		return;
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 103d13e..4c62bfe 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -5467,7 +5467,7 @@ static int kvmhv_svm_off(struct kvm *kvm)
>   			continue;
>   
>   		kvm_for_each_memslot(memslot, slots) {
> -			kvmppc_uvmem_drop_pages(memslot, kvm, true);
> +			kvmppc_uvmem_drop_pages(memslot, kvm, true, true);
>   			uv_unregister_mem_slot(kvm->arch.lpid, memslot->id);
>   		}
>   	}
> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
> index ea4a1f1..2ef1e03 100644
> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
> @@ -99,14 +99,56 @@
>   static DEFINE_SPINLOCK(kvmppc_uvmem_bitmap_lock);
>   
>   #define KVMPPC_UVMEM_PFN	(1UL << 63)
> +#define KVMPPC_UVMEM_SHARED	(1UL << 62)
> +#define KVMPPC_UVMEM_FLAG_MASK	(KVMPPC_UVMEM_PFN | KVMPPC_UVMEM_SHARED)
> +#define KVMPPC_UVMEM_PFN_MASK	(~KVMPPC_UVMEM_FLAG_MASK)
>   
>   struct kvmppc_uvmem_slot {
>   	struct list_head list;
>   	unsigned long nr_pfns;
>   	unsigned long base_pfn;
> +	/*
> +	 * pfns array has an entry for each GFN of the memory slot.
> +	 *
> +	 * The GFN can be in one of the following states.
> +	 *
> +	 * (a) Secure - The GFN is secure. Only Ultravisor can access it.
> +	 * (b) Shared - The GFN is shared. Both Hypervisor and Ultravisor
> +	 *		can access it.
> +	 * (c) Normal - The GFN is a normal.  Only Hypervisor can access it.
> +	 *
> +	 * Secure GFN is associated with a devicePFN. Its pfn[] has
> +	 * KVMPPC_UVMEM_PFN flag set, and has the value of the device PFN
> +	 * KVMPPC_UVMEM_SHARED flag unset, and has the value of the device PFN
> +	 *
> +	 * Shared GFN is associated with a memoryPFN. Its pfn[] has
> +	 * KVMPPC_UVMEM_SHARED flag set. But its KVMPPC_UVMEM_PFN is not set,
> +	 * and there is no PFN value stored.
> +	 *
> +	 * Normal GFN is not associated with memoryPFN. Its pfn[] has
> +	 * KVMPPC_UVMEM_SHARED and KVMPPC_UVMEM_PFN flag unset, and no PFN
> +	 * value is stored.
> +	 *
> +	 * Any other combination of values in pfn[] leads to undefined
> +	 * behavior.
> +	 *
> +	 * Life cycle of a GFN --
> +	 *
> +	 * ---------------------------------------------------------
> +	 * |        |     Share	 |  Unshare | SVM	|slot      |
> +	 * |	    |		 |	    | abort/	|flush	   |
> +	 * |	    |		 |	    | terminate	|	   |
> +	 * ---------------------------------------------------------
> +	 * |        |            |          |           |          |
> +	 * | Secure |     Shared | Secure   |Normal	|Secure    |
> +	 * |        |            |          |           |          |
> +	 * | Shared |     Shared | Secure   |Normal     |Shared    |
> +	 * |        |            |          |           |          |
> +	 * | Normal |     Shared | Secure   |Normal     |Normal    |
> +	 * ---------------------------------------------------------
> +	 */
>   	unsigned long *pfns;
>   };
> -
>   struct kvmppc_uvmem_page_pvt {
>   	struct kvm *kvm;
>   	unsigned long gpa;
> @@ -175,7 +217,12 @@ static void kvmppc_uvmem_pfn_remove(unsigned long gfn, struct kvm *kvm)
>   
>   	list_for_each_entry(p, &kvm->arch.uvmem_pfns, list) {
>   		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
> -			p->pfns[gfn - p->base_pfn] = 0;
> +			/*
> +			 * Reset everything, but keep the KVMPPC_UVMEM_SHARED
> +			 * flag intact.  A gfn continues to be shared or
> +			 * unshared, with or without an associated device pfn.
> +			 */
> +			p->pfns[gfn - p->base_pfn] &= KVMPPC_UVMEM_SHARED;
>   			return;
>   		}
>   	}
> @@ -193,7 +240,7 @@ static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
>   			if (p->pfns[index] & KVMPPC_UVMEM_PFN) {
>   				if (uvmem_pfn)
>   					*uvmem_pfn = p->pfns[index] &
> -						     ~KVMPPC_UVMEM_PFN;
> +						     KVMPPC_UVMEM_PFN_MASK;
>   				return true;
>   			} else
>   				return false;
> @@ -202,6 +249,38 @@ static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
>   	return false;
>   }
>   
> +static void kvmppc_gfn_uvmem_shared(unsigned long gfn, struct kvm *kvm,
> +		bool set)
> +{
> +	struct kvmppc_uvmem_slot *p;
> +
> +	list_for_each_entry(p, &kvm->arch.uvmem_pfns, list) {
> +		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
> +			unsigned long index = gfn - p->base_pfn;
> +
> +			if (set)
> +				p->pfns[index] |= KVMPPC_UVMEM_SHARED;
> +			else
> +				p->pfns[index] &= ~KVMPPC_UVMEM_SHARED;
> +			return;
> +		}
> +	}
> +}
> +
> +bool kvmppc_gfn_is_uvmem_shared(unsigned long gfn, struct kvm *kvm)
> +{
> +	struct kvmppc_uvmem_slot *p;
> +
> +	list_for_each_entry(p, &kvm->arch.uvmem_pfns, list) {
> +		if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
> +			unsigned long index = gfn - p->base_pfn;
> +
> +			return (p->pfns[index] & KVMPPC_UVMEM_SHARED);
> +		}
> +	}
> +	return false;
> +}
> +
>   unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
>   {
>   	struct kvm_memslots *slots;
> @@ -256,9 +335,13 @@ unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
>    * is HV side fault on these pages. Next we *get* these pages, forcing
>    * fault on them, do fault time migration to replace the device PTEs in
>    * QEMU page table with normal PTEs from newly allocated pages.
> + *
> + * if @purge_gfn is set, cleanup any information related to each of
> + * the GFNs associated with this memory slot.
>    */
>   void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
> -			     struct kvm *kvm, bool skip_page_out)
> +			     struct kvm *kvm, bool skip_page_out,
> +			     bool purge_gfn)
>   {
>   	int i;
>   	struct kvmppc_uvmem_page_pvt *pvt;
> @@ -269,11 +352,22 @@ void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
>   		struct page *uvmem_page;
>   
>   		mutex_lock(&kvm->arch.uvmem_lock);
> +
> +		if (purge_gfn) {
> +			/*
> +			 * cleanup the shared status of the GFN here.
> +			 * Any device PFN associated with the GFN shall
> +			 * be cleaned up later, in kvmppc_uvmem_page_free()
> +			 * when the device PFN is actually disassociated
> +			 * from the GFN.
> +			 */
> +			kvmppc_gfn_uvmem_shared(gfn, kvm, false);
> +		}
> +
>   		if (!kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
>   			mutex_unlock(&kvm->arch.uvmem_lock);
>   			continue;
>   		}
> -
>   		uvmem_page = pfn_to_page(uvmem_pfn);
>   		pvt = uvmem_page->zone_device_data;
>   		pvt->skip_page_out = skip_page_out;
> @@ -304,7 +398,7 @@ unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm)
>   	srcu_idx = srcu_read_lock(&kvm->srcu);
>   
>   	kvm_for_each_memslot(memslot, kvm_memslots(kvm))
> -		kvmppc_uvmem_drop_pages(memslot, kvm, false);
> +		kvmppc_uvmem_drop_pages(memslot, kvm, false, true);
>   
>   	srcu_read_unlock(&kvm->srcu, srcu_idx);
>   
> @@ -470,8 +564,11 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
>   		goto retry;
>   	}
>   
> -	if (!uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0, page_shift))
> +	if (!uv_page_in(kvm->arch.lpid, pfn << page_shift, gpa, 0,
> +				page_shift)) {
> +		kvmppc_gfn_uvmem_shared(gfn, kvm, true);
>   		ret = H_SUCCESS;
> +	}
>   	kvm_release_pfn_clean(pfn);
>   	mutex_unlock(&kvm->arch.uvmem_lock);
>   out:
> @@ -527,8 +624,10 @@ unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
>   		goto out_unlock;
>   
>   	if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift,
> -				&downgrade))
> +				&downgrade)) {
> +		kvmppc_gfn_uvmem_shared(gfn, kvm, false);
>   		ret = H_SUCCESS;
> +	}
>   out_unlock:
>   	mutex_unlock(&kvm->arch.uvmem_lock);
>   out:
> 


^ permalink raw reply

* [PATCH 7/7] powerpc/perf: support BHRB disable bit and new filtering modes
From: Athira Rajeev @ 2020-06-05  7:57 UTC (permalink / raw)
  To: mpe; +Cc: mikey, mikey, maddy, linuxppc-dev, atrajeev
In-Reply-To: <1591343830-8286-1-git-send-email-atrajeev@linux.vnet.ibm.com>

PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB).
First is the addition of BHRB disable bit and second new filtering
modes for BHRB.

BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
bit 26, namely "BHRB Recording Disable (BHRBRD)". This field controls
whether BHRB entries are written when BHRB recording is enabled by other
bits. Patch implements support for this BHRB disable bit.

Secondly PowerISA v3.1 introduce filtering support for
PERF_SAMPLE_BRANCH_IND_CALL/COND. The patch adds BHRB filter support
for "ind_call" and "cond" in power10_bhrb_filter_map().

'commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer")'
added a check in bhrb_read() to filter the kernel address from BHRB buffer. Patch here modified
it to avoid that check for PowerISA v3.1 based processors, since PowerISA v3.1 allows
only MSR[PR]=1 address to be written to BHRB buffer.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/perf/core-book3s.c       | 27 +++++++++++++++++++++------
 arch/powerpc/perf/isa207-common.c     | 13 +++++++++++++
 arch/powerpc/perf/power10-pmu.c       | 13 +++++++++++--
 arch/powerpc/platforms/powernv/idle.c | 14 ++++++++++++++
 4 files changed, 59 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 44c86a9..d3856ff 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -464,9 +464,13 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
 			 * addresses at this point. Check the privileges before
 			 * exporting it to userspace (avoid exposure of regions
 			 * where we could have speculative execution)
+			 * Incase of ISA 310, BHRB will capture only user-space
+			 * address,hence include a check before filtering code
 			 */
-			if (is_kernel_addr(addr) && perf_allow_kernel(&event->attr) != 0)
-				continue;
+			if (!(ppmu->flags & PPMU_ARCH_310S))
+				if (is_kernel_addr(addr) &&
+				perf_allow_kernel(&event->attr) != 0)
+					continue;
 
 			/* Branches are read most recent first (ie. mfbhrb 0 is
 			 * the most recent branch).
@@ -1210,7 +1214,7 @@ static void write_mmcr0(struct cpu_hw_events *cpuhw, unsigned long mmcr0)
 static void power_pmu_disable(struct pmu *pmu)
 {
 	struct cpu_hw_events *cpuhw;
-	unsigned long flags, mmcr0, val;
+	unsigned long flags, mmcr0, val, mmcra = 0;
 
 	if (!ppmu)
 		return;
@@ -1243,12 +1247,23 @@ static void power_pmu_disable(struct pmu *pmu)
 		mb();
 		isync();
 
+		val = mmcra = cpuhw->mmcr[2];
+
 		/*
 		 * Disable instruction sampling if it was enabled
 		 */
-		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE) {
-			mtspr(SPRN_MMCRA,
-			      cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
+		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE)
+			mmcra = cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE;
+
+		/* Disable BHRB via mmcra [:26] for p10 if needed */
+		if (!(cpuhw->mmcr[2] & MMCRA_BHRB_DISABLE))
+			mmcra |= MMCRA_BHRB_DISABLE;
+
+		/* Write SPRN_MMCRA if mmcra has either disabled
+		 * instruction sampling or BHRB
+		 */
+		if (val != mmcra) {
+			mtspr(SPRN_MMCRA, mmcra);
 			mb();
 			isync();
 		}
diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index 7d4839e..463d925 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -404,6 +404,12 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 
 	mmcra = mmcr1 = mmcr2 = mmcr3 = 0;
 
+	/* Disable bhrb unless explicitly requested
+	 * by setting MMCRA [:26] bit.
+	 */
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		mmcra |= MMCRA_BHRB_DISABLE;
+
 	/* Second pass: assign PMCs, set all MMCR1 fields */
 	for (i = 0; i < n_ev; ++i) {
 		pmc     = (event[i] >> EVENT_PMC_SHIFT) & EVENT_PMC_MASK;
@@ -475,10 +481,17 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 		}
 
 		if (event[i] & EVENT_WANTS_BHRB) {
+			/* set MMCRA[:26] to 0 for Power10 to enable BHRB */
+			if (cpu_has_feature(CPU_FTR_ARCH_31))
+				mmcra &= ~MMCRA_BHRB_DISABLE;
 			val = (event[i] >> EVENT_IFM_SHIFT) & EVENT_IFM_MASK;
 			mmcra |= val << MMCRA_IFM_SHIFT;
 		}
 
+		/* set MMCRA[:26] to 0 if there is user request for BHRB */
+		if (cpu_has_feature(CPU_FTR_ARCH_31) && has_branch_stack(pevents[i]))
+			mmcra &= ~MMCRA_BHRB_DISABLE;
+
 		if (pevents[i]->attr.exclude_user)
 			mmcr2 |= MMCR2_FCP(pmc);
 
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index 07d0781..8effc18 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -82,6 +82,8 @@
 
 /* MMCRA IFM bits - POWER10 */
 #define POWER10_MMCRA_IFM1		0x0000000040000000UL
+#define POWER10_MMCRA_IFM2		0x0000000080000000UL
+#define POWER10_MMCRA_IFM3		0x00000000C0000000UL
 #define POWER10_MMCRA_BHRB_MASK		0x00000000C0000000UL
 
 /* Table of alternatives, sorted by column 0 */
@@ -245,8 +247,15 @@ static u64 power10_bhrb_filter_map(u64 branch_sample_type)
 	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
 		return -1;
 
-	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
-		return -1;
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL) {
+		pmu_bhrb_filter |= POWER10_MMCRA_IFM2;
+		return pmu_bhrb_filter;
+	}
+
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
+		pmu_bhrb_filter |= POWER10_MMCRA_IFM3;
+		return pmu_bhrb_filter;
+	}
 
 	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
 		return -1;
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 2dd4673..7db99c7 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -611,6 +611,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
 	unsigned long srr1;
 	unsigned long pls;
 	unsigned long mmcr0 = 0;
+	unsigned long mmcra_bhrb = 0;
 	struct p9_sprs sprs = {}; /* avoid false used-uninitialised */
 	bool sprs_saved = false;
 
@@ -657,6 +658,15 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
 		  */
 		mmcr0		= mfspr(SPRN_MMCR0);
 	}
+
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		/* POWER10 uses MMCRA[:26] as BHRB disable bit
+		 * to disable BHRB logic when not used. Hence Save and
+		 * restore MMCRA after a state-loss idle.
+		 */
+		mmcra_bhrb		= mfspr(SPRN_MMCRA);
+	}
+
 	if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
 		sprs.lpcr	= mfspr(SPRN_LPCR);
 		sprs.hfscr	= mfspr(SPRN_HFSCR);
@@ -721,6 +731,10 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
 			mtspr(SPRN_MMCR0, mmcr0);
 		}
 
+		/* Reload MMCRA to restore BHRB disable bit for POWER10 */
+		if (cpu_has_feature(CPU_FTR_ARCH_31))
+			mtspr(SPRN_MMCRA, mmcra_bhrb);
+
 		/*
 		 * DD2.2 and earlier need to set then clear bit 60 in MMCRA
 		 * to ensure the PMU starts running.
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH 6/7] powerpc/perf: power10 Performance Monitoring support
From: Athira Rajeev @ 2020-06-05  7:57 UTC (permalink / raw)
  To: mpe; +Cc: mikey, mikey, maddy, linuxppc-dev, atrajeev
In-Reply-To: <1591343830-8286-1-git-send-email-atrajeev@linux.vnet.ibm.com>

Base enablement patch to register performance monitoring
hardware support for power10. Patch introduce the raw event
encoding format, defines the supported list of events, config
fields for the event attributes and their corresponding bit values
which are exported via sysfs.

Patch also enhances the support function in isa207_common.c to
include power10 pmu hardware.

[Enablement of base PMU driver code]
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
[Addition of ISA macros for counter support functions]
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/perf/Makefile              |   2 +-
 arch/powerpc/perf/core-book3s.c         |   2 +
 arch/powerpc/perf/internal.h            |   1 +
 arch/powerpc/perf/isa207-common.c       |  59 ++++-
 arch/powerpc/perf/isa207-common.h       |  33 ++-
 arch/powerpc/perf/power10-events-list.h |  81 ++++++
 arch/powerpc/perf/power10-pmu.c         | 422 ++++++++++++++++++++++++++++++++
 7 files changed, 589 insertions(+), 11 deletions(-)
 create mode 100644 arch/powerpc/perf/power10-events-list.h
 create mode 100644 arch/powerpc/perf/power10-pmu.c

diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile
index 53d614e..c02854d 100644
--- a/arch/powerpc/perf/Makefile
+++ b/arch/powerpc/perf/Makefile
@@ -9,7 +9,7 @@ obj-$(CONFIG_PPC_PERF_CTRS)	+= core-book3s.o bhrb.o
 obj64-$(CONFIG_PPC_PERF_CTRS)	+= ppc970-pmu.o power5-pmu.o \
 				   power5+-pmu.o power6-pmu.o power7-pmu.o \
 				   isa207-common.o power8-pmu.o power9-pmu.o \
-				   generic-compat-pmu.o
+				   generic-compat-pmu.o power10-pmu.o
 obj32-$(CONFIG_PPC_PERF_CTRS)	+= mpc7450-pmu.o
 
 obj-$(CONFIG_PPC_POWERNV)	+= imc-pmu.o
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 6de81d1..44c86a9 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2331,6 +2331,8 @@ static int __init init_ppc64_pmu(void)
 		return 0;
 	else if (!init_power9_pmu())
 		return 0;
+	else if (!init_power10_pmu())
+		return 0;
 	else if (!init_ppc970_pmu())
 		return 0;
 	else
diff --git a/arch/powerpc/perf/internal.h b/arch/powerpc/perf/internal.h
index f755c64..80bbf72 100644
--- a/arch/powerpc/perf/internal.h
+++ b/arch/powerpc/perf/internal.h
@@ -9,4 +9,5 @@
 extern int init_power7_pmu(void);
 extern int init_power8_pmu(void);
 extern int init_power9_pmu(void);
+extern int init_power10_pmu(void);
 extern int init_generic_compat_pmu(void);
diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index 4c86da5..7d4839e 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -55,7 +55,9 @@ static bool is_event_valid(u64 event)
 {
 	u64 valid_mask = EVENT_VALID_MASK;
 
-	if (cpu_has_feature(CPU_FTR_ARCH_300))
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		valid_mask = p10_EVENT_VALID_MASK;
+	else if (cpu_has_feature(CPU_FTR_ARCH_300))
 		valid_mask = p9_EVENT_VALID_MASK;
 
 	return !(event & ~valid_mask);
@@ -69,6 +71,14 @@ static inline bool is_event_marked(u64 event)
 	return false;
 }
 
+static unsigned long sdar_mod_val(u64 event)
+{
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		return p10_SDAR_MODE(event);
+
+	return p9_SDAR_MODE(event);
+}
+
 static void mmcra_sdar_mode(u64 event, unsigned long *mmcra)
 {
 	/*
@@ -79,7 +89,7 @@ static void mmcra_sdar_mode(u64 event, unsigned long *mmcra)
 	 * MMCRA[SDAR_MODE] will be programmed as "0b01" for continous sampling
 	 * mode and will be un-changed when setting MMCRA[63] (Marked events).
 	 *
-	 * Incase of Power9:
+	 * Incase of Power9/power10:
 	 * Marked event: MMCRA[SDAR_MODE] will be set to 0b00 ('No Updates'),
 	 *               or if group already have any marked events.
 	 * For rest
@@ -90,8 +100,8 @@ static void mmcra_sdar_mode(u64 event, unsigned long *mmcra)
 	if (cpu_has_feature(CPU_FTR_ARCH_300)) {
 		if (is_event_marked(event) || (*mmcra & MMCRA_SAMPLE_ENABLE))
 			*mmcra &= MMCRA_SDAR_MODE_NO_UPDATES;
-		else if (p9_SDAR_MODE(event))
-			*mmcra |=  p9_SDAR_MODE(event) << MMCRA_SDAR_MODE_SHIFT;
+		else if (sdar_mod_val(event))
+			*mmcra |= sdar_mod_val(event) << MMCRA_SDAR_MODE_SHIFT;
 		else
 			*mmcra |= MMCRA_SDAR_MODE_DCACHE;
 	} else
@@ -134,7 +144,11 @@ static bool is_thresh_cmp_valid(u64 event)
 	/*
 	 * Check the mantissa upper two bits are not zero, unless the
 	 * exponent is also zero. See the THRESH_CMP_MANTISSA doc.
+	 * Power10: thresh_cmp is replaced by l2_l3 event select.
 	 */
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		return false;
+
 	cmp = (event >> EVENT_THR_CMP_SHIFT) & EVENT_THR_CMP_MASK;
 	exp = cmp >> 7;
 
@@ -251,7 +265,12 @@ int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp)
 
 	pmc   = (event >> EVENT_PMC_SHIFT)        & EVENT_PMC_MASK;
 	unit  = (event >> EVENT_UNIT_SHIFT)       & EVENT_UNIT_MASK;
-	cache = (event >> EVENT_CACHE_SEL_SHIFT)  & EVENT_CACHE_SEL_MASK;
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		cache = (event >> EVENT_CACHE_SEL_SHIFT) &
+			p10_EVENT_CACHE_SEL_MASK;
+	else
+		cache = (event >> EVENT_CACHE_SEL_SHIFT) &
+			EVENT_CACHE_SEL_MASK;
 	ebb   = (event >> EVENT_EBB_SHIFT)        & EVENT_EBB_MASK;
 
 	if (pmc) {
@@ -283,7 +302,10 @@ int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp)
 	}
 
 	if (unit >= 6 && unit <= 9) {
-		if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+		if (cpu_has_feature(CPU_FTR_ARCH_31) && (unit == 6)) {
+			mask |= CNST_L2L3_GROUP_MASK;
+			value |= CNST_L2L3_GROUP_VAL(event >> p10_L2L3_EVENT_SHIFT);
+		} else if (cpu_has_feature(CPU_FTR_ARCH_300)) {
 			mask  |= CNST_CACHE_GROUP_MASK;
 			value |= CNST_CACHE_GROUP_VAL(event & 0xff);
 
@@ -367,6 +389,7 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 			       struct perf_event *pevents[])
 {
 	unsigned long mmcra, mmcr1, mmcr2, unit, combine, psel, cache, val;
+	unsigned long mmcr3;
 	unsigned int pmc, pmc_inuse;
 	int i;
 
@@ -379,7 +402,7 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 			pmc_inuse |= 1 << pmc;
 	}
 
-	mmcra = mmcr1 = mmcr2 = 0;
+	mmcra = mmcr1 = mmcr2 = mmcr3 = 0;
 
 	/* Second pass: assign PMCs, set all MMCR1 fields */
 	for (i = 0; i < n_ev; ++i) {
@@ -438,8 +461,17 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 			mmcra |= val << MMCRA_THR_CTL_SHIFT;
 			val = (event[i] >> EVENT_THR_SEL_SHIFT) & EVENT_THR_SEL_MASK;
 			mmcra |= val << MMCRA_THR_SEL_SHIFT;
-			val = (event[i] >> EVENT_THR_CMP_SHIFT) & EVENT_THR_CMP_MASK;
-			mmcra |= thresh_cmp_val(val);
+			if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+				val = (event[i] >> EVENT_THR_CMP_SHIFT) &
+					EVENT_THR_CMP_MASK;
+				mmcra |= thresh_cmp_val(val);
+			}
+		}
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31) && (unit == 6)) {
+			val = (event[i] >> p10_L2L3_EVENT_SHIFT) &
+				p10_EVENT_L2L3_SEL_MASK;
+			mmcr2 |= val << p10_L2L3_SEL_SHIFT;
 		}
 
 		if (event[i] & EVENT_WANTS_BHRB) {
@@ -460,6 +492,14 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 				mmcr2 |= MMCR2_FCS(pmc);
 		}
 
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			if (pmc <= 4) {
+				val = (event[i] >> p10_EVENT_MMCR3_SHIFT) &
+					p10_EVENT_MMCR3_MASK;
+				mmcr3 |= val << MMCR3_SHIFT(pmc);
+			}
+		}
+
 		hwc[i] = pmc - 1;
 	}
 
@@ -480,6 +520,7 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 	mmcr[1] = mmcr1;
 	mmcr[2] = mmcra;
 	mmcr[3] = mmcr2;
+	mmcr[4] = mmcr3;
 
 	return 0;
 }
diff --git a/arch/powerpc/perf/isa207-common.h b/arch/powerpc/perf/isa207-common.h
index 63fd4f3..85cbce5 100644
--- a/arch/powerpc/perf/isa207-common.h
+++ b/arch/powerpc/perf/isa207-common.h
@@ -87,6 +87,31 @@
 	 EVENT_LINUX_MASK					|	\
 	 EVENT_PSEL_MASK))
 
+/* Contants to support power10 raw encoding format */
+#define p10_SDAR_MODE_SHIFT		22
+#define p10_SDAR_MODE_MASK		0x3ull
+#define p10_SDAR_MODE(v)		(((v) >> p10_SDAR_MODE_SHIFT) & \
+					p10_SDAR_MODE_MASK)
+#define p10_EVENT_L2L3_SEL_MASK		0x1f
+#define p10_L2L3_SEL_SHIFT		3
+#define p10_L2L3_EVENT_SHIFT		40
+#define p10_EVENT_THRESH_MASK		0xffffull
+#define p10_EVENT_CACHE_SEL_MASK	0x3ull
+#define p10_EVENT_MMCR3_MASK		0x7fffull
+#define p10_EVENT_MMCR3_SHIFT		45
+
+#define p10_EVENT_VALID_MASK		\
+	((p10_SDAR_MODE_MASK   << p10_SDAR_MODE_SHIFT		|	\
+	(p10_EVENT_THRESH_MASK  << EVENT_THRESH_SHIFT)		|	\
+	(EVENT_SAMPLE_MASK     << EVENT_SAMPLE_SHIFT)		|	\
+	(p10_EVENT_CACHE_SEL_MASK  << EVENT_CACHE_SEL_SHIFT)	|	\
+	(EVENT_PMC_MASK        << EVENT_PMC_SHIFT)		|	\
+	(EVENT_UNIT_MASK       << EVENT_UNIT_SHIFT)		|	\
+	(p9_EVENT_COMBINE_MASK << p9_EVENT_COMBINE_SHIFT)	|	\
+	(p10_EVENT_MMCR3_MASK  << p10_EVENT_MMCR3_SHIFT)	|	\
+	(EVENT_MARKED_MASK     << EVENT_MARKED_SHIFT)		|	\
+	 EVENT_LINUX_MASK					|	\
+	EVENT_PSEL_MASK))
 /*
  * Layout of constraint bits:
  *
@@ -135,6 +160,9 @@
 #define CNST_CACHE_PMC4_VAL	(1ull << 54)
 #define CNST_CACHE_PMC4_MASK	CNST_CACHE_PMC4_VAL
 
+#define CNST_L2L3_GROUP_VAL(v)	(((v) & 0x1full) << 55)
+#define CNST_L2L3_GROUP_MASK	CNST_L2L3_GROUP_VAL(0x1f)
+
 /*
  * For NC we are counting up to 4 events. This requires three bits, and we need
  * the fifth event to overflow and set the 4th bit. To achieve that we bias the
@@ -191,7 +219,7 @@
 #define MMCRA_THR_CTR_EXP(v)		(((v) >> MMCRA_THR_CTR_EXP_SHIFT) &\
 						MMCRA_THR_CTR_EXP_MASK)
 
-/* MMCR1 Threshold Compare bit constant for power9 */
+/* MMCRA Threshold Compare bit constant for power9/power10 */
 #define p9_MMCRA_THR_CMP_SHIFT	45
 
 /* Bits in MMCR2 for PowerISA v2.07 */
@@ -202,6 +230,9 @@
 #define MAX_ALT				2
 #define MAX_PMU_COUNTERS		6
 
+/* Bits in MMCR3 for PowerISA v3.10 */
+#define MMCR3_SHIFT(pmc)		(49 - (15 * ((pmc) - 1)))
+
 #define ISA207_SIER_TYPE_SHIFT		15
 #define ISA207_SIER_TYPE_MASK		(0x7ull << ISA207_SIER_TYPE_SHIFT)
 
diff --git a/arch/powerpc/perf/power10-events-list.h b/arch/powerpc/perf/power10-events-list.h
new file mode 100644
index 0000000..a15bb87
--- /dev/null
+++ b/arch/powerpc/perf/power10-events-list.h
@@ -0,0 +1,81 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Performance counter support for POWER10 processors.
+ *
+ * Copyright 2020 Madhavan Srinivasan, IBM Corporation.
+ * Copyright 2020 Athira Rajeev, IBM Corporation.
+ */
+
+/*
+ * Power10 event codes.
+ */
+EVENT(PM_RUN_CYC,				0x600f4);
+EVENT(PM_DISP_STALL_CYC,			0x100f8);
+EVENT(PM_EXEC_STALL,				0x30008);
+EVENT(PM_RUN_INST_CMPL,				0x500fa);
+EVENT(PM_BR_FIN,				0x10068);
+EVENT(PM_BR_MPRED_FIN,				0x45884);
+
+/* All L1 D cache load references counted at finish, gated by reject */
+EVENT(PM_LD_REF_L1,				0x100fc);
+/* Load Missed L1 */
+EVENT(PM_LD_DEMAND_MISS_L1_FIN,			0x400f0);
+EVENT(PM_LD_MISS_L1,				0x3e054);
+/* Store Missed L1 */
+EVENT(PM_ST_MISS_L1,				0x300f0);
+/* L1 cache data prefetches */
+EVENT(PM_LD_PREFETCH_CACHE_LINE_MISS,		0x1002c);
+/* Demand iCache Miss */
+EVENT(PM_L1_ICACHE_MISS,			0x200fc);
+/* Instruction fetches from L1 */
+EVENT(PM_INST_FROM_L1,				0x04080);
+/* Instruction Demand sectors wriittent into IL1 */
+EVENT(PM_INST_FROM_L1MISS,			0x03f00000001c040);
+/* Instruction prefetch written into IL1 */
+EVENT(PM_IC_PREF_REQ,				0x040a0);
+/* The data cache was reloaded from local core's L3 due to a demand load */
+EVENT(PM_DATA_FROM_L3,				0x01340000001c040);
+/* Demand LD - L3 Miss (not L2 hit and not L3 hit) */
+EVENT(PM_DATA_FROM_L3MISS,			0x300fe);
+/* All successful D-side store dispatches for this thread */
+EVENT(PM_L2_ST,					0x010000046080);
+/* All successful D-side store dispatches for this thread that were L2 Miss */
+EVENT(PM_L2_ST_MISS,				0x26880);
+/* Total HW L3 prefetches(Load+store) */
+EVENT(PM_L3_PF_MISS_L3,				0xd8b8);
+/* Branch Load misses */
+EVENT(PM_BR_MPRED_CMPL,				0x400f6);
+/* Branch loads */
+EVENT(PM_BR_CMPL,				0x4d05e);
+/* Data PTEG reload */
+EVENT(PM_DTLB_MISS,				0x300fc);
+/* ITLB Reloaded */
+EVENT(PM_ITLB_MISS,				0x400fc);
+
+EVENT(PM_RUN_CYC_ALT,				0x0001e);
+EVENT(PM_RUN_INST_CMPL_ALT,			0x00002);
+
+/*
+ * Memory Access Events
+ *
+ * Primary PMU event used here is PM_MRK_INST_CMPL (0x401e0)
+ * To enable capturing of memory profiling, these MMCRA bits
+ * needs to be programmed and corresponding raw event format
+ * encoding.
+ *
+ * MMCRA bits encoding needed are
+ *     SM (Sampling Mode)
+ *     EM (Eligibility for Random Sampling)
+ *     TECE (Threshold Event Counter Event)
+ *     TS (Threshold Start Event)
+ *     TE (Threshold End Event)
+ *
+ * Corresponding Raw Encoding bits:
+ *     sample [EM,SM]
+ *     thresh_sel (TECE)
+ *     thresh start (TS)
+ *     thresh end (TE)
+ */
+
+EVENT(MEM_LOADS,				0x34340401e0);
+EVENT(MEM_STORES,				0x343c0401e0);
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
new file mode 100644
index 0000000..07d0781
--- /dev/null
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -0,0 +1,422 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Performance counter support for POWER10 processors.
+ *
+ * Copyright 2020 Madhavan Srinivasan, IBM Corporation.
+ * Copyright 2020 Athira Rajeev, IBM Corporation.
+ */
+
+#define pr_fmt(fmt)	"power10-pmu: " fmt
+
+#include "isa207-common.h"
+
+/*
+ * Raw event encoding for Power10:
+ *
+ *        60        56        52        48        44        40        36        32
+ * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - |
+ *   | | [ ]   [ src_match ] [  src_mask ]   | [ ] [ l2l3_sel ]  [  thresh_ctl   ]
+ *   | |  |                                  |  |                         |
+ *   | |  *- IFM (Linux)                     |  |        thresh start/stop -*
+ *   | *- BHRB (Linux)                       |  src_sel
+ *   *- EBB (Linux)                          *invert_bit
+ *
+ *        28        24        20        16        12         8         4         0
+ * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - |
+ *   [   ] [  sample ]   [ ] [ ]   [ pmc ]   [unit ]   [ ]   m   [    pmcxsel    ]
+ *     |        |        |    |                        |     |
+ *     |        |        |    |                        |     *- mark
+ *     |        |        |    *- L1/L2/L3 cache_sel    |
+ *     |        |        sdar_mode                     |
+ *     |        *- sampling mode for marked events     *- combine
+ *     |
+ *     *- thresh_sel
+ *
+ * Below uses IBM bit numbering.
+ *
+ * MMCR1[x:y] = unit    (PMCxUNIT)
+ * MMCR1[24]   = pmc1combine[0]
+ * MMCR1[25]   = pmc1combine[1]
+ * MMCR1[26]   = pmc2combine[0]
+ * MMCR1[27]   = pmc2combine[1]
+ * MMCR1[28]   = pmc3combine[0]
+ * MMCR1[29]   = pmc3combine[1]
+ * MMCR1[30]   = pmc4combine[0]
+ * MMCR1[31]   = pmc4combine[1]
+ *
+ * if pmc == 3 and unit == 0 and pmcxsel[0:6] == 0b0101011
+ *	MMCR1[20:27] = thresh_ctl
+ * else if pmc == 4 and unit == 0xf and pmcxsel[0:6] == 0b0101001
+ *	MMCR1[20:27] = thresh_ctl
+ * else
+ *	MMCRA[48:55] = thresh_ctl   (THRESH START/END)
+ *
+ * if thresh_sel:
+ *	MMCRA[45:47] = thresh_sel
+ *
+ * if l2l3_sel:
+ * MMCR2[56:60] = l2l3_sel[0:4]
+ *
+ * MMCR1[16] = cache_sel[0]
+ * MMCR1[17] = cache_sel[1]
+ *
+ * if mark:
+ *	MMCRA[63]    = 1		(SAMPLE_ENABLE)
+ *	MMCRA[57:59] = sample[0:2]	(RAND_SAMP_ELIG)
+ *	MMCRA[61:62] = sample[3:4]	(RAND_SAMP_MODE)
+ *
+ * if EBB and BHRB:
+ *	MMCRA[32:33] = IFM
+ *
+ * MMCRA[SDAR_MODE]  = sdar_mode[0:1]
+ */
+
+/*
+ * Some power10 event codes.
+ */
+#define EVENT(_name, _code)     enum{_name = _code}
+
+#include "power10-events-list.h"
+
+#undef EVENT
+
+/* MMCRA IFM bits - POWER10 */
+#define POWER10_MMCRA_IFM1		0x0000000040000000UL
+#define POWER10_MMCRA_BHRB_MASK		0x00000000C0000000UL
+
+/* Table of alternatives, sorted by column 0 */
+static const unsigned int power10_event_alternatives[][MAX_ALT] = {
+	{ PM_RUN_CYC_ALT,		PM_RUN_CYC },
+	{ PM_RUN_INST_CMPL_ALT,		PM_RUN_INST_CMPL },
+};
+
+static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
+{
+	int num_alt = 0;
+
+	num_alt = isa207_get_alternatives(event, alt,
+					  ARRAY_SIZE(
+					  power10_event_alternatives), flags,
+					  power10_event_alternatives);
+
+	return num_alt;
+}
+
+GENERIC_EVENT_ATTR(cpu-cycles,			PM_RUN_CYC);
+GENERIC_EVENT_ATTR(stalled-cycles-frontend,	PM_DISP_STALL_CYC);
+GENERIC_EVENT_ATTR(stalled-cycles-backend,	PM_EXEC_STALL);
+GENERIC_EVENT_ATTR(instructions,		PM_RUN_INST_CMPL);
+GENERIC_EVENT_ATTR(branch-instructions,		PM_BR_FIN);
+GENERIC_EVENT_ATTR(branch-misses,		PM_BR_MPRED_FIN);
+GENERIC_EVENT_ATTR(cache-references,		PM_LD_REF_L1);
+GENERIC_EVENT_ATTR(cache-misses,		PM_LD_DEMAND_MISS_L1_FIN);
+GENERIC_EVENT_ATTR(mem-loads,			MEM_LOADS);
+GENERIC_EVENT_ATTR(mem-stores,			MEM_STORES);
+
+CACHE_EVENT_ATTR(L1-dcache-load-misses,		PM_LD_MISS_L1);
+CACHE_EVENT_ATTR(L1-dcache-loads,		PM_LD_REF_L1);
+CACHE_EVENT_ATTR(L1-dcache-prefetches,		PM_LD_PREFETCH_CACHE_LINE_MISS);
+CACHE_EVENT_ATTR(L1-dcache-store-misses,	PM_ST_MISS_L1);
+CACHE_EVENT_ATTR(L1-icache-load-misses,		PM_L1_ICACHE_MISS);
+CACHE_EVENT_ATTR(L1-icache-loads,		PM_INST_FROM_L1);
+CACHE_EVENT_ATTR(L1-icache-prefetches,		PM_IC_PREF_REQ);
+CACHE_EVENT_ATTR(LLC-load-misses,		PM_DATA_FROM_L3MISS);
+CACHE_EVENT_ATTR(LLC-loads,			PM_DATA_FROM_L3);
+CACHE_EVENT_ATTR(LLC-prefetches,		PM_L3_PF_MISS_L3);
+CACHE_EVENT_ATTR(LLC-store-misses,		PM_L2_ST_MISS);
+CACHE_EVENT_ATTR(LLC-stores,			PM_L2_ST);
+CACHE_EVENT_ATTR(branch-load-misses,		PM_BR_MPRED_CMPL);
+CACHE_EVENT_ATTR(branch-loads,			PM_BR_CMPL);
+CACHE_EVENT_ATTR(dTLB-load-misses,		PM_DTLB_MISS);
+CACHE_EVENT_ATTR(iTLB-load-misses,		PM_ITLB_MISS);
+
+static struct attribute *power10_events_attr[] = {
+	GENERIC_EVENT_PTR(PM_RUN_CYC),
+	GENERIC_EVENT_PTR(PM_DISP_STALL_CYC),
+	GENERIC_EVENT_PTR(PM_EXEC_STALL),
+	GENERIC_EVENT_PTR(PM_RUN_INST_CMPL),
+	GENERIC_EVENT_PTR(PM_BR_FIN),
+	GENERIC_EVENT_PTR(PM_BR_MPRED_FIN),
+	GENERIC_EVENT_PTR(PM_LD_REF_L1),
+	GENERIC_EVENT_PTR(PM_LD_DEMAND_MISS_L1_FIN),
+	GENERIC_EVENT_PTR(MEM_LOADS),
+	GENERIC_EVENT_PTR(MEM_STORES),
+	CACHE_EVENT_PTR(PM_LD_MISS_L1),
+	CACHE_EVENT_PTR(PM_LD_REF_L1),
+	CACHE_EVENT_PTR(PM_LD_PREFETCH_CACHE_LINE_MISS),
+	CACHE_EVENT_PTR(PM_ST_MISS_L1),
+	CACHE_EVENT_PTR(PM_L1_ICACHE_MISS),
+	CACHE_EVENT_PTR(PM_INST_FROM_L1),
+	CACHE_EVENT_PTR(PM_IC_PREF_REQ),
+	CACHE_EVENT_PTR(PM_DATA_FROM_L3MISS),
+	CACHE_EVENT_PTR(PM_DATA_FROM_L3),
+	CACHE_EVENT_PTR(PM_L3_PF_MISS_L3),
+	CACHE_EVENT_PTR(PM_L2_ST_MISS),
+	CACHE_EVENT_PTR(PM_L2_ST),
+	CACHE_EVENT_PTR(PM_BR_MPRED_CMPL),
+	CACHE_EVENT_PTR(PM_BR_CMPL),
+	CACHE_EVENT_PTR(PM_DTLB_MISS),
+	CACHE_EVENT_PTR(PM_ITLB_MISS),
+	NULL
+};
+
+static struct attribute_group power10_pmu_events_group = {
+	.name = "events",
+	.attrs = power10_events_attr,
+};
+
+PMU_FORMAT_ATTR(event,          "config:0-59");
+PMU_FORMAT_ATTR(pmcxsel,        "config:0-7");
+PMU_FORMAT_ATTR(mark,           "config:8");
+PMU_FORMAT_ATTR(combine,        "config:10-11");
+PMU_FORMAT_ATTR(unit,           "config:12-15");
+PMU_FORMAT_ATTR(pmc,            "config:16-19");
+PMU_FORMAT_ATTR(cache_sel,      "config:20-21");
+PMU_FORMAT_ATTR(sdar_mode,      "config:22-23");
+PMU_FORMAT_ATTR(sample_mode,    "config:24-28");
+PMU_FORMAT_ATTR(thresh_sel,     "config:29-31");
+PMU_FORMAT_ATTR(thresh_stop,    "config:32-35");
+PMU_FORMAT_ATTR(thresh_start,   "config:36-39");
+PMU_FORMAT_ATTR(l2l3_sel,       "config:40-44");
+PMU_FORMAT_ATTR(src_sel,        "config:45-46");
+PMU_FORMAT_ATTR(invert_bit,     "config:47");
+PMU_FORMAT_ATTR(src_mask,       "config:48-53");
+PMU_FORMAT_ATTR(src_match,      "config:54-59");
+
+static struct attribute *power10_pmu_format_attr[] = {
+	&format_attr_event.attr,
+	&format_attr_pmcxsel.attr,
+	&format_attr_mark.attr,
+	&format_attr_combine.attr,
+	&format_attr_unit.attr,
+	&format_attr_pmc.attr,
+	&format_attr_cache_sel.attr,
+	&format_attr_sdar_mode.attr,
+	&format_attr_sample_mode.attr,
+	&format_attr_thresh_sel.attr,
+	&format_attr_thresh_stop.attr,
+	&format_attr_thresh_start.attr,
+	&format_attr_l2l3_sel.attr,
+	&format_attr_src_sel.attr,
+	&format_attr_invert_bit.attr,
+	&format_attr_src_mask.attr,
+	&format_attr_src_match.attr,
+	NULL,
+};
+
+static struct attribute_group power10_pmu_format_group = {
+	.name = "format",
+	.attrs = power10_pmu_format_attr,
+};
+
+static const struct attribute_group *power10_pmu_attr_groups[] = {
+	&power10_pmu_format_group,
+	&power10_pmu_events_group,
+	NULL,
+};
+
+static int power10_generic_events[] = {
+	[PERF_COUNT_HW_CPU_CYCLES] =			PM_RUN_CYC,
+	[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =	PM_DISP_STALL_CYC,
+	[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =	PM_EXEC_STALL,
+	[PERF_COUNT_HW_INSTRUCTIONS] =			PM_RUN_INST_CMPL,
+	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] =		PM_BR_FIN,
+	[PERF_COUNT_HW_BRANCH_MISSES] =			PM_BR_MPRED_FIN,
+	[PERF_COUNT_HW_CACHE_REFERENCES] =		PM_LD_REF_L1,
+	[PERF_COUNT_HW_CACHE_MISSES] =			PM_LD_DEMAND_MISS_L1_FIN,
+};
+
+static u64 power10_bhrb_filter_map(u64 branch_sample_type)
+{
+	u64 pmu_bhrb_filter = 0;
+
+	/* BHRB and regular PMU events share the same privilege state
+	 * filter configuration. BHRB is always recorded along with a
+	 * regular PMU event. As the privilege state filter is handled
+	 * in the basic PMC configuration of the accompanying regular
+	 * PMU event, we ignore any separate BHRB specific request.
+	 */
+
+	/* No branch filter requested */
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY)
+		return pmu_bhrb_filter;
+
+	/* Invalid branch filter options - HW does not support */
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+		return -1;
+
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
+		return -1;
+
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
+		return -1;
+
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
+		pmu_bhrb_filter |= POWER10_MMCRA_IFM1;
+		return pmu_bhrb_filter;
+	}
+
+	/* Every thing else is unsupported */
+	return -1;
+}
+
+static void power10_config_bhrb(u64 pmu_bhrb_filter)
+{
+	pmu_bhrb_filter &= POWER10_MMCRA_BHRB_MASK;
+
+	/* Enable BHRB filter in PMU */
+	mtspr(SPRN_MMCRA, (mfspr(SPRN_MMCRA) | pmu_bhrb_filter));
+}
+
+#define C(x)	PERF_COUNT_HW_CACHE_##x
+
+/*
+ * Table of generalized cache-related events.
+ * 0 means not supported, -1 means nonsensical, other values
+ * are event codes.
+ */
+static u64 power10_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+	[C(L1D)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = PM_LD_REF_L1,
+			[C(RESULT_MISS)] = PM_LD_MISS_L1,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = 0,
+			[C(RESULT_MISS)] = PM_ST_MISS_L1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = PM_LD_PREFETCH_CACHE_LINE_MISS,
+			[C(RESULT_MISS)] = 0,
+		},
+	},
+	[C(L1I)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = PM_INST_FROM_L1,
+			[C(RESULT_MISS)] = PM_L1_ICACHE_MISS,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = PM_INST_FROM_L1MISS,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = PM_IC_PREF_REQ,
+			[C(RESULT_MISS)] = 0,
+		},
+	},
+	[C(LL)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = PM_DATA_FROM_L3,
+			[C(RESULT_MISS)] = PM_DATA_FROM_L3MISS,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = PM_L2_ST,
+			[C(RESULT_MISS)] = PM_L2_ST_MISS,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = PM_L3_PF_MISS_L3,
+			[C(RESULT_MISS)] = 0,
+		},
+	},
+	 [C(DTLB)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = 0,
+			[C(RESULT_MISS)] = PM_DTLB_MISS,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+	},
+	[C(ITLB)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = 0,
+			[C(RESULT_MISS)] = PM_ITLB_MISS,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+	},
+	[C(BPU)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = PM_BR_CMPL,
+			[C(RESULT_MISS)] = PM_BR_MPRED_CMPL,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+	},
+	[C(NODE)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+	},
+};
+
+#undef C
+
+static struct power_pmu power10_pmu = {
+	.name			= "POWER10",
+	.n_counter		= MAX_PMU_COUNTERS,
+	.add_fields		= ISA207_ADD_FIELDS,
+	.test_adder		= ISA207_TEST_ADDER,
+	.group_constraint_mask	= CNST_CACHE_PMC4_MASK,
+	.group_constraint_val	= CNST_CACHE_PMC4_VAL,
+	.compute_mmcr		= isa207_compute_mmcr,
+	.config_bhrb		= power10_config_bhrb,
+	.bhrb_filter_map	= power10_bhrb_filter_map,
+	.get_constraint		= isa207_get_constraint,
+	.get_alternatives	= power10_get_alternatives,
+	.get_mem_data_src	= isa207_get_mem_data_src,
+	.get_mem_weight		= isa207_get_mem_weight,
+	.disable_pmc		= isa207_disable_pmc,
+	.flags			= PPMU_HAS_SIER | PPMU_ARCH_207S |
+				  PPMU_ARCH_310S,
+	.n_generic		= ARRAY_SIZE(power10_generic_events),
+	.generic_events		= power10_generic_events,
+	.cache_events		= &power10_cache_events,
+	.attr_groups		= power10_pmu_attr_groups,
+	.bhrb_nr		= 32,
+};
+
+int init_power10_pmu(void)
+{
+	int rc;
+
+	/* Comes from cpu_specs[] */
+	if (!cur_cpu_spec->oprofile_cpu_type ||
+	    strcmp(cur_cpu_spec->oprofile_cpu_type, "ppc64/power10"))
+		return -ENODEV;
+
+	rc = register_power_pmu(&power10_pmu);
+	if (rc)
+		return rc;
+
+	/* Tell userspace that EBB is supported */
+	cur_cpu_spec->cpu_user_features2 |= PPC_FEATURE2_EBB;
+
+	return 0;
+}
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH 5/7] powerpc/perf: Update Power PMU cache_events to u64 type
From: Athira Rajeev @ 2020-06-05  7:57 UTC (permalink / raw)
  To: mpe; +Cc: mikey, mikey, maddy, linuxppc-dev, atrajeev
In-Reply-To: <1591343830-8286-1-git-send-email-atrajeev@linux.vnet.ibm.com>

Events of type PERF_TYPE_HW_CACHE was described for Power PMU
as: int (*cache_events)[type][op][result];

where type, op, result values unpacked from the event attribute config
value is used to generate the raw event code at runtime.

So far the event code values which used to create these cache-related
events were within 32 bit and `int` type worked. In power10,
some of the event codes are of 64-bit value and hence update the
Power PMU cache_events to `u64` type in `power_pmu` struct.
Also propagate this change to existing all PMU driver code paths
which are using ppmu->cache_events.

Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/perf_event_server.h | 2 +-
 arch/powerpc/perf/core-book3s.c              | 2 +-
 arch/powerpc/perf/generic-compat-pmu.c       | 2 +-
 arch/powerpc/perf/mpc7450-pmu.c              | 2 +-
 arch/powerpc/perf/power5+-pmu.c              | 2 +-
 arch/powerpc/perf/power5-pmu.c               | 2 +-
 arch/powerpc/perf/power6-pmu.c               | 2 +-
 arch/powerpc/perf/power7-pmu.c               | 2 +-
 arch/powerpc/perf/power8-pmu.c               | 2 +-
 arch/powerpc/perf/power9-pmu.c               | 2 +-
 arch/powerpc/perf/ppc970-pmu.c               | 2 +-
 11 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 895aeaa..cb207f8 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -47,7 +47,7 @@ struct power_pmu {
 	const struct attribute_group	**attr_groups;
 	int		n_generic;
 	int		*generic_events;
-	int		(*cache_events)[PERF_COUNT_HW_CACHE_MAX]
+	u64		(*cache_events)[PERF_COUNT_HW_CACHE_MAX]
 			       [PERF_COUNT_HW_CACHE_OP_MAX]
 			       [PERF_COUNT_HW_CACHE_RESULT_MAX];
 
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 9db72cd..6de81d1 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1818,7 +1818,7 @@ static void hw_perf_event_destroy(struct perf_event *event)
 static int hw_perf_cache_event(u64 config, u64 *eventp)
 {
 	unsigned long type, op, result;
-	int ev;
+	u64 ev;
 
 	if (!ppmu->cache_events)
 		return -EINVAL;
diff --git a/arch/powerpc/perf/generic-compat-pmu.c b/arch/powerpc/perf/generic-compat-pmu.c
index 5e5a54d..eb8a6aaf 100644
--- a/arch/powerpc/perf/generic-compat-pmu.c
+++ b/arch/powerpc/perf/generic-compat-pmu.c
@@ -101,7 +101,7 @@ enum {
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int generic_compat_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 generic_compat_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[ C(L1D) ] = {
 		[ C(OP_READ) ] = {
 			[ C(RESULT_ACCESS) ] = 0,
diff --git a/arch/powerpc/perf/mpc7450-pmu.c b/arch/powerpc/perf/mpc7450-pmu.c
index 4d5ef92..cf1eb89 100644
--- a/arch/powerpc/perf/mpc7450-pmu.c
+++ b/arch/powerpc/perf/mpc7450-pmu.c
@@ -354,7 +354,7 @@ static void mpc7450_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int mpc7450_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 mpc7450_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0,		0x225	},
 		[C(OP_WRITE)] = {	0,		0x227	},
diff --git a/arch/powerpc/perf/power5+-pmu.c b/arch/powerpc/perf/power5+-pmu.c
index f857454..9252281 100644
--- a/arch/powerpc/perf/power5+-pmu.c
+++ b/arch/powerpc/perf/power5+-pmu.c
@@ -618,7 +618,7 @@ static void power5p_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int power5p_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power5p_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0x1c10a8,	0x3c1088	},
 		[C(OP_WRITE)] = {	0x2c10a8,	0xc10c3		},
diff --git a/arch/powerpc/perf/power5-pmu.c b/arch/powerpc/perf/power5-pmu.c
index da52eca..3b36630 100644
--- a/arch/powerpc/perf/power5-pmu.c
+++ b/arch/powerpc/perf/power5-pmu.c
@@ -560,7 +560,7 @@ static void power5_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int power5_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power5_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0x4c1090,	0x3c1088	},
 		[C(OP_WRITE)] = {	0x3c1090,	0xc10c3		},
diff --git a/arch/powerpc/perf/power6-pmu.c b/arch/powerpc/perf/power6-pmu.c
index 3929cac..540b78d 100644
--- a/arch/powerpc/perf/power6-pmu.c
+++ b/arch/powerpc/perf/power6-pmu.c
@@ -481,7 +481,7 @@ static void p6_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * are event codes.
  * The "DTLB" and "ITLB" events relate to the DERAT and IERAT.
  */
-static int power6_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power6_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0x280030,	0x80080		},
 		[C(OP_WRITE)] = {	0x180032,	0x80088		},
diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
index a137813..2b7f375 100644
--- a/arch/powerpc/perf/power7-pmu.c
+++ b/arch/powerpc/perf/power7-pmu.c
@@ -332,7 +332,7 @@ static void power7_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int power7_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power7_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0xc880,		0x400f0	},
 		[C(OP_WRITE)] = {	0,		0x300f0	},
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 3a5fcc2..5282e84 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -253,7 +253,7 @@ static void power8_config_bhrb(u64 pmu_bhrb_filter)
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int power8_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power8_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[ C(L1D) ] = {
 		[ C(OP_READ) ] = {
 			[ C(RESULT_ACCESS) ] = PM_LD_REF_L1,
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 08c3ef7..05dae38 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -310,7 +310,7 @@ static void power9_config_bhrb(u64 pmu_bhrb_filter)
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int power9_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power9_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[ C(L1D) ] = {
 		[ C(OP_READ) ] = {
 			[ C(RESULT_ACCESS) ] = PM_LD_REF_L1,
diff --git a/arch/powerpc/perf/ppc970-pmu.c b/arch/powerpc/perf/ppc970-pmu.c
index 4035d93..2970d1e 100644
--- a/arch/powerpc/perf/ppc970-pmu.c
+++ b/arch/powerpc/perf/ppc970-pmu.c
@@ -432,7 +432,7 @@ static void p970_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int ppc970_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 ppc970_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0x8810,		0x3810	},
 		[C(OP_WRITE)] = {	0x7810,		0x813	},
-- 
1.8.3.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox