LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [RESEND PATCH v7 3/5] powerpc/papr_scm: Fetch nvdimm health information from PHYP
From: Vaibhav Jain @ 2020-05-19 19:00 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams
In-Reply-To: <20200519190058.257981-1-vaibhav@linux.ibm.com>

Implement support for fetching nvdimm health information via
H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
of 64-bit big-endian integers, bitwise-and of which is then stored in
'struct papr_scm_priv' and subsequently partially exposed to
user-space via newly introduced dimm specific attribute
'papr/flags'. Since the hcall is costly, the health information is
cached and only re-queried, 60s after the previous successful hcall.

The patch also adds a  documentation text describing flags reported by
the the new sysfs attribute 'papr/flags' is also introduced at
Documentation/ABI/testing/sysfs-bus-papr-scm.

[1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
PAPR hcalls")

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

Resend:
* None

v6..v7 :
* Used the exported buf_seq_printf() function to generate content for
  'papr/flags'
* Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
  and removed the papr_scm.h file [Mpe]
* Some minor consistency issued in sysfs-bus-papr-scm
  documentation. [Mpe]
* s/dimm_mutex/health_mutex/g [Mpe]
* Split drc_pmem_query_health() into two function one of which takes
  care of caching and locking. [Mpe]
* Fixed a local copy creation of dimm health information using
  READ_ONCE(). [Mpe]

v5..v6 :
* Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
  [Dan Williams]
* Include documentation for 'papr/flags' attr [Dan Williams]
* Change flag 'save_fail' to 'flush_fail' [Dan Williams]
* Caching of health bitmap to reduce expensive hcalls [Dan Williams]
* Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
* Replaced two __be64 integers from papr_scm_priv to a single u64
  integer [Mpe]
* Updated patch description to reflect the changes made in this
  version.
* Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
  flags_show() [Dan Williams]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
       	 NVDIMM unarmed [Aneesh]

v1..v2 :
* New patch in the series.
---
 Documentation/ABI/testing/sysfs-bus-papr-scm |  27 +++
 arch/powerpc/platforms/pseries/papr_scm.c    | 169 ++++++++++++++++++-
 2 files changed, 194 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-scm

diff --git a/Documentation/ABI/testing/sysfs-bus-papr-scm b/Documentation/ABI/testing/sysfs-bus-papr-scm
new file mode 100644
index 000000000000..6143d06072f1
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-papr-scm
@@ -0,0 +1,27 @@
+What:		/sys/bus/nd/devices/nmemX/papr/flags
+Date:		Apr, 2020
+KernelVersion:	v5.8
+Contact:	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org,
+Description:
+		(RO) Report flags indicating various states of a
+		papr-scm NVDIMM device. Each flag maps to a one or
+		more bits set in the dimm-health-bitmap retrieved in
+		response to H_SCM_HEALTH hcall. The details of the bit
+		flags returned in response to this hcall is available
+		at 'Documentation/powerpc/papr_hcalls.rst' . Below are
+		the flags reported in this sysfs file:
+
+		* "not_armed"	: Indicates that NVDIMM contents will not
+				  survive a power cycle.
+		* "flush_fail"	: Indicates that NVDIMM contents
+				  couldn't be flushed during last
+				  shut-down event.
+		* "restore_fail": Indicates that NVDIMM contents
+				  couldn't be restored during NVDIMM
+				  initialization.
+		* "encrypted"	: NVDIMM contents are encrypted.
+		* "smart_notify": There is health event for the NVDIMM.
+		* "scrubbed"	: Indicating that contents of the
+				  NVDIMM have been scrubbed.
+		* "locked"	: Indicating that NVDIMM contents cant
+				  be modified until next power cycle.
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index f35592423380..142636e1a59f 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -12,6 +12,7 @@
 #include <linux/libnvdimm.h>
 #include <linux/platform_device.h>
 #include <linux/delay.h>
+#include <linux/seq_buf.h>
 
 #include <asm/plpar_wrappers.h>
 
@@ -22,6 +23,44 @@
 	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
 	 (1ul << ND_CMD_SET_CONFIG_DATA))
 
+/* DIMM health bitmap bitmap indicators */
+/* SCM device is unable to persist memory contents */
+#define PAPR_SCM_DIMM_UNARMED                   (1ULL << (63 - 0))
+/* SCM device failed to persist memory contents */
+#define PAPR_SCM_DIMM_SHUTDOWN_DIRTY            (1ULL << (63 - 1))
+/* SCM device contents are persisted from previous IPL */
+#define PAPR_SCM_DIMM_SHUTDOWN_CLEAN            (1ULL << (63 - 2))
+/* SCM device contents are not persisted from previous IPL */
+#define PAPR_SCM_DIMM_EMPTY                     (1ULL << (63 - 3))
+/* SCM device memory life remaining is critically low */
+#define PAPR_SCM_DIMM_HEALTH_CRITICAL           (1ULL << (63 - 4))
+/* SCM device will be garded off next IPL due to failure */
+#define PAPR_SCM_DIMM_HEALTH_FATAL              (1ULL << (63 - 5))
+/* SCM contents cannot persist due to current platform health status */
+#define PAPR_SCM_DIMM_HEALTH_UNHEALTHY          (1ULL << (63 - 6))
+/* SCM device is unable to persist memory contents in certain conditions */
+#define PAPR_SCM_DIMM_HEALTH_NON_CRITICAL       (1ULL << (63 - 7))
+/* SCM device is encrypted */
+#define PAPR_SCM_DIMM_ENCRYPTED                 (1ULL << (63 - 8))
+/* SCM device has been scrubbed and locked */
+#define PAPR_SCM_DIMM_SCRUBBED_AND_LOCKED       (1ULL << (63 - 9))
+
+/* Bits status indicators for health bitmap indicating unarmed dimm */
+#define PAPR_SCM_DIMM_UNARMED_MASK (PAPR_SCM_DIMM_UNARMED |		\
+				    PAPR_SCM_DIMM_HEALTH_UNHEALTHY)
+
+/* Bits status indicators for health bitmap indicating unflushed dimm */
+#define PAPR_SCM_DIMM_BAD_SHUTDOWN_MASK (PAPR_SCM_DIMM_SHUTDOWN_DIRTY)
+
+/* Bits status indicators for health bitmap indicating unrestored dimm */
+#define PAPR_SCM_DIMM_BAD_RESTORE_MASK  (PAPR_SCM_DIMM_EMPTY)
+
+/* Bit status indicators for smart event notification */
+#define PAPR_SCM_DIMM_SMART_EVENT_MASK (PAPR_SCM_DIMM_HEALTH_CRITICAL | \
+					PAPR_SCM_DIMM_HEALTH_FATAL |	\
+					PAPR_SCM_DIMM_HEALTH_UNHEALTHY)
+
+/* private struct associated with each region */
 struct papr_scm_priv {
 	struct platform_device *pdev;
 	struct device_node *dn;
@@ -39,6 +78,15 @@ struct papr_scm_priv {
 	struct resource res;
 	struct nd_region *region;
 	struct nd_interleave_set nd_set;
+
+	/* Protect dimm health data from concurrent read/writes */
+	struct mutex health_mutex;
+
+	/* Last time the health information of the dimm was updated */
+	unsigned long lasthealth_jiffies;
+
+	/* Health information for the dimm */
+	u64 health_bitmap;
 };
 
 static int drc_pmem_bind(struct papr_scm_priv *p)
@@ -144,6 +192,62 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
 	return drc_pmem_bind(p);
 }
 
+/*
+ * Issue hcall to retrieve dimm health info and populate papr_scm_priv with the
+ * health information.
+ */
+static int __drc_pmem_query_health(struct papr_scm_priv *p)
+{
+	unsigned long ret[PLPAR_HCALL_BUFSIZE];
+	s64 rc;
+
+	/* issue the hcall */
+	rc = plpar_hcall(H_SCM_HEALTH, ret, p->drc_index);
+	if (rc != H_SUCCESS) {
+		dev_err(&p->pdev->dev,
+			 "Failed to query health information, Err:%lld\n", rc);
+		rc = -ENXIO;
+		goto out;
+	}
+
+	p->lasthealth_jiffies = jiffies;
+	p->health_bitmap = ret[0] & ret[1];
+
+	dev_dbg(&p->pdev->dev,
+		"Queried dimm health info. Bitmap:0x%016lx Mask:0x%016lx\n",
+		ret[0], ret[1]);
+out:
+	return rc;
+}
+
+/* Min interval in seconds for assuming stable dimm health */
+#define MIN_HEALTH_QUERY_INTERVAL 60
+
+/* Query cached health info and if needed call drc_pmem_query_health */
+static int drc_pmem_query_health(struct papr_scm_priv *p)
+{
+	unsigned long cache_timeout;
+	s64 rc;
+
+	/* Protect concurrent modifications to papr_scm_priv */
+	rc = mutex_lock_interruptible(&p->health_mutex);
+	if (rc)
+		return rc;
+
+	/* Jiffies offset for which the health data is assumed to be same */
+	cache_timeout = p->lasthealth_jiffies +
+		msecs_to_jiffies(MIN_HEALTH_QUERY_INTERVAL * 1000);
+
+	/* Fetch new health info is its older than MIN_HEALTH_QUERY_INTERVAL */
+	if (time_after(jiffies, cache_timeout))
+		rc = __drc_pmem_query_health(p);
+	else
+		/* Assume cached health data is valid */
+		rc = 0;
+
+	mutex_unlock(&p->health_mutex);
+	return rc;
+}
 
 static int papr_scm_meta_get(struct papr_scm_priv *p,
 			     struct nd_cmd_get_config_data_hdr *hdr)
@@ -286,6 +390,64 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
 	return 0;
 }
 
+static ssize_t flags_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct nvdimm *dimm = to_nvdimm(dev);
+	struct papr_scm_priv *p = nvdimm_provider_data(dimm);
+	struct seq_buf s;
+	u64 health;
+	int rc;
+
+	rc = drc_pmem_query_health(p);
+	if (rc)
+		return rc;
+
+	/* Copy health_bitmap locally, check masks & update out buffer */
+	health = READ_ONCE(p->health_bitmap);
+
+	seq_buf_init(&s, buf, PAGE_SIZE);
+	if (health & PAPR_SCM_DIMM_UNARMED_MASK)
+		seq_buf_printf(&s, "not_armed ");
+
+	if (health & PAPR_SCM_DIMM_BAD_SHUTDOWN_MASK)
+		seq_buf_printf(&s, "flush_fail ");
+
+	if (health & PAPR_SCM_DIMM_BAD_RESTORE_MASK)
+		seq_buf_printf(&s, "restore_fail ");
+
+	if (health & PAPR_SCM_DIMM_ENCRYPTED)
+		seq_buf_printf(&s, "encrypted ");
+
+	if (health & PAPR_SCM_DIMM_SMART_EVENT_MASK)
+		seq_buf_printf(&s, "smart_notify ");
+
+	if (health & PAPR_SCM_DIMM_SCRUBBED_AND_LOCKED)
+		seq_buf_printf(&s, "scrubbed locked ");
+
+	if (seq_buf_used(&s))
+		seq_buf_printf(&s, "\n");
+
+	return seq_buf_used(&s);
+}
+DEVICE_ATTR_RO(flags);
+
+/* papr_scm specific dimm attributes */
+static struct attribute *papr_scm_nd_attributes[] = {
+	&dev_attr_flags.attr,
+	NULL,
+};
+
+static struct attribute_group papr_scm_nd_attribute_group = {
+	.name = "papr",
+	.attrs = papr_scm_nd_attributes,
+};
+
+static const struct attribute_group *papr_scm_dimm_attr_groups[] = {
+	&papr_scm_nd_attribute_group,
+	NULL,
+};
+
 static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
 {
 	struct device *dev = &p->pdev->dev;
@@ -312,8 +474,8 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
 	dimm_flags = 0;
 	set_bit(NDD_LABELING, &dimm_flags);
 
-	p->nvdimm = nvdimm_create(p->bus, p, NULL, dimm_flags,
-				  PAPR_SCM_DIMM_CMD_MASK, 0, NULL);
+	p->nvdimm = nvdimm_create(p->bus, p, papr_scm_dimm_attr_groups,
+				  dimm_flags, PAPR_SCM_DIMM_CMD_MASK, 0, NULL);
 	if (!p->nvdimm) {
 		dev_err(dev, "Error creating DIMM object for %pOF\n", p->dn);
 		goto err;
@@ -399,6 +561,9 @@ static int papr_scm_probe(struct platform_device *pdev)
 	if (!p)
 		return -ENOMEM;
 
+	/* Initialize the dimm mutex */
+	mutex_init(&p->health_mutex);
+
 	/* optional DT properties */
 	of_property_read_u32(dn, "ibm,metadata-size", &metadata_size);
 
-- 
2.26.2


^ permalink raw reply related

* [RESEND PATCH v7 1/5] powerpc: Document details on H_SCM_HEALTH hcall
From: Vaibhav Jain @ 2020-05-19 19:00 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams
In-Reply-To: <20200519190058.257981-1-vaibhav@linux.ibm.com>

Add documentation to 'papr_hcalls.rst' describing the bitmap flags
that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
specification.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

Resend:
* None

v6..v7:
* None

v5..v6
* New patch in the series
---
 Documentation/powerpc/papr_hcalls.rst | 43 ++++++++++++++++++++++++---
 1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
index 3493631a60f8..9a5ba5eaf323 100644
--- a/Documentation/powerpc/papr_hcalls.rst
+++ b/Documentation/powerpc/papr_hcalls.rst
@@ -220,13 +220,48 @@ from the LPAR memory.
 **H_SCM_HEALTH**
 
 | Input: drcIndex
-| Out: *health-bitmap, health-bit-valid-bitmap*
+| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
 | Return Value: *H_Success, H_Parameter, H_Hardware*
 
 Given a DRC Index return the info on predictive failure and overall health of
-the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
-failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
-valid.
+the NVDIMM. The asserted bits in the health-bitmap indicate one or more states
+(described in table below) of the NVDIMM and health-bit-valid-bitmap indicate
+which bits in health-bitmap are valid.
+
+Health Bitmap Flags:
+
++------+-----------------------------------------------------------------------+
+|  Bit |               Definition                                              |
++======+=======================================================================+
+|  00  | SCM device is unable to persist memory contents.                      |
+|      | If the system is powered down, nothing will be saved.                 |
++------+-----------------------------------------------------------------------+
+|  01  | SCM device failed to persist memory contents. Either contents were not|
+|      | saved successfully on power down or were not restored properly on     |
+|      | power up.                                                             |
++------+-----------------------------------------------------------------------+
+|  02  | SCM device contents are persisted from previous IPL. The data from    |
+|      | the last boot were successfully restored.                             |
++------+-----------------------------------------------------------------------+
+|  03  | SCM device contents are not persisted from previous IPL. There was no |
+|      | data to restore from the last boot.                                   |
++------+-----------------------------------------------------------------------+
+|  04  | SCM device memory life remaining is critically low                    |
++------+-----------------------------------------------------------------------+
+|  05  | SCM device will be garded off next IPL due to failure                 |
++------+-----------------------------------------------------------------------+
+|  06  | SCM contents cannot persist due to current platform health status. A  |
+|      | hardware failure may prevent data from being saved or restored.       |
++------+-----------------------------------------------------------------------+
+|  07  | SCM device is unable to persist memory contents in certain conditions |
++------+-----------------------------------------------------------------------+
+|  08  | SCM device is encrypted                                               |
++------+-----------------------------------------------------------------------+
+|  09  | SCM device has successfully completed a requested erase or secure     |
+|      | erase procedure.                                                      |
++------+-----------------------------------------------------------------------+
+|10:63 | Reserved / Unused                                                     |
++------+-----------------------------------------------------------------------+
 
 **H_SCM_PERFORMANCE_STATS**
 
-- 
2.26.2


^ permalink raw reply related

* [RESEND PATCH v7 0/5] powerpc/papr_scm: Add support for reporting nvdimm health
From: Vaibhav Jain @ 2020-05-19 19:00 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams

The PAPR standard[1][3] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[2].  Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.

To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch NVDIMM health stats using hcalls described
in Ref[2].  This health and performance stats are then exposed to
userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
libndctl.

These changes coupled with proposed ndtcl changes located at Ref[4]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.

Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:

 # ndctl list -DH
[
  {
    "dev":"nmem0",
    "health":{
      "health_state":"fatal",
      "shutdown_state":"dirty"
    }
  }
]

Dimm health report output on a pseries guest lpar with vPMEM or HMS
based NVDIMMs that are in perfectly healthy conditions:

 # ndctl list -d nmem0 -H
[
  {
    "dev":"nmem0",
    "health":{
      "health_state":"ok",
      "shutdown_state":"clean"
    }
  }
]

PAPR NVDIMM-Specific-Methods(PDSM)
==================================

PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_scm_pdsm.h'. Support for
more PDSMs will be added in future.

Structure of the patch-set
==========================

The patch-set starts with a doc patch documenting details of hcall
H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
thats used in subsequent patches to generate sysfs attribute content.

Third patch implements support for fetching NVDIMM health information
from PHYP and partially exposing it to user-space via a NVDIMM sysfs
flag.

Fourth patches deal with implementing support for servicing PDSM
commands in papr_scm module.

Finally the last patch implements support for servicing PDSM
'PAPR_SCM_PDSM_HEALTH' that returns the NVDIMM health information to
libndctl.

Changelog:
==========

Resend:
* Added ack from Steven Rostedt on Patch-2 that exports kernel symbol
  seq_buf_printf()

v6..v7:

* Incorporate various review comments from Mpe.  Removed papr_scm.h
* Added a patch to export seq_buf_printf() [Mpe, Steven Rostedt]
* header file and moved its contents to papr_scm.c.
* Split function drc_pmem_query_health() into two functions, one that takes
  care of caching and concurrency and other one that doesn't.
* Fixed a possible incorrect way to make local copy of nvdimm health data.
* Some variable renames changed as suggested in previous review.
* Removed unused macros/defines from papr_scm_pdsm.h
* Updated papr_scm_pdsm.h to remove usage of __KERNEL__ define.
* Updated papr_scm_pdsm.h to remove redefinition of __packed macro.

v5..v6:

* Incorporate review comments from Mpe and Dan Williams.
* Changed the usage of term DSM to PDSM as former conflicted with
  usage in ACPI context.
* UAPI updates to remove usage of bool and marking the structs 
  defined as 'packed'.
* Simplified the health-bitmap handling in papr_scm to use u64
  instead of __be64 integers.
* Caching of the health information so reading the dimm-flag file
  doesn't result in costly hcalls everytime.
* Changed dimm-flag 'save_fail' to 'flush_fail'
* Moved the dimm flag file from 'papr_flags' to 'papr/flags'.
* Added a patch to document H_SCM_HEALTH hcall return values.
* Added sysfs ABI documentation for newly introduce dimm-flag
  sysfs file 'papr/flags'

v4..v5:

* Fixed a bug in new implementation of papr_scm_ndctl() that was triggering
  a false error condition.

v3..v4:

* Restructured papr_scm_ndctl() to dispatch ND_CMD_CALL commands to a new
  function named papr_scm_service_dsm() to serivice DSM requests. [Aneesh]

v2..v3:

* Updated the papr_scm_dsm.h header to be more confimant general kernel
  guidelines for UAPI headers. [Aneesh]

* Changed the definition of macro PAPR_SCM_DIMM_UNARMED_MASK to not
  include case when the NVDIMM is unarmed because its a vPMEM
  NVDIMM. [Aneesh]

v1..v2:

* Restructured the patch-set based on review comments on V1 patch-set to
simplify the patch review. Multiple small patches have been combined into
single patches to reduce cross referencing that was needed in earlier
patch-set. Hence most of the patches in this patch-set as now new. [Aneesh]

* Removed the initial work done for fetch NVDIMM performance statistics.
These changes will be re-proposed in a separate patch-set. [Aneesh]

* Simplified handling of versioning of 'struct
nd_papr_scm_dimm_health_stat_v1' as only one version of the structure is
currently in existence.

References:
[1] "Power Architecture Platform Reference"
      https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[2] commit 58b278f568f0
     ("powerpc: Provide initial documentation for PAPR hcalls")
[3] "Linux on Power Architecture Platform Reference"
     https://members.openpowerfoundation.org/document/dl/469
[4] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v7

Vaibhav Jain (5):
  powerpc: Document details on H_SCM_HEALTH hcall
  seq_buf: Export seq_buf_printf() to external modules
  powerpc/papr_scm: Fetch nvdimm health information from PHYP
  ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
  powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_HEALTH

 Documentation/ABI/testing/sysfs-bus-papr-scm  |  27 ++
 Documentation/powerpc/papr_hcalls.rst         |  43 ++-
 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h | 173 +++++++++
 arch/powerpc/platforms/pseries/papr_scm.c     | 363 +++++++++++++++++-
 include/uapi/linux/ndctl.h                    |   1 +
 lib/seq_buf.c                                 |   1 +
 6 files changed, 595 insertions(+), 13 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-scm
 create mode 100644 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h

-- 
2.26.2


^ permalink raw reply

* Re: [PATCH v2 3/5] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier
From: Dan Williams @ 2020-05-19 18:59 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: alistair, linuxppc-dev, linux-nvdimm
In-Reply-To: <87d070f2vs.fsf@linux.ibm.com>

On Tue, May 19, 2020 at 6:53 AM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> Dan Williams <dan.j.williams@intel.com> writes:
>
> > On Mon, May 18, 2020 at 10:30 PM Aneesh Kumar K.V
> > <aneesh.kumar@linux.ibm.com> wrote:
>
> ...
>
> >> Applications using new instructions will behave as expected when running
> >> on P8 and P9. Only future hardware will differentiate between 'dcbf' and
> >> 'dcbfps'
> >
> > Right, this is the problem. Applications using new instructions behave
> > as expected, the kernel has been shipping of_pmem and papr_scm for
> > several cycles now, you're saying that the DAX applications written
> > against those platforms are going to be broken on P8 and P9?
>
> The expecation is that both kernel and userspace would get upgraded to
> use the new instruction before actual persistent memory devices are
> made available.
>
> >
> >> > I'm thinking the kernel
> >> > should go as far as to disable DAX operation by default on new
> >> > hardware until userspace asserts that it is prepared to switch to the
> >> > new implementation. Is there any other way to ensure the forward
> >> > compatibility of deployed ppc64 DAX applications?
> >>
> >> AFAIU there is no released persistent memory hardware on ppc64 platform
> >> and we need to make sure before applications get enabled to use these
> >> persistent memory devices, they should switch to use the new
> >> instruction?
> >
> > Right, I want the kernel to offer some level of safety here because
> > everything you are describing sounds like a flag day conversion. Am I
> > misreading? Is there some other gate that prevents existing users of
> > of_pmem and papr_scm from having their expectations violated when
> > running on P8 / P9 hardware? Maybe there's tighter ecosystem control
> > that I'm just not familiar with, I'm only going off the fact that the
> > kernel has shipped a non-zero number of NVDIMM drivers that build with
> > ARCH=ppc64 for several cycles.
>
> If we are looking at adding changes to kernel that will prevent a kernel
> from running on newer hardware in a specific case, we could as well take
> the changes to get the kernel use the newer instructions right?

Oh, no, I'm not talking about stopping the kernel from running. I'm
simply recommending that support for MAP_SYNC mappings (userspace
managed flushing) be disabled by default on PPC with either a
compile-time or run-time default to assert that userspace has been
audited for legacy applications or that the platform owner is
otherwise willing to take the risk.

> But I agree with your concern that if we have older kernel/applications
> that continue to use `dcbf` on future hardware we will end up
> having issues w.r.t powerfail consistency. The plan is what you outlined
> above as tighter ecosystem control. Considering we don't have a pmem
> device generally available, we get both kernel and userspace upgraded
> to use these new instructions before such a device is made available.

Ok, I think a compile time kernel option with a runtime override
satisfies my concern. Does that work for you?

^ permalink raw reply

* Re: [PATCH] arch/{mips,sparc,microblaze,powerpc}: Don't enable pagefault/preempt twice
From: Ira Weiny @ 2020-05-19 18:40 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Peter Zijlstra, Dave Hansen, dri-devel, linux-mips,
	James E.J. Bottomley, Max Filippov, Paul Mackerras,
	H. Peter Anvin, sparclinux, Dan Williams, Helge Deller, x86,
	linux-csky, Christoph Hellwig, Ingo Molnar, linux-snps-arc,
	linux-xtensa, Borislav Petkov, Al Viro, Andy Lutomirski,
	Thomas Gleixner, linux-arm-kernel, Chris Zankel,
	Thomas Bogendoerfer, linux-parisc, linux-kernel, Christian Koenig,
	Andrew Morton, linuxppc-dev, David S. Miller
In-Reply-To: <20200519165422.GA5838@roeck-us.net>

On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> > enables when vaddr is not in the fixmap.
> > 
> > Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> microblazeel works with this patch,

Awesome...  Andrew in my rush yesterday I should have put a reported by on the
patch for Guenter as well.

Sorry about that Guenter,
Ira

> as do the nosmp sparc32 boot tests,
> but sparc32 boot tests with SMP enabled still fail with lots of messages
> such as:
> 
> BUG: Bad page state in process swapper/0  pfn:006a1
> page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
> flags: 0x0()
> raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
> page dumped because: nonzero mapcount
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
> [f00e7ab8 :
> bad_page+0xa8/0x108 ]
> [f00e8b54 :
> free_pcppages_bulk+0x154/0x52c ]
> [f00ea024 :
> free_unref_page+0x54/0x6c ]
> [f00ed864 :
> free_reserved_area+0x58/0xec ]
> [f0527104 :
> kernel_init+0x14/0x110 ]
> [f000b77c :
> ret_from_kernel_thread+0xc/0x38 ]
> [00000000 :
> 0x0 ]
> 
> Code path leading to that message is different but always the same
> from free_unref_page().
> 
> Still testing ppc images.
> 
> Guenter

^ permalink raw reply

* Re: [PATCH v2 6/7] powerpc/dt_cpu_ftrs: Add MMA feature
From: Paul A. Clarke @ 2020-05-19 17:51 UTC (permalink / raw)
  To: Alistair Popple; +Cc: aneesh.kumar, mikey, linuxppc-dev, npiggin
In-Reply-To: <20200519003157.31946-7-alistair@popple.id.au>

On Tue, May 19, 2020 at 10:31:56AM +1000, Alistair Popple wrote:
> Matrix multiple accumulate (MMA) is a new feature added to ISAv3.1 and

Conclusion is that this should be "Matrix-Multiply Assist", but then there
are a couple more below...

> POWER10. Support on powernv can be selected via a firmware CPU device
> tree feature which enables it via a PCR bit.
> 
> Signed-off-by: Alistair Popple <alistair@popple.id.au>
> ---
>  arch/powerpc/include/asm/reg.h    |  3 ++-
>  arch/powerpc/kernel/dt_cpu_ftrs.c | 17 ++++++++++++++++-
>  2 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 1931b1142599..c446863a40cf 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -479,7 +479,8 @@
>  #define   PCR_VEC_DIS	(__MASK(63-0))	/* Vec. disable (bit NA since POWER8) */
>  #define   PCR_VSX_DIS	(__MASK(63-1))	/* VSX disable (bit NA since POWER8) */
>  #define   PCR_TM_DIS	(__MASK(63-2))	/* Trans. memory disable (POWER8) */
> -#define   PCR_HIGH_BITS	(PCR_VEC_DIS | PCR_VSX_DIS | PCR_TM_DIS)
> +#define   PCR_MMA_DIS	(__MASK(63-3)) /* Matrix-Multiply Accelerator */

also here.

> +#define   PCR_HIGH_BITS	(PCR_MMA_DIS | PCR_VEC_DIS | PCR_VSX_DIS | PCR_TM_DIS)
>  /*
>   * These bits are used in the function kvmppc_set_arch_compat() to specify and
>   * determine both the compatibility level which we want to emulate and the
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index 93c340906aad..e7540ee5cad8 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -75,6 +75,7 @@ static struct {
>  	u64	lpcr_clear;
>  	u64	hfscr;
>  	u64	fscr;
> +	u64	pcr;
>  } system_registers;
> 
>  static void (*init_pmu_registers)(void);
> @@ -102,7 +103,7 @@ static void __restore_cpu_cpufeatures(void)
>  	if (hv_mode) {
>  		mtspr(SPRN_LPID, 0);
>  		mtspr(SPRN_HFSCR, system_registers.hfscr);
> -		mtspr(SPRN_PCR, PCR_MASK);
> +		mtspr(SPRN_PCR, system_registers.pcr);
>  	}
>  	mtspr(SPRN_FSCR, system_registers.fscr);
> 
> @@ -555,6 +556,18 @@ static int __init feat_enable_large_ci(struct dt_cpu_feature *f)
>  	return 1;
>  }
> 
> +static int __init feat_enable_mma(struct dt_cpu_feature *f)
> +{
> +	u64 pcr;
> +
> +	feat_enable(f);
> +	pcr = mfspr(SPRN_PCR);
> +	pcr &= ~PCR_MMA_DIS;
> +	mtspr(SPRN_PCR, pcr);
> +
> +	return 1;
> +}
> +
>  struct dt_cpu_feature_match {
>  	const char *name;
>  	int (*enable)(struct dt_cpu_feature *f);
> @@ -629,6 +642,7 @@ static struct dt_cpu_feature_match __initdata
>  	{"vector-binary16", feat_enable, 0},
>  	{"wait-v3", feat_enable, 0},
>  	{"prefix-instructions", feat_enable, 0},
> +	{"matrix-multiply-accumulate", feat_enable_mma, 0},

and presumably here as well.

>  };

PC

^ permalink raw reply

* Re: [PATCH v2 1/7] powerpc: Add new HWCAP bits
From: Paul A. Clarke @ 2020-05-19 17:48 UTC (permalink / raw)
  To: Alistair Popple; +Cc: aneesh.kumar, mikey, linuxppc-dev, npiggin
In-Reply-To: <20200519003157.31946-2-alistair@popple.id.au>

On Tue, May 19, 2020 at 10:31:51AM +1000, Alistair Popple wrote:
> POWER10 introduces two new architectural features - ISAv3.1 and matrix
> multiply accumulate (MMA) instructions. Userspace detects the presence
> of these features via two HWCAP bits introduced in this patch. These
> bits have been agreed to by the compiler and binutils team.
> 
> Signed-off-by: Alistair Popple <alistair@popple.id.au>
> ---
>  arch/powerpc/include/uapi/asm/cputable.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/include/uapi/asm/cputable.h b/arch/powerpc/include/uapi/asm/cputable.h
> index 540592034740..2692a56bf20b 100644
> --- a/arch/powerpc/include/uapi/asm/cputable.h
> +++ b/arch/powerpc/include/uapi/asm/cputable.h
> @@ -50,6 +50,8 @@
>  #define PPC_FEATURE2_DARN		0x00200000 /* darn random number insn */
>  #define PPC_FEATURE2_SCV		0x00100000 /* scv syscall */
>  #define PPC_FEATURE2_HTM_NO_SUSPEND	0x00080000 /* TM w/out suspended state */
> +#define PPC_FEATURE2_ARCH_3_1		0x00040000 /* ISA 3.1 */
> +#define PPC_FEATURE2_MMA		0x00020000 /* Matrix Multiply Accumulate */

Carrying the conclusion of the recent discussion further, this should also be
"Matrix-Multiply Assist".

PC

^ permalink raw reply

* Re: [PATCH] arch/{mips,sparc,microblaze,powerpc}: Don't enable pagefault/preempt twice
From: Guenter Roeck @ 2020-05-19 16:54 UTC (permalink / raw)
  To: ira.weiny
  Cc: Peter Zijlstra, Dave Hansen, dri-devel, linux-mips,
	James E.J. Bottomley, Max Filippov, Paul Mackerras,
	H. Peter Anvin, sparclinux, Dan Williams, Helge Deller, x86,
	linux-csky, Christoph Hellwig, Ingo Molnar, linux-snps-arc,
	linux-xtensa, Borislav Petkov, Al Viro, Andy Lutomirski,
	Thomas Gleixner, linux-arm-kernel, Chris Zankel,
	Thomas Bogendoerfer, linux-parisc, linux-kernel, Christian Koenig,
	Andrew Morton, linuxppc-dev, David S. Miller
In-Reply-To: <20200518184843.3029640-1-ira.weiny@intel.com>

On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> enables when vaddr is not in the fixmap.
> 
> Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

microblazeel works with this patch, as do the nosmp sparc32 boot tests,
but sparc32 boot tests with SMP enabled still fail with lots of messages
such as:

BUG: Bad page state in process swapper/0  pfn:006a1
page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
flags: 0x0()
raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
page dumped because: nonzero mapcount
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
[f00e7ab8 :
bad_page+0xa8/0x108 ]
[f00e8b54 :
free_pcppages_bulk+0x154/0x52c ]
[f00ea024 :
free_unref_page+0x54/0x6c ]
[f00ed864 :
free_reserved_area+0x58/0xec ]
[f0527104 :
kernel_init+0x14/0x110 ]
[f000b77c :
ret_from_kernel_thread+0xc/0x38 ]
[00000000 :
0x0 ]

Code path leading to that message is different but always the same
from free_unref_page().

Still testing ppc images.

Guenter

^ permalink raw reply

* Re: [PATCH V3 07/15] arch/kunmap_atomic: Consolidate duplicate code
From: Ira Weiny @ 2020-05-19 16:29 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Peter Zijlstra, Dave Hansen, dri-devel, linux-mips,
	James E.J. Bottomley, Max Filippov, Paul Mackerras,
	H. Peter Anvin, sparclinux, Dan Williams, Helge Deller, x86,
	linux-csky, Christoph Hellwig, Ingo Molnar, linux-snps-arc,
	linux-xtensa, Borislav Petkov, Al Viro, Andy Lutomirski,
	Thomas Gleixner, linux-arm-kernel, Chris Zankel,
	Thomas Bogendoerfer, linux-parisc, linux-kernel, Christian Koenig,
	Andrew Morton, linuxppc-dev, David S. Miller
In-Reply-To: <47757f51-15f2-3abe-9035-abdb3ba5816e@roeck-us.net>

On Mon, May 18, 2020 at 07:50:36PM -0700, Guenter Roeck wrote:
> Hi Ira,
> 
> On 5/18/20 5:03 PM, Ira Weiny wrote:
> > On Sun, May 17, 2020 at 09:29:32PM -0700, Guenter Roeck wrote:
> >> On Sun, May 17, 2020 at 08:49:39PM -0700, Ira Weiny wrote:
> >>> On Sat, May 16, 2020 at 03:33:06PM -0700, Guenter Roeck wrote:
> >>>> On Thu, May 07, 2020 at 07:59:55AM -0700, ira.weiny@intel.com wrote:
> >>>>> From: Ira Weiny <ira.weiny@intel.com>
> >>>>>
> >>>

Sorry for the delay I missed this email last night...  I blame outlook...  ;-)

...

> >>> Do you have a kernel config?  Specifically is CONFIG_HIGHMEM set?
> >>>
> >> See below. Yes, CONFIG_HIGHMEM is set.
> >>
> >> The scripts used to build and boot the image are at:
> >>
> >> https://github.com/groeck/linux-build-test/tree/master/rootfs/microblazeel
> > 
> > Despite finding the obvious error earlier today I've still been trying to get
> > this to work.
> > 
> > I had to make some slight modifications to use the 0-day cross compile build
> > and my local qemu build.  But those were pretty minor modifications.  I'm
> > running on x86_64 host.
> > 
> > With those slight mods to the scripts I get the following error even without my
> > patch set on 5.7-rc4.  I have 1 cpu pegged at 100% while it is running...  Is
> > there anything I can do to get more debug output?  Perhaps I just need to let
> > it run longer?
> > 
> 
> I don't think so. Try running it with "-d" parameter (run-qemu-microblazeel.sh
> -d petalogix-s3adsp1800); that gives you the qemu command line. Once it says
> "running", abort the script and execute qemu directly.

FYI Minor nit...  a simple copy/paste failed...  that print of the cmd line
did not include quotes around the -append text:

09:06:03 > /home/iweiny/dev/qemu/microblazeel-softmmu/qemu-system-microblazeel
   -M petalogix-s3adsp1800 -m 256 -kernel arch/microblaze/boot/linux.bin
   -no-reboot -initrd /tmp/buildbot-cache/microblazeel/rootfs.cpio -append
   panic=-1 slub_debug=FZPUA rdinit=/sbin/init console=ttyUL0,115200 -monitor
   none -serial stdio -nographic

qemu-system-microblazeel: slub_debug=FZPUA: Could not open 'slub_debug=FZPUA': No such file or directory

> Oh, and please update
> the repository; turns out I didn't push for a while and made a number of
> changes.

Cool beans...  I've updated.

> 
> My compiler was compiled with buildroot (a long time ago). I don't recall if
> it needed something special in the configuration, unfortunately.

AFAICT the compile is working...  It is running from the command line now...  I
expected it to be slow so I have also increased the timeouts last night.  So
far it still fails.  I did notice that there is a new 'R' in the wait output.

<quote>
.........................R......................... failed (silent)
------------
qemu log:
qemu-system-microblazeel: terminating on signal 15 from pid 3357146 (/bin/bash)
</quote>

I was hoping that meant it found qemu 'running' but looks like that was just a
retry...  :-(

Last night I increased some of the timeouts I could find.

<quote>
 LOOPTIME=5     # Wait time before checking status
 -MAXTIME=150    # Maximum wait time for qemu session to complete
 -MAXSTIME=60    # Maximum wait time for qemu session to generate output
 +#MAXTIME=150   # Maximum wait time for qemu session to complete
 +#MAXSTIME=60   # Maximum wait time for qemu session to generate output
 +MAXTIME=300    # Maximum wait time for qemu session to complete
 +MAXSTIME=120   # Maximum wait time for qemu session to generate output
</quote>

But thanks to the qemu command line hint I can see these were not nearly
enough...  (It has been running for > 20 minutes...  and I'm not getting
output...)  Or I've done something really wrong.  Shouldn't qemu be at least
showing something on the terminal by now?  I normally run qemu with different
display options (and my qemu foo is weak) so I'm not sure what I should be
seeing with this command line.

09:06:28 > /home/iweiny/dev/qemu/microblazeel-softmmu/qemu-system-microblazeel
  -M petalogix-s3adsp1800 -m 256 -kernel arch/microblaze/boot/linux.bin
  -no-reboot -initrd /tmp/buildbot-cache/microblazeel/rootfs.cpio -append
  "panic=-1 slub_debug=FZPUA rdinit=/sbin/init console=ttyUL0,115200" -monitor
  none -serial stdio -nographic

Maybe I just have too slow of a machine...  :-/

My qemu was built back in March.  I'm updating that now...

Sorry for being so dense...
Ira

^ permalink raw reply

* Re: [PATCH v2 6/7] powerpc/dt_cpu_ftrs: Add MMA feature
From: Paul A. Clarke @ 2020-05-19 15:49 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Alistair Popple, mikey, linuxppc-dev, npiggin, aneesh.kumar
In-Reply-To: <20200519152855.GN31009@gate.crashing.org>

On Tue, May 19, 2020 at 10:28:55AM -0500, Segher Boessenkool wrote:
> On Tue, May 19, 2020 at 10:22:40AM -0500, Paul A. Clarke wrote:
> > On Tue, May 19, 2020 at 10:05:56AM -0500, Segher Boessenkool wrote:
> > > On Tue, May 19, 2020 at 09:49:22AM -0500, Paul A. Clarke wrote:
> > > > On Tue, May 19, 2020 at 10:31:56AM +1000, Alistair Popple wrote:
> > > > > Matrix multiple accumulate (MMA) is a new feature added to ISAv3.1 and
> > > > 
> > > > nit: "Matrix-Multiply Accelerator".
> > > 
> > > "Matrix-Multiply Assist" in fact :-)
> > 
> > Not according to the ISA (p. 1129).
> 
> There is one mistake in the 3.1 ABI yes, in the description for the PCR
> bit of this same name.  Everything else calls this "assist", like all
> similar things are called as well (whereas an "accelerator" is an
> external device).

I stand corrected. The PCR bit description is in error, and that's what I
was going by.  Section 7.6.1.12 "VSX Matrix-Multiply Assist (MMA)
Instructions" seems to be more definitive.  Thanks, Segher, for correcting
my correction.

Should be "Matrix-Multiply Assist".

:-)

PC

^ permalink raw reply

* Re: [PATCH v2 6/7] powerpc/dt_cpu_ftrs: Add MMA feature
From: Segher Boessenkool @ 2020-05-19 15:28 UTC (permalink / raw)
  To: Paul A. Clarke
  Cc: Alistair Popple, mikey, linuxppc-dev, npiggin, aneesh.kumar
In-Reply-To: <20200519152240.GB24922@oc3272150783.ibm.com>

On Tue, May 19, 2020 at 10:22:40AM -0500, Paul A. Clarke wrote:
> On Tue, May 19, 2020 at 10:05:56AM -0500, Segher Boessenkool wrote:
> > On Tue, May 19, 2020 at 09:49:22AM -0500, Paul A. Clarke wrote:
> > > On Tue, May 19, 2020 at 10:31:56AM +1000, Alistair Popple wrote:
> > > > Matrix multiple accumulate (MMA) is a new feature added to ISAv3.1 and
> > > 
> > > nit: "Matrix-Multiply Accelerator".
> > 
> > "Matrix-Multiply Assist" in fact :-)
> 
> Not according to the ISA (p. 1129).

There is one mistake in the 3.1 ABI yes, in the description for the PCR
bit of this same name.  Everything else calls this "assist", like all
similar things are called as well (whereas an "accelerator" is an
external device).


Segher

^ permalink raw reply

* Re: [PATCH v2 6/7] powerpc/dt_cpu_ftrs: Add MMA feature
From: Paul A. Clarke @ 2020-05-19 15:22 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Alistair Popple, mikey, linuxppc-dev, npiggin, aneesh.kumar
In-Reply-To: <20200519150556.GM31009@gate.crashing.org>

On Tue, May 19, 2020 at 10:05:56AM -0500, Segher Boessenkool wrote:
> On Tue, May 19, 2020 at 09:49:22AM -0500, Paul A. Clarke wrote:
> > On Tue, May 19, 2020 at 10:31:56AM +1000, Alistair Popple wrote:
> > > Matrix multiple accumulate (MMA) is a new feature added to ISAv3.1 and
> > 
> > nit: "Matrix-Multiply Accelerator".
> 
> "Matrix-Multiply Assist" in fact :-)

Not according to the ISA (p. 1129).

PC

^ permalink raw reply

* Re: [PATCH v2 6/7] powerpc/dt_cpu_ftrs: Add MMA feature
From: Segher Boessenkool @ 2020-05-19 15:05 UTC (permalink / raw)
  To: Paul A. Clarke
  Cc: Alistair Popple, mikey, linuxppc-dev, npiggin, aneesh.kumar
In-Reply-To: <20200519144922.GA24922@oc3272150783.ibm.com>

On Tue, May 19, 2020 at 09:49:22AM -0500, Paul A. Clarke wrote:
> On Tue, May 19, 2020 at 10:31:56AM +1000, Alistair Popple wrote:
> > Matrix multiple accumulate (MMA) is a new feature added to ISAv3.1 and
> 
> nit: "Matrix-Multiply Accelerator".

"Matrix-Multiply Assist" in fact :-)


Segher

^ permalink raw reply

* Re: [PATCH v2 6/7] powerpc/dt_cpu_ftrs: Add MMA feature
From: Paul A. Clarke @ 2020-05-19 14:49 UTC (permalink / raw)
  To: Alistair Popple; +Cc: aneesh.kumar, mikey, linuxppc-dev, npiggin
In-Reply-To: <20200519003157.31946-7-alistair@popple.id.au>

On Tue, May 19, 2020 at 10:31:56AM +1000, Alistair Popple wrote:
> Matrix multiple accumulate (MMA) is a new feature added to ISAv3.1 and

nit: "Matrix-Multiply Accelerator".

PC

^ permalink raw reply

* [powerpc:next 108/110] kernel/events/hw_breakpoint.c:216:12: warning: no previous prototype for function 'arch_reserve_bp_slot'
From: kbuild test robot @ 2020-05-19 14:47 UTC (permalink / raw)
  To: Ravi Bangoria
  Cc: clang-built-linux, Michael Neuling, kbuild-all, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3032 bytes --]

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
head:   30df74d67d48949da87e3a5b57c381763e8fd526
commit: 29da4f91c0c1fbda12b8a31be0d564930208c92e [108/110] powerpc/watchpoint: Don't allow concurrent perf and ptrace events
config: x86_64-allyesconfig (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 135b877874fae96b4372c8a3fbfaa8ff44ff86e3)
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        git checkout 29da4f91c0c1fbda12b8a31be0d564930208c92e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>, old ones prefixed by <<):

kernel/events/hw_breakpoint.c:71:12: warning: no previous prototype for function 'hw_breakpoint_weight' [-Wmissing-prototypes]
__weak int hw_breakpoint_weight(struct perf_event *bp)
^
kernel/events/hw_breakpoint.c:71:8: note: declare 'static' if the function is not intended to be used outside of this translation unit
__weak int hw_breakpoint_weight(struct perf_event *bp)
^
static
>> kernel/events/hw_breakpoint.c:216:12: warning: no previous prototype for function 'arch_reserve_bp_slot' [-Wmissing-prototypes]
__weak int arch_reserve_bp_slot(struct perf_event *bp)
^
kernel/events/hw_breakpoint.c:216:8: note: declare 'static' if the function is not intended to be used outside of this translation unit
__weak int arch_reserve_bp_slot(struct perf_event *bp)
^
static
>> kernel/events/hw_breakpoint.c:221:13: warning: no previous prototype for function 'arch_release_bp_slot' [-Wmissing-prototypes]
__weak void arch_release_bp_slot(struct perf_event *bp)
^
kernel/events/hw_breakpoint.c:221:8: note: declare 'static' if the function is not intended to be used outside of this translation unit
__weak void arch_release_bp_slot(struct perf_event *bp)
^
static
kernel/events/hw_breakpoint.c:228:13: warning: no previous prototype for function 'arch_unregister_hw_breakpoint' [-Wmissing-prototypes]
__weak void arch_unregister_hw_breakpoint(struct perf_event *bp)
^
kernel/events/hw_breakpoint.c:228:8: note: declare 'static' if the function is not intended to be used outside of this translation unit
__weak void arch_unregister_hw_breakpoint(struct perf_event *bp)
^
static
4 warnings generated.

vim +/arch_reserve_bp_slot +216 kernel/events/hw_breakpoint.c

   215	
 > 216	__weak int arch_reserve_bp_slot(struct perf_event *bp)
   217	{
   218		return 0;
   219	}
   220	
 > 221	__weak void arch_release_bp_slot(struct perf_event *bp)
   222	{
   223	}
   224	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 73420 bytes --]

^ permalink raw reply

* Re: [PATCH v2 3/5] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier
From: Aneesh Kumar K.V @ 2020-05-19 13:52 UTC (permalink / raw)
  To: Dan Williams; +Cc: alistair, linuxppc-dev, linux-nvdimm
In-Reply-To: <CAPcyv4g+oE305Q5bYWkNBKFifB9c0TZo6+hqFQnqiFqU5QFrhQ@mail.gmail.com>

Dan Williams <dan.j.williams@intel.com> writes:

> On Mon, May 18, 2020 at 10:30 PM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:

...

>> Applications using new instructions will behave as expected when running
>> on P8 and P9. Only future hardware will differentiate between 'dcbf' and
>> 'dcbfps'
>
> Right, this is the problem. Applications using new instructions behave
> as expected, the kernel has been shipping of_pmem and papr_scm for
> several cycles now, you're saying that the DAX applications written
> against those platforms are going to be broken on P8 and P9?

The expecation is that both kernel and userspace would get upgraded to
use the new instruction before actual persistent memory devices are
made available.

>
>> > I'm thinking the kernel
>> > should go as far as to disable DAX operation by default on new
>> > hardware until userspace asserts that it is prepared to switch to the
>> > new implementation. Is there any other way to ensure the forward
>> > compatibility of deployed ppc64 DAX applications?
>>
>> AFAIU there is no released persistent memory hardware on ppc64 platform
>> and we need to make sure before applications get enabled to use these
>> persistent memory devices, they should switch to use the new
>> instruction?
>
> Right, I want the kernel to offer some level of safety here because
> everything you are describing sounds like a flag day conversion. Am I
> misreading? Is there some other gate that prevents existing users of
> of_pmem and papr_scm from having their expectations violated when
> running on P8 / P9 hardware? Maybe there's tighter ecosystem control
> that I'm just not familiar with, I'm only going off the fact that the
> kernel has shipped a non-zero number of NVDIMM drivers that build with
> ARCH=ppc64 for several cycles.

If we are looking at adding changes to kernel that will prevent a kernel
from running on newer hardware in a specific case, we could as well take
the changes to get the kernel use the newer instructions right?

But I agree with your concern that if we have older kernel/applications
that continue to use `dcbf` on future hardware we will end up
having issues w.r.t powerfail consistency. The plan is what you outlined
above as tighter ecosystem control. Considering we don't have a pmem
device generally available, we get both kernel and userspace upgraded
to use these new instructions before such a device is made available.

-aneesh

^ permalink raw reply

* Re: [PATCH] ASoC: fsl_asrc: Set ASR76K and ASR56K based on processing clock
From: Mark Brown @ 2020-05-19 13:19 UTC (permalink / raw)
  To: Shengjiu Wang, timur, nicoleotsuka, lgirdwood, Xiubo.Lee, tiwai,
	perex, festevam, alsa-devel
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1589278979-31008-1-git-send-email-shengjiu.wang@nxp.com>

On Tue, 12 May 2020 18:22:59 +0800, Shengjiu Wang wrote:
> The processing clock is different for platforms, so it is better
> to set ASR76K and ASR56K based on processing clock, rather than
> hard coding the value for them.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-5.8

Thanks!

[1/1] ASoC: fsl_asrc: Set ASR76K and ASR56K based on processing clock
      commit: f8953043e6e24f9425249464605a541d00950709

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

^ permalink raw reply

* Re: [PATCH] ASoC: fsl: imx-pcm-dma: Don't request dma channel in probe
From: Lucas Stach @ 2020-05-19 10:01 UTC (permalink / raw)
  To: Shengjiu Wang, timur, nicoleotsuka, Xiubo.Lee, festevam,
	lgirdwood, broonie, perex, tiwai, shawnguo, s.hauer, kernel,
	linux-imx, sumit.semwal, alsa-devel, linuxppc-dev,
	linux-arm-kernel, linux-kernel, linux-media, dri-devel,
	linaro-mm-sig
In-Reply-To: <1589881301-4143-1-git-send-email-shengjiu.wang@nxp.com>

Am Dienstag, den 19.05.2020, 17:41 +0800 schrieb Shengjiu Wang:
> There are two requirements that we need to move the request
> of dma channel from probe to open.

How do you handle -EPROBE_DEFER return code from the channel request if
you don't do it in probe?

> - When dma device binds with power-domains, the power will
> be enabled when we request dma channel. If the request of dma
> channel happen on probe, then the power-domains will be always
> enabled after kernel boot up,  which is not good for power
> saving,  so we need to move the request of dma channel to .open();

This is certainly something which could be fixed in the dmaengine
driver.

> - With FE-BE case, if the dma channel is requested in probe,
> then there will be below issue, which is caused by that the
> dma channel will be requested duplicately

Why is this requested a second time? Is this just some missing cleanup
on a deferred probe path?

Regards,
Lucas

> [  638.906268] sysfs: cannot create duplicate filename '/devices/soc0/soc/2000000.bus/2000000.spba-bus/2024000.esai/dma:tx'
> [  638.919061] CPU: 1 PID: 673 Comm: aplay Not tainted 5.7.0-rc1-12956-gfc64b2585593 #287
> [  638.927113] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [  638.933690] [<c0110dd8>] (unwind_backtrace) from [<c010b8ec>] (show_stack+0x10/0x14)
> [  638.941464] [<c010b8ec>] (show_stack) from [<c0557fc0>] (dump_stack+0xe4/0x118)
> [  638.948808] [<c0557fc0>] (dump_stack) from [<c032aeb4>] (sysfs_warn_dup+0x50/0x64)
> [  638.956406] [<c032aeb4>] (sysfs_warn_dup) from [<c032b1a8>] (sysfs_do_create_link_sd+0xc8/0xd4)
> [  638.965134] [<c032b1a8>] (sysfs_do_create_link_sd) from [<c05dc668>] (dma_request_chan+0xb0/0x210)
> [  638.974120] [<c05dc668>] (dma_request_chan) from [<c05dc7d0>] (dma_request_slave_channel+0x8/0x14)
> [  638.983111] [<c05dc7d0>] (dma_request_slave_channel) from [<c09d5548>] (fsl_asrc_dma_hw_params+0x1e0/0x438)
> [  638.992881] [<c09d5548>] (fsl_asrc_dma_hw_params) from [<c09c1654>] (soc_pcm_hw_params+0x4a0/0x6a8)
> [  639.001952] [<c09c1654>] (soc_pcm_hw_params) from [<c09c39d4>] (dpcm_fe_dai_hw_params+0x70/0xe4)
> [  639.010765] [<c09c39d4>] (dpcm_fe_dai_hw_params) from [<c099b274>] (snd_pcm_hw_params+0x158/0x418)
> [  639.019750] [<c099b274>] (snd_pcm_hw_params) from [<c099c5a0>] (snd_pcm_ioctl+0x734/0x183c)
> [  639.028129] [<c099c5a0>] (snd_pcm_ioctl) from [<c029ff94>] (ksys_ioctl+0x2ac/0xb98)
> [  639.035812] [<c029ff94>] (ksys_ioctl) from [<c0100080>] (ret_fast_syscall+0x0/0x28)
> [  639.043490] Exception stack(0xec529fa8 to 0xec529ff0)
> [  639.048565] 9fa0:                   bee84650 01321870 00000004 c25c4111 bee84650 0002000f
> [  639.056766] 9fc0: bee84650 01321870 01321820 00000036 00001f40 00000000 0002c2f8 00000003
> [  639.064964] 9fe0: b6f483fc bee8451c b6ee2655 b6e1dcf8
> [  639.070339] fsl-esai-dai 2024000.esai: Cannot create DMA dma:tx symlink
> 
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
> ---
>  sound/soc/fsl/imx-pcm-dma.c | 173 +++++++++++++++++++++++++++++++++---
>  1 file changed, 159 insertions(+), 14 deletions(-)
> 
> diff --git a/sound/soc/fsl/imx-pcm-dma.c b/sound/soc/fsl/imx-pcm-dma.c
> index 04a9bc749016..dae53b384df4 100644
> --- a/sound/soc/fsl/imx-pcm-dma.c
> +++ b/sound/soc/fsl/imx-pcm-dma.c
> @@ -11,6 +11,7 @@
>  #include <linux/dmaengine.h>
>  #include <linux/types.h>
>  #include <linux/module.h>
> +#include <linux/dma-mapping.h>
>  
>  #include <sound/core.h>
>  #include <sound/pcm.h>
> @@ -29,24 +30,168 @@ static bool filter(struct dma_chan *chan, void *param)
>  	return true;
>  }
>  
> -static const struct snd_dmaengine_pcm_config imx_dmaengine_pcm_config = {
> -	.prepare_slave_config = snd_dmaengine_pcm_prepare_slave_config,
> -	.compat_filter_fn = filter,
> -};
> +static int imx_pcm_hw_params(struct snd_soc_component *component,
> +			     struct snd_pcm_substream *substream,
> +			     struct snd_pcm_hw_params *params)
> +{
> +	struct snd_pcm_runtime *runtime = substream->runtime;
> +	struct snd_soc_pcm_runtime *rtd = substream->private_data;
> +	struct snd_soc_dai *cpu_dai = asoc_rtd_to_cpu(rtd, 0);
> +	struct snd_dmaengine_dai_dma_data *dma_data;
> +	struct dma_slave_config config;
> +	struct dma_chan *chan;
> +	int ret = 0;
>  
> -int imx_pcm_dma_init(struct platform_device *pdev, size_t size)
> +	snd_pcm_set_runtime_buffer(substream, &substream->dma_buffer);
> +	runtime->dma_bytes = params_buffer_bytes(params);
> +
> +	chan = snd_dmaengine_pcm_get_chan(substream);
> +	if (!chan)
> +		return -EINVAL;
> +
> +	ret = snd_hwparams_to_dma_slave_config(substream, params, &config);
> +	if (ret)
> +		return ret;
> +
> +	dma_data = snd_soc_dai_get_dma_data(cpu_dai, substream);
> +	if (!dma_data)
> +		return -EINVAL;
> +
> +	snd_dmaengine_pcm_set_config_from_dai_data(substream,
> +						   dma_data,
> +						   &config);
> +	return dmaengine_slave_config(chan, &config);
> +}
> +
> +static int imx_pcm_hw_free(struct snd_soc_component *component,
> +			   struct snd_pcm_substream *substream)
>  {
> -	struct snd_dmaengine_pcm_config *config;
> +	snd_pcm_set_runtime_buffer(substream, NULL);
> +	return 0;
> +}
> +
> +static snd_pcm_uframes_t imx_pcm_pointer(struct snd_soc_component *component,
> +					 struct snd_pcm_substream *substream)
> +{
> +	return snd_dmaengine_pcm_pointer(substream);
> +}
> +
> +static int imx_pcm_trigger(struct snd_soc_component *component,
> +			   struct snd_pcm_substream *substream, int cmd)
> +{
> +	return snd_dmaengine_pcm_trigger(substream, cmd);
> +}
> +
> +static int imx_pcm_open(struct snd_soc_component *component,
> +			struct snd_pcm_substream *substream)
> +{
> +	struct snd_soc_pcm_runtime *rtd = substream->private_data;
> +	bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
> +	struct snd_soc_dai *cpu_dai = asoc_rtd_to_cpu(rtd, 0);
> +	struct snd_dmaengine_dai_dma_data *dma_data;
> +	struct device *dev = component->dev;
> +	struct snd_pcm_hardware hw;
> +	struct dma_chan *chan;
> +	int ret;
> +
> +	ret = snd_pcm_hw_constraint_integer(substream->runtime,
> +					    SNDRV_PCM_HW_PARAM_PERIODS);
> +	if (ret < 0) {
> +		dev_err(dev, "failed to set pcm hw params periods\n");
> +		return ret;
> +	}
> +
> +	dma_data = snd_soc_dai_get_dma_data(cpu_dai, substream);
> +	if (!dma_data)
> +		return -EINVAL;
> +
> +	chan = dma_request_slave_channel(cpu_dai->dev, tx ? "tx" : "rx");
> +	if (!chan) {
> +		/* Try to request channel using compat_filter_fn */
> +		chan = snd_dmaengine_pcm_request_channel(filter,
> +							 dma_data->filter_data);
> +		if (!chan)
> +			return -ENXIO;
> +	}
>  
> -	config = devm_kzalloc(&pdev->dev,
> -			sizeof(struct snd_dmaengine_pcm_config), GFP_KERNEL);
> -	if (!config)
> -		return -ENOMEM;
> -	*config = imx_dmaengine_pcm_config;
> +	ret = snd_dmaengine_pcm_open(substream, chan);
> +	if (ret)
> +		goto pcm_open_fail;
>  
> -	return devm_snd_dmaengine_pcm_register(&pdev->dev,
> -		config,
> -		SND_DMAENGINE_PCM_FLAG_COMPAT);
> +	memset(&hw, 0, sizeof(hw));
> +	hw.info = SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID |
> +			SNDRV_PCM_INFO_INTERLEAVED;
> +	hw.periods_min = 2;
> +	hw.periods_max = UINT_MAX;
> +	hw.period_bytes_min = 256;
> +	hw.period_bytes_max = dma_get_max_seg_size(chan->device->dev);
> +	hw.buffer_bytes_max = IMX_DEFAULT_DMABUF_SIZE;
> +	hw.fifo_size = dma_data->fifo_size;
> +
> +	/* Refine the hw according to caps of DMA. */
> +	ret = snd_dmaengine_pcm_refine_runtime_hwparams(substream,
> +							dma_data,
> +							&hw,
> +							chan);
> +	if (ret < 0)
> +		goto refine_runtime_hwparams_fail;
> +
> +	snd_soc_set_runtime_hwparams(substream, &hw);
> +
> +	/* Support allocate memory from IRAM */
> +	ret = snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV_IRAM,
> +				  chan->device->dev,
> +				  hw.buffer_bytes_max,
> +				  &substream->dma_buffer);
> +	if (ret < 0)
> +		goto alloc_pagas_fail;
> +
> +	return 0;
> +
> +alloc_pagas_fail:
> +refine_runtime_hwparams_fail:
> +	snd_dmaengine_pcm_close(substream);
> +pcm_open_fail:
> +	dma_release_channel(chan);
> +
> +	return ret;
> +}
> +
> +static int imx_pcm_close(struct snd_soc_component *component,
> +			 struct snd_pcm_substream *substream)
> +{
> +	if (substream) {
> +		snd_dma_free_pages(&substream->dma_buffer);
> +		substream->dma_buffer.area = NULL;
> +		substream->dma_buffer.addr = 0;
> +	}
> +
> +	return snd_dmaengine_pcm_close_release_chan(substream);
> +}
> +
> +static int imx_pcm_new(struct snd_soc_component *component,
> +		       struct snd_soc_pcm_runtime *rtd)
> +{
> +	struct snd_card *card = rtd->card->snd_card;
> +
> +	return dma_coerce_mask_and_coherent(card->dev, DMA_BIT_MASK(32));
> +}
> +
> +static const struct snd_soc_component_driver imx_pcm_component = {
> +	.name           = "imx-pcm-dma",
> +	.pcm_construct	= imx_pcm_new,
> +	.open		= imx_pcm_open,
> +	.close		= imx_pcm_close,
> +	.hw_params	= imx_pcm_hw_params,
> +	.hw_free	= imx_pcm_hw_free,
> +	.trigger	= imx_pcm_trigger,
> +	.pointer	= imx_pcm_pointer,
> +};
> +
> +int imx_pcm_dma_init(struct platform_device *pdev, size_t size)
> +{
> +	return devm_snd_soc_register_component(&pdev->dev,
> +					       &imx_pcm_component, NULL, 0);
>  }
>  EXPORT_SYMBOL_GPL(imx_pcm_dma_init);
>  


^ permalink raw reply

* [PATCH] ASoC: fsl: imx-pcm-dma: Don't request dma channel in probe
From: Shengjiu Wang @ 2020-05-19  9:41 UTC (permalink / raw)
  To: timur, nicoleotsuka, Xiubo.Lee, festevam, lgirdwood, broonie,
	perex, tiwai, shawnguo, s.hauer, kernel, linux-imx, sumit.semwal,
	alsa-devel, linuxppc-dev, linux-arm-kernel, linux-kernel,
	linux-media, dri-devel, linaro-mm-sig

There are two requirements that we need to move the request
of dma channel from probe to open.

- When dma device binds with power-domains, the power will
be enabled when we request dma channel. If the request of dma
channel happen on probe, then the power-domains will be always
enabled after kernel boot up,  which is not good for power
saving,  so we need to move the request of dma channel to .open();

- With FE-BE case, if the dma channel is requested in probe,
then there will be below issue, which is caused by that the
dma channel will be requested duplicately

[  638.906268] sysfs: cannot create duplicate filename '/devices/soc0/soc/2000000.bus/2000000.spba-bus/2024000.esai/dma:tx'
[  638.919061] CPU: 1 PID: 673 Comm: aplay Not tainted 5.7.0-rc1-12956-gfc64b2585593 #287
[  638.927113] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[  638.933690] [<c0110dd8>] (unwind_backtrace) from [<c010b8ec>] (show_stack+0x10/0x14)
[  638.941464] [<c010b8ec>] (show_stack) from [<c0557fc0>] (dump_stack+0xe4/0x118)
[  638.948808] [<c0557fc0>] (dump_stack) from [<c032aeb4>] (sysfs_warn_dup+0x50/0x64)
[  638.956406] [<c032aeb4>] (sysfs_warn_dup) from [<c032b1a8>] (sysfs_do_create_link_sd+0xc8/0xd4)
[  638.965134] [<c032b1a8>] (sysfs_do_create_link_sd) from [<c05dc668>] (dma_request_chan+0xb0/0x210)
[  638.974120] [<c05dc668>] (dma_request_chan) from [<c05dc7d0>] (dma_request_slave_channel+0x8/0x14)
[  638.983111] [<c05dc7d0>] (dma_request_slave_channel) from [<c09d5548>] (fsl_asrc_dma_hw_params+0x1e0/0x438)
[  638.992881] [<c09d5548>] (fsl_asrc_dma_hw_params) from [<c09c1654>] (soc_pcm_hw_params+0x4a0/0x6a8)
[  639.001952] [<c09c1654>] (soc_pcm_hw_params) from [<c09c39d4>] (dpcm_fe_dai_hw_params+0x70/0xe4)
[  639.010765] [<c09c39d4>] (dpcm_fe_dai_hw_params) from [<c099b274>] (snd_pcm_hw_params+0x158/0x418)
[  639.019750] [<c099b274>] (snd_pcm_hw_params) from [<c099c5a0>] (snd_pcm_ioctl+0x734/0x183c)
[  639.028129] [<c099c5a0>] (snd_pcm_ioctl) from [<c029ff94>] (ksys_ioctl+0x2ac/0xb98)
[  639.035812] [<c029ff94>] (ksys_ioctl) from [<c0100080>] (ret_fast_syscall+0x0/0x28)
[  639.043490] Exception stack(0xec529fa8 to 0xec529ff0)
[  639.048565] 9fa0:                   bee84650 01321870 00000004 c25c4111 bee84650 0002000f
[  639.056766] 9fc0: bee84650 01321870 01321820 00000036 00001f40 00000000 0002c2f8 00000003
[  639.064964] 9fe0: b6f483fc bee8451c b6ee2655 b6e1dcf8
[  639.070339] fsl-esai-dai 2024000.esai: Cannot create DMA dma:tx symlink

Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
 sound/soc/fsl/imx-pcm-dma.c | 173 +++++++++++++++++++++++++++++++++---
 1 file changed, 159 insertions(+), 14 deletions(-)

diff --git a/sound/soc/fsl/imx-pcm-dma.c b/sound/soc/fsl/imx-pcm-dma.c
index 04a9bc749016..dae53b384df4 100644
--- a/sound/soc/fsl/imx-pcm-dma.c
+++ b/sound/soc/fsl/imx-pcm-dma.c
@@ -11,6 +11,7 @@
 #include <linux/dmaengine.h>
 #include <linux/types.h>
 #include <linux/module.h>
+#include <linux/dma-mapping.h>
 
 #include <sound/core.h>
 #include <sound/pcm.h>
@@ -29,24 +30,168 @@ static bool filter(struct dma_chan *chan, void *param)
 	return true;
 }
 
-static const struct snd_dmaengine_pcm_config imx_dmaengine_pcm_config = {
-	.prepare_slave_config = snd_dmaengine_pcm_prepare_slave_config,
-	.compat_filter_fn = filter,
-};
+static int imx_pcm_hw_params(struct snd_soc_component *component,
+			     struct snd_pcm_substream *substream,
+			     struct snd_pcm_hw_params *params)
+{
+	struct snd_pcm_runtime *runtime = substream->runtime;
+	struct snd_soc_pcm_runtime *rtd = substream->private_data;
+	struct snd_soc_dai *cpu_dai = asoc_rtd_to_cpu(rtd, 0);
+	struct snd_dmaengine_dai_dma_data *dma_data;
+	struct dma_slave_config config;
+	struct dma_chan *chan;
+	int ret = 0;
 
-int imx_pcm_dma_init(struct platform_device *pdev, size_t size)
+	snd_pcm_set_runtime_buffer(substream, &substream->dma_buffer);
+	runtime->dma_bytes = params_buffer_bytes(params);
+
+	chan = snd_dmaengine_pcm_get_chan(substream);
+	if (!chan)
+		return -EINVAL;
+
+	ret = snd_hwparams_to_dma_slave_config(substream, params, &config);
+	if (ret)
+		return ret;
+
+	dma_data = snd_soc_dai_get_dma_data(cpu_dai, substream);
+	if (!dma_data)
+		return -EINVAL;
+
+	snd_dmaengine_pcm_set_config_from_dai_data(substream,
+						   dma_data,
+						   &config);
+	return dmaengine_slave_config(chan, &config);
+}
+
+static int imx_pcm_hw_free(struct snd_soc_component *component,
+			   struct snd_pcm_substream *substream)
 {
-	struct snd_dmaengine_pcm_config *config;
+	snd_pcm_set_runtime_buffer(substream, NULL);
+	return 0;
+}
+
+static snd_pcm_uframes_t imx_pcm_pointer(struct snd_soc_component *component,
+					 struct snd_pcm_substream *substream)
+{
+	return snd_dmaengine_pcm_pointer(substream);
+}
+
+static int imx_pcm_trigger(struct snd_soc_component *component,
+			   struct snd_pcm_substream *substream, int cmd)
+{
+	return snd_dmaengine_pcm_trigger(substream, cmd);
+}
+
+static int imx_pcm_open(struct snd_soc_component *component,
+			struct snd_pcm_substream *substream)
+{
+	struct snd_soc_pcm_runtime *rtd = substream->private_data;
+	bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
+	struct snd_soc_dai *cpu_dai = asoc_rtd_to_cpu(rtd, 0);
+	struct snd_dmaengine_dai_dma_data *dma_data;
+	struct device *dev = component->dev;
+	struct snd_pcm_hardware hw;
+	struct dma_chan *chan;
+	int ret;
+
+	ret = snd_pcm_hw_constraint_integer(substream->runtime,
+					    SNDRV_PCM_HW_PARAM_PERIODS);
+	if (ret < 0) {
+		dev_err(dev, "failed to set pcm hw params periods\n");
+		return ret;
+	}
+
+	dma_data = snd_soc_dai_get_dma_data(cpu_dai, substream);
+	if (!dma_data)
+		return -EINVAL;
+
+	chan = dma_request_slave_channel(cpu_dai->dev, tx ? "tx" : "rx");
+	if (!chan) {
+		/* Try to request channel using compat_filter_fn */
+		chan = snd_dmaengine_pcm_request_channel(filter,
+							 dma_data->filter_data);
+		if (!chan)
+			return -ENXIO;
+	}
 
-	config = devm_kzalloc(&pdev->dev,
-			sizeof(struct snd_dmaengine_pcm_config), GFP_KERNEL);
-	if (!config)
-		return -ENOMEM;
-	*config = imx_dmaengine_pcm_config;
+	ret = snd_dmaengine_pcm_open(substream, chan);
+	if (ret)
+		goto pcm_open_fail;
 
-	return devm_snd_dmaengine_pcm_register(&pdev->dev,
-		config,
-		SND_DMAENGINE_PCM_FLAG_COMPAT);
+	memset(&hw, 0, sizeof(hw));
+	hw.info = SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID |
+			SNDRV_PCM_INFO_INTERLEAVED;
+	hw.periods_min = 2;
+	hw.periods_max = UINT_MAX;
+	hw.period_bytes_min = 256;
+	hw.period_bytes_max = dma_get_max_seg_size(chan->device->dev);
+	hw.buffer_bytes_max = IMX_DEFAULT_DMABUF_SIZE;
+	hw.fifo_size = dma_data->fifo_size;
+
+	/* Refine the hw according to caps of DMA. */
+	ret = snd_dmaengine_pcm_refine_runtime_hwparams(substream,
+							dma_data,
+							&hw,
+							chan);
+	if (ret < 0)
+		goto refine_runtime_hwparams_fail;
+
+	snd_soc_set_runtime_hwparams(substream, &hw);
+
+	/* Support allocate memory from IRAM */
+	ret = snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV_IRAM,
+				  chan->device->dev,
+				  hw.buffer_bytes_max,
+				  &substream->dma_buffer);
+	if (ret < 0)
+		goto alloc_pagas_fail;
+
+	return 0;
+
+alloc_pagas_fail:
+refine_runtime_hwparams_fail:
+	snd_dmaengine_pcm_close(substream);
+pcm_open_fail:
+	dma_release_channel(chan);
+
+	return ret;
+}
+
+static int imx_pcm_close(struct snd_soc_component *component,
+			 struct snd_pcm_substream *substream)
+{
+	if (substream) {
+		snd_dma_free_pages(&substream->dma_buffer);
+		substream->dma_buffer.area = NULL;
+		substream->dma_buffer.addr = 0;
+	}
+
+	return snd_dmaengine_pcm_close_release_chan(substream);
+}
+
+static int imx_pcm_new(struct snd_soc_component *component,
+		       struct snd_soc_pcm_runtime *rtd)
+{
+	struct snd_card *card = rtd->card->snd_card;
+
+	return dma_coerce_mask_and_coherent(card->dev, DMA_BIT_MASK(32));
+}
+
+static const struct snd_soc_component_driver imx_pcm_component = {
+	.name           = "imx-pcm-dma",
+	.pcm_construct	= imx_pcm_new,
+	.open		= imx_pcm_open,
+	.close		= imx_pcm_close,
+	.hw_params	= imx_pcm_hw_params,
+	.hw_free	= imx_pcm_hw_free,
+	.trigger	= imx_pcm_trigger,
+	.pointer	= imx_pcm_pointer,
+};
+
+int imx_pcm_dma_init(struct platform_device *pdev, size_t size)
+{
+	return devm_snd_soc_register_component(&pdev->dev,
+					       &imx_pcm_component, NULL, 0);
 }
 EXPORT_SYMBOL_GPL(imx_pcm_dma_init);
 
-- 
2.21.0


^ permalink raw reply related

* [powerpc:next 17/110] arch/powerpc/kernel/fadump.c:513:18: warning: variable 'mem_boundary' set but not used
From: kbuild test robot @ 2020-05-19  9:41 UTC (permalink / raw)
  To: Hari Bathini; +Cc: linuxppc-dev, kbuild-all, Mahesh Salgaonkar

[-- Attachment #1: Type: text/plain, Size: 4147 bytes --]

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
head:   30df74d67d48949da87e3a5b57c381763e8fd526
commit: 140777a3d8dfdb3d3f20ea7707c0f1c0ce1b0aa5 [17/110] powerpc/fadump: consider reserved ranges while reserving memory
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        git checkout 140777a3d8dfdb3d3f20ea7707c0f1c0ce1b0aa5
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>, old ones prefixed by <<):

arch/powerpc/kernel/fadump.c:71:12: warning: no previous prototype for 'fadump_cma_init' [-Wmissing-prototypes]
71 | int __init fadump_cma_init(void)
|            ^~~~~~~~~~~~~~~
arch/powerpc/kernel/fadump.c: In function 'fadump_reserve_mem':
>> arch/powerpc/kernel/fadump.c:513:18: warning: variable 'mem_boundary' set but not used [-Wunused-but-set-variable]
513 |  u64 base, size, mem_boundary, bootmem_min;
|                  ^~~~~~~~~~~~
arch/powerpc/kernel/fadump.c: In function 'fadump_update_elfcore_header':
arch/powerpc/kernel/fadump.c:731:17: warning: variable 'elf' set but not used [-Wunused-but-set-variable]
731 |  struct elfhdr *elf;
|                 ^~~

vim +/mem_boundary +513 arch/powerpc/kernel/fadump.c

   510	
   511	int __init fadump_reserve_mem(void)
   512	{
 > 513		u64 base, size, mem_boundary, bootmem_min;
   514		int ret = 1;
   515	
   516		if (!fw_dump.fadump_enabled)
   517			return 0;
   518	
   519		if (!fw_dump.fadump_supported) {
   520			pr_info("Firmware-Assisted Dump is not supported on this hardware\n");
   521			goto error_out;
   522		}
   523	
   524		/*
   525		 * Initialize boot memory size
   526		 * If dump is active then we have already calculated the size during
   527		 * first kernel.
   528		 */
   529		if (!fw_dump.dump_active) {
   530			fw_dump.boot_memory_size =
   531				PAGE_ALIGN(fadump_calculate_reserve_size());
   532	#ifdef CONFIG_CMA
   533			if (!fw_dump.nocma) {
   534				fw_dump.boot_memory_size =
   535					ALIGN(fw_dump.boot_memory_size,
   536					      FADUMP_CMA_ALIGNMENT);
   537			}
   538	#endif
   539	
   540			bootmem_min = fw_dump.ops->fadump_get_bootmem_min();
   541			if (fw_dump.boot_memory_size < bootmem_min) {
   542				pr_err("Can't enable fadump with boot memory size (0x%lx) less than 0x%llx\n",
   543				       fw_dump.boot_memory_size, bootmem_min);
   544				goto error_out;
   545			}
   546	
   547			if (!fadump_get_boot_mem_regions()) {
   548				pr_err("Too many holes in boot memory area to enable fadump\n");
   549				goto error_out;
   550			}
   551		}
   552	
   553		/*
   554		 * Calculate the memory boundary.
   555		 * If memory_limit is less than actual memory boundary then reserve
   556		 * the memory for fadump beyond the memory_limit and adjust the
   557		 * memory_limit accordingly, so that the running kernel can run with
   558		 * specified memory_limit.
   559		 */
   560		if (memory_limit && memory_limit < memblock_end_of_DRAM()) {
   561			size = get_fadump_area_size();
   562			if ((memory_limit + size) < memblock_end_of_DRAM())
   563				memory_limit += size;
   564			else
   565				memory_limit = memblock_end_of_DRAM();
   566			printk(KERN_INFO "Adjusted memory_limit for firmware-assisted"
   567					" dump, now %#016llx\n", memory_limit);
   568		}
   569		if (memory_limit)
   570			mem_boundary = memory_limit;
   571		else
   572			mem_boundary = memblock_end_of_DRAM();
   573	
   574		base = fw_dump.boot_mem_top;
   575		size = get_fadump_area_size();
   576		fw_dump.reserve_dump_area_size = size;
   577		if (fw_dump.dump_active) {
   578			pr_info("Firmware-assisted dump is active.\n");
   579	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 66042 bytes --]

^ permalink raw reply

* linux-next: manual merge of the rcu tree with the powerpc tree
From: Stephen Rothwell @ 2020-05-19  7:23 UTC (permalink / raw)
  To: Paul E. McKenney, Michael Ellerman, PowerPC
  Cc: Peter Zijlstra, Thomas Gleixner, Linux Next Mailing List,
	Linux Kernel Mailing List, Nicholas Piggin

[-- Attachment #1: Type: text/plain, Size: 765 bytes --]

Hi all,

Today's linux-next merge of the rcu tree got a conflict in:

  arch/powerpc/kernel/traps.c

between commit:

  116ac378bb3f ("powerpc/64s: machine check interrupt update NMI accounting")

from the powerpc tree and commit:

  187416eeb388 ("hardirq/nmi: Allow nested nmi_enter()")

from the rcu tree.

I fixed it up (I used the powerpc tree version for now) and can carry the
fix as necessary. This is now fixed as far as linux-next is concerned,
but any non trivial conflicts should be mentioned to your upstream
maintainer when your tree is submitted for merging.  You may also want
to consider cooperating with the maintainer of the conflicting tree to
minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH v2 3/5] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier
From: Dan Williams @ 2020-05-19  7:09 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: alistair, linuxppc-dev, linux-nvdimm
In-Reply-To: <87v9kspk3x.fsf@linux.ibm.com>

On Mon, May 18, 2020 at 10:30 PM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
>
> Hi Dan,
>
> Apologies for the delay in response. I was waiting for feedback from
> hardware team before responding to this email.
>
>
> Dan Williams <dan.j.williams@intel.com> writes:
>
> > On Tue, May 12, 2020 at 8:47 PM Aneesh Kumar K.V
> > <aneesh.kumar@linux.ibm.com> wrote:
> >>
> >> Architectures like ppc64 provide persistent memory specific barriers
> >> that will ensure that all stores for which the modifications are
> >> written to persistent storage by preceding dcbfps and dcbstps
> >> instructions have updated persistent storage before any data
> >> access or data transfer caused by subsequent instructions is initiated.
> >> This is in addition to the ordering done by wmb()
> >>
> >> Update nvdimm core such that architecture can use barriers other than
> >> wmb to ensure all previous writes are architecturally visible for
> >> the platform buffer flush.
> >
> > This seems like an exceedingly bad idea, maybe I'm missing something.
> > This implies that the deployed base of DAX applications using the old
> > instruction sequence are going to regress on new hardware that
> > requires the new instructions to be deployed.
>
>
> pmdk support for ppc64 is still work in progress and there is pull
> request to switch pmdk to use new instruction.

Ok.

>
> https://github.com/tuliom/pmdk/commit/fix-flush
>
> All userspace applications will be switched to use the new
> instructions. The new instructions are designed such that when running on P8
> and P9 they behave as 'dcbf' and 'hwsync'.

Sure, makes sense.

> Applications using new instructions will behave as expected when running
> on P8 and P9. Only future hardware will differentiate between 'dcbf' and
> 'dcbfps'

Right, this is the problem. Applications using new instructions behave
as expected, the kernel has been shipping of_pmem and papr_scm for
several cycles now, you're saying that the DAX applications written
against those platforms are going to be broken on P8 and P9?

> > I'm thinking the kernel
> > should go as far as to disable DAX operation by default on new
> > hardware until userspace asserts that it is prepared to switch to the
> > new implementation. Is there any other way to ensure the forward
> > compatibility of deployed ppc64 DAX applications?
>
> AFAIU there is no released persistent memory hardware on ppc64 platform
> and we need to make sure before applications get enabled to use these
> persistent memory devices, they should switch to use the new
> instruction?

Right, I want the kernel to offer some level of safety here because
everything you are describing sounds like a flag day conversion. Am I
misreading? Is there some other gate that prevents existing users of
of_pmem and papr_scm from having their expectations violated when
running on P8 / P9 hardware? Maybe there's tighter ecosystem control
that I'm just not familiar with, I'm only going off the fact that the
kernel has shipped a non-zero number of NVDIMM drivers that build with
ARCH=ppc64 for several cycles.

^ permalink raw reply

* [PATCH V2] powerpc/perf: Add support for outputting extended regs in perf intr_regs
From: Athira Rajeev @ 2020-05-19  6:15 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: ravi.bangoria, atrajeev, maddy, linux-kernel, acme, anju, jolsa

From: Anju T Sudhakar <anju@linux.vnet.ibm.com>

Add support for perf extended register capability in powerpc.
The capability flag PERF_PMU_CAP_EXTENDED_REGS, is used to indicate the
PMU which support extended registers. The generic code define the mask
of extended registers as 0 for non supported architectures.

Patch adds extended regs support for power9 platform by
exposing MMCR0, MMCR1 and MMCR2 registers.

REG_RESERVED mask needs update to include extended regs.
`PERF_REG_EXTENDED_MASK`, contains mask value of the supported registers,
is defined at runtime in the kernel based on platform since the supported
registers may differ from one processor version to another and hence the
MASK value.

Perf tools side uses extended mask to display the platform
supported register names (with -I? option) to the user and also
send this mask to the kernel to capture the extended registers
in each sample. Hence decide the mask value based on the processor
version.

with patch
----------

available registers: r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11
r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26
r27 r28 r29 r30 r31 nip msr orig_r3 ctr link xer ccr softe
trap dar dsisr sier mmcra mmcr0 mmcr1 mmcr2

PERF_RECORD_SAMPLE(IP, 0x1): 4784/4784: 0 period: 1 addr: 0
... intr regs: mask 0xffffffffffff ABI 64-bit
.... r0    0xc00000000012b77c
.... r1    0xc000003fe5e03930
.... r2    0xc000000001b0e000
.... r3    0xc000003fdcddf800
.... r4    0xc000003fc7880000
.... r5    0x9c422724be
.... r6    0xc000003fe5e03908
.... r7    0xffffff63bddc8706
.... r8    0x9e4
.... r9    0x0
.... r10   0x1
.... r11   0x0
.... r12   0xc0000000001299c0
.... r13   0xc000003ffffc4800
.... r14   0x0
.... r15   0x7fffdd8b8b00
.... r16   0x0
.... r17   0x7fffdd8be6b8
.... r18   0x7e7076607730
.... r19   0x2f
.... r20   0xc00000001fc26c68
.... r21   0xc0002041e4227e00
.... r22   0xc00000002018fb60
.... r23   0x1
.... r24   0xc000003ffec4d900
.... r25   0x80000000
.... r26   0x0
.... r27   0x1
.... r28   0x1
.... r29   0xc000000001be1260
.... r30   0x6008010
.... r31   0xc000003ffebb7218
.... nip   0xc00000000012b910
.... msr   0x9000000000009033
.... orig_r3 0xc00000000012b86c
.... ctr   0xc0000000001299c0
.... link  0xc00000000012b77c
.... xer   0x0
.... ccr   0x28002222
.... softe 0x1
.... trap  0xf00
.... dar   0x0
.... dsisr 0x80000000000
.... sier  0x0
.... mmcra 0x80000000000
.... mmcr0 0x82008090
.... mmcr1 0x1e000000
.... mmcr2 0x0
 ... thread: perf:4784

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
[Defined PERF_REG_EXTENDED_MASK at run time to add support for different platforms ]
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
Changes from v1 -> v2

- PERF_REG_EXTENDED_MASK` is defined at runtime in the kernel
based on platform. This will give flexibility in using extended
regs for all processor versions where the supported registers may differ.
- removed PERF_REG_EXTENDED_MASK from the perf tools side. Based on the
processor version(from PVR value), tool side will return the appropriate
extended mask
- Since tool changes can handle without a "PERF_REG_EXTENDED_MASK" macro,
dropped patch to set NO_AUXTRACE.
- Addressed review comments from Ravi Bangoria for V1

---

 arch/powerpc/include/asm/perf_event_server.h    |  8 ++++
 arch/powerpc/include/uapi/asm/perf_regs.h       | 14 ++++++-
 arch/powerpc/perf/core-book3s.c                 |  1 +
 arch/powerpc/perf/perf_regs.c                   | 34 ++++++++++++++--
 arch/powerpc/perf/power9-pmu.c                  |  6 +++
 tools/arch/powerpc/include/uapi/asm/perf_regs.h | 14 ++++++-
 tools/perf/arch/powerpc/include/perf_regs.h     |  5 ++-
 tools/perf/arch/powerpc/util/perf_regs.c        | 54 +++++++++++++++++++++++++
 8 files changed, 130 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 3e9703f..1458e1a 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -15,6 +15,9 @@
 #define MAX_EVENT_ALTERNATIVES	8
 #define MAX_LIMITED_HWCOUNTERS	2
 
+extern u64 mask_var;
+#define PERF_REG_EXTENDED_MASK          mask_var
+
 struct perf_event;
 
 /*
@@ -55,6 +58,11 @@ struct power_pmu {
 	int 		*blacklist_ev;
 	/* BHRB entries in the PMU */
 	int		bhrb_nr;
+	/*
+	 * set this flag with `PERF_PMU_CAP_EXTENDED_REGS` if
+	 * the pmu supports extended perf regs capability
+	 */
+	int		capabilities;
 };
 
 /*
diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h
index f599064..485b1d5 100644
--- a/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -48,6 +48,18 @@ enum perf_event_powerpc_regs {
 	PERF_REG_POWERPC_DSISR,
 	PERF_REG_POWERPC_SIER,
 	PERF_REG_POWERPC_MMCRA,
-	PERF_REG_POWERPC_MAX,
+	/* Extended registers */
+	PERF_REG_POWERPC_MMCR0,
+	PERF_REG_POWERPC_MMCR1,
+	PERF_REG_POWERPC_MMCR2,
+	/* Max regs without the extended regs */
+	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
 };
+
+#define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
+
+/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
+#define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
+				- PERF_REG_PMU_MASK)
+
 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 3dcfecf..f56b778 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2276,6 +2276,7 @@ int register_power_pmu(struct power_pmu *pmu)
 
 	power_pmu.attr_groups = ppmu->attr_groups;
 
+	power_pmu.capabilities |= (ppmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS);
 #ifdef MSR_HV
 	/*
 	 * Use FCHV to ignore kernel events if MSR.HV is set.
diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
index a213a0a..f1dbbc5 100644
--- a/arch/powerpc/perf/perf_regs.c
+++ b/arch/powerpc/perf/perf_regs.c
@@ -13,9 +13,11 @@
 #include <asm/ptrace.h>
 #include <asm/perf_regs.h>
 
+u64 mask_var;
+
 #define PT_REGS_OFFSET(id, r) [id] = offsetof(struct pt_regs, r)
 
-#define REG_RESERVED (~((1ULL << PERF_REG_POWERPC_MAX) - 1))
+#define REG_RESERVED (~(PERF_REG_EXTENDED_MASK | PERF_REG_PMU_MASK))
 
 static unsigned int pt_regs_offset[PERF_REG_POWERPC_MAX] = {
 	PT_REGS_OFFSET(PERF_REG_POWERPC_R0,  gpr[0]),
@@ -69,10 +71,26 @@
 	PT_REGS_OFFSET(PERF_REG_POWERPC_MMCRA, dsisr),
 };
 
+/* Function to return the extended register values */
+static u64 get_ext_regs_value(int idx)
+{
+	switch (idx) {
+	case PERF_REG_POWERPC_MMCR0:
+			return mfspr(SPRN_MMCR0);
+	case PERF_REG_POWERPC_MMCR1:
+			return mfspr(SPRN_MMCR1);
+	case PERF_REG_POWERPC_MMCR2:
+			return mfspr(SPRN_MMCR2);
+	default: return 0;
+	}
+}
+
 u64 perf_reg_value(struct pt_regs *regs, int idx)
 {
-	if (WARN_ON_ONCE(idx >= PERF_REG_POWERPC_MAX))
-		return 0;
+	u64 PERF_REG_EXTENDED_MAX;
+
+	if (cpu_has_feature(CPU_FTR_ARCH_300))
+		PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_MMCR2 + 1;
 
 	if (idx == PERF_REG_POWERPC_SIER &&
 	   (IS_ENABLED(CONFIG_FSL_EMB_PERF_EVENT) ||
@@ -85,6 +103,16 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
 	    IS_ENABLED(CONFIG_PPC32)))
 		return 0;
 
+	if (idx >= PERF_REG_POWERPC_MAX && idx < PERF_REG_EXTENDED_MAX)
+		return get_ext_regs_value(idx);
+
+	/*
+	 * If the idx is referring to value beyond the
+	 * supported registers, return 0 with a warning
+	 */
+	if (WARN_ON_ONCE(idx >= PERF_REG_EXTENDED_MAX))
+		return 0;
+
 	return regs_get_register(regs, pt_regs_offset[idx]);
 }
 
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 08c3ef7..4525090 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -90,6 +90,8 @@ enum {
 #define POWER9_MMCRA_IFM3		0x00000000C0000000UL
 #define POWER9_MMCRA_BHRB_MASK		0x00000000C0000000UL
 
+extern u64 mask_var;
+
 /* Nasty Power9 specific hack */
 #define PVR_POWER9_CUMULUS		0x00002000
 
@@ -434,6 +436,7 @@ static void power9_config_bhrb(u64 pmu_bhrb_filter)
 	.cache_events		= &power9_cache_events,
 	.attr_groups		= power9_pmu_attr_groups,
 	.bhrb_nr		= 32,
+	.capabilities           = PERF_PMU_CAP_EXTENDED_REGS,
 };
 
 int init_power9_pmu(void)
@@ -457,6 +460,9 @@ int init_power9_pmu(void)
 		}
 	}
 
+	/* Set the PERF_REG_EXTENDED_MASK here */
+	mask_var = PERF_REG_PMU_MASK_300;
+
 	rc = register_power_pmu(&power9_pmu);
 	if (rc)
 		return rc;
diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
index f599064..485b1d5 100644
--- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -48,6 +48,18 @@ enum perf_event_powerpc_regs {
 	PERF_REG_POWERPC_DSISR,
 	PERF_REG_POWERPC_SIER,
 	PERF_REG_POWERPC_MMCRA,
-	PERF_REG_POWERPC_MAX,
+	/* Extended registers */
+	PERF_REG_POWERPC_MMCR0,
+	PERF_REG_POWERPC_MMCR1,
+	PERF_REG_POWERPC_MMCR2,
+	/* Max regs without the extended regs */
+	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
 };
+
+#define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
+
+/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
+#define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
+				- PERF_REG_PMU_MASK)
+
 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git a/tools/perf/arch/powerpc/include/perf_regs.h b/tools/perf/arch/powerpc/include/perf_regs.h
index e18a355..46ed00d 100644
--- a/tools/perf/arch/powerpc/include/perf_regs.h
+++ b/tools/perf/arch/powerpc/include/perf_regs.h
@@ -64,7 +64,10 @@
 	[PERF_REG_POWERPC_DAR] = "dar",
 	[PERF_REG_POWERPC_DSISR] = "dsisr",
 	[PERF_REG_POWERPC_SIER] = "sier",
-	[PERF_REG_POWERPC_MMCRA] = "mmcra"
+	[PERF_REG_POWERPC_MMCRA] = "mmcra",
+	[PERF_REG_POWERPC_MMCR0] = "mmcr0",
+	[PERF_REG_POWERPC_MMCR1] = "mmcr1",
+	[PERF_REG_POWERPC_MMCR2] = "mmcr2",
 };
 
 static inline const char *perf_reg_name(int id)
diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
index 0a52429..b78f81f 100644
--- a/tools/perf/arch/powerpc/util/perf_regs.c
+++ b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -6,9 +6,15 @@
 
 #include "../../../util/perf_regs.h"
 #include "../../../util/debug.h"
+#include "../../../util/event.h"
+#include "../../../util/header.h"
+#include "../../../perf-sys.h"
 
+#include <api/fs/fs.h>
 #include <linux/kernel.h>
 
+#define PVR_POWER9		0x004E
+
 const struct sample_reg sample_reg_masks[] = {
 	SMPL_REG(r0, PERF_REG_POWERPC_R0),
 	SMPL_REG(r1, PERF_REG_POWERPC_R1),
@@ -55,6 +61,9 @@
 	SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR),
 	SMPL_REG(sier, PERF_REG_POWERPC_SIER),
 	SMPL_REG(mmcra, PERF_REG_POWERPC_MMCRA),
+	SMPL_REG(mmcr0, PERF_REG_POWERPC_MMCR0),
+	SMPL_REG(mmcr1, PERF_REG_POWERPC_MMCR1),
+	SMPL_REG(mmcr2, PERF_REG_POWERPC_MMCR2),
 	SMPL_REG_END
 };
 
@@ -163,3 +172,48 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
 
 	return SDT_ARG_VALID;
 }
+
+uint64_t arch__intr_reg_mask(void)
+{
+	struct perf_event_attr attr = {
+		.type                   = PERF_TYPE_HARDWARE,
+		.config                 = PERF_COUNT_HW_CPU_CYCLES,
+		.sample_type            = PERF_SAMPLE_REGS_INTR,
+		.precise_ip             = 1,
+		.disabled               = 1,
+		.exclude_kernel         = 1,
+	};
+	int fd, ret;
+	char buffer[64];
+	u32 version;
+	u64 extended_mask = 0;
+
+	/* Get the PVR value to set the extended
+	 * mask specific to platform
+	 */
+	get_cpuid(buffer, sizeof(buffer));
+	ret = sscanf(buffer, "%u,", &version);
+
+	if (ret != 1) {
+		pr_debug("Failed to get the processor version, unable to output extended registers\n");
+		return PERF_REGS_MASK;
+	}
+
+	if (version == PVR_POWER9)
+		extended_mask = PERF_REG_PMU_MASK_300;
+
+	attr.sample_regs_intr = extended_mask;
+	attr.sample_period = 1;
+	event_attr_init(&attr);
+
+	/*
+	 * check if the pmu supports perf extended regs, before
+	 * returning the register mask to sample.
+	 */
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	if (fd != -1) {
+		close(fd);
+		return (extended_mask | PERF_REGS_MASK);
+	}
+	return PERF_REGS_MASK;
+}
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH v3 7/7] powerpc/book3s/pmem: Add WARN_ONCE to catch the wrong usage of pmem flush functions.
From: Aneesh Kumar K.V @ 2020-05-19  5:55 UTC (permalink / raw)
  To: linuxppc-dev, mpe, linux-nvdimm
  Cc: alistair, dan.j.williams, oohall, Aneesh Kumar K.V
In-Reply-To: <20200519055502.128318-1-aneesh.kumar@linux.ibm.com>

We only support persistent memory on P8 and above. This is enforced by the
firmware and further checked on virtualzied platform during platform init.
Add WARN_ONCE in pmem flush routines to catch the wrong usage of these.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/include/asm/cacheflush.h | 2 ++
 arch/powerpc/lib/pmem.c               | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
index bc3ea009cf14..865fae8a226e 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -125,6 +125,8 @@ static inline void  arch_pmem_flush_barrier(void)
 {
 	if (cpu_has_feature(CPU_FTR_ARCH_207S))
 		asm volatile(PPC_PHWSYNC ::: "memory");
+	else
+		WARN_ONCE(1, "Using pmem flush on older hardware.");
 }
 #endif /* __KERNEL__ */
 
diff --git a/arch/powerpc/lib/pmem.c b/arch/powerpc/lib/pmem.c
index 21210fa676e5..f40bd908d28d 100644
--- a/arch/powerpc/lib/pmem.c
+++ b/arch/powerpc/lib/pmem.c
@@ -37,12 +37,14 @@ static inline void clean_pmem_range(unsigned long start, unsigned long stop)
 {
 	if (cpu_has_feature(CPU_FTR_ARCH_207S))
 		return __clean_pmem_range(start, stop);
+	WARN_ONCE(1, "Using pmem flush on older hardware.");
 }
 
 static inline void flush_pmem_range(unsigned long start, unsigned long stop)
 {
 	if (cpu_has_feature(CPU_FTR_ARCH_207S))
 		return __flush_pmem_range(start, stop);
+	WARN_ONCE(1, "Using pmem flush on older hardware.");
 }
 
 /*
-- 
2.26.2


^ permalink raw reply related

* [PATCH v3 6/7] powerpc/pmem: Avoid the barrier in flush routines
From: Aneesh Kumar K.V @ 2020-05-19  5:55 UTC (permalink / raw)
  To: linuxppc-dev, mpe, linux-nvdimm
  Cc: alistair, dan.j.williams, oohall, Aneesh Kumar K.V
In-Reply-To: <20200519055502.128318-1-aneesh.kumar@linux.ibm.com>

nvdimm expect the flush routines to just mark the cache clean. The barrier
that mark the store globally visible is done in nvdimm_flush().

Update the papr_scm driver to a simplified nvdim_flush callback that do
only the required barrier.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/lib/pmem.c                   |  6 ------
 arch/powerpc/platforms/pseries/papr_scm.c | 13 +++++++++++++
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/lib/pmem.c b/arch/powerpc/lib/pmem.c
index 5a61aaeb6930..21210fa676e5 100644
--- a/arch/powerpc/lib/pmem.c
+++ b/arch/powerpc/lib/pmem.c
@@ -19,9 +19,6 @@ static inline void __clean_pmem_range(unsigned long start, unsigned long stop)
 
 	for (i = 0; i < size >> shift; i++, addr += bytes)
 		asm volatile(PPC_DCBSTPS(%0, %1): :"i"(0), "r"(addr): "memory");
-
-
-	asm volatile(PPC_PHWSYNC ::: "memory");
 }
 
 static inline void __flush_pmem_range(unsigned long start, unsigned long stop)
@@ -34,9 +31,6 @@ static inline void __flush_pmem_range(unsigned long start, unsigned long stop)
 
 	for (i = 0; i < size >> shift; i++, addr += bytes)
 		asm volatile(PPC_DCBFPS(%0, %1): :"i"(0), "r"(addr): "memory");
-
-
-	asm volatile(PPC_PHWSYNC ::: "memory");
 }
 
 static inline void clean_pmem_range(unsigned long start, unsigned long stop)
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index f35592423380..ad506e7003c9 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -285,6 +285,18 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
 
 	return 0;
 }
+/*
+ * We have made sure the pmem writes are done such that before calling this
+ * all the caches are flushed/clean. We use dcbf/dcbfps to ensure this. Here
+ * we just need to add the necessary barrier to make sure the above flushes
+ * are have updated persistent storage before any data access or data transfer
+ * caused by subsequent instructions is initiated.
+ */
+static int papr_scm_flush_sync(struct nd_region *nd_region, struct bio *bio)
+{
+	arch_pmem_flush_barrier();
+	return 0;
+}
 
 static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
 {
@@ -340,6 +352,7 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
 	ndr_desc.mapping = &mapping;
 	ndr_desc.num_mappings = 1;
 	ndr_desc.nd_set = &p->nd_set;
+	ndr_desc.flush = papr_scm_flush_sync;
 
 	if (p->is_volatile)
 		p->region = nvdimm_volatile_region_create(p->bus, &ndr_desc);
-- 
2.26.2


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox