LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* RE: [RESEND PATCH v9 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
From: Williams, Dan J @ 2020-06-04  0:24 UTC (permalink / raw)
  To: Vaibhav Jain, linuxppc-dev@lists.ozlabs.org,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org
  Cc: Santosh Sivaraj, Aneesh Kumar K . V, Steven Rostedt,
	Oliver O'Halloran, Weiny, Ira
In-Reply-To: <20200602101438.73929-5-vaibhav@linux.ibm.com>

[ forgive formatting I'm temporarily stuck using Outlook this week... ]

> From: Vaibhav Jain <vaibhav@linux.ibm.com>
[..]
> 
> Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
> module and add the command family NVDIMM_FAMILY_PAPR to the white
> list of NVDIMM command sets. Also advertise support for ND_CMD_CALL for
> the nvdimm command mask and implement necessary scaffolding in the
> module to handle ND_CMD_CALL ioctl and PDSM requests that we receive.
> 
> The layout of the PDSM request as we expect from libnvdimm/libndctl is
> described in newly introduced uapi header 'papr_pdsm.h' which defines a
> new 'struct nd_pdsm_cmd_pkg' header. This header is used to communicate
> the PDSM request via member 'nd_cmd_pkg.nd_command' and size of
> payload that need to be sent/received for servicing the PDSM.
> 
> A new function is_cmd_valid() is implemented that reads the args to
> papr_scm_ndctl() and performs sanity tests on them. A new function
> papr_scm_service_pdsm() is introduced and is called from
> papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
> command from libnvdimm.
> 
> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
> Changelog:
> 
> Resend:
> * Added ack from Aneesh.
> 
> v8..v9:
> * Reduced the usage of term SCM replacing it with appropriate
>   replacement [ Dan Williams, Aneesh ]
> * Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
> * s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
> * s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
> * Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
> * Minor update to patch description.
> 
> v7..v8:
> * Removed the 'payload_offset' field from 'struct
>   nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
>   at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
> * To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
>   'reserved' field of 10-bytes is introduced. [ Aneesh ]
> * Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
>   [ Ira ]
> 
> Resend:
> * None
> 
> v6..v7 :
> * Removed the re-definitions of __packed macro from papr_scm_pdsm.h
>   [Mpe].
> * Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
> * Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
>   [Mpe].
> * Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]
> 
> v5..v6 :
> * Changed the usage of the term DSM to PDSM to distinguish it from the
>   ACPI term [ Dan Williams ]
> * Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various
> struct
>   to reflect the new terminology.
> * Updated the patch description and title to reflect the new terminology.
> * Squashed patch to introduce new command family in 'ndctl.h' with
>   this patch [ Dan Williams ]
> * Updated the papr_scm_pdsm method starting index from 0x10000 to 0x0
>   [ Dan Williams ]
> * Removed redundant license text from the papr_scm_psdm.h file.
>   [ Dan Williams ]
> * s/envelop/envelope/ at various places [ Dan Williams ]
> * Added '__packed' attribute to command package header to gaurd
>   against different compiler adding paddings between the fields.
>   [ Dan Williams]
> * Converted various pr_debug to dev_debug [ Dan Williams ]
> 
> v4..v5 :
> * None
> 
> v3..v4 :
> * None
> 
> v2..v3 :
> * Updated the patch prefix to 'ndctl/uapi' [Aneesh]
> 
> v1..v2 :
> * None
> ---
>  arch/powerpc/include/uapi/asm/papr_pdsm.h | 136
> ++++++++++++++++++++++  arch/powerpc/platforms/pseries/papr_scm.c |
> 101 +++++++++++++++-
>  include/uapi/linux/ndctl.h                |   1 +
>  3 files changed, 232 insertions(+), 6 deletions(-)  create mode 100644
> arch/powerpc/include/uapi/asm/papr_pdsm.h
> 
> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h
> b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> new file mode 100644
> index 000000000000..6407fefcc007
> --- /dev/null
> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> @@ -0,0 +1,136 @@
> +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
> +/*
> + * PAPR nvDimm Specific Methods (PDSM) and structs for libndctl
> + *
> + * (C) Copyright IBM 2020
> + *
> + * Author: Vaibhav Jain <vaibhav at linux.ibm.com>  */
> +
> +#ifndef _UAPI_ASM_POWERPC_PAPR_PDSM_H_
> +#define _UAPI_ASM_POWERPC_PAPR_PDSM_H_
> +
> +#include <linux/types.h>
> +
> +/*
> + * PDSM Envelope:
> + *
> + * The ioctl ND_CMD_CALL transfers data between user-space and kernel
> +via
> + * envelope which consists of a header and user-defined payload sections.
> + * The header is described by 'struct nd_pdsm_cmd_pkg' which expects a
> + * payload following it and accessible via 'nd_pdsm_cmd_pkg.payload' field.
> + * There is reserved field that can used to introduce new fields to the
> + * structure in future. It also tries to ensure that
> 'nd_pdsm_cmd_pkg.payload'
> + * lies at a 8-byte boundary.
> + *
> + *  +-------------+---------------------+---------------------------+
> + *  |   64-Bytes  |       16-Bytes      |       Max 176-Bytes       |
> + *  +-------------+---------------------+---------------------------+
> + *  |               nd_pdsm_cmd_pkg     |                           |
> + *  |-------------+                     |                           |
> + *  |  nd_cmd_pkg |                     |                           |
> + *  +-------------+---------------------+---------------------------+
> + *  | nd_family   |                     |                           |
> + *  | nd_size_out | cmd_status          |                           |
> + *  | nd_size_in  | payload_version     |     payload               |
> + *  | nd_command  | reserved            |                           |
> + *  | nd_fw_size  |                     |                           |
> + *  +-------------+---------------------+---------------------------+
> + *
> + * PDSM Header:
> + *
> + * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
> + * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to
> member
> + * 'nd_cmd_pkg.nd_command'. Apart from size information of the
> envelope
> +which is
> + * contained in 'struct nd_cmd_pkg', the header also has members
> +following
> + * members:
> + *
> + * 'cmd_status'		: (Out) Errors if any encountered while
> servicing PDSM.
> + * 'payload_version'	: (In/Out) Version number associated with the
> payload.
> + * 'reserved'		: Not used and reserved for future.
> + *
> + * PDSM Payload:
> + *
> + * The layout of the PDSM Payload is defined by various structs shared
> +between
> + * papr_scm and libndctl so that contents of payload can be
> +interpreted. During
> + * servicing of a PDSM the papr_scm module will read input args from
> +the payload
> + * field by casting its contents to an appropriate struct pointer based
> +on the
> + * PDSM command. Similarly the output of servicing the PDSM command
> +will be
> + * copied to the payload field using the same struct.
> + *
> + * 'libnvdimm' enforces a hard limit of 256 bytes on the envelope size,
> +which
> + * leaves around 176 bytes for the envelope payload (ignoring any
> +padding that
> + * the compiler may silently introduce).
> + *
> + * Payload Version:
> + *
> + * A 'payload_version' field is present in PDSM header that indicates a
> +specific
> + * version of the structure present in PDSM Payload for a given PDSM
> command.
> + * This provides backward compatibility in case the PDSM Payload
> +structure
> + * evolves and different structures are supported by 'papr_scm' and
> 'libndctl'.
> + *
> + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send
> +the version
> + * of the payload struct it supports via 'payload_version' field. The
> 'papr_scm'
> + * module when servicing the PDSM envelope checks the 'payload_version'
> +and then
> + * uses 'payload struct version' == MIN('payload_version field',
> + * 'max payload-struct-version supported by papr_scm') to service the
> PDSM.
> + * After servicing the PDSM, 'papr_scm' put the negotiated version of
> +payload
> + * struct in returned 'payload_version' field.
> + *
> + * Libndctl on receiving the envelope back from papr_scm again checks
> +the
> + * 'payload_version' field and based on it use the appropriate version
> +dsm
> + * struct to parse the results.
> + *
> + * Backward Compatibility:
> + *
> + * Above scheme of exchanging different versioned PDSM struct between
> +libndctl
> + * and papr_scm should provide backward compatibility until following
> +two
> + * assumptions/conditions when defining new PDSM structs hold:
> + *
> + * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
> + *
> + * 1. T(X) is a proper subset of T(Y) if Y > X.
> + *    i.e Each new version of PDSM struct should retain existing struct
> + *    attributes from previous version
> + *
> + * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
> + *    it should also support T(1), T(2)...T(X - 1).
> + *    i.e When adding support for new version of a PDSM struct, libndctl
> + *    and papr_scm should retain support of the existing PDSM struct
> + *    version they support.
> + */
> +
> +/* PDSM-header + payload expected with ND_CMD_CALL ioctl from
> libnvdimm
> +*/ struct nd_pdsm_cmd_pkg {
> +	struct nd_cmd_pkg hdr;	/* Package header containing sub-
> cmd */
> +	__s32 cmd_status;	/* Out: Sub-cmd status returned back */
> +	__u16 reserved[5];	/* Ignored and to be used in future */
> +	__u16 payload_version;	/* In/Out: version of the payload */
> +	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
> +} __packed;
> +
> +/*
> + * Methods to be embedded in ND_CMD_CALL request. These are sent to
> the
> +kernel
> + * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct  */
> +enum papr_pdsm {
> +	PAPR_PDSM_MIN = 0x0,
> +	PAPR_PDSM_MAX,
> +};
> +
> +/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */ static inline
> +struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct nd_cmd_pkg
> *cmd) {
> +	return (struct nd_pdsm_cmd_pkg *) cmd; }
> +
> +/* Return the payload pointer for a given pcmd */ static inline void
> +*pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd) {
> +	if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
> +		return NULL;
> +	else
> +		return (void *)(pcmd->payload);
> +}
> +
> +#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c
> b/arch/powerpc/platforms/pseries/papr_scm.c
> index 149431594839..5e2237e7ec08 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -15,13 +15,15 @@
>  #include <linux/seq_buf.h>
> 
>  #include <asm/plpar_wrappers.h>
> +#include <asm/papr_pdsm.h>
> 
>  #define BIND_ANY_ADDR (~0ul)
> 
>  #define PAPR_SCM_DIMM_CMD_MASK \
>  	((1ul << ND_CMD_GET_CONFIG_SIZE) | \
>  	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
> -	 (1ul << ND_CMD_SET_CONFIG_DATA))
> +	 (1ul << ND_CMD_SET_CONFIG_DATA) | \
> +	 (1ul << ND_CMD_CALL))
> 
>  /* DIMM health bitmap bitmap indicators */
>  /* SCM device is unable to persist memory contents */ @@ -350,16 +352,97
> @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
>  	return 0;
>  }
> 
> +/*
> + * Validate the inputs args to dimm-control function and return '0' if valid.
> + * This also does initial sanity validation to ND_CMD_CALL sub-command
> packages.
> + */
> +static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void
> *buf,
> +		       unsigned int buf_len)
> +{
> +	unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
> +	struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
> +	struct papr_scm_priv *p;
> +
> +	/* Only dimm-specific calls are supported atm */
> +	if (!nvdimm)
> +		return -EINVAL;
> +
> +	/* get the provider date from struct nvdimm */
> +	p = nvdimm_provider_data(nvdimm);
> +
> +	if (!test_bit(cmd, &cmd_mask)) {
> +		dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
> +		return -EINVAL;
> +	} else if (cmd == ND_CMD_CALL) {
> +
> +		/* Verify the envelope package */
> +		if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
> +			dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
> +				buf_len);
> +			return -EINVAL;
> +		}
> +
> +		/* Verify that the PDSM family is valid */
> +		if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
> +			dev_dbg(&p->pdev->dev, "Invalid pkg
> family=0x%llx\n",
> +				pkg->hdr.nd_family);
> +			return -EINVAL;
> +
> +		}
> +
> +		/* We except a payload with all PDSM commands */
> +		if (pdsm_cmd_to_payload(pkg) == NULL) {
> +			dev_dbg(&p->pdev->dev,
> +				"Empty payload for sub-command=0x%llx\n",
> +				pkg->hdr.nd_command);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	/* Command looks valid */

So this is where I would expect the kernel to validate the command vs a known list of supported commands / payloads. One of the goals of requiring public documentation of any commands that libnvdimm might support for the ioctl path is to give the kernel the ability to gate future enabling on consideration of a common kernel front-end interface. I believe this would also address questions about the versioning scheme because userspace would be actively prevented from sending command payloads that were not first explicitly enabled in the kernel. This interface as it stands in this patch set seems to be a very thin / "anything goes" passthrough with no consideration for that policy.

As an example of the utility of this policy, consider the recent support for nvdimm security commands that allow a passphrase to be set and issue commands like "unlock" and "secure erase". The kernel actively prevents those commands from being sent from userspace. See acpi_nfit_clear_to_send() and nd_cmd_clear_to_send(). The reasoning is that it enforces the kernel's nvdimm security model that uses encrypted/trusted keys to protect key material (clear text keys only-ever exist in kernel-space). Yes, that restriction is painful for people that don't want the kernel's security model and just want the simplicity of passing clear-text keys around, but it's necessary for the kernel to have any chance to provide a common abstraction across vendors. The pain of negotiating every single command with what the kernel will support is useful for the long term health of the kernel. It forces ongoing conversations across vendors to consolidate interfaces and reuse kernel best practices like encrypted/trusted keys. Code acceptance is the only real gate the kernel has to enforce cooperation across vendors.

The expectation is that the kernel does not allow any command to pass that is not explicitly listed in a bitmap of known commands. I would expect that if you changed the payload of an existing command that would likely require a new entry in this bitmap. The goal is to give the kernel a chance to constrain the passthrough interface to afford a chance to have a discussion of what might done in a common implementation. Another example is the label-area read-write commands. The kernel needs explicit control to ensure that it owns the label area and that userspace is not able to corrupt it (write it behind the kernel's back).

Now that said, I have battle scars with some OEMs that just want a generic passthrough interface so they never need to work with the kernel community again and can just write their custom validation tooling and be done. I've mostly been successful in that fight outside of the gaping hole of ND_CMD_VENDOR. That's the path that ipmctl has used to issue commands that have not made it into the public specification on docs.pmem.io. My warning shot for that is the "disable_vendor_specific" module option that administrators can set to only allow commands that the kernel explicitly knows the effects of to be issued. The result is only tooling / enabling that submits to this auditing regime is guaranteed to work everywhere.

So, that long explanation out of the way, what does that mean for this patch set? I'd like to understand if you still see a need for a versioning scheme if the implementation is required to explicitly list all the commands it supports? I.e. that the kernel need not worry about userspace sending future unknown payloads because unknown payloads are blocked. Also if your interface has anything similar to a "vendor specific" passthrough I would like to require that go through the ND_CMD_VENDOR ioctl, so that the kernel still has a common check point to prevent vendor specific "I don't want to talk to the kernel community" shenanigans, but even better if ND_CMD_VENDOR is something the kernel can eventually jettison because nobody is using it.

I feel like this is a conversation that will take a few days to resolve, which does not leave time to push this for v5.8. That said, I do think the health flags patches at the beginning of this series are low risk and uncontentious. How about I merge those for v5.8 and circle back to get this ioctl path queued early in v5.8-rc? Apologies for the late feedback on this relative to v5.8.


^ permalink raw reply

* Re: [PATCH v3 2/7] documentation for stats_fs
From: Randy Dunlap @ 2020-06-04  0:23 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito, kvm
  Cc: linux-s390, linux-doc, netdev, Emanuele Giuseppe Esposito,
	linux-kernel, kvm-ppc, Jonathan Adams, Christian Borntraeger,
	Alexander Viro, David Rientjes, linux-fsdevel, Paolo Bonzini,
	linux-mips, linuxppc-dev, linux-arm-kernel, Jim Mattson
In-Reply-To: <20200526110318.69006-3-eesposit@redhat.com>

Hi--

Here are a few comments for you.


On 5/26/20 4:03 AM, Emanuele Giuseppe Esposito wrote:
> Html docs for a complete documentation of the stats_fs API,
> filesystem and usage.
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>  Documentation/filesystems/index.rst    |   1 +
>  Documentation/filesystems/stats_fs.rst | 222 +++++++++++++++++++++++++
>  2 files changed, 223 insertions(+)
>  create mode 100644 Documentation/filesystems/stats_fs.rst

> diff --git a/Documentation/filesystems/stats_fs.rst b/Documentation/filesystems/stats_fs.rst
> new file mode 100644
> index 000000000000..292c689ffb98
> --- /dev/null
> +++ b/Documentation/filesystems/stats_fs.rst
> @@ -0,0 +1,222 @@
> +========
> +Stats_FS
> +========
> +
> +Stats_fs is a synthetic ram-based virtual filesystem that takes care of

                           RAM-based

> +gathering and displaying statistics for the Linux kernel subsystems.
> +
> +The motivation for stats_fs comes from the fact that there is no common
> +way for Linux kernel subsystems to expose statistics to userspace shared
> +throughout the Linux kernel; subsystems have to take care of gathering and
> +displaying statistics by themselves, for example in the form of files in
> +debugfs.
> +
> +Allowing each subsystem of the kernel to do so has two disadvantages.
> +First, it will introduce redundant code. Second, debugfs is anyway not the
> +right place for statistics (for example it is affected by lockdown).
> +
> +Stats_fs offers a generic and stable API, allowing any kind of
> +directory/file organization and supporting multiple kind of aggregations

                                                       kinds

> +(not only sum, but also average, max, min and count_zero) and data types
> +(boolean, all unsigned/signed and custom types). The implementation takes
> +care of gathering and displaying information at run time; users only need
> +to specify the values to be included in each source. Optionally, users can
> +also provide a display function for each value, that will take care of
> +displaying the provided value in a custom format.
> +
> +Its main function is to display each statistics as a file in the desired

                                        statistic

> +folder hierarchy defined through the API. Stats_fs files can be read, and
> +possibly cleared if their file mode allows it.
> +
> +Stats_fs is typically mounted with a command like::
> +
> +    mount -t stats_fs stats_fs /sys/kernel/stats_fs
> +
> +(Or an equivalent /etc/fstab line).
> +
> +Stats_fs has two main components: the public API defined by
> +include/linux/stats_fs.h, and the virtual file system in
> +/sys/kernel/stats.
> +
> +The API has two main elements, values and sources. Kernel
> +subsystems will create a source, add child
> +sources/values/aggregates and register it to the root source (that on the
> +virtual fs would be /sys/kernel/stats).
> +
> +The stats_fs API is defined in ``<linux/stats_fs.h>``.
> +
> +    Sources
> +        Sources are created via ``stats_fs_source_create()``, and each
> +        source becomes a directory in the file system. Sources form a
> +        parent-child relationship; root sources are added to the file
> +        system via ``stats_fs_source_register()``. Therefore each Linux
> +        subsystem will add its own entry to the root, filesystem similar

                                                   root filesystem

> +        to what it is done in debugfs. Every other source is added to or
> +        removed from a parent through the
> +        ``stats_fs_source_add_subordinate()`` and
> +        ``stats_fs_source_remove_subordinate()`` APIs. Once a source is
> +        created and added to the tree (via add_subordinate), it will be
> +        used to compute aggregate values in the parent source. A source
> +        can optionally be hidden from the filesystem but still considered
> +        in the aggregation operations if the corresponding flag is set
> +        during initialization.
> +
> +    Values
> +        Values represent quantites that are gathered by the stats_fs user.

                            quantities

> +        Examples of values include the number of vm exits of a given kind,

                                                    VM

> +        the amount of memory used by some data structure, the length of
> +        the longest hash table chain, or anything like that. Values are
> +        defined with the stats_fs_source_add_values function. Each value
> +        is defined by a ``struct stats_fs_value``; the same
> +        ``stats_fs_value`` can be added to many different sources. A value
> +        can be considered "simple" if it fetches data from a user-provided
> +        location, or "aggregate" if it groups all values in the
> +        subordinate sources that include the same ``stats_fs_value``.
> +        Values by default are considered to be cumulative, meaning the
> +        value they represent never decreases, but can also be defined as
> +        floating if they exibith a different behavior. The main difference

                            exhibit

> +        between these two is reflected into the file permission, since a
> +        floating value file does not allow the user to clear it. Each
> +        value has a ``stats_fs_type`` pointer in order to allow the user
> +        to provide custom get and clear functions. The library, however,
> +        also exports default ``stats_fs_type`` structs for the standard
> +        types (all unsigned and signed types plus boolean). A value can
> +        also provide a show function that takes care of displaying the
> +        value in a custom string format. This can be especially useful
> +        when displaying enums.
> +
> +Because stats_fs is a different mountpoint than debugfs, it is not affected
> +by security lockdown.
> +
> +Using Stats_fs
> +================
> +
> +Define a value::
> +
> +        struct statistics{
> +                uint64_t exit;
> +                ...
> +        };
> +
> +        struct kvm {
> +                ...
> +                struct statistics stat;
> +        };
> +
> +        struct stats_fs_value kvm_stats[] = {
> +                { "exit_vm", offsetof(struct kvm, stat.exit), &stats_fs_type_u64,
> +                  STATS_FS_SUM },
> +                { NULL }
> +        };
> +
> +The same ``struct stats_fs_value`` is used for both simple and aggregate
> +values, though the type and offset are only used for simple values.
> +Aggregates merge all values that use the same ``struct stats_fs_value``.
> +
> +Create the parent source::
> +
> +        struct stats_fs_source parent_source = stats_fs_source_create(0, "parent");
> +
> +Register it (files and folders
> +will only be visible after this function is called)::
> +
> +        stats_fs_source_register(parent_source);
> +
> +Create and add a child::
> +
> +        struct stats_fs_source child_source = stats_fs_source_create(STATS_FS_HIDDEN, "child");
> +
> +        stats_fs_source_add_subordinate(parent_source, child_source);
> +
> +The STATS_FS_HIDDEN attribute won't affect the aggregation, it will only
> +block the creation of the files.

Why does HIDDEN block the creation of files?  instead of their visibility?

> +
> +Add values to parent and child (also here order doesn't matter)::
> +
> +        struct kvm *base_ptr = kmalloc(..., sizeof(struct kvm));
> +        ...
> +        stats_fs_source_add_values(child_source, kvm_stats, base_ptr, 0);
> +        stats_fs_source_add_values(parent_source, kvm_stats, NULL, STATS_FS_HIDDEN);
> +
> +``child_source`` will be a simple value, since it has a non-NULL base
> +pointer, while ``parent_source`` will be an aggregate. During the adding
> +phase, also values can optionally be marked as hidden, so that the folder
> +and other values can be still shown.
> +
> +Of course the same ``struct stats_fs_value`` array can be also passed with a
> +different base pointer, to represent the same value but in another instance
> +of the kvm struct.
> +
> +Search:
> +
> +Fetch a value from the child source, returning the value
> +pointed by ``(uint64_t *) base_ptr + kvm_stats[0].offset``::
> +
> +        uint64_t ret_child, ret_parent;
> +
> +        stats_fs_source_get_value(child_source, &kvm_stats[0], &ret_child);
> +
> +Fetch an aggregate value, searching all subsources of ``parent_source`` for
> +the specified ``struct stats_fs_value``::
> +
> +        stats_fs_source_get_value(parent_source, &kvm_stats[0], &ret_parent);
> +
> +        assert(ret_child == ret_parent); // check expected result
> +
> +To make it more interesting, add another child::
> +
> +        struct stats_fs_source child_source2 = stats_fs_source_create(0, "child2");
> +
> +        stats_fs_source_add_subordinate(parent_source, child_source2);
> +        // now  the structure is parent -> child1
> +        //                              -> child2

Is that the same as                 parent -> child1 -> child2
?  It could almost be read as
                                    parent -> child1
                                    parent -> child2

Whichever it is, can you make it more explicit, please?


> +
> +        struct kvm *other_base_ptr = kmalloc(..., sizeof(struct kvm));
> +        ...
> +        stats_fs_source_add_values(child_source2, kvm_stats, other_base_ptr, 0);
> +
> +Note that other_base_ptr points to another instance of kvm, so the struct
> +stats_fs_value is the same but the address at which they point is not.
> +
> +Now get the aggregate value::
> +
> +        uint64_t ret_child, ret_child2, ret_parent;
> +
> +        stats_fs_source_get_value(child_source, &kvm_stats[0], &ret_child);
> +        stats_fs_source_get_value(parent_source, &kvm_stats[0], &ret_parent);
> +        stats_fs_source_get_value(child_source2, &kvm_stats[0], &ret_child2);
> +
> +        assert((ret_child + ret_child2) == ret_parent);
> +
> +Cleanup::
> +
> +        stats_fs_source_remove_subordinate(parent_source, child_source);
> +        stats_fs_source_revoke(child_source);
> +        stats_fs_source_put(child_source);
> +
> +        stats_fs_source_remove_subordinate(parent_source, child_source2);
> +        stats_fs_source_revoke(child_source2);
> +        stats_fs_source_put(child_source2);
> +
> +        stats_fs_source_put(parent_source);
> +        kfree(other_base_ptr);
> +        kfree(base_ptr);
> +
> +Calling stats_fs_source_revoke is very important, because it will ensure

           stats_fs_source_revoke()

> +that stats_fs will not access the data that were passed to
> +stats_fs_source_add_value for this source.
> +
> +Because open files increase the reference count for a stats_fs_source, the
> +source can end up living longer than the data that provides the values for
> +the source.  Calling stats_fs_source_revoke just before the backing data

                        stats_fs_source_revoke()

> +is freed avoids accesses to freed data structures. The sources will return
> +0.
> +
> +This is not needed for the parent_source, since it just contains
> +aggregates that would be 0 anyways if no matching child value exist.
> +
> +API Documentation
> +=================
> +
> +.. kernel-doc:: include/linux/stats_fs.h
> +   :export: fs/stats_fs/*.c
> \ No newline at end of file

Please fix that. ^^^^^


Thanks for the documentation.

-- 
~Randy


^ permalink raw reply

* Re: [PATCH] cxl: Fix kobject memleak
From: Andrew Donnellan @ 2020-06-04  0:08 UTC (permalink / raw)
  To: wanghai (M), fbarrat, arnd, gregkh; +Cc: linuxppc-dev, imunsie, linux-kernel
In-Reply-To: <a9ecd617-c541-aeb1-2e94-93abba475279@huawei.com>

On 3/6/20 9:57 pm, wanghai (M) wrote:
> kfree(cr) can be called when 
> kobject_put()-->kobject_release()-->kobject_cleanup()-->kobj_type->release() 
> is called.  The kobj_type here is afu_config_record_type

Of course, I missed that.

In that case

Acked-by: Andrew Donnellan <ajd@linux.ibm.com>

Thanks for the fix!

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited

^ permalink raw reply

* Re: [PATCH 3/3] ASoC: fsl_easrc: Fix "Function parameter not described" warnings
From: Nicolin Chen @ 2020-06-04  0:07 UTC (permalink / raw)
  To: Shengjiu Wang
  Cc: alsa-devel, timur, Xiubo.Lee, linuxppc-dev, tiwai, lgirdwood,
	perex, broonie, festevam, linux-kernel
In-Reply-To: <d166b868e6d294de47a89857be03758ec82a0a61.1591155860.git.shengjiu.wang@nxp.com>

On Wed, Jun 03, 2020 at 11:39:41AM +0800, Shengjiu Wang wrote:
> Obtained with:
> $ make W=1
> 
> sound/soc/fsl/fsl_easrc.c:403: warning: Function parameter or member 'easrc' not described in 'fsl_easrc_normalize_filter'
> sound/soc/fsl/fsl_easrc.c:403: warning: Function parameter or member 'infilter' not described in 'fsl_easrc_normalize_filter'
> sound/soc/fsl/fsl_easrc.c:403: warning: Function parameter or member 'outfilter' not described in 'fsl_easrc_normalize_filter'
> sound/soc/fsl/fsl_easrc.c:403: warning: Function parameter or member 'shift' not described in 'fsl_easrc_normalize_filter'
> 
> Fixes: 955ac624058f ("ASoC: fsl_easrc: Add EASRC ASoC CPU DAI drivers")
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
> Reported-by: kbuild test robot <lkp@intel.com>

Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply

* Re: [PATCH 2/3] ASoC: fsl_easrc: Fix -Wunused-but-set-variable
From: Nicolin Chen @ 2020-06-04  0:07 UTC (permalink / raw)
  To: Shengjiu Wang
  Cc: alsa-devel, timur, Xiubo.Lee, linuxppc-dev, tiwai, lgirdwood,
	perex, broonie, festevam, linux-kernel
In-Reply-To: <91ceb59e3bce31c9e93abba06f5156692ff5c71e.1591155860.git.shengjiu.wang@nxp.com>

On Wed, Jun 03, 2020 at 11:39:40AM +0800, Shengjiu Wang wrote:
> Obtained with:
> $ make W=1
> 
> sound/soc/fsl/fsl_easrc.c: In function 'fsl_easrc_set_rs_ratio':
> sound/soc/fsl/fsl_easrc.c:182:15: warning: variable 'int_bits' set but not used [-Wunused-but-set-variable]
>   unsigned int int_bits;
>                ^
> sound/soc/fsl/fsl_easrc.c: In function 'fsl_easrc_set_ctx_organziation':
> sound/soc/fsl/fsl_easrc.c:1204:17: warning: variable 'dev' set but not used [-Wunused-but-set-variable]
>   struct device *dev;
>                  ^
> sound/soc/fsl/fsl_easrc.c: In function 'fsl_easrc_release_context':
> sound/soc/fsl/fsl_easrc.c:1294:17: warning: variable 'dev' set but not used [-Wunused-but-set-variable]
>   struct device *dev;
>                  ^
> Fixes: 955ac624058f ("ASoC: fsl_easrc: Add EASRC ASoC CPU DAI drivers")
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
> Reported-by: kbuild test robot <lkp@intel.com>

Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply

* Re: [PATCH 1/3] ASoC: fsl_easrc: Fix -Wmissing-prototypes warning
From: Nicolin Chen @ 2020-06-04  0:05 UTC (permalink / raw)
  To: Shengjiu Wang
  Cc: alsa-devel, timur, Xiubo.Lee, linuxppc-dev, tiwai, lgirdwood,
	perex, broonie, festevam, linux-kernel
In-Reply-To: <ab1b83a56c71f4159a98e6da5602c2c36fe59f4d.1591155860.git.shengjiu.wang@nxp.com>

On Wed, Jun 03, 2020 at 11:39:39AM +0800, Shengjiu Wang wrote:
> Obtained with:
> $ make W=1
> 
> sound/soc/fsl/fsl_easrc.c:967:5: warning: no previous prototype for function 'fsl_easrc_config_context' [-Wmissing-prototypes]
> int fsl_easrc_config_context(struct fsl_asrc *easrc, unsigned int ctx_id)
>     ^
> sound/soc/fsl/fsl_easrc.c:967:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_config_context(struct fsl_asrc *easrc, unsigned int ctx_id)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1128:5: warning: no previous prototype for function 'fsl_easrc_set_ctx_format' [-Wmissing-prototypes]
> int fsl_easrc_set_ctx_format(struct fsl_asrc_pair *ctx,
>     ^
> sound/soc/fsl/fsl_easrc.c:1128:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_set_ctx_format(struct fsl_asrc_pair *ctx,
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1201:5: warning: no previous prototype for function 'fsl_easrc_set_ctx_organziation' [-Wmissing-prototypes]
> int fsl_easrc_set_ctx_organziation(struct fsl_asrc_pair *ctx)
>     ^
> sound/soc/fsl/fsl_easrc.c:1201:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_set_ctx_organziation(struct fsl_asrc_pair *ctx)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1245:5: warning: no previous prototype for function 'fsl_easrc_request_context' [-Wmissing-prototypes]
> int fsl_easrc_request_context(int channels, struct fsl_asrc_pair *ctx)
>     ^
> sound/soc/fsl/fsl_easrc.c:1245:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_request_context(int channels, struct fsl_asrc_pair *ctx)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1290:6: warning: no previous prototype for function 'fsl_easrc_release_context' [-Wmissing-prototypes]
> void fsl_easrc_release_context(struct fsl_asrc_pair *ctx)
>      ^
> sound/soc/fsl/fsl_easrc.c:1290:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> void fsl_easrc_release_context(struct fsl_asrc_pair *ctx)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1317:5: warning: no previous prototype for function 'fsl_easrc_start_context' [-Wmissing-prototypes]
> int fsl_easrc_start_context(struct fsl_asrc_pair *ctx)
>     ^
> sound/soc/fsl/fsl_easrc.c:1317:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_start_context(struct fsl_asrc_pair *ctx)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1335:5: warning: no previous prototype for function 'fsl_easrc_stop_context' [-Wmissing-prototypes]
> int fsl_easrc_stop_context(struct fsl_asrc_pair *ctx)
>     ^
> sound/soc/fsl/fsl_easrc.c:1335:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> int fsl_easrc_stop_context(struct fsl_asrc_pair *ctx)
> ^
> static
> sound/soc/fsl/fsl_easrc.c:1382:18: warning: no previous prototype for function 'fsl_easrc_get_dma_channel' [-Wmissing-prototypes]
> struct dma_chan *fsl_easrc_get_dma_channel(struct fsl_asrc_pair *ctx,
>                  ^
> sound/soc/fsl/fsl_easrc.c:1382:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
> struct dma_chan *fsl_easrc_get_dma_channel(struct fsl_asrc_pair *ctx,
> ^
> static
> 
> Fixes: 955ac624058f ("ASoC: fsl_easrc: Add EASRC ASoC CPU DAI drivers")
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
> Reported-by: kbuild test robot <lkp@intel.com>

Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply

* Re: [PATCH] powerpc/vdso32: mark __kernel_datapage_offset as STV_PROTECTED
From: Nick Desaulniers @ 2020-06-03 23:50 UTC (permalink / raw)
  To: Nathan Chancellor, Christophe Leroy, Fangrui Song,
	Michael Ellerman
  Cc: clang-built-linux, Paul Mackerras, linuxppc-dev, LKML
In-Reply-To: <20200207064210.GA13125@ubuntu-x2-xlarge-x86>

On Thu, Feb 6, 2020 at 10:42 PM Nathan Chancellor
<natechancellor@gmail.com> wrote:
>
> On Wed, Feb 05, 2020 at 07:25:59AM +0100, Christophe Leroy wrote:
> >
> >
> > Le 05/02/2020 à 01:50, Fangrui Song a écrit :
> > > A PC-relative relocation (R_PPC_REL16_LO in this case) referencing a
> > > preemptible symbol in a -shared link is not allowed.  GNU ld's powerpc
> > > port is permissive and allows it [1], but lld will report an error after
> > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=ec0895f08f99515194e9fcfe1338becf6f759d38
> >
> > Note that there is a series whose first two patches aim at dropping
> > __kernel_datapage_offset . See
> > https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=156045 and
> > especially patches https://patchwork.ozlabs.org/patch/1231467/ and
> > https://patchwork.ozlabs.org/patch/1231461/
> >
> > Those patches can be applied independentely of the rest.
> >
> > Christophe
>
> If that is the case, it would be nice if those could be fast tracked to
> 5.6 because as it stands now, all PowerPC builds that were working with
> ld.lld are now broken. Either that or take this patch and rebase that
> series on this one.

So do we still need Fangrui's patch or is it moot?  I'm doing a scrub
of our bug tracker and this issue is still open:
https://github.com/ClangBuiltLinux/linux/issues/851
but it looks like all of our ppc LE targets are linking with LLD just fine
https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/builds/169379039
though it sounds like
https://github.com/ClangBuiltLinux/linux/issues/774
may be a blocker?
Though I don't see Cristophe's
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/5f97f7c921ffc2113ada0f32924e409bccc8277a.1580399657.git.christophe.leroy@c-s.fr/
in mainline or -next.  Was the series not accepted?


>
> Cheers,
> Nathan
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200207064210.GA13125%40ubuntu-x2-xlarge-x86.



--
Thanks,
~Nick Desaulniers

^ permalink raw reply

* Re: [PATCH] arch/{mips,sparc,microblaze,powerpc}: Don't enable pagefault/preempt twice
From: Guenter Roeck @ 2020-06-03 23:44 UTC (permalink / raw)
  To: Ira Weiny, Andrew Morton, Mike Rapoport
  Cc: Peter Zijlstra, Dave Hansen, dri-devel, linux-mips,
	James E.J. Bottomley, Max Filippov, Paul Mackerras,
	H. Peter Anvin, sparclinux, Thomas Gleixner, Helge Deller, x86,
	linux-csky, Christoph Hellwig, Ingo Molnar, linux-snps-arc,
	linux-xtensa, Borislav Petkov, Al Viro, Andy Lutomirski,
	Dan Williams, linux-arm-kernel, Chris Zankel, Thomas Bogendoerfer,
	linux-parisc, linux-kernel, Christian Koenig, linuxppc-dev,
	David S. Miller
In-Reply-To: <20200603211416.GA1740285@iweiny-DESK2.sc.intel.com>

On 6/3/20 2:14 PM, Ira Weiny wrote:
> On Wed, Jun 03, 2020 at 01:57:36PM -0700, Andrew Morton wrote:
>> On Thu, 21 May 2020 10:42:50 -0700 Ira Weiny <ira.weiny@intel.com> wrote:
>>
>>>>>
>>>>> Actually it occurs to me that the patch consolidating kmap_prot is odd for
>>>>> sparc 32 bit...
>>>>>
>>>>> Its a long shot but could you try reverting this patch?
>>>>>
>>>>> 4ea7d2419e3f kmap: consolidate kmap_prot definitions
>>>>>
>>>>
>>>> That is not easy to revert, unfortunately, due to several follow-up patches.
>>>
>>> I have gotten your sparc tests to run and they all pass...
>>>
>>> 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
>>> Build reference: v5.7-rc4-17-g852b6f2edc0f
>>>
>>> Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
>>> Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
>>> Building sparc32:SS-4:nosmp:initrd ... running ......... passed
>>> Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
>>> Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
>>> Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
>>> Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
>>> Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
>>>
>>> Is there another test I need to run?
>>
>> This all petered out, but as I understand it, this patchset still might
>> have issues on various architectures.
>>
>> Can folks please provide an update on the testing status?
> 
> I believe the tests were failing for Guenter due to another patch set...[1]
> 
> My tests with just this series are working.
> 
>>From my understanding the other failures were unrelated.[2]
> 
> 	<quote Mike Rapoport>
> 	I've checked the patch above on top of the mmots which already has
> 	Ira's patches and it booted fine. I've used sparc32_defconfig to build
> 	the kernel and qemu-system-sparc with default machine and CPU.
> 	</quote>
> 
> Mike, am I wrong?  Do you think the kmap() patches are still causing issues?
> 

For my part, all I can say is that -next is in pretty bad shape right now.
The summary of my tests says:

Build results:
	total: 151 pass: 130 fail: 21
Qemu test results:
	total: 430 pass: 375 fail: 55

sparc32 smp images in next-20200603 still crash for me with a spinlock
recursion. s390 images hang early in boot. Several others (alpha, arm64,
various ppc) don't even compile. I can run some more bisects over time,
but this is becoming a full-time job :-(.

Guenter

^ permalink raw reply

* Re: [PATCH] pwm: Add missing "CONFIG_" prefix
From: Kees Cook @ 2020-06-03 23:38 UTC (permalink / raw)
  To: Joe Perches
  Cc: linux-pwm, Uwe Kleine-König, linux-kernel, Thierry Reding,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <b08611018fdb6d88757c6008a5c02fa0e07b32fb.camel@perches.com>

On Wed, Jun 03, 2020 at 04:04:31PM -0700, Joe Perches wrote:
> more odd uses (mostly in comments)
> 
> $ git grep -P -oh '\bIS_ENABLED\s*\(\s*\w+\s*\)'| \
>   sed -r 's/\s+//g'| \
>   grep -v '(CONFIG_' | \
>   sort | uniq -c | sort -rn

I think a missed a bunch because my grep was messy. :) This is much
easier to scan.

>       7 IS_ENABLED(DEBUG)
>       4 IS_ENABLED(cfg)
>       2 IS_ENABLED(opt_name)
>       2 IS_ENABLED(DEBUG_PRINT_TRIE_GRAPHVIZ)
>       2 IS_ENABLED(config)
>       2 IS_ENABLED(cond)
>       2 IS_ENABLED(__BIG_ENDIAN)
>       1 IS_ENABLED(x)
>       1 IS_ENABLED(PWM_DEBUG)
>       1 IS_ENABLED(option)
>       1 IS_ENABLED(DEBUG_RANDOM_TRIE)
>       1 IS_ENABLED(DEBUG_CHACHA20POLY1305_SLOW_CHUNK_TEST)

These seem to be "as expected".

>       4 IS_ENABLED(DRM_I915_SELFTEST)
>       1 IS_ENABLED(STRICT_KERNEL_RWX)
>       1 IS_ENABLED(ETHTOOL_NETLINK)

But these are not.

> 
> STRICT_KERNEL_RWX is misused here in ppc
> 
> ---
> 
> Fix pr_warn without newline too.
> 
>  arch/powerpc/mm/book3s64/hash_utils.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 51e3c15f7aff..dd60c5f2b991 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -660,11 +660,10 @@ static void __init htab_init_page_sizes(void)
>  		 * Pick a size for the linear mapping. Currently, we only
>  		 * support 16M, 1M and 4K which is the default
>  		 */
> -		if (IS_ENABLED(STRICT_KERNEL_RWX) &&
> +		if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) &&
>  		    (unsigned long)_stext % 0x1000000) {
>  			if (mmu_psize_defs[MMU_PAGE_16M].shift)
> -				pr_warn("Kernel not 16M aligned, "
> -					"disabling 16M linear map alignment");
> +				pr_warn("Kernel not 16M aligned, disabling 16M linear map alignment\n");
>  			aligned = false;
>  		}

Reviewed-by: Kees Cook <keescook@chromium.org>


-- 
Kees Cook

^ permalink raw reply

* Re: [RESEND PATCH v9 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
From: Ira Weiny @ 2020-06-03 23:22 UTC (permalink / raw)
  To: Vaibhav Jain
  Cc: Aneesh Kumar K . V, linuxppc-dev, linux-kernel, Steven Rostedt,
	linux-nvdimm
In-Reply-To: <87lfl3hp2n.fsf@linux.ibm.com>

On Thu, Jun 04, 2020 at 01:58:00AM +0530, Vaibhav Jain wrote:
> Hi Ira,
> 
> Thanks again for reviewing this patch. My Response below:
> 
> Ira Weiny <ira.weiny@intel.com> writes:
> 
> > On Tue, Jun 02, 2020 at 01:51:49PM -0700, 'Ira Weiny' wrote:
> >> On Tue, Jun 02, 2020 at 03:44:37PM +0530, Vaibhav Jain wrote:
> >
> > ...
> >
> >> > +
> >> > +/*
> >> > + * PDSM Envelope:
> >> > + *
> >> > + * The ioctl ND_CMD_CALL transfers data between user-space and kernel via
> >> > + * envelope which consists of a header and user-defined payload sections.
> >> > + * The header is described by 'struct nd_pdsm_cmd_pkg' which expects a
> >> > + * payload following it and accessible via 'nd_pdsm_cmd_pkg.payload' field.
> >> > + * There is reserved field that can used to introduce new fields to the
> >> > + * structure in future. It also tries to ensure that 'nd_pdsm_cmd_pkg.payload'
> >> > + * lies at a 8-byte boundary.
> >> > + *
> >> > + *  +-------------+---------------------+---------------------------+
> >> > + *  |   64-Bytes  |       16-Bytes      |       Max 176-Bytes       |
> >> > + *  +-------------+---------------------+---------------------------+
> >> > + *  |               nd_pdsm_cmd_pkg     |                           |
> >> > + *  |-------------+                     |                           |
> >> > + *  |  nd_cmd_pkg |                     |                           |
> >> > + *  +-------------+---------------------+---------------------------+
> >> > + *  | nd_family   |                     |                           |
> >> > + *  | nd_size_out | cmd_status          |                           |
> >> > + *  | nd_size_in  | payload_version     |     payload               |
> >> > + *  | nd_command  | reserved            |                           |
> >> > + *  | nd_fw_size  |                     |                           |
> >> > + *  +-------------+---------------------+---------------------------+
> >
> > One more comment WRT nd_size_[in|out].  I know that it is defined as the size
> > of the FW payload but normally when you nest headers 'size' in Header A
> > represents everything after Header A, including Header B.  In this case that
> > would be including nd_pdsm_cmd_pkg...
> >
> > It looks like that is not what you have done?  Or perhaps I missed it?
> >
> Not sure if I understand the question correctly.
> 'struct nd_pdsm_cmd_pkg' contains 'struct nd_cmd_pkg' at its head and
> its size_[in|out] are populated by the libndctl in userspace, setting
> them to data following the 'struct nd_cmd_pkg'.
> 
> Copying of 'struct nd_cmd_pkg' to the input/out envelop is implicitly
> done in __nd_ioctl via the command descriptor array
> __nd_cmd_bus_descs. So I dont need to add the size of 'struct
> nd_cmd_pkg' to nd_size_[in|out].

Yea I see that now...  Coming from a networking background I find that odd...
:-/  Usually 'size' in a header includes all data after that header.  Because
header A knows nothing of the rest of the 'payload'...

FWIW you could define nd_size_in anyway you want because you are not really
sending any payload back from firmware directly.  But I suppose I can live with
it.

Ira

> 
> > Ira
> >
> >> > + *
> >> > + * PDSM Header:
> >> > + *
> >> > + * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
> >> > + * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to member
> >> > + * 'nd_cmd_pkg.nd_command'. Apart from size information of the envelope which is
> >> > + * contained in 'struct nd_cmd_pkg', the header also has members following 
> >>                                                              ^^^^^
> >>                                                         ...  the  ...
> >> 
> >> > + * members:
> >> > + *
> >> > + * 'cmd_status'		: (Out) Errors if any encountered while servicing PDSM.
> >> > + * 'payload_version'	: (In/Out) Version number associated with the payload.
> >> > + * 'reserved'		: Not used and reserved for future.
> >> > + *
> >> > + * PDSM Payload:
> >> > + *
> >> > + * The layout of the PDSM Payload is defined by various structs shared between
> >> > + * papr_scm and libndctl so that contents of payload can be interpreted. During
> >> > + * servicing of a PDSM the papr_scm module will read input args from the payload
> >> > + * field by casting its contents to an appropriate struct pointer based on the
> >> > + * PDSM command. Similarly the output of servicing the PDSM command will be
> >> > + * copied to the payload field using the same struct.
> >> > + *
> >> > + * 'libnvdimm' enforces a hard limit of 256 bytes on the envelope size, which
> >> > + * leaves around 176 bytes for the envelope payload (ignoring any padding that
> >> > + * the compiler may silently introduce).
> >> > + *
> >> > + * Payload Version:
> >> > + *
> >> > + * A 'payload_version' field is present in PDSM header that indicates a specific
> >> > + * version of the structure present in PDSM Payload for a given PDSM command.
> >> > + * This provides backward compatibility in case the PDSM Payload structure
> >> > + * evolves and different structures are supported by 'papr_scm' and 'libndctl'.
> >> > + *
> >> > + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the version
> >> > + * of the payload struct it supports via 'payload_version' field. The 'papr_scm'
> >> > + * module when servicing the PDSM envelope checks the 'payload_version' and then
> >> > + * uses 'payload struct version' == MIN('payload_version field',
> >> > + * 'max payload-struct-version supported by papr_scm') to service the PDSM.
> >> > + * After servicing the PDSM, 'papr_scm' put the negotiated version of payload
> >> > + * struct in returned 'payload_version' field.
> >> > + *
> >> > + * Libndctl on receiving the envelope back from papr_scm again checks the
> >> > + * 'payload_version' field and based on it use the appropriate version dsm
> >> > + * struct to parse the results.
> >> > + *
> >> > + * Backward Compatibility:
> >> > + *
> >> > + * Above scheme of exchanging different versioned PDSM struct between libndctl
> >> > + * and papr_scm should provide backward compatibility until following two
> >> > + * assumptions/conditions when defining new PDSM structs hold:
> >> > + *
> >> > + * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
> >> > + *
> >> > + * 1. T(X) is a proper subset of T(Y) if Y > X.
> >> > + *    i.e Each new version of PDSM struct should retain existing struct
> >> > + *    attributes from previous version
> >> > + *
> >> > + * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
> >> > + *    it should also support T(1), T(2)...T(X - 1).
> >> > + *    i.e When adding support for new version of a PDSM struct, libndctl
> >> > + *    and papr_scm should retain support of the existing PDSM struct
> >> > + *    version they support.
> >> 
> >> Please see this thread for an example why versions are a bad idea in UAPIs:
> >> 
> >> https://lkml.org/lkml/2020/3/26/213
> >> 
> >> While the use of version is different in that thread the fundamental issues are
> >> the same.  You end up with some weird matrix of supported features and
> >> structure definitions.  For example, you are opening up the possibility of
> >> changing structures with a different version for no good reason.
> >> 
> >> Also having the user query with version Z and get back version X (older) is
> >> odd.  Generally if the kernel does not know about a feature (ie version Z of
> >> the structure) it should return -EINVAL and let the user figure out what to do.
> >> The user may just give up or they could try a different query.
> >> 
> >> > + */
> >> > +
> >> > +/* PDSM-header + payload expected with ND_CMD_CALL ioctl from libnvdimm */
> >> > +struct nd_pdsm_cmd_pkg {
> >> > +	struct nd_cmd_pkg hdr;	/* Package header containing sub-cmd */
> >> > +	__s32 cmd_status;	/* Out: Sub-cmd status returned back */
> >> > +	__u16 reserved[5];	/* Ignored and to be used in future */
> >> 
> >> How do you know when reserved is used for something else in the future?  Is
> >> reserved guaranteed (and checked by the code) to be 0?
> >> 
> >> > +	__u16 payload_version;	/* In/Out: version of the payload */
> >> 
> >> Why is payload_version after reserved?
> >> 
> >> > +	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
> >> > +} __packed;
> >> > +
> >> > +/*
> >> > + * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
> >> > + * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
> >> > + */
> >> > +enum papr_pdsm {
> >> > +	PAPR_PDSM_MIN = 0x0,
> >> > +	PAPR_PDSM_MAX,
> >> > +};
> >> > +
> >> > +/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */
> >> > +static inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct nd_cmd_pkg *cmd)
> >> > +{
> >> > +	return (struct nd_pdsm_cmd_pkg *) cmd;
> >> > +}
> >> > +
> >> > +/* Return the payload pointer for a given pcmd */
> >> > +static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
> >> > +{
> >> > +	if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
> >> > +		return NULL;
> >> > +	else
> >> > +		return (void *)(pcmd->payload);
> >> > +}
> >> > +
> >> > +#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
> >> > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> >> > index 149431594839..5e2237e7ec08 100644
> >> > --- a/arch/powerpc/platforms/pseries/papr_scm.c
> >> > +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> >> > @@ -15,13 +15,15 @@
> >> >  #include <linux/seq_buf.h>
> >> >  
> >> >  #include <asm/plpar_wrappers.h>
> >> > +#include <asm/papr_pdsm.h>
> >> >  
> >> >  #define BIND_ANY_ADDR (~0ul)
> >> >  
> >> >  #define PAPR_SCM_DIMM_CMD_MASK \
> >> >  	((1ul << ND_CMD_GET_CONFIG_SIZE) | \
> >> >  	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
> >> > -	 (1ul << ND_CMD_SET_CONFIG_DATA))
> >> > +	 (1ul << ND_CMD_SET_CONFIG_DATA) | \
> >> > +	 (1ul << ND_CMD_CALL))
> >> >  
> >> >  /* DIMM health bitmap bitmap indicators */
> >> >  /* SCM device is unable to persist memory contents */
> >> > @@ -350,16 +352,97 @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
> >> >  	return 0;
> >> >  }
> >> >  
> >> > +/*
> >> > + * Validate the inputs args to dimm-control function and return '0' if valid.
> >> > + * This also does initial sanity validation to ND_CMD_CALL sub-command packages.
> >> > + */
> >> > +static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >> > +		       unsigned int buf_len)
> >> > +{
> >> > +	unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
> >> > +	struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
> >> > +	struct papr_scm_priv *p;
> >> > +
> >> > +	/* Only dimm-specific calls are supported atm */
> >> > +	if (!nvdimm)
> >> > +		return -EINVAL;
> >> > +
> >> > +	/* get the provider date from struct nvdimm */
> >> 
> >> s/date/data
> >> 
> >> > +	p = nvdimm_provider_data(nvdimm);
> >> > +
> >> > +	if (!test_bit(cmd, &cmd_mask)) {
> >> > +		dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
> >> > +		return -EINVAL;
> >> > +	} else if (cmd == ND_CMD_CALL) {
> >> > +
> >> > +		/* Verify the envelope package */
> >> > +		if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
> >> > +			dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
> >> > +				buf_len);
> >> > +			return -EINVAL;
> >> > +		}
> >> > +
> >> > +		/* Verify that the PDSM family is valid */
> >> > +		if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
> >> > +			dev_dbg(&p->pdev->dev, "Invalid pkg family=0x%llx\n",
> >> > +				pkg->hdr.nd_family);
> >> > +			return -EINVAL;
> >> > +
> >> > +		}
> >> > +
> >> > +		/* We except a payload with all PDSM commands */
> >> > +		if (pdsm_cmd_to_payload(pkg) == NULL) {
> >> > +			dev_dbg(&p->pdev->dev,
> >> > +				"Empty payload for sub-command=0x%llx\n",
> >> > +				pkg->hdr.nd_command);
> >> > +			return -EINVAL;
> >> > +		}
> >> > +	}
> >> > +
> >> > +	/* Command looks valid */
> >> 
> >> I assume the first command to be implemented also checks the { nd_command,
> >> payload_version, payload length } for correctness?
> >> 
> >> > +	return 0;
> >> > +}
> >> > +
> >> > +static int papr_scm_service_pdsm(struct papr_scm_priv *p,
> >> > +				struct nd_pdsm_cmd_pkg *call_pkg)
> >> > +{
> >> > +	/* unknown subcommands return error in packages */
> >> > +	if (call_pkg->hdr.nd_command <= PAPR_PDSM_MIN ||
> >> > +	    call_pkg->hdr.nd_command >= PAPR_PDSM_MAX) {
> >> > +		dev_dbg(&p->pdev->dev, "Invalid PDSM request 0x%llx\n",
> >> > +			call_pkg->hdr.nd_command);
> >> > +		call_pkg->cmd_status = -EINVAL;
> >> > +		return 0;
> >> > +	}
> >> > +
> >> > +	/* Depending on the DSM command call appropriate service routine */
> >> > +	switch (call_pkg->hdr.nd_command) {
> >> > +	default:
> >> > +		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
> >> > +			call_pkg->hdr.nd_command);
> >> > +		call_pkg->cmd_status = -ENOENT;
> >> > +		return 0;
> >> > +	}
> >> > +}
> >> > +
> >> >  static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >> >  			  struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >> >  			  unsigned int buf_len, int *cmd_rc)
> >> >  {
> >> >  	struct nd_cmd_get_config_size *get_size_hdr;
> >> >  	struct papr_scm_priv *p;
> >> > +	struct nd_pdsm_cmd_pkg *call_pkg = NULL;
> >> > +	int rc;
> >> >  
> >> > -	/* Only dimm-specific calls are supported atm */
> >> > -	if (!nvdimm)
> >> > -		return -EINVAL;
> >> > +	/* Use a local variable in case cmd_rc pointer is NULL */
> >> > +	if (cmd_rc == NULL)
> >> > +		cmd_rc = &rc;
> >> 
> >> Why is this needed?  AFAICT The caller of papr_scm_ndctl does not specify null
> >> and you did not change it.
> >> 
> >> > +
> >> > +	*cmd_rc = is_cmd_valid(nvdimm, cmd, buf, buf_len);
> >> > +	if (*cmd_rc) {
> >> > +		pr_debug("Invalid cmd=0x%x. Err=%d\n", cmd, *cmd_rc);
> >> > +		return *cmd_rc;
> >> > +	}
> >> >  
> >> >  	p = nvdimm_provider_data(nvdimm);
> >> >  
> >> > @@ -381,13 +464,19 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >> >  		*cmd_rc = papr_scm_meta_set(p, buf);
> >> >  		break;
> >> >  
> >> > +	case ND_CMD_CALL:
> >> > +		call_pkg = nd_to_pdsm_cmd_pkg(buf);
> >> > +		*cmd_rc = papr_scm_service_pdsm(p, call_pkg);
> >> > +		break;
> >> > +
> >> >  	default:
> >> > -		return -EINVAL;
> >> > +		dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
> >> > +		*cmd_rc = -EINVAL;
> >> 
> >> Is this change related?  If there is a bug where there is a caller of
> >> papr_scm_ndctl() with cmd_rc == NULL this should be a separate patch to fix
> >> that issue.
> >> 
> >> Ira
> >> 
> >> >  	}
> >> >  
> >> >  	dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc);
> >> >  
> >> > -	return 0;
> >> > +	return *cmd_rc;
> >> >  }
> >> >  
> >> >  static ssize_t flags_show(struct device *dev,
> >> > diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
> >> > index de5d90212409..0e09dc5cec19 100644
> >> > --- a/include/uapi/linux/ndctl.h
> >> > +++ b/include/uapi/linux/ndctl.h
> >> > @@ -244,6 +244,7 @@ struct nd_cmd_pkg {
> >> >  #define NVDIMM_FAMILY_HPE2 2
> >> >  #define NVDIMM_FAMILY_MSFT 3
> >> >  #define NVDIMM_FAMILY_HYPERV 4
> >> > +#define NVDIMM_FAMILY_PAPR 5
> >> >  
> >> >  #define ND_IOCTL_CALL			_IOWR(ND_IOCTL, ND_CMD_CALL,\
> >> >  					struct nd_cmd_pkg)
> >> > -- 
> >> > 2.26.2
> >> > 
> >> _______________________________________________
> >> Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
> >> To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
> 
> -- 
> Cheers
> ~ Vaibhav

^ permalink raw reply

* Re: [RESEND PATCH v9 5/5] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
From: Ira Weiny @ 2020-06-03 23:18 UTC (permalink / raw)
  To: Vaibhav Jain
  Cc: Santosh Sivaraj, linux-nvdimm, linux-kernel, Steven Rostedt,
	Oliver O'Halloran, Aneesh Kumar K . V, Dan Williams,
	linuxppc-dev
In-Reply-To: <87pnaggee3.fsf@linux.ibm.com>

On Thu, Jun 04, 2020 at 12:34:04AM +0530, Vaibhav Jain wrote:
> Hi Ira,
> 
> Thanks for reviewing this patch. My responses below:
> 
> Ira Weiny <ira.weiny@intel.com> writes:
> 
> > On Tue, Jun 02, 2020 at 03:44:38PM +0530, Vaibhav Jain wrote:
> >> This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
> >> that returns a newly introduced 'struct nd_papr_pdsm_health' instance
> >> containing dimm health information back to user space in response to
> >> ND_CMD_CALL. This functionality is implemented in newly introduced
> >> papr_pdsm_health() that queries the nvdimm health information and
> >> then copies this information to the package payload whose layout is
> >> defined by 'struct nd_papr_pdsm_health'.
> >> 
> >> The patch also introduces a new member 'struct papr_scm_priv.health'
> >> thats an instance of 'struct nd_papr_pdsm_health' to cache the health
> >> information of a nvdimm. As a result functions drc_pmem_query_health()
> >> and flags_show() are updated to populate and use this new struct
> >> instead of a u64 integer that was earlier used.
> >> 
> >> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
> >> Cc: Dan Williams <dan.j.williams@intel.com>
> >> Cc: Michael Ellerman <mpe@ellerman.id.au>
> >> Cc: Ira Weiny <ira.weiny@intel.com>
> >> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> >> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> >> ---
> >> Changelog:
> >> 
> >> Resend:
> >> * Added ack from Aneesh.
> >> 
> >> v8..v9:
> >> * s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
> >> * s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
> >> * Renamed papr_scm_get_health() to papr_psdm_health()
> >> * Updated patch description to replace papr-scm dimm with nvdimm.
> >> 
> >> v7..v8:
> >> * None
> >> 
> >> Resend:
> >> * None
> >> 
> >> v6..v7:
> >> * Updated flags_show() to use seq_buf_printf(). [Mpe]
> >> * Updated papr_scm_get_health() to use newly introduced
> >>   __drc_pmem_query_health() bypassing the cache [Mpe].
> >> 
> >> v5..v6:
> >> * Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
> >>   gaurd against possibility of different compilers adding different
> >>   paddings to the struct [ Dan Williams ]
> >> 
> >> * Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
> >>   'bool' and also updated drc_pmem_query_health() to take this into
> >>   account. [ Dan Williams ]
> >> 
> >> v4..v5:
> >> * None
> >> 
> >> v3..v4:
> >> * Call the DSM_PAPR_SCM_HEALTH service function from
> >>   papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]
> >> 
> >> v2..v3:
> >> * Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
> >>   as its exported to the userspace [Aneesh]
> >> * Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
> >>   from enum to #defines [Aneesh]
> >> 
> >> v1..v2:
> >> * New patch in the series
> >> ---
> >>  arch/powerpc/include/uapi/asm/papr_pdsm.h |  39 +++++++
> >>  arch/powerpc/platforms/pseries/papr_scm.c | 125 +++++++++++++++++++---
> >>  2 files changed, 147 insertions(+), 17 deletions(-)
> >> 
> >> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> >> index 6407fefcc007..411725a91591 100644
> >> --- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
> >> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
> >> @@ -115,6 +115,7 @@ struct nd_pdsm_cmd_pkg {
> >>   */
> >>  enum papr_pdsm {
> >>  	PAPR_PDSM_MIN = 0x0,
> >> +	PAPR_PDSM_HEALTH,
> >>  	PAPR_PDSM_MAX,
> >>  };
> >>  
> >> @@ -133,4 +134,42 @@ static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
> >>  		return (void *)(pcmd->payload);
> >>  }
> >>  
> >> +/* Various nvdimm health indicators */
> >> +#define PAPR_PDSM_DIMM_HEALTHY       0
> >> +#define PAPR_PDSM_DIMM_UNHEALTHY     1
> >> +#define PAPR_PDSM_DIMM_CRITICAL      2
> >> +#define PAPR_PDSM_DIMM_FATAL         3
> >> +
> >> +/*
> >> + * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
> >> + * Various flags indicate the health status of the dimm.
> >> + *
> >> + * dimm_unarmed		: Dimm not armed. So contents wont persist.
> >> + * dimm_bad_shutdown	: Previous shutdown did not persist contents.
> >> + * dimm_bad_restore	: Contents from previous shutdown werent restored.
> >> + * dimm_scrubbed	: Contents of the dimm have been scrubbed.
> >> + * dimm_locked		: Contents of the dimm cant be modified until CEC reboot
> >> + * dimm_encrypted	: Contents of dimm are encrypted.
> >> + * dimm_health		: Dimm health indicator. One of PAPR_PDSM_DIMM_XXXX
> >> + */
> >> +struct nd_papr_pdsm_health_v1 {
> >> +	__u8 dimm_unarmed;
> >> +	__u8 dimm_bad_shutdown;
> >> +	__u8 dimm_bad_restore;
> >> +	__u8 dimm_scrubbed;
> >> +	__u8 dimm_locked;
> >> +	__u8 dimm_encrypted;
> >> +	__u16 dimm_health;
> >> +} __packed;
> >> +
> >> +/*
> >> + * Typedef the current struct for dimm_health so that any application
> >> + * or kernel recompiled after introducing a new version automatically
> >> + * supports the new version.
> >> + */
> >> +#define nd_papr_pdsm_health nd_papr_pdsm_health_v1
> >> +
> >> +/* Current version number for the dimm health struct */
> >
> > This can't be the 'current' version.  You will need a list of versions you
> > support.  Because if the user passes in an old version you need to be able to
> > respond with that old version.  Also if you plan to support 'return X for a Y
> > query' then the user will need both X and Y defined to interpret X.
> Yes, and that change will be introduced with addition of version-2 of
> nd_papr_pdsm_health. Earlier version of the patchset[1] had such a table
> implemented. But to simplify the patchset, as we are only dealing with
> version-1 of the structs right now, it was dropped.
> 
> [1] :
> https://lore.kernel.org/linuxppc-dev/20200220095805.197229-9-vaibhav@linux.ibm.com/

I'm not sure I follow that comment.

I feel like there is some confusion about what firmware can return vs the UAPI
structure.  You have already marshaled the data between the 2.  We can define
whatever we want for the UAPI structures throwing away data the kernel does not
understand from the firmware.

> 
> >
> >> +#define ND_PAPR_PDSM_HEALTH_VERSION 1
> >> +
> >>  #endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> >> index 5e2237e7ec08..c0606c0c659c 100644
> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> >> @@ -88,7 +88,7 @@ struct papr_scm_priv {
> >>  	unsigned long lasthealth_jiffies;
> >>  
> >>  	/* Health information for the dimm */
> >> -	u64 health_bitmap;
> >> +	struct nd_papr_pdsm_health health;
> >
> > ok so we are throwing away all the #defs from patch 1?  Are they still valid?
> >
> > I'm confused that patch 3 added this and we are throwing it away
> > here...
> The #defines are still valid, only the usage moved to a __drc_pmem_query_health().
> 
> >
> >>  };
> >>  
> >>  static int drc_pmem_bind(struct papr_scm_priv *p)
> >> @@ -201,6 +201,7 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
> >>  static int __drc_pmem_query_health(struct papr_scm_priv *p)
> >>  {
> >>  	unsigned long ret[PLPAR_HCALL_BUFSIZE];
> >> +	u64 health;
> >>  	long rc;
> >>  
> >>  	/* issue the hcall */
> >> @@ -208,18 +209,46 @@ static int __drc_pmem_query_health(struct papr_scm_priv *p)
> >>  	if (rc != H_SUCCESS) {
> >>  		dev_err(&p->pdev->dev,
> >>  			 "Failed to query health information, Err:%ld\n", rc);
> >> -		rc = -ENXIO;
> >> -		goto out;
> >> +		return -ENXIO;
> >
> > I missed this...  probably did not need the goto in the first patch?
> Yes, will get rid of the goto from patch-1.

Cool.

> 
> >
> >>  	}
> >>  
> >>  	p->lasthealth_jiffies = jiffies;
> >> -	p->health_bitmap = ret[0] & ret[1];
> >> +	health = ret[0] & ret[1];
> >>  
> >>  	dev_dbg(&p->pdev->dev,
> >>  		"Queried dimm health info. Bitmap:0x%016lx Mask:0x%016lx\n",
> >>  		ret[0], ret[1]);
> >> -out:
> >> -	return rc;
> >> +
> >> +	memset(&p->health, 0, sizeof(p->health));
> >> +
> >> +	/* Check for various masks in bitmap and set the buffer */
> >> +	if (health & PAPR_PMEM_UNARMED_MASK)
> >
> > Oh ok...  odd.  (don't add code then just take it away in a series)
> > You could have lead with the user structure and put this code in patch
> > 3.
> The struct nd_papr_pdsm_health in only introduced this patch in header
> 'papr_pdsm.h' as means of exchanging nvdimm health information with
> userspace. Introducing this struct without introducing the necessary
> scafolding in 'papr_pdsm.h' would have been very counter-intutive.

I respectfully disagree.  You intended to use a copy of this structure in
kernel to store the data.  Just do that.

> 
> >
> > Why does the user need u8 to represent a single bit?  Does this help protect
> > against endian issues?
> This was 'bool' earlier but since type 'bool' isnt suitable for ioctl abi
> and I wanted to avoid bit fields here as not sure if their packing may
> differ across compilers hence replaced with u8.
> 

ok works for me...

> >
> >> +		p->health.dimm_unarmed = 1;
> >> +
> >> +	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
> >> +		p->health.dimm_bad_shutdown = 1;
> >> +
> >> +	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
> >> +		p->health.dimm_bad_restore = 1;
> >> +
> >> +	if (health & PAPR_PMEM_ENCRYPTED)
> >> +		p->health.dimm_encrypted = 1;
> >> +
> >> +	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED) {
> >> +		p->health.dimm_locked = 1;
> >> +		p->health.dimm_scrubbed = 1;
> >> +	}
> >> +
> >> +	if (health & PAPR_PMEM_HEALTH_UNHEALTHY)
> >> +		p->health.dimm_health = PAPR_PDSM_DIMM_UNHEALTHY;
> >> +
> >> +	if (health & PAPR_PMEM_HEALTH_CRITICAL)
> >> +		p->health.dimm_health = PAPR_PDSM_DIMM_CRITICAL;
> >> +
> >> +	if (health & PAPR_PMEM_HEALTH_FATAL)
> >> +		p->health.dimm_health = PAPR_PDSM_DIMM_FATAL;
> >> +
> >> +	return 0;
> >>  }
> >>  
> >>  /* Min interval in seconds for assuming stable dimm health */
> >> @@ -403,6 +432,58 @@ static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >>  	return 0;
> >>  }
> >>  
> >> +/* Fetch the DIMM health info and populate it in provided package. */
> >> +static int papr_pdsm_health(struct papr_scm_priv *p,
> >> +			       struct nd_pdsm_cmd_pkg *pkg)
> >> +{
> >> +	int rc;
> >> +	size_t copysize = sizeof(p->health);
> >> +
> >> +	/* Ensure dimm health mutex is taken preventing concurrent access */
> >> +	rc = mutex_lock_interruptible(&p->health_mutex);
> >> +	if (rc)
> >> +		goto out;
> >> +
> >> +	/* Always fetch upto date dimm health data ignoring cached values */
> >> +	rc = __drc_pmem_query_health(p);
> >> +	if (rc)
> >> +		goto out_unlock;
> >> +	/*
> >> +	 * If the requested payload version is greater than one we know
> >> +	 * about, return the payload version we know about and let
> >> +	 * caller/userspace handle.
> >> +	 */
> >> +	if (pkg->payload_version > ND_PAPR_PDSM_HEALTH_VERSION)
> >> +		pkg->payload_version = ND_PAPR_PDSM_HEALTH_VERSION;
> >
> > I know this seems easy now but I do think you will run into trouble later.
> 
> I did addressed this in an earlier iteration of this patchset[1] and
> dropped it in favour of simplicity.
> 
> [1] :
> https://lore.kernel.org/linuxppc-dev/20200220095805.197229-9-vaibhav@linux.ibm.com/
 
I don't see how that addresses this?  See my other email.

Ira

> 
> > Ira
> >
> >> +
> >> +	if (pkg->hdr.nd_size_out < copysize) {
> >> +		dev_dbg(&p->pdev->dev, "Truncated payload (%u). Expected (%lu)",
> >> +			pkg->hdr.nd_size_out, copysize);
> >> +		rc = -ENOSPC;
> >> +		goto out_unlock;
> >> +	}
> >> +
> >> +	dev_dbg(&p->pdev->dev, "Copying payload size=%lu version=0x%x\n",
> >> +		copysize, pkg->payload_version);
> >> +
> >> +	/* Copy the health struct to the payload */
> >> +	memcpy(pdsm_cmd_to_payload(pkg), &p->health, copysize);
> >> +	pkg->hdr.nd_fw_size = copysize;
> >> +
> >> +out_unlock:
> >> +	mutex_unlock(&p->health_mutex);
> >> +
> >> +out:
> >> +	/*
> >> +	 * Put the error in out package and return success from function
> >> +	 * so that errors if any are propogated back to userspace.
> >> +	 */
> >> +	pkg->cmd_status = rc;
> >> +	dev_dbg(&p->pdev->dev, "completion code = %d\n", rc);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >>  static int papr_scm_service_pdsm(struct papr_scm_priv *p,
> >>  				struct nd_pdsm_cmd_pkg *call_pkg)
> >>  {
> >> @@ -417,6 +498,9 @@ static int papr_scm_service_pdsm(struct papr_scm_priv *p,
> >>  
> >>  	/* Depending on the DSM command call appropriate service routine */
> >>  	switch (call_pkg->hdr.nd_command) {
> >> +	case PAPR_PDSM_HEALTH:
> >> +		return papr_pdsm_health(p, call_pkg);
> >> +
> >>  	default:
> >>  		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
> >>  			call_pkg->hdr.nd_command);
> >> @@ -485,34 +569,41 @@ static ssize_t flags_show(struct device *dev,
> >>  	struct nvdimm *dimm = to_nvdimm(dev);
> >>  	struct papr_scm_priv *p = nvdimm_provider_data(dimm);
> >>  	struct seq_buf s;
> >> -	u64 health;
> >>  	int rc;
> >>  
> >>  	rc = drc_pmem_query_health(p);
> >>  	if (rc)
> >>  		return rc;
> >>  
> >> -	/* Copy health_bitmap locally, check masks & update out buffer */
> >> -	health = READ_ONCE(p->health_bitmap);
> >> -
> >>  	seq_buf_init(&s, buf, PAGE_SIZE);
> >> -	if (health & PAPR_PMEM_UNARMED_MASK)
> >> +
> >> +	/* Protect concurrent modifications to papr_scm_priv */
> >> +	rc = mutex_lock_interruptible(&p->health_mutex);
> >> +	if (rc)
> >> +		return rc;
> >> +
> >> +	if (p->health.dimm_unarmed)
> >>  		seq_buf_printf(&s, "not_armed ");
> >>  
> >> -	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
> >> +	if (p->health.dimm_bad_shutdown)
> >>  		seq_buf_printf(&s, "flush_fail ");
> >>  
> >> -	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
> >> +	if (p->health.dimm_bad_restore)
> >>  		seq_buf_printf(&s, "restore_fail ");
> >>  
> >> -	if (health & PAPR_PMEM_ENCRYPTED)
> >> +	if (p->health.dimm_encrypted)
> >>  		seq_buf_printf(&s, "encrypted ");
> >>  
> >> -	if (health & PAPR_PMEM_SMART_EVENT_MASK)
> >> +	if (p->health.dimm_health)
> >>  		seq_buf_printf(&s, "smart_notify ");
> >>  
> >> -	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED)
> >> -		seq_buf_printf(&s, "scrubbed locked ");
> >> +	if (p->health.dimm_scrubbed)
> >> +		seq_buf_printf(&s, "scrubbed ");
> >> +
> >> +	if (p->health.dimm_locked)
> >> +		seq_buf_printf(&s, "locked ");
> >> +
> >> +	mutex_unlock(&p->health_mutex);
> >>  
> >>  	if (seq_buf_used(&s))
> >>  		seq_buf_printf(&s, "\n");
> >> -- 
> >> 2.26.2
> >> 
> 
> -- 
> Cheers
> ~ Vaibhav

^ permalink raw reply

* Re: [PATCH v1 3/4] KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in H_SVM_INIT_DONE
From: Ram Pai @ 2020-06-03 23:10 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: rcampbell, ldufour, cclaudio, kvm-ppc, aneesh.kumar, sukadev,
	linuxppc-dev, bauerman, david
In-Reply-To: <20200602100639.GB31382@in.ibm.com>

On Tue, Jun 02, 2020 at 03:36:39PM +0530, Bharata B Rao wrote:
> On Mon, Jun 01, 2020 at 12:05:35PM -0700, Ram Pai wrote:
> > On Mon, Jun 01, 2020 at 05:25:18PM +0530, Bharata B Rao wrote:
> > > On Sat, May 30, 2020 at 07:27:50PM -0700, Ram Pai wrote:
> > > > H_SVM_INIT_DONE incorrectly assumes that the Ultravisor has explicitly
> > > > called H_SVM_PAGE_IN for all secure pages.
> > > 
> > > I don't think that is quite true. HV doesn't assume anything about
> > > secure pages by itself.
> > 
> > Yes. Currently, it does not assume anything about secure pages.  But I am
> > proposing that it should consider all pages (except the shared pages) as
> > secure pages, when H_SVM_INIT_DONE is called.
> 
> Ok, then may be also add the proposed changes to H_SVM_INIT_DONE
> documentation.

ok.

> 
> > 
> > In other words, HV should treat all pages; except shared pages, as
> > secure pages once H_SVM_INIT_DONE is called. And this includes pages
> > added subsequently through memory hotplug.
> 
> So after H_SVM_INIT_DONE, if HV touches a secure page for any
> reason and gets encrypted contents via page-out, HV drops the
> device pfn at that time. So what state we would be in that case? We
> have completed H_SVM_INIT_DONE, but still have a normal (but encrypted)
> page in HV?

Good point.

The corresponding GFN will continue to be a secure GFN. Just that its
backing PFN is not a device-PFN, but a memory-PFN. Also that backing
memory-PFN contains encrypted content.

I will clarify this in the patch; about secure-GFN state.

> 
> > 
> > Yes. the Ultravisor can explicitly request the HV to move the pages
> > individually.  But that will slow down the transition too significantly.
> > It takes above 20min to transition them, for a SVM of size 100G.
> > 
> > With this proposed enhancement, the switch completes in a few seconds.
> 
> I think, many pages during initial switch and most pages for hotplugged
> memory are zero pages, for which we don't anyway issue UV page-in calls.
> So the 20min saving you are observing is purely due to hcall overhead?

Apparently, that seems to be the case.

> 
> How about extending H_SVM_PAGE_IN interface or a new hcall to request
> multiple pages in one request?
> 
> Also, how about requesting for bigger page sizes (2M)? Ralph Campbell
> had patches that added THP support for migrate_vma_* calls.

yes. that should give further boost. I think the API does not stop us
from using that feature. Its the support on the Ultravisor side.
Hopefully we will have contributions to the ultravisor once it is
opensourced.

> 
> > 
> > > 
> > > > These GFNs continue to be
> > > > normal GFNs associated with normal PFNs; when infact, these GFNs should
> > > > have been secure GFNs associated with device PFNs.
> > > 
> > > Transition to secure state is driven by SVM/UV and HV just responds to
> > > hcalls by issuing appropriate uvcalls. SVM/UV is in the best position to
> > > determine the required pages that need to be moved into secure side.
> > > HV just responds to it and tracks such pages as device private pages.
> > > 
> > > If SVM/UV doesn't get in all the pages to secure side by the time
> > > of H_SVM_INIT_DONE, the remaining pages are just normal (shared or
> > > otherwise) pages as far as HV is concerned.  Why should HV assume that
> > > SVM/UV didn't ask for a few pages and hence push those pages during
> > > H_SVM_INIT_DONE?
> > 
> > By definition, SVM is a VM backed by secure pages.
> > Hence all pages(except shared) must turn secure when a VM switches to SVM.
> > 
> > UV is interested in only a certain pages for the VM, which it will
> > request explicitly through H_SVM_PAGE_IN.  All other pages, need not
> > be paged-in through UV_PAGE_IN.  They just need to be switched to
> > device-pages.
> > 
> > > 
> > > I think UV should drive the movement of pages into secure side both
> > > of boot-time SVM memory and hot-plugged memory. HV does memslot
> > > registration uvcall when new memory is plugged in, UV should explicitly
> > > get the required pages in at that time instead of expecting HV to drive
> > > the same.
> > > 
> > > > +static int uv_migrate_mem_slot(struct kvm *kvm,
> > > > +		const struct kvm_memory_slot *memslot)
> > > > +{
> > > > +	unsigned long gfn = memslot->base_gfn;
> > > > +	unsigned long end;
> > > > +	bool downgrade = false;
> > > > +	struct vm_area_struct *vma;
> > > > +	int i, ret = 0;
> > > > +	unsigned long start = gfn_to_hva(kvm, gfn);
> > > > +
> > > > +	if (kvm_is_error_hva(start))
> > > > +		return H_STATE;
> > > > +
> > > > +	end = start + (memslot->npages << PAGE_SHIFT);
> > > > +
> > > > +	down_write(&kvm->mm->mmap_sem);
> > > > +
> > > > +	mutex_lock(&kvm->arch.uvmem_lock);
> > > > +	vma = find_vma_intersection(kvm->mm, start, end);
> > > > +	if (!vma || vma->vm_start > start || vma->vm_end < end) {
> > > > +		ret = H_STATE;
> > > > +		goto out_unlock;
> > > > +	}
> > > > +
> > > > +	ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
> > > > +			  MADV_UNMERGEABLE, &vma->vm_flags);
> > > > +	downgrade_write(&kvm->mm->mmap_sem);
> > > > +	downgrade = true;
> > > > +	if (ret) {
> > > > +		ret = H_STATE;
> > > > +		goto out_unlock;
> > > > +	}
> > > > +
> > > > +	for (i = 0; i < memslot->npages; i++, ++gfn) {
> > > > +		/* skip paged-in pages and shared pages */
> > > > +		if (kvmppc_gfn_is_uvmem_pfn(gfn, kvm, NULL) ||
> > > > +			kvmppc_gfn_is_uvmem_shared(gfn, kvm))
> > > > +			continue;
> > > > +
> > > > +		start = gfn_to_hva(kvm, gfn);
> > > > +		end = start + (1UL << PAGE_SHIFT);
> > > > +		ret = kvmppc_svm_migrate_page(vma, start, end,
> > > > +			(gfn << PAGE_SHIFT), kvm, PAGE_SHIFT, false);
> > > > +
> > > > +		if (ret)
> > > > +			goto out_unlock;
> > > > +	}
> > > 
> > > Is there a guarantee that the vma you got for the start address remains
> > > valid for all the addresses till end in a memslot? If not, you should
> > > re-get the vma for the current address in each iteration I suppose.
> > 
> > 
> > mm->mmap_sem  is the semaphore that guards the vma. right?  If that
> > semaphore is held, can the vma change?
> 
> I am not sure if the vma you obtained would span the entire address range
> in the memslot.

will check.

Thanks,
RP

^ permalink raw reply

* Re: [PATCH] pwm: Add missing "CONFIG_" prefix
From: Joe Perches @ 2020-06-03 23:04 UTC (permalink / raw)
  To: Kees Cook, Uwe Kleine-König, Thierry Reding
  Cc: linux-pwm, linux-kernel, Paul Mackerras, linuxppc-dev
In-Reply-To: <202006031539.4198EA6@keescook>

On Wed, 2020-06-03 at 15:40 -0700, Kees Cook wrote:
> The IS_ENABLED() use was missing the CONFIG_ prefix which would have
> lead to skipping this code.
> 
> Fixes: 3ad1f3a33286 ("pwm: Implement some checks for lowlevel drivers")
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  drivers/pwm/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pwm/core.c b/drivers/pwm/core.c
> index 9973c442b455..6b3cbc0490c6 100644
> --- a/drivers/pwm/core.c
> +++ b/drivers/pwm/core.c
> @@ -121,7 +121,7 @@ static int pwm_device_request(struct pwm_device *pwm, const char *label)
>  		pwm->chip->ops->get_state(pwm->chip, pwm, &pwm->state);
>  		trace_pwm_get(pwm, &pwm->state);
>  
> -		if (IS_ENABLED(PWM_DEBUG))
> +		if (IS_ENABLED(CONFIG_PWM_DEBUG))
>  			pwm->last = pwm->state;
>  	}
>  
> -- 
> 2.25.1
> 

more odd uses (mostly in comments)

$ git grep -P -oh '\bIS_ENABLED\s*\(\s*\w+\s*\)'| \
  sed -r 's/\s+//g'| \
  grep -v '(CONFIG_' | \
  sort | uniq -c | sort -rn
      7 IS_ENABLED(DEBUG)
      4 IS_ENABLED(DRM_I915_SELFTEST)
      4 IS_ENABLED(cfg)
      2 IS_ENABLED(opt_name)
      2 IS_ENABLED(DEBUG_PRINT_TRIE_GRAPHVIZ)
      2 IS_ENABLED(config)
      2 IS_ENABLED(cond)
      2 IS_ENABLED(__BIG_ENDIAN)
      1 IS_ENABLED(x)
      1 IS_ENABLED(STRICT_KERNEL_RWX)
      1 IS_ENABLED(PWM_DEBUG)
      1 IS_ENABLED(option)
      1 IS_ENABLED(ETHTOOL_NETLINK)
      1 IS_ENABLED(DEBUG_RANDOM_TRIE)
      1 IS_ENABLED(DEBUG_CHACHA20POLY1305_SLOW_CHUNK_TEST)

STRICT_KERNEL_RWX is misused here in ppc

---

Fix pr_warn without newline too.

 arch/powerpc/mm/book3s64/hash_utils.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 51e3c15f7aff..dd60c5f2b991 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -660,11 +660,10 @@ static void __init htab_init_page_sizes(void)
 		 * Pick a size for the linear mapping. Currently, we only
 		 * support 16M, 1M and 4K which is the default
 		 */
-		if (IS_ENABLED(STRICT_KERNEL_RWX) &&
+		if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) &&
 		    (unsigned long)_stext % 0x1000000) {
 			if (mmu_psize_defs[MMU_PAGE_16M].shift)
-				pr_warn("Kernel not 16M aligned, "
-					"disabling 16M linear map alignment");
+				pr_warn("Kernel not 16M aligned, disabling 16M linear map alignment\n");
 			aligned = false;
 		}
 




^ permalink raw reply related

* Re: [RESEND PATCH v9 4/5] ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
From: Ira Weiny @ 2020-06-03 22:46 UTC (permalink / raw)
  To: Vaibhav Jain
  Cc: Santosh Sivaraj, linux-nvdimm, linux-kernel, Steven Rostedt,
	Oliver O'Halloran, Aneesh Kumar K . V, Dan Williams,
	linuxppc-dev
In-Reply-To: <87tuzsggtd.fsf@linux.ibm.com>

On Wed, Jun 03, 2020 at 11:41:42PM +0530, Vaibhav Jain wrote:
> Hi Ira,
> 
> Thanks for reviewing this patch. My responses below:
> 
> Ira Weiny <ira.weiny@intel.com> writes:
> 

...

> >> + *
> >> + * Payload Version:
> >> + *
> >> + * A 'payload_version' field is present in PDSM header that indicates a specific
> >> + * version of the structure present in PDSM Payload for a given PDSM command.
> >> + * This provides backward compatibility in case the PDSM Payload structure
> >> + * evolves and different structures are supported by 'papr_scm' and 'libndctl'.
> >> + *
> >> + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the version
> >> + * of the payload struct it supports via 'payload_version' field. The 'papr_scm'
> >> + * module when servicing the PDSM envelope checks the 'payload_version' and then
> >> + * uses 'payload struct version' == MIN('payload_version field',
> >> + * 'max payload-struct-version supported by papr_scm') to service the PDSM.
> >> + * After servicing the PDSM, 'papr_scm' put the negotiated version of payload
> >> + * struct in returned 'payload_version' field.
> >> + *
> >> + * Libndctl on receiving the envelope back from papr_scm again checks the
> >> + * 'payload_version' field and based on it use the appropriate version dsm
> >> + * struct to parse the results.
> >> + *
> >> + * Backward Compatibility:
> >> + *
> >> + * Above scheme of exchanging different versioned PDSM struct between libndctl
> >> + * and papr_scm should provide backward compatibility until following two
> >> + * assumptions/conditions when defining new PDSM structs hold:
> >> + *
> >> + * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
> >> + *
> >> + * 1. T(X) is a proper subset of T(Y) if Y > X.
> >> + *    i.e Each new version of PDSM struct should retain existing struct
> >> + *    attributes from previous version
> >> + *
> >> + * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
> >> + *    it should also support T(1), T(2)...T(X - 1).
> >> + *    i.e When adding support for new version of a PDSM struct, libndctl
> >> + *    and papr_scm should retain support of the existing PDSM struct
> >> + *    version they support.
> >
> > Please see this thread for an example why versions are a bad idea in UAPIs:
> >
> > https://lkml.org/lkml/2020/3/26/213
> >
> 
> > While the use of version is different in that thread the fundamental issues are
> > the same.  You end up with some weird matrix of supported features and
> > structure definitions.  For example, you are opening up the possibility of
> > changing structures with a different version for no good reason.
> 
> Not really sure I understand the statement correctly "you are opening up
> the possibility of changing structures with a different version for no
> good reason."

What I mean is:

struct v1 {
	u32 x;
	u32 y;
};

struct v2 {
	u32 y;
	u32 x;
};

x and y are the same data but you have now redefined the order of the struct.
You don't need that flexibility/complexity.

Generally I think you are defining:

struct v1 {
	u32 x;
	u32 y;
};

struct v2 {
	u32 x;
	u32 y;
	u32 z;
	u32 a;
};

Which becomes 2 structures...  There is no need.

The easiest thing to do is:

struct user_data {
	u32 x;
	u32 y;
};

And later you modify user_data to:

struct user_data {
	u32 x;
	u32 y;
	u32 z;
	u32 a;
};

libndctl always passes sizeof(struct user_data) to the call. [Do ensure
structures are 64bit aligned for this to work.]

The kernel sees the size and returns the amount of data up to that size.

Therefore, older kernels automatically fill in x and y,  newer kernels fill in
z/a if the buffer was big enough.  libndctl only uses the fields it knows about.

It is _much_ easier this way.  Almost nothing needs to get changed as versions
roll forward.  The only big issue is if libndctl _needs_ z then it has to check
if z is returned.

In that case add a cap_mask with bit fields which the kernel can fill in for
which fields are valid.

struct user_data {
	u64 cap_mask;  /* where bits define extra future capabilities */
	u32 x;
	u32 y;
};

IFF you need to add data within fields which are reserved you can use
capability flags to indicate which fields are requested and which are returned
by the kernel.

But I _think_ for what you want libndctl must survive if z/a are not available
right?  So just adding to the structure should be fine.

> We want to return more data in the struct in future iterations. So
> 'changing structure with different version' is something we are
> expecting. 
> 
> With the backward compatibility constraints 1 & 2 above, it will ensure
> that support matrix looks like a lower traingular matrix with each
> successive version supporting previous version attributes. So supporting
> future versions is relatively simplified.

But you end up with weird switch/if's etc to deal with the multiple structures.

With the size method the kernel simply returns the same size data as the user
requested and everything is done.  No logic required at all.  Literally it can
just copy the data it has (truncating if necessary).

> 
> >
> > Also having the user query with version Z and get back version X (older) is
> > odd.  Generally if the kernel does not know about a feature (ie version Z of
> > the structure) it should return -EINVAL and let the user figure out what to do.
> > The user may just give up or they could try a different query.
> >
> Considering the flow of ndctl/libndctl this is needed. libndctl will
> usually issues only one CMD_CALL ioctl to kernel and if that fails then
> an error is reported and ndctl will exit loosing state.
> 
> Adding mechanism in libndctl to reissue CMD_CALL ioctl to fetch a
> appropriate version of pdsm struct is going to be considerably more
> work.
> 
> This version fall-back mechanism, ensures that libndctl will receive
> usable data without having to reissue a more CMD_CALL ioctls.

Define usable?

What happens if libndctl does not get 'z' in my example above?  What does it
do?  If I understand correctly it does not _need_ z.  So why have a check on
the version from the kernel?

What if we change to:

struct v3 {
	u32 x;
	u32 y;
	u32 z;
	u32 a;
	u32 b;
	u32 c;
};

Now it has to

	if(version 2)
		z/a valid do something()

	if(version 3)
		b/c valid do something else()

if z, a, b, c are all 0 does it matter?

If not, the logic above disappears.

If so, then you need a cap mask.  Then the kernel can say c and a are valid
(but c is 0) or other flexible stuff like that.

> 
> >> + */
> >> +
> >> +/* PDSM-header + payload expected with ND_CMD_CALL ioctl from libnvdimm */
> >> +struct nd_pdsm_cmd_pkg {
> >> +	struct nd_cmd_pkg hdr;	/* Package header containing sub-cmd */
> >> +	__s32 cmd_status;	/* Out: Sub-cmd status returned back */
> >> +	__u16 reserved[5];	/* Ignored and to be used in future */
> >
> > How do you know when reserved is used for something else in the future?  Is
> > reserved guaranteed (and checked by the code) to be 0?
> 
> For current set of pdsm requests ignore these reserved fields. However a
> future pdsm request can leverage these reserved fields. So papr_scm
> just bind the usage of these fields with the value of
> 'nd_cmd_pkg.nd_command' that indicates the pdsm request.
> 
> That being said checking if the reserved fields are set to 0 will be a
> good measure. Will add this check in next iteration.

Exactly, if you don't check them now you will end up with an older libndctl
which passes in garbage and breaks future users...  Basically rendering the
reserved fields useless.

> 
> >
> >> +	__u16 payload_version;	/* In/Out: version of the payload */
> >
> > Why is payload_version after reserved?
> Want to place the payload version field just before the payload data so
> that it can be accessed with simple pointer arithmetic.

I did not see that in the patch.  I thought you were using
nd_pdsm_cmd_pkg->payload_version?

> 
> >
> >> +	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
> >> +} __packed;
> >> +
> >> +/*
> >> + * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
> >> + * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
> >> + */
> >> +enum papr_pdsm {
> >> +	PAPR_PDSM_MIN = 0x0,
> >> +	PAPR_PDSM_MAX,
> >> +};
> >> +
> >> +/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */
> >> +static inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct nd_cmd_pkg *cmd)
> >> +{
> >> +	return (struct nd_pdsm_cmd_pkg *) cmd;
> >> +}
> >> +
> >> +/* Return the payload pointer for a given pcmd */
> >> +static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
> >> +{
> >> +	if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
> >> +		return NULL;
> >> +	else
> >> +		return (void *)(pcmd->payload);
> >> +}
> >> +
> >> +#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> >> index 149431594839..5e2237e7ec08 100644
> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> >> @@ -15,13 +15,15 @@
> >>  #include <linux/seq_buf.h>
> >>  
> >>  #include <asm/plpar_wrappers.h>
> >> +#include <asm/papr_pdsm.h>
> >>  
> >>  #define BIND_ANY_ADDR (~0ul)
> >>  
> >>  #define PAPR_SCM_DIMM_CMD_MASK \
> >>  	((1ul << ND_CMD_GET_CONFIG_SIZE) | \
> >>  	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
> >> -	 (1ul << ND_CMD_SET_CONFIG_DATA))
> >> +	 (1ul << ND_CMD_SET_CONFIG_DATA) | \
> >> +	 (1ul << ND_CMD_CALL))
> >>  
> >>  /* DIMM health bitmap bitmap indicators */
> >>  /* SCM device is unable to persist memory contents */
> >> @@ -350,16 +352,97 @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
> >>  	return 0;
> >>  }
> >>  
> >> +/*
> >> + * Validate the inputs args to dimm-control function and return '0' if valid.
> >> + * This also does initial sanity validation to ND_CMD_CALL sub-command packages.
> >> + */
> >> +static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >> +		       unsigned int buf_len)
> >> +{
> >> +	unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
> >> +	struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
> >> +	struct papr_scm_priv *p;
> >> +
> >> +	/* Only dimm-specific calls are supported atm */
> >> +	if (!nvdimm)
> >> +		return -EINVAL;
> >> +
> >> +	/* get the provider date from struct nvdimm */
> >
> > s/date/data
> Thanks for point this out. Will fix this in next iteration.
> 
> >
> >> +	p = nvdimm_provider_data(nvdimm);
> >> +
> >> +	if (!test_bit(cmd, &cmd_mask)) {
> >> +		dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
> >> +		return -EINVAL;
> >> +	} else if (cmd == ND_CMD_CALL) {
> >> +
> >> +		/* Verify the envelope package */
> >> +		if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
> >> +			dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
> >> +				buf_len);
> >> +			return -EINVAL;
> >> +		}
> >> +
> >> +		/* Verify that the PDSM family is valid */
> >> +		if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
> >> +			dev_dbg(&p->pdev->dev, "Invalid pkg family=0x%llx\n",
> >> +				pkg->hdr.nd_family);
> >> +			return -EINVAL;
> >> +
> >> +		}
> >> +
> >> +		/* We except a payload with all PDSM commands */
> >> +		if (pdsm_cmd_to_payload(pkg) == NULL) {
> >> +			dev_dbg(&p->pdev->dev,
> >> +				"Empty payload for sub-command=0x%llx\n",
> >> +				pkg->hdr.nd_command);
> >> +			return -EINVAL;
> >> +		}
> >> +	}
> >> +
> >> +	/* Command looks valid */
> >
> > I assume the first command to be implemented also checks the { nd_command,
> > payload_version, payload length } for correctness?
> Yes the pdsm service functions do check the payload_version and
> payload_length. Please see the papr_pdsm_health() that services the
> PAPR_PDSM_HEALTH pdsm in Patch-5
> 

cool.

> >
> >> +	return 0;
> >> +}
> >> +
> >> +static int papr_scm_service_pdsm(struct papr_scm_priv *p,
> >> +				struct nd_pdsm_cmd_pkg *call_pkg)
> >> +{
> >> +	/* unknown subcommands return error in packages */
> >> +	if (call_pkg->hdr.nd_command <= PAPR_PDSM_MIN ||
> >> +	    call_pkg->hdr.nd_command >= PAPR_PDSM_MAX) {
> >> +		dev_dbg(&p->pdev->dev, "Invalid PDSM request 0x%llx\n",
> >> +			call_pkg->hdr.nd_command);
> >> +		call_pkg->cmd_status = -EINVAL;
> >> +		return 0;
> >> +	}
> >> +
> >> +	/* Depending on the DSM command call appropriate service routine */
> >> +	switch (call_pkg->hdr.nd_command) {
> >> +	default:
> >> +		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
> >> +			call_pkg->hdr.nd_command);
> >> +		call_pkg->cmd_status = -ENOENT;
> >> +		return 0;
> >> +	}
> >> +}
> >> +
> >>  static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >>  			  struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >>  			  unsigned int buf_len, int *cmd_rc)
> >>  {
> >>  	struct nd_cmd_get_config_size *get_size_hdr;
> >>  	struct papr_scm_priv *p;
> >> +	struct nd_pdsm_cmd_pkg *call_pkg = NULL;
> >> +	int rc;
> >>  
> >> -	/* Only dimm-specific calls are supported atm */
> >> -	if (!nvdimm)
> >> -		return -EINVAL;
> >> +	/* Use a local variable in case cmd_rc pointer is NULL */
> >> +	if (cmd_rc == NULL)
> >> +		cmd_rc = &rc;
> >
> > Why is this needed?  AFAICT The caller of papr_scm_ndctl does not specify null
> > and you did not change it.
> This pointer is coming from outside the papr_scm code hence need to be
> defensive here. Also as per[1] cmd_rc is "translation of firmware status"
> and not every caller would need it hence making this pointer optional.
> 
> This is evident in acpi_nfit_blk_get_flags() where the 'nd_desc->ndctl'
> is called with 'cmd_rc == NULL'.
> 
> [1] https://lore.kernel.org/linux-nvdimm/CAPcyv4hE_FG0YZXJVA1G=CBq8b9e0K54jxk5Sq5UKU-dnWT2Kg@mail.gmail.com/

Ah... Ok.  So this is a bug fix which needs to happen regardless of the status
of this patch...

> 
> >
> >> +
> >> +	*cmd_rc = is_cmd_valid(nvdimm, cmd, buf, buf_len);
> >> +	if (*cmd_rc) {
> >> +		pr_debug("Invalid cmd=0x%x. Err=%d\n", cmd, *cmd_rc);
> >> +		return *cmd_rc;
> >> +	}
> >>  
> >>  	p = nvdimm_provider_data(nvdimm);
> >>  
> >> @@ -381,13 +464,19 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >>  		*cmd_rc = papr_scm_meta_set(p, buf);

... Because this will break here. even without this new code...  right?

Lets get this fix in as a prelim-patch.

> >>  		break;
> >>  
> >> +	case ND_CMD_CALL:
> >> +		call_pkg = nd_to_pdsm_cmd_pkg(buf);
> >> +		*cmd_rc = papr_scm_service_pdsm(p, call_pkg);
> >> +		break;
> >> +
> >>  	default:
> >> -		return -EINVAL;
> >> +		dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
> >> +		*cmd_rc = -EINVAL;
> >
> > Is this change related?  If there is a bug where there is a caller of
> > papr_scm_ndctl() with cmd_rc == NULL this should be a separate patch to fix
> > that issue.
> This simplifies a bit debugging of errors reported in
> papr_scm_ndctl() as it ensures that subsequest dev_dbg "Returned with
> cmd_rc" is always logged.
> 
> I think, this is a too small change to be carved out as an independent
> patch. Also this doesnt change the behaviour of the code except logging
> some more error info.
> 
> However, If you feel too strongly about it I will spin a separate patch
> in this patch series for this.

This can go in as part of a 'protect against cmd_rc == NULL' preliminary patch.

I flagged this because at first I could not figure out what this had to do with
the ND_CMD_CALL...

For reviewers you want to make your patches concise to what you are
fixing/adding...

Also, based on acpi_nfit_blk_get_flags() using cmd_rc == NULL it looks like we
have a bug which needs to get fixed regardless of the this patch.  And if that
bug exists in earlier kernels you will need a separate patch to backport as a
fix.

So lets get that in first and separate...  :-D

Ira

> 
> >
> > Ira
> >
> >>  	}
> >>  
> >>  	dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc);
> >>  
> >> -	return 0;
> >> +	return *cmd_rc;
> >>  }
> >>  
> >>  static ssize_t flags_show(struct device *dev,
> >> diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
> >> index de5d90212409..0e09dc5cec19 100644
> >> --- a/include/uapi/linux/ndctl.h
> >> +++ b/include/uapi/linux/ndctl.h
> >> @@ -244,6 +244,7 @@ struct nd_cmd_pkg {
> >>  #define NVDIMM_FAMILY_HPE2 2
> >>  #define NVDIMM_FAMILY_MSFT 3
> >>  #define NVDIMM_FAMILY_HYPERV 4
> >> +#define NVDIMM_FAMILY_PAPR 5
> >>  
> >>  #define ND_IOCTL_CALL			_IOWR(ND_IOCTL, ND_CMD_CALL,\
> >>  					struct nd_cmd_pkg)
> >> -- 
> >> 2.26.2
> >> 
> 
> -- 
> Cheers
> ~ Vaibhav

^ permalink raw reply

* Re: Re: [RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'
From: Scott Branden @ 2020-06-03 21:32 UTC (permalink / raw)
  To: Bhupesh Sharma
  Cc: Mark Rutland, x86, Linux Doc Mailing List, Catalin Marinas,
	Ard Biesheuvel, kexec mailing list, Linux Kernel Mailing List,
	linuxppc-dev, Kazuhito Hagio, James Morse, Dave Anderson,
	Bhupesh SHARMA, Will Deacon, linux-arm-kernel, Steve Capper
In-Reply-To: <CACi5LpPXdcJ7AmFWiSyM8rG_+7C=wTqiP0oCa9QAPe0Y0_wH=Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1113 bytes --]

Hi Bhupesh,

On Wed, 3 Jun 2020 at 13:39, Bhupesh Sharma <bhsharma@redhat.com> wrote:

> Hello Scott,
>
> On Thu, Jun 4, 2020 at 12:17 AM Scott Branden
> <scott.branden@broadcom.com> wrote:
> >
> > Hi Bhupesh,
> >
> > Would be great to get this patch series upstreamed?
> >
> > On 2019-12-25 10:49 a.m., Bhupesh Sharma wrote:
> > > Hi James,
> > >
> > > On 12/12/2019 04:02 PM, James Morse wrote:
> > >> Hi Bhupesh,
> > >
> > > I am sorry this review mail skipped my attention due to holidays and
> > > focus on other urgent issues.
> > >
>
<SNIP>

> > >
> > > Ok, got it. Fixed this in v6, which should be on its way shortly.
> > I can't seem to find v6?
>
> Oops. I remember Cc'ing you to the v6 patchset (may be my email client
> messed up), anyways here is the v6 patchset for your reference:
> <http://lists.infradead.org/pipermail/kexec/2020-May/025095.html>
>
> Do share your review/test comments on the same.
>
I found the cover letter but didn't get the patches.  Thanks for sending
link.
v5 worked fine for us. Will get someone to try out v6 and let you know.

>
> Thanks,
> Bhupesh
>
> Regards,
 Scott

[-- Attachment #2: Type: text/html, Size: 2119 bytes --]

^ permalink raw reply

* [PATCH] net: ethernet: freescale: remove unneeded include for ucc_geth
From: Valentin Longchamp @ 2020-06-03 21:28 UTC (permalink / raw)
  To: netdev; +Cc: kuba, Valentin Longchamp, linuxppc-dev, leoyang.li

net/sch_generic.h does not need to be included, remove it.

Signed-off-by: Valentin Longchamp <valentin@longchamp.me>
---
 drivers/net/ethernet/freescale/ucc_geth.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.c b/drivers/net/ethernet/freescale/ucc_geth.c
index 552e7554a9f8..db791f60b884 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -42,7 +42,6 @@
 #include <soc/fsl/qe/ucc.h>
 #include <soc/fsl/qe/ucc_fast.h>
 #include <asm/machdep.h>
-#include <net/sch_generic.h>
 
 #include "ucc_geth.h"
 
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH] arch/{mips,sparc,microblaze,powerpc}: Don't enable pagefault/preempt twice
From: Ira Weiny @ 2020-06-03 21:14 UTC (permalink / raw)
  To: Andrew Morton, Mike Rapoport, Guenter Roeck
  Cc: Peter Zijlstra, Dave Hansen, dri-devel, linux-mips,
	James E.J. Bottomley, Max Filippov, Paul Mackerras,
	H. Peter Anvin, sparclinux, Thomas Gleixner, Helge Deller, x86,
	linux-csky, Christoph Hellwig, Ingo Molnar, linux-snps-arc,
	linux-xtensa, Borislav Petkov, Al Viro, Andy Lutomirski,
	Dan Williams, linux-arm-kernel, Chris Zankel, Thomas Bogendoerfer,
	linux-parisc, linux-kernel, Christian Koenig, linuxppc-dev,
	David S. Miller
In-Reply-To: <20200603135736.e7b5ded0082a81ae6d9067a0@linux-foundation.org>

On Wed, Jun 03, 2020 at 01:57:36PM -0700, Andrew Morton wrote:
> On Thu, 21 May 2020 10:42:50 -0700 Ira Weiny <ira.weiny@intel.com> wrote:
> 
> > > > 
> > > > Actually it occurs to me that the patch consolidating kmap_prot is odd for
> > > > sparc 32 bit...
> > > > 
> > > > Its a long shot but could you try reverting this patch?
> > > > 
> > > > 4ea7d2419e3f kmap: consolidate kmap_prot definitions
> > > > 
> > > 
> > > That is not easy to revert, unfortunately, due to several follow-up patches.
> > 
> > I have gotten your sparc tests to run and they all pass...
> > 
> > 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
> > Build reference: v5.7-rc4-17-g852b6f2edc0f
> > 
> > Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
> > Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
> > Building sparc32:SS-4:nosmp:initrd ... running ......... passed
> > Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
> > Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
> > Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
> > Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
> > Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
> > Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
> > Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
> > Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
> > 
> > Is there another test I need to run?
> 
> This all petered out, but as I understand it, this patchset still might
> have issues on various architectures.
> 
> Can folks please provide an update on the testing status?

I believe the tests were failing for Guenter due to another patch set...[1]

My tests with just this series are working.

From my understanding the other failures were unrelated.[2]

	<quote Mike Rapoport>
	I've checked the patch above on top of the mmots which already has
	Ira's patches and it booted fine. I've used sparc32_defconfig to build
	the kernel and qemu-system-sparc with default machine and CPU.
	</quote>

Mike, am I wrong?  Do you think the kmap() patches are still causing issues?

Ira

[1] https://lore.kernel.org/lkml/2807E5FD2F6FDA4886F6618EAC48510E92EAB1DE@CRSMSX101.amr.corp.intel.com/
[2] https://lore.kernel.org/lkml/20200520195110.GH1118872@kernel.org/


^ permalink raw reply

* Re: [PATCH] arch/{mips,sparc,microblaze,powerpc}: Don't enable pagefault/preempt twice
From: Andrew Morton @ 2020-06-03 20:57 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Peter Zijlstra, Dave Hansen, dri-devel, linux-mips,
	James E.J. Bottomley, Max Filippov, Paul Mackerras,
	H. Peter Anvin, sparclinux, Thomas Gleixner, Helge Deller, x86,
	linux-csky, Christoph Hellwig, Ingo Molnar, linux-snps-arc,
	Guenter Roeck, linux-xtensa, Borislav Petkov, Al Viro,
	Andy Lutomirski, Dan Williams, linux-arm-kernel, Chris Zankel,
	Thomas Bogendoerfer, linux-parisc, linux-kernel, Christian Koenig,
	linuxppc-dev, David S. Miller
In-Reply-To: <20200521174250.GB176262@iweiny-DESK2.sc.intel.com>

On Thu, 21 May 2020 10:42:50 -0700 Ira Weiny <ira.weiny@intel.com> wrote:

> > > 
> > > Actually it occurs to me that the patch consolidating kmap_prot is odd for
> > > sparc 32 bit...
> > > 
> > > Its a long shot but could you try reverting this patch?
> > > 
> > > 4ea7d2419e3f kmap: consolidate kmap_prot definitions
> > > 
> > 
> > That is not easy to revert, unfortunately, due to several follow-up patches.
> 
> I have gotten your sparc tests to run and they all pass...
> 
> 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
> Build reference: v5.7-rc4-17-g852b6f2edc0f
> 
> Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
> Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
> Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
> Building sparc32:SS-4:nosmp:initrd ... running ......... passed
> Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
> Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
> Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
> Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
> Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
> Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
> Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
> Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
> Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
> Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
> Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
> 
> Is there another test I need to run?

This all petered out, but as I understand it, this patchset still might
have issues on various architectures.

Can folks please provide an update on the testing status?

^ permalink raw reply

* Re: Re: [RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'
From: Bhupesh Sharma @ 2020-06-03 20:38 UTC (permalink / raw)
  To: Scott Branden
  Cc: Mark Rutland, x86, Linux Doc Mailing List, Catalin Marinas,
	Ard Biesheuvel, kexec mailing list, Linux Kernel Mailing List,
	linuxppc-dev, Kazuhito Hagio, James Morse, Dave Anderson,
	Bhupesh SHARMA, Will Deacon, linux-arm-kernel, Steve Capper
In-Reply-To: <51606585-77a3-265f-1d4e-19f25a90697d@broadcom.com>

Hello Scott,

On Thu, Jun 4, 2020 at 12:17 AM Scott Branden
<scott.branden@broadcom.com> wrote:
>
> Hi Bhupesh,
>
> Would be great to get this patch series upstreamed?
>
> On 2019-12-25 10:49 a.m., Bhupesh Sharma wrote:
> > Hi James,
> >
> > On 12/12/2019 04:02 PM, James Morse wrote:
> >> Hi Bhupesh,
> >
> > I am sorry this review mail skipped my attention due to holidays and
> > focus on other urgent issues.
> >
> >> On 29/11/2019 19:59, Bhupesh Sharma wrote:
> >>> Add documentation for TCR_EL1.T1SZ variable being added to
> >>> vmcoreinfo.
> >>>
> >>> It indicates the size offset of the memory region addressed by
> >>> TTBR1_EL1
> >>
> >>> and hence can be used for determining the vabits_actual value.
> >>
> >> used for determining random-internal-kernel-variable, that might not
> >> exist tomorrow.
> >>
> >> Could you describe how this is useful/necessary if a debugger wants
> >> to walk the page
> >> tables from the core file? I think this is a better argument.
> >>
> >> Wouldn't the documentation be better as part of the patch that adds
> >> the export?
> >> (... unless these have to go via different trees? ..)
> >
> > Ok, will fix the same in v6 version.
> >
> >>> diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst
> >>> b/Documentation/admin-guide/kdump/vmcoreinfo.rst
> >>> index 447b64314f56..f9349f9d3345 100644
> >>> --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
> >>> +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
> >>> @@ -398,6 +398,12 @@ KERNELOFFSET
> >>>   The kernel randomization offset. Used to compute the page offset. If
> >>>   KASLR is disabled, this value is zero.
> >>>   +TCR_EL1.T1SZ
> >>> +------------
> >>> +
> >>> +Indicates the size offset of the memory region addressed by TTBR1_EL1
> >>
> >>> +and hence can be used for determining the vabits_actual value.
> >>
> >> 'vabits_actual' may not exist when the next person comes to read this
> >> documentation (its
> >> going to rot really quickly).
> >>
> >> I think the first half of this text is enough to say what this is
> >> for. You should include
> >> words to the effect that its the hardware value that goes with
> >> swapper_pg_dir. You may
> >> want to point readers to the arm-arm for more details on what the
> >> value means.
> >
> > Ok, got it. Fixed this in v6, which should be on its way shortly.
> I can't seem to find v6?

Oops. I remember Cc'ing you to the v6 patchset (may be my email client
messed up), anyways here is the v6 patchset for your reference:
<http://lists.infradead.org/pipermail/kexec/2020-May/025095.html>

Do share your review/test comments on the same.

Thanks,
Bhupesh


^ permalink raw reply

* [PATCH] ibmvscsi: don't send host info in adapter info MAD after LPM
From: Tyrel Datwyler @ 2020-06-03 20:36 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, brking, linuxppc-dev, linux-scsi, martin.petersen

The adatper info MAD is used to send the client info and receive the
host info as a response. A peristent buffer is used and as such the
client info is overwritten after the response. During the course of
a normal adapter reset the client info is refreshed in the buffer in
preparation for sending the adapter info MAD.

However, in the special case of LPM where we reenable the CRQ instead
of a full CRQ teardown and reset we fail to refresh the client info in
the adapter info buffer. As a result after Live Partition Migration
(LPM) we erroneously report the hosts info as our own.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 59f0f1030c54..c5711c659b51 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -415,6 +415,8 @@ static int ibmvscsi_reenable_crq_queue(struct crq_queue *queue,
 	int rc = 0;
 	struct vio_dev *vdev = to_vio_dev(hostdata->dev);
 
+	set_adapter_info(hostdata);
+
 	/* Re-enable the CRQ */
 	do {
 		if (rc)
-- 
2.26.1


^ permalink raw reply related

* Re: [RESEND PATCH v9 4/5] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
From: Vaibhav Jain @ 2020-06-03 20:28 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Aneesh Kumar K . V, linuxppc-dev, linux-kernel, Steven Rostedt,
	linux-nvdimm
In-Reply-To: <20200602210553.GG1505637@iweiny-DESK2.sc.intel.com>

Hi Ira,

Thanks again for reviewing this patch. My Response below:

Ira Weiny <ira.weiny@intel.com> writes:

> On Tue, Jun 02, 2020 at 01:51:49PM -0700, 'Ira Weiny' wrote:
>> On Tue, Jun 02, 2020 at 03:44:37PM +0530, Vaibhav Jain wrote:
>
> ...
>
>> > +
>> > +/*
>> > + * PDSM Envelope:
>> > + *
>> > + * The ioctl ND_CMD_CALL transfers data between user-space and kernel via
>> > + * envelope which consists of a header and user-defined payload sections.
>> > + * The header is described by 'struct nd_pdsm_cmd_pkg' which expects a
>> > + * payload following it and accessible via 'nd_pdsm_cmd_pkg.payload' field.
>> > + * There is reserved field that can used to introduce new fields to the
>> > + * structure in future. It also tries to ensure that 'nd_pdsm_cmd_pkg.payload'
>> > + * lies at a 8-byte boundary.
>> > + *
>> > + *  +-------------+---------------------+---------------------------+
>> > + *  |   64-Bytes  |       16-Bytes      |       Max 176-Bytes       |
>> > + *  +-------------+---------------------+---------------------------+
>> > + *  |               nd_pdsm_cmd_pkg     |                           |
>> > + *  |-------------+                     |                           |
>> > + *  |  nd_cmd_pkg |                     |                           |
>> > + *  +-------------+---------------------+---------------------------+
>> > + *  | nd_family   |                     |                           |
>> > + *  | nd_size_out | cmd_status          |                           |
>> > + *  | nd_size_in  | payload_version     |     payload               |
>> > + *  | nd_command  | reserved            |                           |
>> > + *  | nd_fw_size  |                     |                           |
>> > + *  +-------------+---------------------+---------------------------+
>
> One more comment WRT nd_size_[in|out].  I know that it is defined as the size
> of the FW payload but normally when you nest headers 'size' in Header A
> represents everything after Header A, including Header B.  In this case that
> would be including nd_pdsm_cmd_pkg...
>
> It looks like that is not what you have done?  Or perhaps I missed it?
>
Not sure if I understand the question correctly.
'struct nd_pdsm_cmd_pkg' contains 'struct nd_cmd_pkg' at its head and
its size_[in|out] are populated by the libndctl in userspace, setting
them to data following the 'struct nd_cmd_pkg'.

Copying of 'struct nd_cmd_pkg' to the input/out envelop is implicitly
done in __nd_ioctl via the command descriptor array
__nd_cmd_bus_descs. So I dont need to add the size of 'struct
nd_cmd_pkg' to nd_size_[in|out].

> Ira
>
>> > + *
>> > + * PDSM Header:
>> > + *
>> > + * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
>> > + * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to member
>> > + * 'nd_cmd_pkg.nd_command'. Apart from size information of the envelope which is
>> > + * contained in 'struct nd_cmd_pkg', the header also has members following 
>>                                                              ^^^^^
>>                                                         ...  the  ...
>> 
>> > + * members:
>> > + *
>> > + * 'cmd_status'		: (Out) Errors if any encountered while servicing PDSM.
>> > + * 'payload_version'	: (In/Out) Version number associated with the payload.
>> > + * 'reserved'		: Not used and reserved for future.
>> > + *
>> > + * PDSM Payload:
>> > + *
>> > + * The layout of the PDSM Payload is defined by various structs shared between
>> > + * papr_scm and libndctl so that contents of payload can be interpreted. During
>> > + * servicing of a PDSM the papr_scm module will read input args from the payload
>> > + * field by casting its contents to an appropriate struct pointer based on the
>> > + * PDSM command. Similarly the output of servicing the PDSM command will be
>> > + * copied to the payload field using the same struct.
>> > + *
>> > + * 'libnvdimm' enforces a hard limit of 256 bytes on the envelope size, which
>> > + * leaves around 176 bytes for the envelope payload (ignoring any padding that
>> > + * the compiler may silently introduce).
>> > + *
>> > + * Payload Version:
>> > + *
>> > + * A 'payload_version' field is present in PDSM header that indicates a specific
>> > + * version of the structure present in PDSM Payload for a given PDSM command.
>> > + * This provides backward compatibility in case the PDSM Payload structure
>> > + * evolves and different structures are supported by 'papr_scm' and 'libndctl'.
>> > + *
>> > + * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the version
>> > + * of the payload struct it supports via 'payload_version' field. The 'papr_scm'
>> > + * module when servicing the PDSM envelope checks the 'payload_version' and then
>> > + * uses 'payload struct version' == MIN('payload_version field',
>> > + * 'max payload-struct-version supported by papr_scm') to service the PDSM.
>> > + * After servicing the PDSM, 'papr_scm' put the negotiated version of payload
>> > + * struct in returned 'payload_version' field.
>> > + *
>> > + * Libndctl on receiving the envelope back from papr_scm again checks the
>> > + * 'payload_version' field and based on it use the appropriate version dsm
>> > + * struct to parse the results.
>> > + *
>> > + * Backward Compatibility:
>> > + *
>> > + * Above scheme of exchanging different versioned PDSM struct between libndctl
>> > + * and papr_scm should provide backward compatibility until following two
>> > + * assumptions/conditions when defining new PDSM structs hold:
>> > + *
>> > + * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
>> > + *
>> > + * 1. T(X) is a proper subset of T(Y) if Y > X.
>> > + *    i.e Each new version of PDSM struct should retain existing struct
>> > + *    attributes from previous version
>> > + *
>> > + * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
>> > + *    it should also support T(1), T(2)...T(X - 1).
>> > + *    i.e When adding support for new version of a PDSM struct, libndctl
>> > + *    and papr_scm should retain support of the existing PDSM struct
>> > + *    version they support.
>> 
>> Please see this thread for an example why versions are a bad idea in UAPIs:
>> 
>> https://lkml.org/lkml/2020/3/26/213
>> 
>> While the use of version is different in that thread the fundamental issues are
>> the same.  You end up with some weird matrix of supported features and
>> structure definitions.  For example, you are opening up the possibility of
>> changing structures with a different version for no good reason.
>> 
>> Also having the user query with version Z and get back version X (older) is
>> odd.  Generally if the kernel does not know about a feature (ie version Z of
>> the structure) it should return -EINVAL and let the user figure out what to do.
>> The user may just give up or they could try a different query.
>> 
>> > + */
>> > +
>> > +/* PDSM-header + payload expected with ND_CMD_CALL ioctl from libnvdimm */
>> > +struct nd_pdsm_cmd_pkg {
>> > +	struct nd_cmd_pkg hdr;	/* Package header containing sub-cmd */
>> > +	__s32 cmd_status;	/* Out: Sub-cmd status returned back */
>> > +	__u16 reserved[5];	/* Ignored and to be used in future */
>> 
>> How do you know when reserved is used for something else in the future?  Is
>> reserved guaranteed (and checked by the code) to be 0?
>> 
>> > +	__u16 payload_version;	/* In/Out: version of the payload */
>> 
>> Why is payload_version after reserved?
>> 
>> > +	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
>> > +} __packed;
>> > +
>> > +/*
>> > + * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
>> > + * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
>> > + */
>> > +enum papr_pdsm {
>> > +	PAPR_PDSM_MIN = 0x0,
>> > +	PAPR_PDSM_MAX,
>> > +};
>> > +
>> > +/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */
>> > +static inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct nd_cmd_pkg *cmd)
>> > +{
>> > +	return (struct nd_pdsm_cmd_pkg *) cmd;
>> > +}
>> > +
>> > +/* Return the payload pointer for a given pcmd */
>> > +static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
>> > +{
>> > +	if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
>> > +		return NULL;
>> > +	else
>> > +		return (void *)(pcmd->payload);
>> > +}
>> > +
>> > +#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
>> > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>> > index 149431594839..5e2237e7ec08 100644
>> > --- a/arch/powerpc/platforms/pseries/papr_scm.c
>> > +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>> > @@ -15,13 +15,15 @@
>> >  #include <linux/seq_buf.h>
>> >  
>> >  #include <asm/plpar_wrappers.h>
>> > +#include <asm/papr_pdsm.h>
>> >  
>> >  #define BIND_ANY_ADDR (~0ul)
>> >  
>> >  #define PAPR_SCM_DIMM_CMD_MASK \
>> >  	((1ul << ND_CMD_GET_CONFIG_SIZE) | \
>> >  	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
>> > -	 (1ul << ND_CMD_SET_CONFIG_DATA))
>> > +	 (1ul << ND_CMD_SET_CONFIG_DATA) | \
>> > +	 (1ul << ND_CMD_CALL))
>> >  
>> >  /* DIMM health bitmap bitmap indicators */
>> >  /* SCM device is unable to persist memory contents */
>> > @@ -350,16 +352,97 @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
>> >  	return 0;
>> >  }
>> >  
>> > +/*
>> > + * Validate the inputs args to dimm-control function and return '0' if valid.
>> > + * This also does initial sanity validation to ND_CMD_CALL sub-command packages.
>> > + */
>> > +static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
>> > +		       unsigned int buf_len)
>> > +{
>> > +	unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
>> > +	struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
>> > +	struct papr_scm_priv *p;
>> > +
>> > +	/* Only dimm-specific calls are supported atm */
>> > +	if (!nvdimm)
>> > +		return -EINVAL;
>> > +
>> > +	/* get the provider date from struct nvdimm */
>> 
>> s/date/data
>> 
>> > +	p = nvdimm_provider_data(nvdimm);
>> > +
>> > +	if (!test_bit(cmd, &cmd_mask)) {
>> > +		dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
>> > +		return -EINVAL;
>> > +	} else if (cmd == ND_CMD_CALL) {
>> > +
>> > +		/* Verify the envelope package */
>> > +		if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
>> > +			dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
>> > +				buf_len);
>> > +			return -EINVAL;
>> > +		}
>> > +
>> > +		/* Verify that the PDSM family is valid */
>> > +		if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
>> > +			dev_dbg(&p->pdev->dev, "Invalid pkg family=0x%llx\n",
>> > +				pkg->hdr.nd_family);
>> > +			return -EINVAL;
>> > +
>> > +		}
>> > +
>> > +		/* We except a payload with all PDSM commands */
>> > +		if (pdsm_cmd_to_payload(pkg) == NULL) {
>> > +			dev_dbg(&p->pdev->dev,
>> > +				"Empty payload for sub-command=0x%llx\n",
>> > +				pkg->hdr.nd_command);
>> > +			return -EINVAL;
>> > +		}
>> > +	}
>> > +
>> > +	/* Command looks valid */
>> 
>> I assume the first command to be implemented also checks the { nd_command,
>> payload_version, payload length } for correctness?
>> 
>> > +	return 0;
>> > +}
>> > +
>> > +static int papr_scm_service_pdsm(struct papr_scm_priv *p,
>> > +				struct nd_pdsm_cmd_pkg *call_pkg)
>> > +{
>> > +	/* unknown subcommands return error in packages */
>> > +	if (call_pkg->hdr.nd_command <= PAPR_PDSM_MIN ||
>> > +	    call_pkg->hdr.nd_command >= PAPR_PDSM_MAX) {
>> > +		dev_dbg(&p->pdev->dev, "Invalid PDSM request 0x%llx\n",
>> > +			call_pkg->hdr.nd_command);
>> > +		call_pkg->cmd_status = -EINVAL;
>> > +		return 0;
>> > +	}
>> > +
>> > +	/* Depending on the DSM command call appropriate service routine */
>> > +	switch (call_pkg->hdr.nd_command) {
>> > +	default:
>> > +		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
>> > +			call_pkg->hdr.nd_command);
>> > +		call_pkg->cmd_status = -ENOENT;
>> > +		return 0;
>> > +	}
>> > +}
>> > +
>> >  static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
>> >  			  struct nvdimm *nvdimm, unsigned int cmd, void *buf,
>> >  			  unsigned int buf_len, int *cmd_rc)
>> >  {
>> >  	struct nd_cmd_get_config_size *get_size_hdr;
>> >  	struct papr_scm_priv *p;
>> > +	struct nd_pdsm_cmd_pkg *call_pkg = NULL;
>> > +	int rc;
>> >  
>> > -	/* Only dimm-specific calls are supported atm */
>> > -	if (!nvdimm)
>> > -		return -EINVAL;
>> > +	/* Use a local variable in case cmd_rc pointer is NULL */
>> > +	if (cmd_rc == NULL)
>> > +		cmd_rc = &rc;
>> 
>> Why is this needed?  AFAICT The caller of papr_scm_ndctl does not specify null
>> and you did not change it.
>> 
>> > +
>> > +	*cmd_rc = is_cmd_valid(nvdimm, cmd, buf, buf_len);
>> > +	if (*cmd_rc) {
>> > +		pr_debug("Invalid cmd=0x%x. Err=%d\n", cmd, *cmd_rc);
>> > +		return *cmd_rc;
>> > +	}
>> >  
>> >  	p = nvdimm_provider_data(nvdimm);
>> >  
>> > @@ -381,13 +464,19 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
>> >  		*cmd_rc = papr_scm_meta_set(p, buf);
>> >  		break;
>> >  
>> > +	case ND_CMD_CALL:
>> > +		call_pkg = nd_to_pdsm_cmd_pkg(buf);
>> > +		*cmd_rc = papr_scm_service_pdsm(p, call_pkg);
>> > +		break;
>> > +
>> >  	default:
>> > -		return -EINVAL;
>> > +		dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
>> > +		*cmd_rc = -EINVAL;
>> 
>> Is this change related?  If there is a bug where there is a caller of
>> papr_scm_ndctl() with cmd_rc == NULL this should be a separate patch to fix
>> that issue.
>> 
>> Ira
>> 
>> >  	}
>> >  
>> >  	dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc);
>> >  
>> > -	return 0;
>> > +	return *cmd_rc;
>> >  }
>> >  
>> >  static ssize_t flags_show(struct device *dev,
>> > diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
>> > index de5d90212409..0e09dc5cec19 100644
>> > --- a/include/uapi/linux/ndctl.h
>> > +++ b/include/uapi/linux/ndctl.h
>> > @@ -244,6 +244,7 @@ struct nd_cmd_pkg {
>> >  #define NVDIMM_FAMILY_HPE2 2
>> >  #define NVDIMM_FAMILY_MSFT 3
>> >  #define NVDIMM_FAMILY_HYPERV 4
>> > +#define NVDIMM_FAMILY_PAPR 5
>> >  
>> >  #define ND_IOCTL_CALL			_IOWR(ND_IOCTL, ND_CMD_CALL,\
>> >  					struct nd_cmd_pkg)
>> > -- 
>> > 2.26.2
>> > 
>> _______________________________________________
>> Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
>> To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

-- 
Cheers
~ Vaibhav

^ permalink raw reply

* [powerpc:topic/ppc-kvm] BUILD SUCCESS bf8036a4098d1548cdccf9ed5c523ef4e83e3c68
From: kernel test robot @ 2020-06-03 19:11 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  topic/ppc-kvm
branch HEAD: bf8036a4098d1548cdccf9ed5c523ef4e83e3c68  powerpc/book3s64/kvm: Fix secondary page table walk warning during migration

elapsed time: 7946m

configs tested: 246
configs skipped: 17

The following configs have been built successfully.
More configs may be tested in the coming days.

arm                                 defconfig
arm                              allyesconfig
arm                              allmodconfig
arm                               allnoconfig
arm64                            allyesconfig
arm64                               defconfig
arm64                            allmodconfig
arm64                             allnoconfig
sparc                       sparc64_defconfig
mips                  decstation_64_defconfig
mips                          ath79_defconfig
powerpc                      pasemi_defconfig
arm                         cm_x300_defconfig
arm64                            alldefconfig
arm                   milbeaut_m10v_defconfig
arm                          ep93xx_defconfig
x86_64                              defconfig
s390                                defconfig
sh                           cayman_defconfig
mips                     loongson1b_defconfig
sh                        sh7763rdp_defconfig
ia64                              allnoconfig
sh                           se7619_defconfig
mips                         tb0287_defconfig
arm                       mainstone_defconfig
mips                      bmips_stb_defconfig
parisc                generic-64bit_defconfig
mips                         tb0226_defconfig
h8300                               defconfig
arm                            hisi_defconfig
powerpc                     mpc83xx_defconfig
m68k                          multi_defconfig
m68k                             allyesconfig
nds32                            alldefconfig
sh                         microdev_defconfig
powerpc                       holly_defconfig
powerpc                mpc7448_hpc2_defconfig
arm                        shmobile_defconfig
mips                           xway_defconfig
arm                          ixp4xx_defconfig
sh                           se7780_defconfig
m68k                            q40_defconfig
sh                           se7724_defconfig
mips                malta_qemu_32r6_defconfig
mips                            e55_defconfig
mips                       lemote2f_defconfig
h8300                       h8s-sim_defconfig
m68k                       m5475evb_defconfig
arm                      pxa255-idp_defconfig
arm                     am200epdkit_defconfig
mips                      maltaaprp_defconfig
powerpc                     pseries_defconfig
arm                            dove_defconfig
h8300                            alldefconfig
arm                            pleb_defconfig
sh                             espt_defconfig
arm                           omap1_defconfig
arm                       spear13xx_defconfig
sparc64                          allyesconfig
microblaze                    nommu_defconfig
arm                           corgi_defconfig
arm                         bcm2835_defconfig
arm                            u300_defconfig
arm                          simpad_defconfig
arm                         s3c6400_defconfig
m68k                        mvme16x_defconfig
powerpc                       ppc64_defconfig
mips                      pistachio_defconfig
c6x                         dsk6455_defconfig
riscv                    nommu_virt_defconfig
sh                        apsh4ad0a_defconfig
sh                           se7343_defconfig
nios2                            alldefconfig
nds32                               defconfig
mips                      pic32mzda_defconfig
xtensa                          iss_defconfig
arm                           efm32_defconfig
nios2                         3c120_defconfig
m68k                       m5208evb_defconfig
sh                         apsh4a3a_defconfig
sh                          lboxre2_defconfig
arm                    vt8500_v6_v7_defconfig
arm                         hackkit_defconfig
mips                     cu1000-neo_defconfig
arc                        vdk_hs38_defconfig
mips                       capcella_defconfig
arm                        trizeps4_defconfig
powerpc                      tqm8xx_defconfig
sh                           se7750_defconfig
xtensa                         virt_defconfig
sh                            shmin_defconfig
arm                        multi_v5_defconfig
arm                         lpc18xx_defconfig
arm                              zx_defconfig
ia64                             alldefconfig
mips                        nlm_xlp_defconfig
i386                              allnoconfig
i386                             allyesconfig
i386                                defconfig
i386                              debian-10.3
ia64                             allmodconfig
ia64                                defconfig
ia64                             allyesconfig
m68k                             allmodconfig
m68k                              allnoconfig
m68k                           sun3_defconfig
m68k                                defconfig
nios2                               defconfig
nios2                            allyesconfig
openrisc                            defconfig
c6x                              allyesconfig
c6x                               allnoconfig
openrisc                         allyesconfig
nds32                             allnoconfig
csky                             allyesconfig
csky                                defconfig
alpha                               defconfig
alpha                            allyesconfig
xtensa                           allyesconfig
h8300                            allyesconfig
h8300                            allmodconfig
xtensa                              defconfig
arc                                 defconfig
arc                              allyesconfig
sh                               allmodconfig
sh                                allnoconfig
microblaze                        allnoconfig
mips                             allyesconfig
mips                              allnoconfig
mips                             allmodconfig
parisc                            allnoconfig
parisc                              defconfig
parisc                           allyesconfig
parisc                           allmodconfig
powerpc                          allyesconfig
powerpc                          rhel-kconfig
powerpc                          allmodconfig
powerpc                           allnoconfig
powerpc                             defconfig
x86_64               randconfig-a002-20200529
x86_64               randconfig-a006-20200529
x86_64               randconfig-a005-20200529
x86_64               randconfig-a001-20200529
x86_64               randconfig-a004-20200529
x86_64               randconfig-a003-20200529
x86_64               randconfig-a002-20200601
x86_64               randconfig-a006-20200601
x86_64               randconfig-a001-20200601
x86_64               randconfig-a003-20200601
x86_64               randconfig-a004-20200601
x86_64               randconfig-a005-20200601
i386                 randconfig-a001-20200601
i386                 randconfig-a006-20200601
i386                 randconfig-a002-20200601
i386                 randconfig-a005-20200601
i386                 randconfig-a003-20200601
i386                 randconfig-a004-20200601
i386                 randconfig-a001-20200603
i386                 randconfig-a006-20200603
i386                 randconfig-a002-20200603
i386                 randconfig-a005-20200603
i386                 randconfig-a003-20200603
i386                 randconfig-a004-20200603
i386                 randconfig-a001-20200602
i386                 randconfig-a006-20200602
i386                 randconfig-a002-20200602
i386                 randconfig-a005-20200602
i386                 randconfig-a003-20200602
i386                 randconfig-a004-20200602
i386                 randconfig-a004-20200529
i386                 randconfig-a001-20200529
i386                 randconfig-a002-20200529
i386                 randconfig-a006-20200529
i386                 randconfig-a003-20200529
i386                 randconfig-a005-20200529
i386                 randconfig-a004-20200531
i386                 randconfig-a003-20200531
i386                 randconfig-a006-20200531
i386                 randconfig-a002-20200531
i386                 randconfig-a005-20200531
i386                 randconfig-a001-20200531
x86_64               randconfig-a002-20200603
x86_64               randconfig-a006-20200603
x86_64               randconfig-a001-20200603
x86_64               randconfig-a003-20200603
x86_64               randconfig-a004-20200603
x86_64               randconfig-a005-20200603
x86_64               randconfig-a011-20200531
x86_64               randconfig-a016-20200531
x86_64               randconfig-a012-20200531
x86_64               randconfig-a014-20200531
x86_64               randconfig-a013-20200531
x86_64               randconfig-a015-20200531
x86_64               randconfig-a011-20200602
x86_64               randconfig-a016-20200602
x86_64               randconfig-a013-20200602
x86_64               randconfig-a012-20200602
x86_64               randconfig-a014-20200602
x86_64               randconfig-a015-20200602
i386                 randconfig-a014-20200602
i386                 randconfig-a015-20200602
i386                 randconfig-a011-20200602
i386                 randconfig-a016-20200602
i386                 randconfig-a012-20200602
i386                 randconfig-a013-20200602
i386                 randconfig-a014-20200603
i386                 randconfig-a015-20200603
i386                 randconfig-a011-20200603
i386                 randconfig-a016-20200603
i386                 randconfig-a012-20200603
i386                 randconfig-a013-20200603
i386                 randconfig-a013-20200529
i386                 randconfig-a011-20200529
i386                 randconfig-a012-20200529
i386                 randconfig-a015-20200529
i386                 randconfig-a016-20200529
i386                 randconfig-a014-20200529
i386                 randconfig-a013-20200531
i386                 randconfig-a012-20200531
i386                 randconfig-a015-20200531
i386                 randconfig-a011-20200531
i386                 randconfig-a016-20200531
i386                 randconfig-a014-20200531
riscv                            allyesconfig
riscv                             allnoconfig
riscv                               defconfig
riscv                            allmodconfig
s390                             allyesconfig
s390                              allnoconfig
s390                             allmodconfig
sparc                            allyesconfig
sparc                               defconfig
sparc64                             defconfig
sparc64                           allnoconfig
sparc64                          allmodconfig
um                               allmodconfig
um                                allnoconfig
um                                  defconfig
um                               allyesconfig
x86_64                                   rhel
x86_64                               rhel-7.6
x86_64                    rhel-7.6-kselftests
x86_64                         rhel-7.2-clear
x86_64                                    lkp
x86_64                              fedora-25
x86_64                                  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply

* Re: [PATCH v4 08/14] powerpc: add support for folded p4d page tables
From: Andrew Morton @ 2020-06-03 19:05 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh, linux-mm,
	Paul Mackerras, linux-hexagon, Will Deacon, kvmarm, Jonas Bonn,
	linux-arch, Brian Cain, Marc Zyngier, Russell King, Ley Foon Tan,
	Mike Rapoport, Catalin Marinas, Julien Thierry, uclinux-h8-devel,
	Fenghua Yu, Arnd Bergmann, Suzuki K Poulose, kvm-ppc,
	Stefan Kristiansson, openrisc, Stafford Horne, Guan Xuetao,
	linux-arm-kernel, Christophe Leroy, Tony Luck, Yoshinori Sato,
	linux-kernel, James Morse, nios2-dev, linuxppc-dev
In-Reply-To: <20200414153455.21744-9-rppt@kernel.org>

On Tue, 14 Apr 2020 18:34:49 +0300 Mike Rapoport <rppt@kernel.org> wrote:

> Implement primitives necessary for the 4th level folding, add walks of p4d
> level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.

A bunch of new material just landed in linux-next/powerpc.

The timing is awkward!  I trust this will be going into mainline during
this merge window?  If not, please drop it and repull after -rc1.

arch/powerpc/mm/ptdump/ptdump.c:walk_pagetables() was a problem. 
Here's what I ended up with - please check.

static void walk_pagetables(struct pg_state *st)
{
	unsigned int i;
	unsigned long addr = st->start_address & PGDIR_MASK;
	pgd_t *pgd = pgd_offset_k(addr);

	/*
	 * Traverse the linux pagetable structure and dump pages that are in
	 * the hash pagetable.
	 */
	for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
		p4d_t *p4d = p4d_offset(pgd, 0);

		if (pgd_none(*pgd) || pgd_is_leaf(*pgd))
			note_page(st, addr, 1, p4d_val(*p4d), PGDIR_SIZE);
		else if (is_hugepd(__hugepd(p4d_val(*p4d))))
			walk_hugepd(st, (hugepd_t *)pgd, addr, PGDIR_SHIFT, 1);
		else
			/* pgd exists */
			walk_pud(st, p4d, addr);
	}
}

Mike's series "mm: consolidate definitions of page table accessors"
took quite a lot of damage as well.  Patches which needed rework as a
result of this were:

powerpc-add-support-for-folded-p4d-page-tables-fix.patch
mm-introduce-include-linux-pgtableh.patch
mm-reorder-includes-after-introduction-of-linux-pgtableh.patch
mm-pgtable-add-shortcuts-for-accessing-kernel-pmd-and-pte.patch
mm-pgtable-add-shortcuts-for-accessing-kernel-pmd-and-pte-fix-2.patch
mm-consolidate-pte_index-and-pte_offset_-definitions.patch
mm-consolidate-pmd_index-and-pmd_offset-definitions.patch
mm-consolidate-pud_index-and-pud_offset-definitions.patch
mm-consolidate-pgd_index-and-pgd_offset_k-definitions.patch


^ permalink raw reply

* Re: [RESEND PATCH v9 5/5] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
From: Vaibhav Jain @ 2020-06-03 19:04 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Santosh Sivaraj, linux-nvdimm, linux-kernel, Steven Rostedt,
	Oliver O'Halloran, Aneesh Kumar K . V, Dan Williams,
	linuxppc-dev
In-Reply-To: <20200602211901.GA1676657@iweiny-DESK2.sc.intel.com>

Hi Ira,

Thanks for reviewing this patch. My responses below:

Ira Weiny <ira.weiny@intel.com> writes:

> On Tue, Jun 02, 2020 at 03:44:38PM +0530, Vaibhav Jain wrote:
>> This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
>> that returns a newly introduced 'struct nd_papr_pdsm_health' instance
>> containing dimm health information back to user space in response to
>> ND_CMD_CALL. This functionality is implemented in newly introduced
>> papr_pdsm_health() that queries the nvdimm health information and
>> then copies this information to the package payload whose layout is
>> defined by 'struct nd_papr_pdsm_health'.
>> 
>> The patch also introduces a new member 'struct papr_scm_priv.health'
>> thats an instance of 'struct nd_papr_pdsm_health' to cache the health
>> information of a nvdimm. As a result functions drc_pmem_query_health()
>> and flags_show() are updated to populate and use this new struct
>> instead of a u64 integer that was earlier used.
>> 
>> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Ira Weiny <ira.weiny@intel.com>
>> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> ---
>> Changelog:
>> 
>> Resend:
>> * Added ack from Aneesh.
>> 
>> v8..v9:
>> * s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
>> * s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
>> * Renamed papr_scm_get_health() to papr_psdm_health()
>> * Updated patch description to replace papr-scm dimm with nvdimm.
>> 
>> v7..v8:
>> * None
>> 
>> Resend:
>> * None
>> 
>> v6..v7:
>> * Updated flags_show() to use seq_buf_printf(). [Mpe]
>> * Updated papr_scm_get_health() to use newly introduced
>>   __drc_pmem_query_health() bypassing the cache [Mpe].
>> 
>> v5..v6:
>> * Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
>>   gaurd against possibility of different compilers adding different
>>   paddings to the struct [ Dan Williams ]
>> 
>> * Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
>>   'bool' and also updated drc_pmem_query_health() to take this into
>>   account. [ Dan Williams ]
>> 
>> v4..v5:
>> * None
>> 
>> v3..v4:
>> * Call the DSM_PAPR_SCM_HEALTH service function from
>>   papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]
>> 
>> v2..v3:
>> * Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
>>   as its exported to the userspace [Aneesh]
>> * Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
>>   from enum to #defines [Aneesh]
>> 
>> v1..v2:
>> * New patch in the series
>> ---
>>  arch/powerpc/include/uapi/asm/papr_pdsm.h |  39 +++++++
>>  arch/powerpc/platforms/pseries/papr_scm.c | 125 +++++++++++++++++++---
>>  2 files changed, 147 insertions(+), 17 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h
>> index 6407fefcc007..411725a91591 100644
>> --- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
>> +++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
>> @@ -115,6 +115,7 @@ struct nd_pdsm_cmd_pkg {
>>   */
>>  enum papr_pdsm {
>>  	PAPR_PDSM_MIN = 0x0,
>> +	PAPR_PDSM_HEALTH,
>>  	PAPR_PDSM_MAX,
>>  };
>>  
>> @@ -133,4 +134,42 @@ static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
>>  		return (void *)(pcmd->payload);
>>  }
>>  
>> +/* Various nvdimm health indicators */
>> +#define PAPR_PDSM_DIMM_HEALTHY       0
>> +#define PAPR_PDSM_DIMM_UNHEALTHY     1
>> +#define PAPR_PDSM_DIMM_CRITICAL      2
>> +#define PAPR_PDSM_DIMM_FATAL         3
>> +
>> +/*
>> + * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
>> + * Various flags indicate the health status of the dimm.
>> + *
>> + * dimm_unarmed		: Dimm not armed. So contents wont persist.
>> + * dimm_bad_shutdown	: Previous shutdown did not persist contents.
>> + * dimm_bad_restore	: Contents from previous shutdown werent restored.
>> + * dimm_scrubbed	: Contents of the dimm have been scrubbed.
>> + * dimm_locked		: Contents of the dimm cant be modified until CEC reboot
>> + * dimm_encrypted	: Contents of dimm are encrypted.
>> + * dimm_health		: Dimm health indicator. One of PAPR_PDSM_DIMM_XXXX
>> + */
>> +struct nd_papr_pdsm_health_v1 {
>> +	__u8 dimm_unarmed;
>> +	__u8 dimm_bad_shutdown;
>> +	__u8 dimm_bad_restore;
>> +	__u8 dimm_scrubbed;
>> +	__u8 dimm_locked;
>> +	__u8 dimm_encrypted;
>> +	__u16 dimm_health;
>> +} __packed;
>> +
>> +/*
>> + * Typedef the current struct for dimm_health so that any application
>> + * or kernel recompiled after introducing a new version automatically
>> + * supports the new version.
>> + */
>> +#define nd_papr_pdsm_health nd_papr_pdsm_health_v1
>> +
>> +/* Current version number for the dimm health struct */
>
> This can't be the 'current' version.  You will need a list of versions you
> support.  Because if the user passes in an old version you need to be able to
> respond with that old version.  Also if you plan to support 'return X for a Y
> query' then the user will need both X and Y defined to interpret X.
Yes, and that change will be introduced with addition of version-2 of
nd_papr_pdsm_health. Earlier version of the patchset[1] had such a table
implemented. But to simplify the patchset, as we are only dealing with
version-1 of the structs right now, it was dropped.

[1] :
https://lore.kernel.org/linuxppc-dev/20200220095805.197229-9-vaibhav@linux.ibm.com/

>
>> +#define ND_PAPR_PDSM_HEALTH_VERSION 1
>> +
>>  #endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>> index 5e2237e7ec08..c0606c0c659c 100644
>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>> @@ -88,7 +88,7 @@ struct papr_scm_priv {
>>  	unsigned long lasthealth_jiffies;
>>  
>>  	/* Health information for the dimm */
>> -	u64 health_bitmap;
>> +	struct nd_papr_pdsm_health health;
>
> ok so we are throwing away all the #defs from patch 1?  Are they still valid?
>
> I'm confused that patch 3 added this and we are throwing it away
> here...
The #defines are still valid, only the usage moved to a __drc_pmem_query_health().

>
>>  };
>>  
>>  static int drc_pmem_bind(struct papr_scm_priv *p)
>> @@ -201,6 +201,7 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
>>  static int __drc_pmem_query_health(struct papr_scm_priv *p)
>>  {
>>  	unsigned long ret[PLPAR_HCALL_BUFSIZE];
>> +	u64 health;
>>  	long rc;
>>  
>>  	/* issue the hcall */
>> @@ -208,18 +209,46 @@ static int __drc_pmem_query_health(struct papr_scm_priv *p)
>>  	if (rc != H_SUCCESS) {
>>  		dev_err(&p->pdev->dev,
>>  			 "Failed to query health information, Err:%ld\n", rc);
>> -		rc = -ENXIO;
>> -		goto out;
>> +		return -ENXIO;
>
> I missed this...  probably did not need the goto in the first patch?
Yes, will get rid of the goto from patch-1.

>
>>  	}
>>  
>>  	p->lasthealth_jiffies = jiffies;
>> -	p->health_bitmap = ret[0] & ret[1];
>> +	health = ret[0] & ret[1];
>>  
>>  	dev_dbg(&p->pdev->dev,
>>  		"Queried dimm health info. Bitmap:0x%016lx Mask:0x%016lx\n",
>>  		ret[0], ret[1]);
>> -out:
>> -	return rc;
>> +
>> +	memset(&p->health, 0, sizeof(p->health));
>> +
>> +	/* Check for various masks in bitmap and set the buffer */
>> +	if (health & PAPR_PMEM_UNARMED_MASK)
>
> Oh ok...  odd.  (don't add code then just take it away in a series)
> You could have lead with the user structure and put this code in patch
> 3.
The struct nd_papr_pdsm_health in only introduced this patch in header
'papr_pdsm.h' as means of exchanging nvdimm health information with
userspace. Introducing this struct without introducing the necessary
scafolding in 'papr_pdsm.h' would have been very counter-intutive.


>
> Why does the user need u8 to represent a single bit?  Does this help protect
> against endian issues?
This was 'bool' earlier but since type 'bool' isnt suitable for ioctl abi
and I wanted to avoid bit fields here as not sure if their packing may
differ across compilers hence replaced with u8.

>
>> +		p->health.dimm_unarmed = 1;
>> +
>> +	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
>> +		p->health.dimm_bad_shutdown = 1;
>> +
>> +	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
>> +		p->health.dimm_bad_restore = 1;
>> +
>> +	if (health & PAPR_PMEM_ENCRYPTED)
>> +		p->health.dimm_encrypted = 1;
>> +
>> +	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED) {
>> +		p->health.dimm_locked = 1;
>> +		p->health.dimm_scrubbed = 1;
>> +	}
>> +
>> +	if (health & PAPR_PMEM_HEALTH_UNHEALTHY)
>> +		p->health.dimm_health = PAPR_PDSM_DIMM_UNHEALTHY;
>> +
>> +	if (health & PAPR_PMEM_HEALTH_CRITICAL)
>> +		p->health.dimm_health = PAPR_PDSM_DIMM_CRITICAL;
>> +
>> +	if (health & PAPR_PMEM_HEALTH_FATAL)
>> +		p->health.dimm_health = PAPR_PDSM_DIMM_FATAL;
>> +
>> +	return 0;
>>  }
>>  
>>  /* Min interval in seconds for assuming stable dimm health */
>> @@ -403,6 +432,58 @@ static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
>>  	return 0;
>>  }
>>  
>> +/* Fetch the DIMM health info and populate it in provided package. */
>> +static int papr_pdsm_health(struct papr_scm_priv *p,
>> +			       struct nd_pdsm_cmd_pkg *pkg)
>> +{
>> +	int rc;
>> +	size_t copysize = sizeof(p->health);
>> +
>> +	/* Ensure dimm health mutex is taken preventing concurrent access */
>> +	rc = mutex_lock_interruptible(&p->health_mutex);
>> +	if (rc)
>> +		goto out;
>> +
>> +	/* Always fetch upto date dimm health data ignoring cached values */
>> +	rc = __drc_pmem_query_health(p);
>> +	if (rc)
>> +		goto out_unlock;
>> +	/*
>> +	 * If the requested payload version is greater than one we know
>> +	 * about, return the payload version we know about and let
>> +	 * caller/userspace handle.
>> +	 */
>> +	if (pkg->payload_version > ND_PAPR_PDSM_HEALTH_VERSION)
>> +		pkg->payload_version = ND_PAPR_PDSM_HEALTH_VERSION;
>
> I know this seems easy now but I do think you will run into trouble later.

I did addressed this in an earlier iteration of this patchset[1] and
dropped it in favour of simplicity.

[1] :
https://lore.kernel.org/linuxppc-dev/20200220095805.197229-9-vaibhav@linux.ibm.com/


> Ira
>
>> +
>> +	if (pkg->hdr.nd_size_out < copysize) {
>> +		dev_dbg(&p->pdev->dev, "Truncated payload (%u). Expected (%lu)",
>> +			pkg->hdr.nd_size_out, copysize);
>> +		rc = -ENOSPC;
>> +		goto out_unlock;
>> +	}
>> +
>> +	dev_dbg(&p->pdev->dev, "Copying payload size=%lu version=0x%x\n",
>> +		copysize, pkg->payload_version);
>> +
>> +	/* Copy the health struct to the payload */
>> +	memcpy(pdsm_cmd_to_payload(pkg), &p->health, copysize);
>> +	pkg->hdr.nd_fw_size = copysize;
>> +
>> +out_unlock:
>> +	mutex_unlock(&p->health_mutex);
>> +
>> +out:
>> +	/*
>> +	 * Put the error in out package and return success from function
>> +	 * so that errors if any are propogated back to userspace.
>> +	 */
>> +	pkg->cmd_status = rc;
>> +	dev_dbg(&p->pdev->dev, "completion code = %d\n", rc);
>> +
>> +	return 0;
>> +}
>> +
>>  static int papr_scm_service_pdsm(struct papr_scm_priv *p,
>>  				struct nd_pdsm_cmd_pkg *call_pkg)
>>  {
>> @@ -417,6 +498,9 @@ static int papr_scm_service_pdsm(struct papr_scm_priv *p,
>>  
>>  	/* Depending on the DSM command call appropriate service routine */
>>  	switch (call_pkg->hdr.nd_command) {
>> +	case PAPR_PDSM_HEALTH:
>> +		return papr_pdsm_health(p, call_pkg);
>> +
>>  	default:
>>  		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
>>  			call_pkg->hdr.nd_command);
>> @@ -485,34 +569,41 @@ static ssize_t flags_show(struct device *dev,
>>  	struct nvdimm *dimm = to_nvdimm(dev);
>>  	struct papr_scm_priv *p = nvdimm_provider_data(dimm);
>>  	struct seq_buf s;
>> -	u64 health;
>>  	int rc;
>>  
>>  	rc = drc_pmem_query_health(p);
>>  	if (rc)
>>  		return rc;
>>  
>> -	/* Copy health_bitmap locally, check masks & update out buffer */
>> -	health = READ_ONCE(p->health_bitmap);
>> -
>>  	seq_buf_init(&s, buf, PAGE_SIZE);
>> -	if (health & PAPR_PMEM_UNARMED_MASK)
>> +
>> +	/* Protect concurrent modifications to papr_scm_priv */
>> +	rc = mutex_lock_interruptible(&p->health_mutex);
>> +	if (rc)
>> +		return rc;
>> +
>> +	if (p->health.dimm_unarmed)
>>  		seq_buf_printf(&s, "not_armed ");
>>  
>> -	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
>> +	if (p->health.dimm_bad_shutdown)
>>  		seq_buf_printf(&s, "flush_fail ");
>>  
>> -	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
>> +	if (p->health.dimm_bad_restore)
>>  		seq_buf_printf(&s, "restore_fail ");
>>  
>> -	if (health & PAPR_PMEM_ENCRYPTED)
>> +	if (p->health.dimm_encrypted)
>>  		seq_buf_printf(&s, "encrypted ");
>>  
>> -	if (health & PAPR_PMEM_SMART_EVENT_MASK)
>> +	if (p->health.dimm_health)
>>  		seq_buf_printf(&s, "smart_notify ");
>>  
>> -	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED)
>> -		seq_buf_printf(&s, "scrubbed locked ");
>> +	if (p->health.dimm_scrubbed)
>> +		seq_buf_printf(&s, "scrubbed ");
>> +
>> +	if (p->health.dimm_locked)
>> +		seq_buf_printf(&s, "locked ");
>> +
>> +	mutex_unlock(&p->health_mutex);
>>  
>>  	if (seq_buf_used(&s))
>>  		seq_buf_printf(&s, "\n");
>> -- 
>> 2.26.2
>> 

-- 
Cheers
~ Vaibhav

^ permalink raw reply

* Re: Re: [RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'
From: Scott Branden @ 2020-06-03 18:47 UTC (permalink / raw)
  To: Bhupesh Sharma, James Morse, linux-kernel
  Cc: Mark Rutland, Ard Biesheuvel, linux-doc, Catalin Marinas, x86,
	kexec, linuxppc-dev, Kazuhito Hagio, Dave Anderson, bhupesh.linux,
	Will Deacon, linux-arm-kernel, Steve Capper
In-Reply-To: <b7d8d603-d9fe-3e18-c754-baf2015acd16@redhat.com>

Hi Bhupesh,

Would be great to get this patch series upstreamed?

On 2019-12-25 10:49 a.m., Bhupesh Sharma wrote:
> Hi James,
>
> On 12/12/2019 04:02 PM, James Morse wrote:
>> Hi Bhupesh,
>
> I am sorry this review mail skipped my attention due to holidays and 
> focus on other urgent issues.
>
>> On 29/11/2019 19:59, Bhupesh Sharma wrote:
>>> Add documentation for TCR_EL1.T1SZ variable being added to
>>> vmcoreinfo.
>>>
>>> It indicates the size offset of the memory region addressed by 
>>> TTBR1_EL1
>>
>>> and hence can be used for determining the vabits_actual value.
>>
>> used for determining random-internal-kernel-variable, that might not 
>> exist tomorrow.
>>
>> Could you describe how this is useful/necessary if a debugger wants 
>> to walk the page
>> tables from the core file? I think this is a better argument.
>>
>> Wouldn't the documentation be better as part of the patch that adds 
>> the export?
>> (... unless these have to go via different trees? ..)
>
> Ok, will fix the same in v6 version.
>
>>> diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst 
>>> b/Documentation/admin-guide/kdump/vmcoreinfo.rst
>>> index 447b64314f56..f9349f9d3345 100644
>>> --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
>>> +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
>>> @@ -398,6 +398,12 @@ KERNELOFFSET
>>>   The kernel randomization offset. Used to compute the page offset. If
>>>   KASLR is disabled, this value is zero.
>>>   +TCR_EL1.T1SZ
>>> +------------
>>> +
>>> +Indicates the size offset of the memory region addressed by TTBR1_EL1
>>
>>> +and hence can be used for determining the vabits_actual value.
>>
>> 'vabits_actual' may not exist when the next person comes to read this 
>> documentation (its
>> going to rot really quickly).
>>
>> I think the first half of this text is enough to say what this is 
>> for. You should include
>> words to the effect that its the hardware value that goes with 
>> swapper_pg_dir. You may
>> want to point readers to the arm-arm for more details on what the 
>> value means.
>
> Ok, got it. Fixed this in v6, which should be on its way shortly.
I can't seem to find v6?
>
> Thanks,
> Bhupesh
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox