Linux userland API discussions
 help / color / mirror / Atom feed
* [PATCH V32 04/27] kexec_load: Disable at runtime if the kernel is locked down
From: Matthew Garrett @ 2019-04-04  0:32 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, dhowells, linux-api, luto,
	Matthew Garrett, Matthew Garrett, Dave Young, kexec
In-Reply-To: <20190404003249.14356-1-matthewgarrett@google.com>

From: Matthew Garrett <mjg59@srcf.ucam.org>

The kexec_load() syscall permits the loading and execution of arbitrary
code in ring 0, which is something that lock-down is meant to prevent. It
makes sense to disable kexec_load() in this situation.

This does not affect kexec_file_load() syscall which can check for a
signature on the image to be booted.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Dave Young <dyoung@redhat.com>
cc: kexec@lists.infradead.org
---
 kernel/kexec.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/kexec.c b/kernel/kexec.c
index 68559808fdfa..57047acc9a36 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -207,6 +207,14 @@ static inline int kexec_load_check(unsigned long nr_segments,
 	if (result < 0)
 		return result;
 
+	/*
+	 * kexec can be used to circumvent module loading restrictions, so
+	 * prevent loading in that case
+	 */
+	if (kernel_is_locked_down("kexec of unsigned images",
+				  LOCKDOWN_INTEGRITY))
+		return -EPERM;
+
 	/*
 	 * Verify we have a legal set of flags
 	 * This leaves us room for future extensions.
-- 
2.21.0.392.gf8f6787159e-goog

^ permalink raw reply related

* [PATCH V32 03/27] Restrict /dev/{mem,kmem,port} when the kernel is locked down
From: Matthew Garrett @ 2019-04-04  0:32 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, dhowells, linux-api, luto,
	Matthew Garrett, Matthew Garrett, x86
In-Reply-To: <20190404003249.14356-1-matthewgarrett@google.com>

From: Matthew Garrett <mjg59@srcf.ucam.org>

Allowing users to read and write to core kernel memory makes it possible
for the kernel to be subverted, avoiding module loading restrictions, and
also to steal cryptographic information.

Disallow /dev/mem and /dev/kmem from being opened this when the kernel has
been locked down to prevent this.

Also disallow /dev/port from being opened to prevent raw ioport access and
thus DMA from being used to accomplish the same thing.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Cc: x86@kernel.org
---
 drivers/char/mem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index b08dc50f9f26..67b85939b1bd 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -786,6 +786,8 @@ static loff_t memory_lseek(struct file *file, loff_t offset, int orig)
 
 static int open_port(struct inode *inode, struct file *filp)
 {
+	if (kernel_is_locked_down("/dev/mem,kmem,port", LOCKDOWN_INTEGRITY))
+		return -EPERM;
 	return capable(CAP_SYS_RAWIO) ? 0 : -EPERM;
 }
 
-- 
2.21.0.392.gf8f6787159e-goog

^ permalink raw reply related

* [PATCH V32 02/27] Enforce module signatures if the kernel is locked down
From: Matthew Garrett @ 2019-04-04  0:32 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, dhowells, linux-api, luto,
	Matthew Garrett, Jessica Yu
In-Reply-To: <20190404003249.14356-1-matthewgarrett@google.com>

From: David Howells <dhowells@redhat.com>

If the kernel is locked down, require that all modules have valid
signatures that we can verify.

I have adjusted the errors generated:

 (1) If there's no signature (ENODATA) or we can't check it (ENOPKG,
     ENOKEY), then:

     (a) If signatures are enforced then EKEYREJECTED is returned.

     (b) If there's no signature or we can't check it, but the kernel is
	 locked down then EPERM is returned (this is then consistent with
	 other lockdown cases).

 (2) If the signature is unparseable (EBADMSG, EINVAL), the signature fails
     the check (EKEYREJECTED) or a system error occurs (eg. ENOMEM), we
     return the error we got.

Note that the X.509 code doesn't check for key expiry as the RTC might not
be valid or might not have been transferred to the kernel's clock yet.

 [Modified by Matthew Garrett to remove the IMA integration. This will
  be replaced with integration with the IMA architecture policy
  patchset.]

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <matthewgarrett@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
---
 kernel/module.c | 39 ++++++++++++++++++++++++++++++++-------
 1 file changed, 32 insertions(+), 7 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 2ad1b5239910..deea9d2763f8 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2767,8 +2767,9 @@ static inline void kmemleak_load_module(const struct module *mod,
 #ifdef CONFIG_MODULE_SIG
 static int module_sig_check(struct load_info *info, int flags)
 {
-	int err = -ENOKEY;
+	int err = -ENODATA;
 	const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
+	const char *reason;
 	const void *mod = info->hdr;
 
 	/*
@@ -2783,16 +2784,40 @@ static int module_sig_check(struct load_info *info, int flags)
 		err = mod_verify_sig(mod, info);
 	}
 
-	if (!err) {
+	switch (err) {
+	case 0:
 		info->sig_ok = true;
 		return 0;
-	}
 
-	/* Not having a signature is only an error if we're strict. */
-	if (err == -ENOKEY && !is_module_sig_enforced())
-		err = 0;
+		/* We don't permit modules to be loaded into trusted kernels
+		 * without a valid signature on them, but if we're not
+		 * enforcing, certain errors are non-fatal.
+		 */
+	case -ENODATA:
+		reason = "Loading of unsigned module";
+		goto decide;
+	case -ENOPKG:
+		reason = "Loading of module with unsupported crypto";
+		goto decide;
+	case -ENOKEY:
+		reason = "Loading of module with unavailable key";
+	decide:
+		if (is_module_sig_enforced()) {
+			pr_notice("%s is rejected\n", reason);
+			return -EKEYREJECTED;
+		}
 
-	return err;
+		if (kernel_is_locked_down(reason, LOCKDOWN_INTEGRITY))
+			return -EPERM;
+		return 0;
+
+		/* All other errors are fatal, including nomem, unparseable
+		 * signatures and signature check failures - even if signatures
+		 * aren't required.
+		 */
+	default:
+		return err;
+	}
 }
 #else /* !CONFIG_MODULE_SIG */
 static int module_sig_check(struct load_info *info, int flags)
-- 
2.21.0.392.gf8f6787159e-goog

^ permalink raw reply related

* [PATCH V32 01/27] Add the ability to lock down access to the running kernel image
From: Matthew Garrett @ 2019-04-04  0:32 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, dhowells, linux-api, luto,
	Matthew Garrett
In-Reply-To: <20190404003249.14356-1-matthewgarrett@google.com>

From: David Howells <dhowells@redhat.com>

Provide a single call to allow kernel code to determine whether the system
should be locked down, thereby disallowing various accesses that might
allow the running kernel image to be changed including the loading of
modules that aren't validly signed with a key we recognise, fiddling with
MSR registers and disallowing hibernation.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <matthewgarrett@google.com>
---
 Documentation/ABI/testing/lockdown            |  19 +++
 .../admin-guide/kernel-parameters.txt         |   9 ++
 Documentation/admin-guide/lockdown.rst        |  60 +++++++
 include/linux/kernel.h                        |  28 ++++
 include/linux/security.h                      |   9 +-
 init/main.c                                   |   1 +
 security/Kconfig                              |  39 +++++
 security/Makefile                             |   3 +
 security/lock_down.c                          | 147 ++++++++++++++++++
 9 files changed, 314 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/ABI/testing/lockdown
 create mode 100644 Documentation/admin-guide/lockdown.rst
 create mode 100644 security/lock_down.c

diff --git a/Documentation/ABI/testing/lockdown b/Documentation/ABI/testing/lockdown
new file mode 100644
index 000000000000..5bd51e20917a
--- /dev/null
+++ b/Documentation/ABI/testing/lockdown
@@ -0,0 +1,19 @@
+What:		security/lockdown
+Date:		March 2019
+Contact:	Matthew Garrett <mjg59@google.com>
+Description:
+		If CONFIG_LOCK_DOWN_KERNEL is enabled, the kernel can be
+		moved to a more locked down state at runtime by writing to
+		this attribute. Valid values are:
+
+		integrity:
+			The kernel will disable functionality that allows
+			userland to modify the running kernel image, other
+			than through the loading or execution of appropriately
+			signed objects.
+
+		confidentiality:
+			The kernel will disable all functionality disabled by
+			the integrity mode, but additionally will disable
+			features that potentially permit userland to obtain
+			confidential information stored within the kernel.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 91c0251fdb86..594d268d92ba 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2213,6 +2213,15 @@
 	lockd.nlm_udpport=M	[NFS] Assign UDP port.
 			Format: <integer>
 
+	lockdown=	[SECURITY]
+			{ integrity | confidentiality }
+			Enable the kernel lockdown feature. If set to
+			integrity, kernel features that allow userland to
+			modify the running kernel are disabled. If set to
+			confidentiality, kernel features that allow userland
+			to extract confidential information from the kernel
+			are also disabled.
+
 	locktorture.nreaders_stress= [KNL]
 			Set the number of locking read-acquisition kthreads.
 			Defaults to being automatically set based on the
diff --git a/Documentation/admin-guide/lockdown.rst b/Documentation/admin-guide/lockdown.rst
new file mode 100644
index 000000000000..d05dcedd20d1
--- /dev/null
+++ b/Documentation/admin-guide/lockdown.rst
@@ -0,0 +1,60 @@
+Kernel lockdown functionality
+-----------------------------
+
+.. CONTENTS
+..
+.. - Overview.
+.. - Enabling Lockdown.
+
+========
+Overview
+========
+
+Traditionally Linux systems have been run with the presumption that a
+process running with full capabilities is effectively equivalent in
+privilege to the kernel itself. The lockdown feature attempts to draw
+a stronger boundary between privileged processes and the kernel,
+increasing the level of trust that can be placed in the kernel even in
+the face of hostile processes.
+
+Lockdown can be run in two modes - integrity and confidentiality. In
+integrity mode, kernel features that allow arbitrary modification of
+the running kernel image are disabled. Confidentiality mode behaves in
+the same way as integrity mode, but also blocks features that
+potentially allow a hostile userland process to extract secret
+information from the kernel.
+
+Note that lockdown depends upon the correct behaviour of the
+kernel. Exploitable vulnerabilities in the kernel may still permit
+arbitrary modification of the kernel or make it possible to disable
+lockdown features.
+
+=================
+Enabling Lockdown
+=================
+
+Lockdown can be enabled in multiple ways.
+
+Kernel configuration
+====================
+
+The kernel can be statically configured by setting either
+CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY or
+CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY. A kernel configured
+with CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY may be booted into
+confidentiality mode using one of the other mechanisms, but otherwise
+the kernel will always boot into the configured mode.
+
+Kernel command line
+===================
+
+Passing lockdown=integrity or lockdown=confidentiality on the kernel
+command line will configure lockdown into the appropriate mode.
+
+Runtime configuration
+=====================
+
+/sys/kernel/security/lockdown will indicate the current lockdown
+state. The system state may be made stricter by writing either
+"integrity" or "confidentiality" into this file, but any attempts to
+make it less strict will fail.
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 8f0e68e250a7..30cf695719d5 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -340,6 +340,34 @@ static inline void refcount_error_report(struct pt_regs *regs, const char *err)
 { }
 #endif
 
+enum lockdown_level {
+	LOCKDOWN_NONE,
+	LOCKDOWN_INTEGRITY,
+	LOCKDOWN_CONFIDENTIALITY,
+	LOCKDOWN_MAX,
+};
+
+#ifdef CONFIG_LOCK_DOWN_KERNEL
+extern bool __kernel_is_locked_down(const char *what,
+				    enum lockdown_level level,
+				    bool first);
+#else
+static inline bool __kernel_is_locked_down(const char *what,
+					   enum lockdown_level level,
+					   bool first)
+{
+	return false;
+}
+#endif
+
+#define kernel_is_locked_down(what, level)				\
+	({								\
+		static bool message_given;				\
+		bool locked_down = __kernel_is_locked_down(what, level, !message_given); \
+		message_given = true;					\
+		locked_down;						\
+	})
+
 /* Internal, do not use. */
 int __must_check _kstrtoul(const char *s, unsigned int base, unsigned long *res);
 int __must_check _kstrtol(const char *s, unsigned int base, long *res);
diff --git a/include/linux/security.h b/include/linux/security.h
index 13537a49ae97..b290946341a4 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1798,5 +1798,12 @@ static inline void security_bpf_prog_free(struct bpf_prog_aux *aux)
 #endif /* CONFIG_SECURITY */
 #endif /* CONFIG_BPF_SYSCALL */
 
-#endif /* ! __LINUX_SECURITY_H */
+#ifdef CONFIG_LOCK_DOWN_KERNEL
+extern void __init init_lockdown(void);
+#else
+static inline void __init init_lockdown(void)
+{
+}
+#endif
 
+#endif /* ! __LINUX_SECURITY_H */
diff --git a/init/main.c b/init/main.c
index e2e80ca3165a..4c6cca9681c7 100644
--- a/init/main.c
+++ b/init/main.c
@@ -555,6 +555,7 @@ asmlinkage __visible void __init start_kernel(void)
 	boot_cpu_init();
 	page_address_init();
 	pr_notice("%s", linux_banner);
+	init_lockdown();
 	setup_arch(&command_line);
 	/*
 	 * Set up the the initial canary and entropy after arch
diff --git a/security/Kconfig b/security/Kconfig
index 1d6463fb1450..593ff231eac6 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -229,6 +229,45 @@ config STATIC_USERMODEHELPER_PATH
 	  If you wish for all usermode helper programs to be disabled,
 	  specify an empty string here (i.e. "").
 
+config LOCK_DOWN_KERNEL
+	bool "Allow the kernel to be 'locked down'"
+	help
+	  Allow the kernel to be locked down. If lockdown support is enabled
+	  and activated, the kernel will impose additional restrictions
+	  intended to prevent uid 0 from being able to modify the running
+	  kernel. This may break userland applications that rely on low-level
+	  access to hardware.
+
+choice
+	prompt "Kernel default lockdown mode"
+	default LOCK_DOWN_KERNEL_FORCE_NONE
+	depends on LOCK_DOWN_KERNEL
+	help
+	  The kernel can be configured to default to differing levels of
+	  lockdown.
+
+config LOCK_DOWN_KERNEL_FORCE_NONE
+       bool "None"
+       help
+          No lockdown functionality is enabled by default. Lockdown may be
+	  enabled via the kernel commandline or /sys/kernel/security/lockdown.
+
+config LOCK_DOWN_KERNEL_FORCE_INTEGRITY
+       bool "Integrity"
+       help
+         The kernel runs in integrity mode by default. Features that allow
+	 the kernel to be modified at runtime are disabled.
+
+config LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY
+       bool "Confidentiality"
+       help
+         The kernel runs in confidentiality mode by default. Features that
+	 allow the kernel to be modified at runtime or that permit userland
+	 code to read confidential material held inside the kernel are
+	 disabled.
+
+endchoice
+
 source "security/selinux/Kconfig"
 source "security/smack/Kconfig"
 source "security/tomoyo/Kconfig"
diff --git a/security/Makefile b/security/Makefile
index c598b904938f..5ff090149c88 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -32,3 +32,6 @@ obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 # Object integrity file lists
 subdir-$(CONFIG_INTEGRITY)		+= integrity
 obj-$(CONFIG_INTEGRITY)			+= integrity/
+
+# Allow the kernel to be locked down
+obj-$(CONFIG_LOCK_DOWN_KERNEL)		+= lock_down.o
diff --git a/security/lock_down.c b/security/lock_down.c
new file mode 100644
index 000000000000..9913fff09ad0
--- /dev/null
+++ b/security/lock_down.c
@@ -0,0 +1,147 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Lock down the kernel
+ *
+ * Copyright (C) 2016 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include <linux/security.h>
+#include <linux/export.h>
+
+static enum lockdown_level kernel_locked_down;
+
+char *lockdown_levels[LOCKDOWN_MAX] = {"none", "integrity", "confidentiality"};
+
+/*
+ * Put the kernel into lock-down mode.
+ */
+static int lock_kernel_down(const char *where, enum lockdown_level level)
+{
+	if (kernel_locked_down >= level)
+		return -EPERM;
+
+	kernel_locked_down = level;
+	pr_notice("Kernel is locked down from %s; see man kernel_lockdown.7\n",
+		  where);
+	return 0;
+}
+
+static int __init lockdown_param(char *level)
+{
+	if (!level)
+		return -EINVAL;
+
+	if (strcmp(level, "integrity") == 0)
+		lock_kernel_down("command line", LOCKDOWN_INTEGRITY);
+	else if (strcmp(level, "confidentiality") == 0)
+		lock_kernel_down("command line", LOCKDOWN_CONFIDENTIALITY);
+	else
+		return -EINVAL;
+
+	return 0;
+}
+
+early_param("lockdown", lockdown_param);
+
+/*
+ * This must be called before arch setup code in order to ensure that the
+ * appropriate default can be applied without being overridden by the command
+ * line option.
+ */
+void __init init_lockdown(void)
+{
+#if defined(CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY)
+	lock_kernel_down("Kernel configuration", LOCKDOWN_INTEGRITY);
+#elif defined(CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY)
+	lock_kernel_down("Kernel configuration", LOCKDOWN_CONFIDENTIALITY);
+#endif
+}
+
+/**
+ * kernel_is_locked_down - Find out if the kernel is locked down
+ * @what: Tag to use in notice generated if lockdown is in effect
+ */
+bool __kernel_is_locked_down(const char *what, enum lockdown_level level,
+			     bool first)
+{
+	if ((kernel_locked_down >= level) && what && first)
+		pr_notice("Lockdown: %s is restricted; see man kernel_lockdown.7\n",
+			  what);
+	return (kernel_locked_down >= level);
+}
+EXPORT_SYMBOL(__kernel_is_locked_down);
+
+static ssize_t lockdown_read(struct file *filp, char __user *buf, size_t count,
+			     loff_t *ppos)
+{
+	char temp[80];
+	int i, offset=0;
+
+	for (i = LOCKDOWN_NONE; i < LOCKDOWN_MAX; i++) {
+		if (lockdown_levels[i]) {
+			const char *label = lockdown_levels[i];
+
+			if (kernel_locked_down == i)
+				offset += sprintf(temp+offset, "[%s] ", label);
+			else
+				offset += sprintf(temp+offset, "%s ", label);
+		}
+	}
+
+	/* Convert the last space to a newline if needed. */
+	if (offset > 0)
+		temp[offset-1] = '\n';
+
+	return simple_read_from_buffer(buf, count, ppos, temp, strlen(temp));
+}
+
+static ssize_t lockdown_write(struct file *file, const char __user *buf,
+			      size_t n, loff_t *ppos)
+{
+	char *state;
+	int i, len, err = -EINVAL;
+
+	state = memdup_user_nul(buf, n);
+	if (IS_ERR(state))
+		return PTR_ERR(state);
+
+	len = strlen(state);
+	if (len && state[len-1] == '\n') {
+		state[len-1] = '\0';
+		len--;
+	}
+
+	for (i = 0; i < LOCKDOWN_MAX; i++) {
+		const char *label = lockdown_levels[i];
+
+		if (label && !strcmp(state, label))
+			err = lock_kernel_down("securityfs", i);
+	}
+
+	kfree(state);
+	return err ? err : n;
+}
+
+static const struct file_operations lockdown_ops = {
+	.read  = lockdown_read,
+	.write = lockdown_write,
+};
+
+static int __init lockdown_secfs_init(void)
+{
+	struct dentry *dentry;
+
+	dentry = securityfs_create_file("lockdown", 0600, NULL, NULL,
+					&lockdown_ops);
+	if (IS_ERR(dentry))
+		return PTR_ERR(dentry);
+
+	return 0;
+}
+
+core_initcall(lockdown_secfs_init);
-- 
2.21.0.392.gf8f6787159e-goog

^ permalink raw reply related

* [PATCH V32 0/27] Lockdown patches for 5.2
From: Matthew Garrett @ 2019-04-04  0:32 UTC (permalink / raw)
  To: jmorris; +Cc: linux-security-module, linux-kernel, dhowells, linux-api, luto

Fairly minimal changes since the last set: tracefs is restricted at
Steven's suggestion (but could do with a once-over, I'm very much not a
vfs person), debugfs is back to Dave's original implementation. I've
also fixed up a malformed patch that resulted from me getting confused
during rebase, and added some further documentation to the initial patch
in order to give a reference for the design goals.

^ permalink raw reply

* Re: [PATCH 14/17] fpga: dfl: fme: add thermal management support
From: Wu Hao @ 2019-04-03 23:43 UTC (permalink / raw)
  To: Moritz Fischer
  Cc: atull, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Russ Weight, Xu Yilun
In-Reply-To: <20190403180909.GD5752@archbook>

On Wed, Apr 03, 2019 at 11:09:09AM -0700, Moritz Fischer wrote:
> Hi Hao,
> 
> On Thu, Apr 04, 2019 at 12:31:47AM +0800, Wu Hao wrote:
> > On Tue, Apr 02, 2019 at 07:59:25AM -0700, Moritz Fischer wrote:
> > > Hi Wu,
> > > 
> > > On Mon, Mar 25, 2019 at 11:07:41AM +0800, Wu Hao wrote:
> > > > This patch adds support to thermal management private feature for DFL
> > > > FPGA Management Engine (FME). As thermal throttling is handled by
> > > > hardware automatically per pre-defined thresholds, this private
> > > > feature driver only provides read-only sysfs interfaces for user
> > > > to read temperature, thresholds, threshold policy and other info.
> > > > 
> > > > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > > > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > > > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > > > Signed-off-by: Wu Hao <hao.wu@intel.com>
> > > > ---
> > > >  Documentation/ABI/testing/sysfs-platform-dfl-fme |  56 +++++++
> > > >  drivers/fpga/dfl-fme-main.c                      | 202 +++++++++++++++++++++++
> > > >  2 files changed, 258 insertions(+)
> > > > 
> > > > diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > > index b8327e9..d3aeb88 100644
> > > > --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > > +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > > @@ -44,3 +44,59 @@ Description:	Read-only. It returns socket_id to indicate which socket
> > > >  		this FPGA belongs to, only valid for integrated solution.
> > > >  		User only needs this information, in case standard numa node
> > > >  		can't provide correct information.
> > > > +
> > > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/temperature
> > > > +Date:		March 2019
> > > > +KernelVersion:  5.2
> > > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > > +Description:	Read-only. It returns temperature (in Celsius) of this FPGA
> > > > +		device.
> > > > +
> > > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1
> > > > +Date:		March 2019
> > > > +KernelVersion:  5.2
> > > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > > +Description:	Read-only. Read this file to get the temperature threshold1
> > > > +		(in Celsius).
> > > > +
> > > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold2
> > > > +Date:		March 2019
> > > > +KernelVersion:  5.2
> > > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > > +Description:	Read-only. Read this file to get the temperature threshold2
> > > > +		(in Celsius).
> > > > +
> > > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/trip_threshold
> > > > +Date:		March 2019
> > > > +KernelVersion:  5.2
> > > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > > +Description:	Read-only. It returns trip threshold (in Celsius), once FPGA
> > > > +		temperature reaches trip threshold, it triggers a fatal event
> > > > +		to board management controller (BMC) to shutdown FPGA.
> > > > +
> > > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1_status
> > > > +Date:		March 2019
> > > > +KernelVersion:  5.2
> > > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > > +Description:	Read-only. It returns 1 if temperature reaches threshold1,
> > > > +		otherwise 0. Once temperature reaches threshold1, hardware
> > > > +		will automatically enter throttling state (AP1 - 50%
> > > > +		or AP2 - 90% throttling, see 'threshold1_policy').
> > > > +
> > > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold2_status
> > > > +Date:		March 2019
> > > > +KernelVersion:  5.2
> > > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > > +Description:	Read-only. It returns 1 if temperature reaches threshold2,
> > > > +		otherwise 0. Once temperature reaches threshold2, hardware
> > > > +		will automatically enter the deepest throttling state (AP6
> > > > +		- 100% throttling).
> > > > +
> > > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1_policy
> > > > +Date:		March 2019
> > > > +KernelVersion:  5.2
> > > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > > +Description:	Read-only. Read this file to get the policy of temperature
> > > > +		threshold1. It only supports two value (policy):
> > > > +		    0 - AP2 state (90% throttling)
> > > > +		    1 - AP1 state (50% throttling)
> > > 
> > > These look like they could directly map to the linux thermal framework,
> > > any reason you can't use the thermal framework?
> > > 
> > > The trip stuff literally maps 1:1 to what a thermal driver does, I think
> > > that's something you'd wanna consider.
> > > 
> > 
> > Hi Moritz,
> > 
> > Thanks a lot for the suggestion, actually I feel that the trip points in thermal
> > zone are used to indicate cooling actions required for thermal software either
> > in kernel or userspace. But in this case, such FPGA hardware handles cooling
> > automatically (yes, driver only expose Read-only sysfs for information), so
> > software doesn't need to take care of this at all. For this purpose, it seems
> > that we don't have to put these thresholds as trip points. And per my
> > understanding, if people use such FPGA device, then they may need to know
> > what's the current hardware throttling behavior, e.g. 50% vs 90%. These
> > information can't be provided by standard thermal zone sysfs, so anyway user
> > needs these sysfs interfaces to know it. But it seems that we still could
> > create a thermal zone without trip points, it could help if user wants to
> > connect some external cooling devices via userspace thermal daemon, they can
> > define whatever trip points they like to activate the external cooling 
> > device. I will consider this further more and come up with a new patch in
> > v2 patchset.
> 
> Generally speaking extending an existing framework with the
> functionality you want is preferable over rolling 100% your own.
> 
> So please look into this.

Yes, agree, will look into this and try to fix this in next version.

Thanks for the comments.

Hao

> 
> Thanks,
> Moritz

^ permalink raw reply

* Re: [PATCH v4 07/17] fs/dcache.c: add shrink_dcache_inode()
From: Eric Biggers @ 2019-04-03 20:36 UTC (permalink / raw)
  To: Al Viro
  Cc: Satya Tangirala, linux-api, linux-f2fs-devel, linux-fscrypt,
	keyrings, linux-mtd, linux-crypto, linux-fsdevel, linux-ext4,
	Paul Crowley
In-Reply-To: <20190403183411.GV2217@ZenIV.linux.org.uk>

On Wed, Apr 03, 2019 at 07:34:12PM +0100, Al Viro wrote:
> On Tue, Apr 02, 2019 at 08:45:50AM -0700, Eric Biggers wrote:
> > From: Eric Biggers <ebiggers@google.com>
> > 
> > When a filesystem encryption key is removed, we need all files which had
> > been "unlocked" (had ->i_crypt_info set up) with it to appear "locked"
> > again.  This is most easily done by evicting the inodes.  This can
> > currently be done using 'echo 2 > /proc/sys/vm/drop_caches'; however,
> > that is overkill and not usable by non-root users.
> > 
> > To evict just the needed inodes we also need the ability to evict those
> > inodes' dentries, since an inode is pinned by its dentries.  Therefore,
> > add a function shrink_dcache_inode() which iterates through an inode's
> > dentries and evicts any unused ones as well as any unused descendants
> > (since there may be negative dentries pinning the inode's dentries).
> 
> Huh?
> 
> > + * Evict all unused aliases of the specified inode from the dcache.  This is
> > + * intended to be used when trying to evict a specific inode, since inodes are
> > + * pinned by their dentries.  We also have to descend to ->d_subdirs for each
> > + * alias, since aliases may be pinned by negative child dentries.
> > + */
> > +void shrink_dcache_inode(struct inode *inode)
> > +{
> > +	for (;;) {
> > +		struct select_data data;
> > +		struct dentry *dentry;
> > +
> > +		INIT_LIST_HEAD(&data.dispose);
> > +		data.start = NULL;
> > +		data.found = 0;
> > +
> > +		spin_lock(&inode->i_lock);
> > +		hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias)
> > +			d_walk(dentry, &data, select_collect);
> > +		spin_unlock(&inode->i_lock);
> > +
> > +		if (!data.found)
> > +			break;
> > +
> > +		shrink_dentry_list(&data.dispose);
> > +		cond_resched();
> 
> This is... odd.  What's wrong with
> 	if (S_ISDIR(inode->i_mode)) {
> 		dentry = d_find_any_alias(inode);
> 		if (dentry) {
> 			shrink_dcache_parent(dentry);
> 			dput(dentry);
> 		}
> 	}
> 	d_prune_aliases(inode);
> instead of that thing?

That works, as far as I can tell, so I'll do that instead.

I don't think I noticed that d_prune_aliases() existed when I wrote this.

Thanks for the suggestion!

- Eric

^ permalink raw reply

* Re: [PATCH v4 07/17] fs/dcache.c: add shrink_dcache_inode()
From: Al Viro @ 2019-04-03 18:34 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Satya Tangirala, linux-api, linux-f2fs-devel, linux-fscrypt,
	keyrings, linux-mtd, linux-crypto, linux-fsdevel, linux-ext4,
	Paul Crowley
In-Reply-To: <20190402154600.32432-8-ebiggers@kernel.org>

On Tue, Apr 02, 2019 at 08:45:50AM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> When a filesystem encryption key is removed, we need all files which had
> been "unlocked" (had ->i_crypt_info set up) with it to appear "locked"
> again.  This is most easily done by evicting the inodes.  This can
> currently be done using 'echo 2 > /proc/sys/vm/drop_caches'; however,
> that is overkill and not usable by non-root users.
> 
> To evict just the needed inodes we also need the ability to evict those
> inodes' dentries, since an inode is pinned by its dentries.  Therefore,
> add a function shrink_dcache_inode() which iterates through an inode's
> dentries and evicts any unused ones as well as any unused descendants
> (since there may be negative dentries pinning the inode's dentries).

Huh?

> + * Evict all unused aliases of the specified inode from the dcache.  This is
> + * intended to be used when trying to evict a specific inode, since inodes are
> + * pinned by their dentries.  We also have to descend to ->d_subdirs for each
> + * alias, since aliases may be pinned by negative child dentries.
> + */
> +void shrink_dcache_inode(struct inode *inode)
> +{
> +	for (;;) {
> +		struct select_data data;
> +		struct dentry *dentry;
> +
> +		INIT_LIST_HEAD(&data.dispose);
> +		data.start = NULL;
> +		data.found = 0;
> +
> +		spin_lock(&inode->i_lock);
> +		hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias)
> +			d_walk(dentry, &data, select_collect);
> +		spin_unlock(&inode->i_lock);
> +
> +		if (!data.found)
> +			break;
> +
> +		shrink_dentry_list(&data.dispose);
> +		cond_resched();

This is... odd.  What's wrong with
	if (S_ISDIR(inode->i_mode)) {
		dentry = d_find_any_alias(inode);
		if (dentry) {
			shrink_dcache_parent(dentry);
			dput(dentry);
		}
	}
	d_prune_aliases(inode);
instead of that thing?

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply

* Re: [PATCH 14/17] fpga: dfl: fme: add thermal management support
From: Moritz Fischer @ 2019-04-03 18:09 UTC (permalink / raw)
  To: Wu Hao
  Cc: Moritz Fischer, atull, linux-fpga, linux-kernel, linux-api,
	Luwei Kang, Russ Weight, Xu Yilun
In-Reply-To: <20190403163147.GA28570@hao-dev>

Hi Hao,

On Thu, Apr 04, 2019 at 12:31:47AM +0800, Wu Hao wrote:
> On Tue, Apr 02, 2019 at 07:59:25AM -0700, Moritz Fischer wrote:
> > Hi Wu,
> > 
> > On Mon, Mar 25, 2019 at 11:07:41AM +0800, Wu Hao wrote:
> > > This patch adds support to thermal management private feature for DFL
> > > FPGA Management Engine (FME). As thermal throttling is handled by
> > > hardware automatically per pre-defined thresholds, this private
> > > feature driver only provides read-only sysfs interfaces for user
> > > to read temperature, thresholds, threshold policy and other info.
> > > 
> > > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > > Signed-off-by: Wu Hao <hao.wu@intel.com>
> > > ---
> > >  Documentation/ABI/testing/sysfs-platform-dfl-fme |  56 +++++++
> > >  drivers/fpga/dfl-fme-main.c                      | 202 +++++++++++++++++++++++
> > >  2 files changed, 258 insertions(+)
> > > 
> > > diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > index b8327e9..d3aeb88 100644
> > > --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > @@ -44,3 +44,59 @@ Description:	Read-only. It returns socket_id to indicate which socket
> > >  		this FPGA belongs to, only valid for integrated solution.
> > >  		User only needs this information, in case standard numa node
> > >  		can't provide correct information.
> > > +
> > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/temperature
> > > +Date:		March 2019
> > > +KernelVersion:  5.2
> > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > +Description:	Read-only. It returns temperature (in Celsius) of this FPGA
> > > +		device.
> > > +
> > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1
> > > +Date:		March 2019
> > > +KernelVersion:  5.2
> > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > +Description:	Read-only. Read this file to get the temperature threshold1
> > > +		(in Celsius).
> > > +
> > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold2
> > > +Date:		March 2019
> > > +KernelVersion:  5.2
> > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > +Description:	Read-only. Read this file to get the temperature threshold2
> > > +		(in Celsius).
> > > +
> > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/trip_threshold
> > > +Date:		March 2019
> > > +KernelVersion:  5.2
> > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > +Description:	Read-only. It returns trip threshold (in Celsius), once FPGA
> > > +		temperature reaches trip threshold, it triggers a fatal event
> > > +		to board management controller (BMC) to shutdown FPGA.
> > > +
> > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1_status
> > > +Date:		March 2019
> > > +KernelVersion:  5.2
> > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > +Description:	Read-only. It returns 1 if temperature reaches threshold1,
> > > +		otherwise 0. Once temperature reaches threshold1, hardware
> > > +		will automatically enter throttling state (AP1 - 50%
> > > +		or AP2 - 90% throttling, see 'threshold1_policy').
> > > +
> > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold2_status
> > > +Date:		March 2019
> > > +KernelVersion:  5.2
> > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > +Description:	Read-only. It returns 1 if temperature reaches threshold2,
> > > +		otherwise 0. Once temperature reaches threshold2, hardware
> > > +		will automatically enter the deepest throttling state (AP6
> > > +		- 100% throttling).
> > > +
> > > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1_policy
> > > +Date:		March 2019
> > > +KernelVersion:  5.2
> > > +Contact:	Wu Hao <hao.wu@intel.com>
> > > +Description:	Read-only. Read this file to get the policy of temperature
> > > +		threshold1. It only supports two value (policy):
> > > +		    0 - AP2 state (90% throttling)
> > > +		    1 - AP1 state (50% throttling)
> > 
> > These look like they could directly map to the linux thermal framework,
> > any reason you can't use the thermal framework?
> > 
> > The trip stuff literally maps 1:1 to what a thermal driver does, I think
> > that's something you'd wanna consider.
> > 
> 
> Hi Moritz,
> 
> Thanks a lot for the suggestion, actually I feel that the trip points in thermal
> zone are used to indicate cooling actions required for thermal software either
> in kernel or userspace. But in this case, such FPGA hardware handles cooling
> automatically (yes, driver only expose Read-only sysfs for information), so
> software doesn't need to take care of this at all. For this purpose, it seems
> that we don't have to put these thresholds as trip points. And per my
> understanding, if people use such FPGA device, then they may need to know
> what's the current hardware throttling behavior, e.g. 50% vs 90%. These
> information can't be provided by standard thermal zone sysfs, so anyway user
> needs these sysfs interfaces to know it. But it seems that we still could
> create a thermal zone without trip points, it could help if user wants to
> connect some external cooling devices via userspace thermal daemon, they can
> define whatever trip points they like to activate the external cooling 
> device. I will consider this further more and come up with a new patch in
> v2 patchset.

Generally speaking extending an existing framework with the
functionality you want is preferable over rolling 100% your own.

So please look into this.

Thanks,
Moritz

^ permalink raw reply

* Re: [PATCH v4 1/6] arm64: HWCAP: add support for AT_HWCAP2
From: Dave Martin @ 2019-04-03 16:33 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Catalin Marinas, Will Deacon, Szabolcs Nagy, linux-arm-kernel,
	Mark Rutland, Phil Blundell, libc-alpha, linux-api,
	Suzuki K Poulose
In-Reply-To: <20190403160622.GM53702@e119886-lin.cambridge.arm.com>

On Wed, Apr 03, 2019 at 05:06:23PM +0100, Andrew Murray wrote:
> On Wed, Apr 03, 2019 at 02:21:12PM +0100, Dave Martin wrote:
> > On Wed, Apr 03, 2019 at 11:56:23AM +0100, Andrew Murray wrote:
> > > As we will exhaust the first 32 bits of AT_HWCAP let's start
> > > exposing AT_HWCAP2 to userspace to give us up to 64 caps.
> > > 
> > > Whilst it's possible to use the remaining 32 bits of AT_HWCAP, we
> > > prefer to expand into AT_HWCAP2 in order to provide a consistent
> > > view to userspace between ILP32 and LP64. However internal to the
> > > kernel we prefer to continue to use the full space of elf_hwcap.
> > > 
> > > To reduce complexity and allow for future expansion, we now
> > > represent hwcaps in the kernel as ordinals and use a
> > > KERNEL_HWCAP_ prefix. This allows us to support automatic feature
> > > based module loading for all our hwcaps.
> > > 
> > > We introduce cpu_set_feature to set hwcaps which complements the
> > > existing cpu_have_feature helper. These helpers allow us to clean
> > > up existing direct uses of elf_hwcap and reduce any future effort
> > > required to move beyond 64 caps.
> > > 
> > > For convenience we also introduce cpu_{have,set}_named_feature which
> > > makes use of the cpu_feature macro to allow providing a hwcap name
> > > without a {KERNEL_}HWCAP_ prefix.
> > > 
> > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > 
> > [...]
> > 
> > > diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
> > > index 400b80b49595..1f38a2740f7a 100644
> > > --- a/arch/arm64/include/asm/hwcap.h
> > > +++ b/arch/arm64/include/asm/hwcap.h
> > > @@ -39,12 +39,61 @@
> > >  #define COMPAT_HWCAP2_SHA2	(1 << 3)
> > >  #define COMPAT_HWCAP2_CRC32	(1 << 4)
> > >  
> > > +/*
> > > + * For userspace we represent hwcaps as a collection of HWCAP{,2}_x bitfields
> > > + * as described in uapi/asm/hwcap.h. For the kernel we represent hwcaps as
> > > + * natural numbers (in a single range of size MAX_CPU_FEATURES) defined here
> > > + * with prefix KERNEL_HWCAP_ mapped to their HWCAP{,2}_x counterpart.
> > > + *
> > > + * Hwcaps should be set and tested within the kernel via the
> > > + * cpu_{set,have}_named_feature(feature) where feature is the unique suffix
> > > + * of KERNEL_HWCAP_{feature}.
> > > + */
> > > +#define __khwcap_feature(x)		ilog2(HWCAP_ ## x)
> > 
> > Hmm, I didn't spot this before, but we should probably include
> > <linux/log2.h>.  This isn't asm-friendly however.
> 
> Doh!
> 
> > 
> > <asm/hwcap.h> gets included (unnecessarily?) by arch/arm64/mm/proc.S and
> > arch/arm64/include/uapi/asm/ptrace.h.
> 
> I also can't see any reason why either of these files includes hwcap.h...

Maybe we could just drop that include from proc.S.

> > Rather than risk breaking a UAPI header, can we remove the ilog2() here
> > and add it back into cpu_feature() where it was originally?
> 
> No I don't think we can. 

Agreed: userspace may be relying (however unwisely) on getting the
hwcaps as a side-effect of <uapi/asm/ptrace.h>, so we can't do much
about that one without taking a risk.

> > There may be a reason why this didn't work that I've forgotten...
> 
> We need the UAPI HWCAP_xx's to be bitfields and we've decided that we should
> limit them to 32 bits. Thus UAPI HWCAP2_xx's will also live within the first
> 32 bits meaning that we can't distinguish between them based on their value.
> 
> This isn't ideal within the kernel, as it means if we store the value
> anywhere (such as struct arm64_cpu_capabilities) then we need to also store
> some additional information to identify if it's AT_HWCAP or AT_HWCAP2.

But we could keep shadow kernel #defines that (for hwcap2) are shifted
up by 32 bits?  This required anything that deals with hwcap numbers to
cope with them being giant numbers that fit in an unsigned long, not
just small intergers (which possibly doesn't work without core changes?)

> In some cases (automatic hwcap based module loading) it's not possible to work
> around this - which is why arm32 can only support this for their elf_hwcap2.
> The approach this series takes allows automatic module loading to work based
> on any hwcap.
> 
> The solutions I can come up with at the moment are:
> 
>  - hard code the mapping without ilog2, as follows, though this is error
>    prone
> 
> #define KERNEL_HWCAP_ASIMD              2
> 
>  - Move the #ifndef __ASSEMBLY__ in include/asm/hwcap.h above the definitions
>    of KERNEL_HWCAP_xx and include <linux/log2.h> under __ASSEMBLY__. This works
>    but we can't test for hwcaps in assembly - maybe this isn't a problem?

Since this is a kernel header, this is probably OK: is asm needs the
hwcaps, sooner or later someone will need to fix it.

Possibly there are out-of-tree drivers relying on using the hwcaps from
assembly, but that's probably their own problem.

So, either move the #ifndef for simplicity, or introduce the ordinals
into <uapi/asm/hwcap.h>:

#define __HWCAP_NR_FP		0
#define __HWCAP_NR_ASIMD	1
#define __HWCAP_NR_EVTSTRM	2

...

#define HWCAP_FP	(1UL << __HWCAP_NR_FP)
#define HWCAP_ASIMD 	(1UL << __HWCAP_NR_ASIMD)
#define HWCAP_EVTSTRM	(1UL << __HWCAP_NR_EVTSTRM)

...

#define __HWCAP2_NR_DCPODP	0

#define HWCAP2_DCPODP	(1UL << __HWCAP2_NR_DCPODP)

...

then use the __HWCAP{,2}_NR_ constants directly place of the
KERNEL_HWCAP_ #defines, or define the KERNEL_HWCAP defined in terms of
them.

This is a noisy approach though, and I'm not totally convinced it's
better.

What do you think?

Cheers
---Dave

^ permalink raw reply

* Re: [PATCH 14/17] fpga: dfl: fme: add thermal management support
From: Wu Hao @ 2019-04-03 16:31 UTC (permalink / raw)
  To: Moritz Fischer
  Cc: atull, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Russ Weight, Xu Yilun
In-Reply-To: <20190402145925.GA15773@archbook>

On Tue, Apr 02, 2019 at 07:59:25AM -0700, Moritz Fischer wrote:
> Hi Wu,
> 
> On Mon, Mar 25, 2019 at 11:07:41AM +0800, Wu Hao wrote:
> > This patch adds support to thermal management private feature for DFL
> > FPGA Management Engine (FME). As thermal throttling is handled by
> > hardware automatically per pre-defined thresholds, this private
> > feature driver only provides read-only sysfs interfaces for user
> > to read temperature, thresholds, threshold policy and other info.
> > 
> > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Wu Hao <hao.wu@intel.com>
> > ---
> >  Documentation/ABI/testing/sysfs-platform-dfl-fme |  56 +++++++
> >  drivers/fpga/dfl-fme-main.c                      | 202 +++++++++++++++++++++++
> >  2 files changed, 258 insertions(+)
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > index b8327e9..d3aeb88 100644
> > --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > @@ -44,3 +44,59 @@ Description:	Read-only. It returns socket_id to indicate which socket
> >  		this FPGA belongs to, only valid for integrated solution.
> >  		User only needs this information, in case standard numa node
> >  		can't provide correct information.
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/temperature
> > +Date:		March 2019
> > +KernelVersion:  5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-only. It returns temperature (in Celsius) of this FPGA
> > +		device.
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1
> > +Date:		March 2019
> > +KernelVersion:  5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-only. Read this file to get the temperature threshold1
> > +		(in Celsius).
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold2
> > +Date:		March 2019
> > +KernelVersion:  5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-only. Read this file to get the temperature threshold2
> > +		(in Celsius).
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/trip_threshold
> > +Date:		March 2019
> > +KernelVersion:  5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-only. It returns trip threshold (in Celsius), once FPGA
> > +		temperature reaches trip threshold, it triggers a fatal event
> > +		to board management controller (BMC) to shutdown FPGA.
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1_status
> > +Date:		March 2019
> > +KernelVersion:  5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-only. It returns 1 if temperature reaches threshold1,
> > +		otherwise 0. Once temperature reaches threshold1, hardware
> > +		will automatically enter throttling state (AP1 - 50%
> > +		or AP2 - 90% throttling, see 'threshold1_policy').
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold2_status
> > +Date:		March 2019
> > +KernelVersion:  5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-only. It returns 1 if temperature reaches threshold2,
> > +		otherwise 0. Once temperature reaches threshold2, hardware
> > +		will automatically enter the deepest throttling state (AP6
> > +		- 100% throttling).
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/thermal_mgmt/threshold1_policy
> > +Date:		March 2019
> > +KernelVersion:  5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-only. Read this file to get the policy of temperature
> > +		threshold1. It only supports two value (policy):
> > +		    0 - AP2 state (90% throttling)
> > +		    1 - AP1 state (50% throttling)
> 
> These look like they could directly map to the linux thermal framework,
> any reason you can't use the thermal framework?
> 
> The trip stuff literally maps 1:1 to what a thermal driver does, I think
> that's something you'd wanna consider.
> 

Hi Moritz,

Thanks a lot for the suggestion, actually I feel that the trip points in thermal
zone are used to indicate cooling actions required for thermal software either
in kernel or userspace. But in this case, such FPGA hardware handles cooling
automatically (yes, driver only expose Read-only sysfs for information), so
software doesn't need to take care of this at all. For this purpose, it seems
that we don't have to put these thresholds as trip points. And per my
understanding, if people use such FPGA device, then they may need to know
what's the current hardware throttling behavior, e.g. 50% vs 90%. These
information can't be provided by standard thermal zone sysfs, so anyway user
needs these sysfs interfaces to know it. But it seems that we still could
create a thermal zone without trip points, it could help if user wants to
connect some external cooling devices via userspace thermal daemon, they can
define whatever trip points they like to activate the external cooling 
device. I will consider this further more and come up with a new patch in
v2 patchset.

Thanks
Hao

> Cheers,
> Moritz

^ permalink raw reply

* Re: [PATCH v4 1/6] arm64: HWCAP: add support for AT_HWCAP2
From: Andrew Murray @ 2019-04-03 16:06 UTC (permalink / raw)
  To: Dave Martin
  Cc: Mark Rutland, libc-alpha, Suzuki K Poulose, Szabolcs Nagy,
	Catalin Marinas, Will Deacon, Phil Blundell, linux-api,
	linux-arm-kernel
In-Reply-To: <20190403132112.GP3567@e103592.cambridge.arm.com>

On Wed, Apr 03, 2019 at 02:21:12PM +0100, Dave Martin wrote:
> On Wed, Apr 03, 2019 at 11:56:23AM +0100, Andrew Murray wrote:
> > As we will exhaust the first 32 bits of AT_HWCAP let's start
> > exposing AT_HWCAP2 to userspace to give us up to 64 caps.
> > 
> > Whilst it's possible to use the remaining 32 bits of AT_HWCAP, we
> > prefer to expand into AT_HWCAP2 in order to provide a consistent
> > view to userspace between ILP32 and LP64. However internal to the
> > kernel we prefer to continue to use the full space of elf_hwcap.
> > 
> > To reduce complexity and allow for future expansion, we now
> > represent hwcaps in the kernel as ordinals and use a
> > KERNEL_HWCAP_ prefix. This allows us to support automatic feature
> > based module loading for all our hwcaps.
> > 
> > We introduce cpu_set_feature to set hwcaps which complements the
> > existing cpu_have_feature helper. These helpers allow us to clean
> > up existing direct uses of elf_hwcap and reduce any future effort
> > required to move beyond 64 caps.
> > 
> > For convenience we also introduce cpu_{have,set}_named_feature which
> > makes use of the cpu_feature macro to allow providing a hwcap name
> > without a {KERNEL_}HWCAP_ prefix.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> 
> [...]
> 
> > diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
> > index 400b80b49595..1f38a2740f7a 100644
> > --- a/arch/arm64/include/asm/hwcap.h
> > +++ b/arch/arm64/include/asm/hwcap.h
> > @@ -39,12 +39,61 @@
> >  #define COMPAT_HWCAP2_SHA2	(1 << 3)
> >  #define COMPAT_HWCAP2_CRC32	(1 << 4)
> >  
> > +/*
> > + * For userspace we represent hwcaps as a collection of HWCAP{,2}_x bitfields
> > + * as described in uapi/asm/hwcap.h. For the kernel we represent hwcaps as
> > + * natural numbers (in a single range of size MAX_CPU_FEATURES) defined here
> > + * with prefix KERNEL_HWCAP_ mapped to their HWCAP{,2}_x counterpart.
> > + *
> > + * Hwcaps should be set and tested within the kernel via the
> > + * cpu_{set,have}_named_feature(feature) where feature is the unique suffix
> > + * of KERNEL_HWCAP_{feature}.
> > + */
> > +#define __khwcap_feature(x)		ilog2(HWCAP_ ## x)
> 
> Hmm, I didn't spot this before, but we should probably include
> <linux/log2.h>.  This isn't asm-friendly however.

Doh!

> 
> <asm/hwcap.h> gets included (unnecessarily?) by arch/arm64/mm/proc.S and
> arch/arm64/include/uapi/asm/ptrace.h.

I also can't see any reason why either of these files includes hwcap.h...

> 
> Rather than risk breaking a UAPI header, can we remove the ilog2() here
> and add it back into cpu_feature() where it was originally?

No I don't think we can. 

> 
> There may be a reason why this didn't work that I've forgotten...

We need the UAPI HWCAP_xx's to be bitfields and we've decided that we should
limit them to 32 bits. Thus UAPI HWCAP2_xx's will also live within the first
32 bits meaning that we can't distinguish between them based on their value.

This isn't ideal within the kernel, as it means if we store the value
anywhere (such as struct arm64_cpu_capabilities) then we need to also store
some additional information to identify if it's AT_HWCAP or AT_HWCAP2.

In some cases (automatic hwcap based module loading) it's not possible to work
around this - which is why arm32 can only support this for their elf_hwcap2.
The approach this series takes allows automatic module loading to work based
on any hwcap.

The solutions I can come up with at the moment are:

 - hard code the mapping without ilog2, as follows, though this is error
   prone

#define KERNEL_HWCAP_ASIMD              2

 - Move the #ifndef __ASSEMBLY__ in include/asm/hwcap.h above the definitions
   of KERNEL_HWCAP_xx and include <linux/log2.h> under __ASSEMBLY__. This works
   but we can't test for hwcaps in assembly - maybe this isn't a problem?

Thanks,

Andrew Murray

> 
> cpufeatures is the only place where we use the KERNEL_HWCAP_foo flags
> directly.
> 
> > +#define KERNEL_HWCAP_FP			__khwcap_feature(FP)
> > +#define KERNEL_HWCAP_ASIMD		__khwcap_feature(ASIMD)
> > +#define KERNEL_HWCAP_EVTSTRM		__khwcap_feature(EVTSTRM)
> 

> [...]
> 
> Otherwise, looks OK to me.

Thanks for the review.

Andrew Murray

> 
> Cheers
> ---Dave

^ permalink raw reply

* Re: [PATCH v4 6/6] arm64: Advertise ARM64_HAS_DCPODP cpu feature
From: Suzuki K Poulose @ 2019-04-03 13:48 UTC (permalink / raw)
  To: andrew.murray, catalin.marinas, will.deacon
  Cc: Szabolcs.Nagy, dave.martin, linux-arm-kernel, mark.rutland, pb,
	libc-alpha, linux-api
In-Reply-To: <20190403105628.39798-7-andrew.murray@arm.com>

On 04/03/2019 11:56 AM, Andrew Murray wrote:
> Advertise ARM64_HAS_DCPODP when both DC CVAP and DC CVADP are supported.
> 
> Even though we don't use this feature now, we provide it for consistency
> with DCPOP and anticipate it being used in the future.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>   arch/arm64/include/asm/cpucaps.h | 3 ++-
>   arch/arm64/kernel/cpufeature.c   | 9 +++++++++
>   2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index f6a76e43f39e..defdc67d9ab4 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -61,7 +61,8 @@
>   #define ARM64_HAS_GENERIC_AUTH_ARCH		40
>   #define ARM64_HAS_GENERIC_AUTH_IMP_DEF		41
>   #define ARM64_HAS_IRQ_PRIO_MASKING		42
> +#define ARM64_HAS_DCPODP			43
>   
> -#define ARM64_NCAPS				43
> +#define ARM64_NCAPS				44
>   
>   #endif /* __ASM_CPUCAPS_H */
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index f8b682a3a9f4..4ee5d63281ae 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1340,6 +1340,15 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>   		.field_pos = ID_AA64ISAR1_DPB_SHIFT,
>   		.min_field_value = 1,
>   	},
> +	{
> +		.desc = "Data cache clean to Point of Deep Persistence",
> +		.capability = ARM64_HAS_DCPODP,
> +		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
> +		.matches = has_cpuid_feature,
> +		.sys_reg = SYS_ID_AA64ISAR1_EL1,
> +		.field_pos = ID_AA64ISAR1_DPB_SHIFT,
> +		.min_field_value = 2,

Missing .sign field. Otherwise:

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

^ permalink raw reply

* Re: [PATCH v4 2/6] arm64: HWCAP: encapsulate elf_hwcap
From: Suzuki K Poulose @ 2019-04-03 13:42 UTC (permalink / raw)
  To: andrew.murray, catalin.marinas, will.deacon
  Cc: Szabolcs.Nagy, dave.martin, linux-arm-kernel, mark.rutland, pb,
	libc-alpha, linux-api
In-Reply-To: <20190403105628.39798-3-andrew.murray@arm.com>

On 04/03/2019 11:56 AM, Andrew Murray wrote:
> The introduction of AT_HWCAP2 introduced accessors which ensure that
> hwcap features are set and tested appropriately.
> 
> Let's now mandate access to elf_hwcap via these accessors by making
> elf_hwcap static within cpufeature.c.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

^ permalink raw reply

* Re: [PATCH v4 6/6] arm64: Advertise ARM64_HAS_DCPODP cpu feature
From: Dave Martin @ 2019-04-03 13:21 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Catalin Marinas, Will Deacon, Szabolcs Nagy, linux-arm-kernel,
	Mark Rutland, Phil Blundell, libc-alpha, linux-api,
	Suzuki K Poulose
In-Reply-To: <20190403105628.39798-7-andrew.murray@arm.com>

On Wed, Apr 03, 2019 at 11:56:28AM +0100, Andrew Murray wrote:
> Advertise ARM64_HAS_DCPODP when both DC CVAP and DC CVADP are supported.
> 
> Even though we don't use this feature now, we provide it for consistency
> with DCPOP and anticipate it being used in the future.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>

Reviewed-by: Dave Martin <Dave.Martin@arm.com>

> ---
>  arch/arm64/include/asm/cpucaps.h | 3 ++-
>  arch/arm64/kernel/cpufeature.c   | 9 +++++++++
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index f6a76e43f39e..defdc67d9ab4 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -61,7 +61,8 @@
>  #define ARM64_HAS_GENERIC_AUTH_ARCH		40
>  #define ARM64_HAS_GENERIC_AUTH_IMP_DEF		41
>  #define ARM64_HAS_IRQ_PRIO_MASKING		42
> +#define ARM64_HAS_DCPODP			43
>  
> -#define ARM64_NCAPS				43
> +#define ARM64_NCAPS				44
>  
>  #endif /* __ASM_CPUCAPS_H */
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index f8b682a3a9f4..4ee5d63281ae 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1340,6 +1340,15 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.field_pos = ID_AA64ISAR1_DPB_SHIFT,
>  		.min_field_value = 1,
>  	},
> +	{
> +		.desc = "Data cache clean to Point of Deep Persistence",
> +		.capability = ARM64_HAS_DCPODP,
> +		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
> +		.matches = has_cpuid_feature,
> +		.sys_reg = SYS_ID_AA64ISAR1_EL1,
> +		.field_pos = ID_AA64ISAR1_DPB_SHIFT,
> +		.min_field_value = 2,
> +	},
>  #endif
>  #ifdef CONFIG_ARM64_SVE
>  	{
> -- 
> 2.21.0
> 

^ permalink raw reply

* Re: [PATCH v4 5/6] arm64: add CVADP support to the cache maintenance helper
From: Dave Martin @ 2019-04-03 13:21 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Catalin Marinas, Will Deacon, Szabolcs Nagy, linux-arm-kernel,
	Mark Rutland, Phil Blundell, libc-alpha, linux-api,
	Suzuki K Poulose
In-Reply-To: <20190403105628.39798-6-andrew.murray@arm.com>

On Wed, Apr 03, 2019 at 11:56:27AM +0100, Andrew Murray wrote:
> Allow users of dcache_by_line_op to specify cvadp as an op.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> ---
>  arch/arm64/include/asm/assembler.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index c5308d01e228..d50caf0e6b64 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -407,10 +407,14 @@ alternative_endif
>  	.ifc	\op, cvap
>  	sys	3, c7, c12, 1, \kaddr	// dc cvap
>  	.else
> +	.ifc	\op, cvadp
> +	sys	3, c7, c13, 1, \kaddr	// dc cvadp
> +	.else
>  	dc	\op, \kaddr
>  	.endif
>  	.endif
>  	.endif
> +	.endif

This is a bit annoying, but short of moving this .if chain into a
separate macro and doing something like:

	.ifc	\op, cvap
	sys	3, c7, c12, 1, \kaddr	// dc cvap
	.exitm
	.endif

	.ifc	\op, cvadp
	sys	3, c7, c12, 1, \kaddr	// dc cvap
	.exitm
	.endif

	// ...

I don't see an obvious fix.  For now, this seems like overkill...


Anyway, with the patch as-is:

Reviewed-by: Dave Martin <Dave.Martin@arm.com>


It's logical to have dcache_by_line_op understanding cvadp, even if we
don't use it yet.

Cheers
---Dave

^ permalink raw reply

* Re: [PATCH v4 4/6] arm64: Expose DC CVADP to userspace
From: Dave Martin @ 2019-04-03 13:21 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Catalin Marinas, Will Deacon, Szabolcs Nagy, linux-arm-kernel,
	Mark Rutland, Phil Blundell, libc-alpha, linux-api,
	Suzuki K Poulose
In-Reply-To: <20190403105628.39798-5-andrew.murray@arm.com>

On Wed, Apr 03, 2019 at 11:56:26AM +0100, Andrew Murray wrote:
> ARMv8.5 builds upon the ARMv8.2 DC CVAP instruction by introducing a DC
> CVADP instruction which cleans the data cache to the point of deep
> persistence. Let's expose this support via the arm64 ELF hwcaps.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>

Reviewed-by: Dave Martin <Dave.Martin@arm.com>

> ---
>  Documentation/arm64/elf_hwcaps.txt  | 4 ++++
>  arch/arm64/include/asm/hwcap.h      | 1 +
>  arch/arm64/include/uapi/asm/hwcap.h | 5 +++++
>  arch/arm64/kernel/cpufeature.c      | 1 +
>  arch/arm64/kernel/cpuinfo.c         | 1 +
>  5 files changed, 12 insertions(+)
> 
> diff --git a/Documentation/arm64/elf_hwcaps.txt b/Documentation/arm64/elf_hwcaps.txt
> index c04f8e87bab8..7b591c1dcb53 100644
> --- a/Documentation/arm64/elf_hwcaps.txt
> +++ b/Documentation/arm64/elf_hwcaps.txt
> @@ -135,6 +135,10 @@ HWCAP_DCPOP
>  
>      Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0001.
>  
> +HWCAP2_DCPODP
> +
> +    Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0010.
> +
>  HWCAP_SHA3
>  
>      Functionality implied by ID_AA64ISAR0_EL1.SHA3 == 0b0001.
> diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
> index 509ee0d2fa0f..732bdd7286c8 100644
> --- a/arch/arm64/include/asm/hwcap.h
> +++ b/arch/arm64/include/asm/hwcap.h
> @@ -85,6 +85,7 @@
>  #define KERNEL_HWCAP_PACG		__khwcap_feature(PACG)
>  
>  #define __khwcap2_feature(x)		(ilog2(HWCAP2_ ## x) + 32)
> +#define KERNEL_HWCAP_DCPODP		__khwcap2_feature(DCPODP)
>  
>  #ifndef __ASSEMBLY__
>  
> diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
> index 453b45af80b7..d64af3913a9e 100644
> --- a/arch/arm64/include/uapi/asm/hwcap.h
> +++ b/arch/arm64/include/uapi/asm/hwcap.h
> @@ -53,4 +53,9 @@
>  #define HWCAP_PACA		(1 << 30)
>  #define HWCAP_PACG		(1UL << 31)
>  
> +/*
> + * HWCAP2 flags - for AT_HWCAP2
> + */
> +#define HWCAP2_DCPODP		(1 << 0)
> +
>  #endif /* _UAPI__ASM_HWCAP_H */
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index a655d1bb1186..f8b682a3a9f4 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1591,6 +1591,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
>  	HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDHP),
>  	HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_DIT_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DIT),
>  	HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DCPOP),
> +	HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_DCPODP),
>  	HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_JSCVT_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_JSCVT),
>  	HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_FCMA_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FCMA),
>  	HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_LRCPC_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_LRCPC),
> diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
> index 810db95f293f..093ca53ce1d1 100644
> --- a/arch/arm64/kernel/cpuinfo.c
> +++ b/arch/arm64/kernel/cpuinfo.c
> @@ -85,6 +85,7 @@ static const char *const hwcap_str[] = {
>  	"sb",
>  	"paca",
>  	"pacg",
> +	"dcpodp",
>  	NULL
>  };
>  
> -- 
> 2.21.0
> 

^ permalink raw reply

* Re: [PATCH v4 3/6] arm64: Handle trapped DC CVADP
From: Dave Martin @ 2019-04-03 13:21 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Catalin Marinas, Will Deacon, Szabolcs Nagy, linux-arm-kernel,
	Mark Rutland, Phil Blundell, libc-alpha, linux-api,
	Suzuki K Poulose
In-Reply-To: <20190403105628.39798-4-andrew.murray@arm.com>

On Wed, Apr 03, 2019 at 11:56:25AM +0100, Andrew Murray wrote:
> The ARMv8.5 DC CVADP instruction may be trapped to EL1 via
> SCTLR_EL1.UCI therefore let's provide a handler for it.
> 
> Just like the CVAP instruction we use a 'sys' instruction instead of
> the 'dc' alias to avoid build issues with older toolchains.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Reviewed-by: Dave Martin <Dave.Martin@arm.com>

> ---
>  arch/arm64/include/asm/esr.h | 3 ++-
>  arch/arm64/kernel/traps.c    | 3 +++
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
> index 52233f00d53d..07d5c026a0b3 100644
> --- a/arch/arm64/include/asm/esr.h
> +++ b/arch/arm64/include/asm/esr.h
> @@ -198,9 +198,10 @@
>  /*
>   * User space cache operations have the following sysreg encoding
>   * in System instructions.
> - * op0=1, op1=3, op2=1, crn=7, crm={ 5, 10, 11, 12, 14 }, WRITE (L=0)
> + * op0=1, op1=3, op2=1, crn=7, crm={ 5, 10, 11, 12, 13, 14 }, WRITE (L=0)
>   */
>  #define ESR_ELx_SYS64_ISS_CRM_DC_CIVAC	14
> +#define ESR_ELx_SYS64_ISS_CRM_DC_CVADP	13
>  #define ESR_ELx_SYS64_ISS_CRM_DC_CVAP	12
>  #define ESR_ELx_SYS64_ISS_CRM_DC_CVAU	11
>  #define ESR_ELx_SYS64_ISS_CRM_DC_CVAC	10
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 8ad119c3f665..f66e1ddbe4a7 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -459,6 +459,9 @@ static void user_cache_maint_handler(unsigned int esr, struct pt_regs *regs)
>  	case ESR_ELx_SYS64_ISS_CRM_DC_CVAC:	/* DC CVAC, gets promoted */
>  		__user_cache_maint("dc civac", address, ret);
>  		break;
> +	case ESR_ELx_SYS64_ISS_CRM_DC_CVADP:	/* DC CVADP */
> +		__user_cache_maint("sys 3, c7, c13, 1", address, ret);
> +		break;
>  	case ESR_ELx_SYS64_ISS_CRM_DC_CVAP:	/* DC CVAP */
>  		__user_cache_maint("sys 3, c7, c12, 1", address, ret);
>  		break;
> -- 
> 2.21.0
> 

^ permalink raw reply

* Re: [PATCH v4 2/6] arm64: HWCAP: encapsulate elf_hwcap
From: Dave Martin @ 2019-04-03 13:21 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Catalin Marinas, Will Deacon, Szabolcs Nagy, linux-arm-kernel,
	Mark Rutland, Phil Blundell, libc-alpha, linux-api,
	Suzuki K Poulose
In-Reply-To: <20190403105628.39798-3-andrew.murray@arm.com>

On Wed, Apr 03, 2019 at 11:56:24AM +0100, Andrew Murray wrote:
> The introduction of AT_HWCAP2 introduced accessors which ensure that
> hwcap features are set and tested appropriately.
> 
> Let's now mandate access to elf_hwcap via these accessors by making
> elf_hwcap static within cpufeature.c.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>

Reviewed-by: Dave Martin <Dave.Martin@arm.com>

> ---
>  arch/arm64/include/asm/cpufeature.h | 15 ++++---------
>  arch/arm64/include/asm/hwcap.h      |  7 +++---
>  arch/arm64/kernel/cpufeature.c      | 33 +++++++++++++++++++++++++++--
>  3 files changed, 38 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 347c17046668..a3f028f82def 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -392,19 +392,12 @@ extern DECLARE_BITMAP(boot_capabilities, ARM64_NPATCHABLE);
>  	for_each_set_bit(cap, cpu_hwcaps, ARM64_NCAPS)
>  
>  bool this_cpu_has_cap(unsigned int cap);
> +void cpu_set_feature(unsigned int num);
> +bool cpu_have_feature(unsigned int num);
> +unsigned long cpu_get_elf_hwcap(void);
> +unsigned long cpu_get_elf_hwcap2(void);
>  
> -static inline void cpu_set_feature(unsigned int num)
> -{
> -	WARN_ON(num >= MAX_CPU_FEATURES);
> -	elf_hwcap |= BIT(num);
> -}
>  #define cpu_set_named_feature(name) cpu_set_feature(cpu_feature(name))
> -
> -static inline bool cpu_have_feature(unsigned int num)
> -{
> -	WARN_ON(num >= MAX_CPU_FEATURES);
> -	return elf_hwcap & BIT(num);
> -}
>  #define cpu_have_named_feature(name) cpu_have_feature(cpu_feature(name))
>  
>  /* System capability check for constant caps */
> diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
> index 1f38a2740f7a..509ee0d2fa0f 100644
> --- a/arch/arm64/include/asm/hwcap.h
> +++ b/arch/arm64/include/asm/hwcap.h
> @@ -17,6 +17,7 @@
>  #define __ASM_HWCAP_H
>  
>  #include <uapi/asm/hwcap.h>
> +#include <asm/cpufeature.h>
>  
>  #define COMPAT_HWCAP_HALF	(1 << 1)
>  #define COMPAT_HWCAP_THUMB	(1 << 2)
> @@ -86,14 +87,13 @@
>  #define __khwcap2_feature(x)		(ilog2(HWCAP2_ ## x) + 32)
>  
>  #ifndef __ASSEMBLY__
> -#include <linux/kernel.h>
>  
>  /*
>   * This yields a mask that user programs can use to figure out what
>   * instruction set this cpu supports.
>   */
> -#define ELF_HWCAP		lower_32_bits(elf_hwcap)
> -#define ELF_HWCAP2		upper_32_bits(elf_hwcap)
> +#define ELF_HWCAP		cpu_get_elf_hwcap()
> +#define ELF_HWCAP2		cpu_get_elf_hwcap2()
>  
>  #ifdef CONFIG_COMPAT
>  #define COMPAT_ELF_HWCAP	(compat_elf_hwcap)
> @@ -109,6 +109,5 @@ enum {
>  #endif
>  };
>  
> -extern unsigned long elf_hwcap;
>  #endif
>  #endif
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 986ceeacd19f..a655d1bb1186 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -35,8 +35,8 @@
>  #include <asm/traps.h>
>  #include <asm/virt.h>
>  
> -unsigned long elf_hwcap __read_mostly;
> -EXPORT_SYMBOL_GPL(elf_hwcap);
> +/* Kernel representation of AT_HWCAP and AT_HWCAP2 */
> +static unsigned long elf_hwcap __read_mostly;
>  
>  #ifdef CONFIG_COMPAT
>  #define COMPAT_ELF_HWCAP_DEFAULT	\
> @@ -1947,6 +1947,35 @@ bool this_cpu_has_cap(unsigned int n)
>  	return false;
>  }
>  
> +void cpu_set_feature(unsigned int num)
> +{
> +	WARN_ON(num >= MAX_CPU_FEATURES);
> +	elf_hwcap |= BIT(num);
> +}
> +EXPORT_SYMBOL_GPL(cpu_set_feature);
> +
> +bool cpu_have_feature(unsigned int num)
> +{
> +	WARN_ON(num >= MAX_CPU_FEATURES);
> +	return elf_hwcap & BIT(num);
> +}
> +EXPORT_SYMBOL_GPL(cpu_have_feature);
> +
> +unsigned long cpu_get_elf_hwcap(void)
> +{
> +	/*
> +	 * We currently only populate the first 32 bits of AT_HWCAP. Please
> +	 * note that for userspace compatibility we guarantee that bits 62
> +	 * and 63 will always be returned as 0.
> +	 */
> +	return lower_32_bits(elf_hwcap);
> +}
> +
> +unsigned long cpu_get_elf_hwcap2(void)
> +{
> +	return upper_32_bits(elf_hwcap);
> +}
> +
>  static void __init setup_system_capabilities(void)
>  {
>  	/*
> -- 
> 2.21.0
> 

^ permalink raw reply

* Re: [PATCH v4 1/6] arm64: HWCAP: add support for AT_HWCAP2
From: Dave Martin @ 2019-04-03 13:21 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Mark Rutland, libc-alpha, Suzuki K Poulose, Szabolcs Nagy,
	Catalin Marinas, Will Deacon, Phil Blundell, linux-api,
	linux-arm-kernel
In-Reply-To: <20190403105628.39798-2-andrew.murray@arm.com>

On Wed, Apr 03, 2019 at 11:56:23AM +0100, Andrew Murray wrote:
> As we will exhaust the first 32 bits of AT_HWCAP let's start
> exposing AT_HWCAP2 to userspace to give us up to 64 caps.
> 
> Whilst it's possible to use the remaining 32 bits of AT_HWCAP, we
> prefer to expand into AT_HWCAP2 in order to provide a consistent
> view to userspace between ILP32 and LP64. However internal to the
> kernel we prefer to continue to use the full space of elf_hwcap.
> 
> To reduce complexity and allow for future expansion, we now
> represent hwcaps in the kernel as ordinals and use a
> KERNEL_HWCAP_ prefix. This allows us to support automatic feature
> based module loading for all our hwcaps.
> 
> We introduce cpu_set_feature to set hwcaps which complements the
> existing cpu_have_feature helper. These helpers allow us to clean
> up existing direct uses of elf_hwcap and reduce any future effort
> required to move beyond 64 caps.
> 
> For convenience we also introduce cpu_{have,set}_named_feature which
> makes use of the cpu_feature macro to allow providing a hwcap name
> without a {KERNEL_}HWCAP_ prefix.
> 
> Signed-off-by: Andrew Murray <andrew.murray@arm.com>

[...]

> diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
> index 400b80b49595..1f38a2740f7a 100644
> --- a/arch/arm64/include/asm/hwcap.h
> +++ b/arch/arm64/include/asm/hwcap.h
> @@ -39,12 +39,61 @@
>  #define COMPAT_HWCAP2_SHA2	(1 << 3)
>  #define COMPAT_HWCAP2_CRC32	(1 << 4)
>  
> +/*
> + * For userspace we represent hwcaps as a collection of HWCAP{,2}_x bitfields
> + * as described in uapi/asm/hwcap.h. For the kernel we represent hwcaps as
> + * natural numbers (in a single range of size MAX_CPU_FEATURES) defined here
> + * with prefix KERNEL_HWCAP_ mapped to their HWCAP{,2}_x counterpart.
> + *
> + * Hwcaps should be set and tested within the kernel via the
> + * cpu_{set,have}_named_feature(feature) where feature is the unique suffix
> + * of KERNEL_HWCAP_{feature}.
> + */
> +#define __khwcap_feature(x)		ilog2(HWCAP_ ## x)

Hmm, I didn't spot this before, but we should probably include
<linux/log2.h>.  This isn't asm-friendly however.

<asm/hwcap.h> gets included (unnecessarily?) by arch/arm64/mm/proc.S and
arch/arm64/include/uapi/asm/ptrace.h.

Rather than risk breaking a UAPI header, can we remove the ilog2() here
and add it back into cpu_feature() where it was originally?

There may be a reason why this didn't work that I've forgotten...

cpufeatures is the only place where we use the KERNEL_HWCAP_foo flags
directly.

> +#define KERNEL_HWCAP_FP			__khwcap_feature(FP)
> +#define KERNEL_HWCAP_ASIMD		__khwcap_feature(ASIMD)
> +#define KERNEL_HWCAP_EVTSTRM		__khwcap_feature(EVTSTRM)

[...]

Otherwise, looks OK to me.

Cheers
---Dave

^ permalink raw reply

* Re: [RESEND PATCH v1] moduleparam: Save information about built-in modules in separate file
From: Masahiro Yamada @ 2019-04-03 11:30 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: Jessica Yu, Michal Marek, Linux Kernel Mailing List,
	Linux Kbuild mailing list, linux-api, Kirill A . Shutemov,
	Gleb Fotengauer-Malinovskiy, Dmitry V. Levin, Dmitry Torokhov,
	Rusty Russell, Lucas De Marchi
In-Reply-To: <20190327160440.GE15936@Legion-PC.fortress>

On Thu, Mar 28, 2019 at 1:04 AM Alexey Gladkov <gladkov.alexey@gmail.com> wrote:
>
> On Wed, Mar 27, 2019 at 04:40:25PM +0100, Jessica Yu wrote:
> > +++ Alexey Gladkov [26/03/19 18:24 +0100]:
> > >On Fri, Mar 22, 2019 at 02:34:12PM +0900, Masahiro Yamada wrote:
> > >> Hi.
> > >>
> > >> (added some people to CC)
> >
> > (Thanks Masahiro for the CC!)
> >
> > >>
> > >> On Fri, Mar 15, 2019 at 7:10 PM Alexey Gladkov <gladkov.alexey@gmail.com> wrote:
> > >> >
> > >> > Problem:
> > >> >
> > >> > When a kernel module is compiled as a separate module, some important
> > >> > information about the kernel module is available via .modinfo section of
> > >> > the module.  In contrast, when the kernel module is compiled into the
> > >> > kernel, that information is not available.
> > >>
> > >>
> > >> I might be missing something, but
> > >> vmlinux provides info of builtin modules
> > >> in /sys/module/.
> > >
> > >No. There are definitely not all modules. I have a builtin sha256_generic,
> > >but I can't find him in the /sys/module.
> >
> > Yeah, you'll only find builtin modules under /sys/module/ if it has any module
> > parameters, otherwise you won't find it there. As Masahiro already mentioned,
> > if a builtin module has any parameters, they would be accessible under /sys/module/.
> >
> > >> (Looks like currently only module_param and MODULE_VERSION)
> > >>
> > >> This patch is not exactly the same, but I see a kind of overwrap.
> > >> I'd like to be sure if we want this new scheme.
> > >
> > >The /sys/module is only for running kernel. One of my use cases is
> > >to create an initrd for a new kernel.
> > >
> > >>
> > >> > Information about built-in modules is necessary in the following cases:
> > >> >
> > >> > 1. When it is necessary to find out what additional parameters can be
> > >> > passed to the kernel at boot time.
> > >>
> > >>
> > >> Actually, /sys/module/<module>/parameters/
> > >> exposes this information.
> > >>
> > >> Doesn't it work for your purpose?
> > >
> > >No, since creating an initrd needs to know all the modalias before
> > >I get the sysfs for new kernel. Also there are no modalias at all.
> > >
> > >> > 2. When you need to know which module names and their aliases are in
> > >> > the kernel. This is very useful for creating an initrd image.
> > >> >
> >
> > Hm, I do see one possible additional use-case for preserving module alias
> > information for built-in modules - modprobe will currently error (I think,
> > correct me if I'm wrong) if we try invoking modprobe with an alias of a
> > built-in module, simply because this information is not in modules.builtin or
> > modules.alias.
>
> Yes. Patch for modprobe in my todo list. The reason I didn’t do it was
> because I wasn’t sure that the file format was final.
>
> > Since kbuild already outputs modules.builtin, I would suggest outputting
> > something like a modules.builtin.alias file (and I guess maybe a modules.builtin.param
> > file too if that's deemed useful), in a format that is consumable by kmod/modprobe,
> > so that modprobing an alias of a built-in module doesn't produce an error. I
> > think this should be easy to do if we keep and parse the resulting .modinfo for
> > builtin modules. This is just an idea, opinions welcome. I've added Lucas to CC
> > in case he has any thoughts.
>
> You don't like kernel.builtin.modinfo ?


Naming is often the most difficult thing. :)

IMHO, 'kernel' and 'builtin' have a similar meaning here.
Is 'kernel.builtin' unnecessarily too long?


Perhaps, another idea is:

'builtin.alias' instead of 'modules.builtin.alias'
if we want a separate file in the same format.


In hindsight, 'modules.builtin' should have been
'builtin.order', I think.




> It is much easier to create and it has almost the same format as the
> modules. So I think it will be easier to parse in kmod.

-- 
Best Regards
Masahiro Yamada

^ permalink raw reply

* [PATCH v4 6/6] arm64: Advertise ARM64_HAS_DCPODP cpu feature
From: Andrew Murray @ 2019-04-03 10:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Szabolcs Nagy, dave.martin, linux-arm-kernel, Mark Rutland,
	Phil Blundell, libc-alpha, linux-api, Suzuki K Poulose
In-Reply-To: <20190403105628.39798-1-andrew.murray@arm.com>

Advertise ARM64_HAS_DCPODP when both DC CVAP and DC CVADP are supported.

Even though we don't use this feature now, we provide it for consistency
with DCPOP and anticipate it being used in the future.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 arch/arm64/include/asm/cpucaps.h | 3 ++-
 arch/arm64/kernel/cpufeature.c   | 9 +++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index f6a76e43f39e..defdc67d9ab4 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -61,7 +61,8 @@
 #define ARM64_HAS_GENERIC_AUTH_ARCH		40
 #define ARM64_HAS_GENERIC_AUTH_IMP_DEF		41
 #define ARM64_HAS_IRQ_PRIO_MASKING		42
+#define ARM64_HAS_DCPODP			43
 
-#define ARM64_NCAPS				43
+#define ARM64_NCAPS				44
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index f8b682a3a9f4..4ee5d63281ae 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1340,6 +1340,15 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.field_pos = ID_AA64ISAR1_DPB_SHIFT,
 		.min_field_value = 1,
 	},
+	{
+		.desc = "Data cache clean to Point of Deep Persistence",
+		.capability = ARM64_HAS_DCPODP,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64ISAR1_EL1,
+		.field_pos = ID_AA64ISAR1_DPB_SHIFT,
+		.min_field_value = 2,
+	},
 #endif
 #ifdef CONFIG_ARM64_SVE
 	{
-- 
2.21.0

^ permalink raw reply related

* [PATCH v4 5/6] arm64: add CVADP support to the cache maintenance helper
From: Andrew Murray @ 2019-04-03 10:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Szabolcs Nagy, dave.martin, linux-arm-kernel, Mark Rutland,
	Phil Blundell, libc-alpha, linux-api, Suzuki K Poulose
In-Reply-To: <20190403105628.39798-1-andrew.murray@arm.com>

Allow users of dcache_by_line_op to specify cvadp as an op.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 arch/arm64/include/asm/assembler.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index c5308d01e228..d50caf0e6b64 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -407,10 +407,14 @@ alternative_endif
 	.ifc	\op, cvap
 	sys	3, c7, c12, 1, \kaddr	// dc cvap
 	.else
+	.ifc	\op, cvadp
+	sys	3, c7, c13, 1, \kaddr	// dc cvadp
+	.else
 	dc	\op, \kaddr
 	.endif
 	.endif
 	.endif
+	.endif
 	add	\kaddr, \kaddr, \tmp1
 	cmp	\kaddr, \size
 	b.lo	9998b
-- 
2.21.0

^ permalink raw reply related

* [PATCH v4 4/6] arm64: Expose DC CVADP to userspace
From: Andrew Murray @ 2019-04-03 10:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Szabolcs Nagy, dave.martin, linux-arm-kernel, Mark Rutland,
	Phil Blundell, libc-alpha, linux-api, Suzuki K Poulose
In-Reply-To: <20190403105628.39798-1-andrew.murray@arm.com>

ARMv8.5 builds upon the ARMv8.2 DC CVAP instruction by introducing a DC
CVADP instruction which cleans the data cache to the point of deep
persistence. Let's expose this support via the arm64 ELF hwcaps.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 Documentation/arm64/elf_hwcaps.txt  | 4 ++++
 arch/arm64/include/asm/hwcap.h      | 1 +
 arch/arm64/include/uapi/asm/hwcap.h | 5 +++++
 arch/arm64/kernel/cpufeature.c      | 1 +
 arch/arm64/kernel/cpuinfo.c         | 1 +
 5 files changed, 12 insertions(+)

diff --git a/Documentation/arm64/elf_hwcaps.txt b/Documentation/arm64/elf_hwcaps.txt
index c04f8e87bab8..7b591c1dcb53 100644
--- a/Documentation/arm64/elf_hwcaps.txt
+++ b/Documentation/arm64/elf_hwcaps.txt
@@ -135,6 +135,10 @@ HWCAP_DCPOP
 
     Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0001.
 
+HWCAP2_DCPODP
+
+    Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0010.
+
 HWCAP_SHA3
 
     Functionality implied by ID_AA64ISAR0_EL1.SHA3 == 0b0001.
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 509ee0d2fa0f..732bdd7286c8 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -85,6 +85,7 @@
 #define KERNEL_HWCAP_PACG		__khwcap_feature(PACG)
 
 #define __khwcap2_feature(x)		(ilog2(HWCAP2_ ## x) + 32)
+#define KERNEL_HWCAP_DCPODP		__khwcap2_feature(DCPODP)
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 453b45af80b7..d64af3913a9e 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -53,4 +53,9 @@
 #define HWCAP_PACA		(1 << 30)
 #define HWCAP_PACG		(1UL << 31)
 
+/*
+ * HWCAP2 flags - for AT_HWCAP2
+ */
+#define HWCAP2_DCPODP		(1 << 0)
+
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a655d1bb1186..f8b682a3a9f4 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1591,6 +1591,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDHP),
 	HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_DIT_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DIT),
 	HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DCPOP),
+	HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_DCPODP),
 	HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_JSCVT_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_JSCVT),
 	HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_FCMA_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FCMA),
 	HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_LRCPC_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_LRCPC),
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 810db95f293f..093ca53ce1d1 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -85,6 +85,7 @@ static const char *const hwcap_str[] = {
 	"sb",
 	"paca",
 	"pacg",
+	"dcpodp",
 	NULL
 };
 
-- 
2.21.0

^ permalink raw reply related

* [PATCH v4 3/6] arm64: Handle trapped DC CVADP
From: Andrew Murray @ 2019-04-03 10:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Szabolcs Nagy, dave.martin, linux-arm-kernel, Mark Rutland,
	Phil Blundell, libc-alpha, linux-api, Suzuki K Poulose
In-Reply-To: <20190403105628.39798-1-andrew.murray@arm.com>

The ARMv8.5 DC CVADP instruction may be trapped to EL1 via
SCTLR_EL1.UCI therefore let's provide a handler for it.

Just like the CVAP instruction we use a 'sys' instruction instead of
the 'dc' alias to avoid build issues with older toolchains.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/include/asm/esr.h | 3 ++-
 arch/arm64/kernel/traps.c    | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 52233f00d53d..07d5c026a0b3 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -198,9 +198,10 @@
 /*
  * User space cache operations have the following sysreg encoding
  * in System instructions.
- * op0=1, op1=3, op2=1, crn=7, crm={ 5, 10, 11, 12, 14 }, WRITE (L=0)
+ * op0=1, op1=3, op2=1, crn=7, crm={ 5, 10, 11, 12, 13, 14 }, WRITE (L=0)
  */
 #define ESR_ELx_SYS64_ISS_CRM_DC_CIVAC	14
+#define ESR_ELx_SYS64_ISS_CRM_DC_CVADP	13
 #define ESR_ELx_SYS64_ISS_CRM_DC_CVAP	12
 #define ESR_ELx_SYS64_ISS_CRM_DC_CVAU	11
 #define ESR_ELx_SYS64_ISS_CRM_DC_CVAC	10
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 8ad119c3f665..f66e1ddbe4a7 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -459,6 +459,9 @@ static void user_cache_maint_handler(unsigned int esr, struct pt_regs *regs)
 	case ESR_ELx_SYS64_ISS_CRM_DC_CVAC:	/* DC CVAC, gets promoted */
 		__user_cache_maint("dc civac", address, ret);
 		break;
+	case ESR_ELx_SYS64_ISS_CRM_DC_CVADP:	/* DC CVADP */
+		__user_cache_maint("sys 3, c7, c13, 1", address, ret);
+		break;
 	case ESR_ELx_SYS64_ISS_CRM_DC_CVAP:	/* DC CVAP */
 		__user_cache_maint("sys 3, c7, c12, 1", address, ret);
 		break;
-- 
2.21.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox