* Re: [PATCH 6/9] Add a general, global device notification watch list [ver #5]
From: Greg Kroah-Hartman @ 2019-07-05 8:44 UTC (permalink / raw)
To: David Howells
Cc: viro, Casey Schaufler, Stephen Smalley, nicolas.dichtel, raven,
Christian Brauner, keyrings, linux-usb, linux-security-module,
linux-fsdevel, linux-api, linux-block, linux-kernel
In-Reply-To: <12946.1562313857@warthog.procyon.org.uk>
On Fri, Jul 05, 2019 at 09:04:17AM +0100, David Howells wrote:
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
>
> > Hm, good point, but there should be some way to test this to verify it
> > works. Maybe for the other types of events?
>
> Keyrings is the simplest. keyutils's testsuite will handle that. I'm trying
> to work out if I can simply make every macro in there that does a modification
> perform a watch automatically to make sure the appropriate events happen.
That should be good enough to test the basic functionality. After this
gets merged I'll see if I can come up with a way to test the USB
stuff...
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCH 6/9] Add a general, global device notification watch list [ver #5]
From: David Howells @ 2019-07-05 8:04 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: dhowells, viro, Casey Schaufler, Stephen Smalley, nicolas.dichtel,
raven, Christian Brauner, keyrings, linux-usb,
linux-security-module, linux-fsdevel, linux-api, linux-block,
linux-kernel
In-Reply-To: <20190705051733.GA15821@kroah.com>
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> Hm, good point, but there should be some way to test this to verify it
> works. Maybe for the other types of events?
Keyrings is the simplest. keyutils's testsuite will handle that. I'm trying
to work out if I can simply make every macro in there that does a modification
perform a watch automatically to make sure the appropriate events happen.
David
^ permalink raw reply
* Re: [PATCH 6/9] Add a general, global device notification watch list [ver #5]
From: Greg Kroah-Hartman @ 2019-07-05 5:17 UTC (permalink / raw)
To: David Howells
Cc: viro, Casey Schaufler, Stephen Smalley, nicolas.dichtel, raven,
Christian Brauner, keyrings, linux-usb, linux-security-module,
linux-fsdevel, linux-api, linux-block, linux-kernel
In-Reply-To: <10295.1562256260@warthog.procyon.org.uk>
On Thu, Jul 04, 2019 at 05:04:20PM +0100, David Howells wrote:
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
>
> > Don't we need a manpage and a kselftest for it?
>
> I've got part of a manpage, but it needs more work.
>
> How do you do a kselftest for this when it does nothing unless hardware events
> happen?
Hm, good point, but there should be some way to test this to verify it
works. Maybe for the other types of events?
thanks,
greg k-h
^ permalink raw reply
* [PATCH v2 11/11] fpga: dfl: fme: add global error reporting support
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga
Cc: linux-kernel, linux-api, atull, Wu Hao, Luwei Kang, Ananda Ravuri,
Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
This patch adds support for global error reporting for FPGA
Management Engine (FME), it introduces sysfs interfaces to
report different error detected by the hardware, and allow
user to clear errors or inject error for testing purpose.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: switch to device_add/remove_groups for sysfs.
---
Documentation/ABI/testing/sysfs-platform-dfl-fme | 75 +++++
drivers/fpga/Makefile | 2 +-
drivers/fpga/dfl-fme-error.c | 385 +++++++++++++++++++++++
drivers/fpga/dfl-fme-main.c | 4 +
drivers/fpga/dfl-fme.h | 2 +
drivers/fpga/dfl.h | 2 +
6 files changed, 469 insertions(+), 1 deletion(-)
create mode 100644 drivers/fpga/dfl-fme-error.c
diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
index 99cd3b2..86eef83 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
@@ -44,3 +44,78 @@ Description: Read-only. It returns socket_id to indicate which socket
this FPGA belongs to, only valid for integrated solution.
User only needs this information, in case standard numa node
can't provide correct information.
+
+What: /sys/bus/platform/devices/dfl-fme.0/errors/revision
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get the revision of this global
+ error reporting private feature.
+
+What: /sys/bus/platform/devices/dfl-fme.0/errors/pcie0_errors
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-Write. Read this file for errors detected on pcie0 link.
+ Write this file to clear errors logged in pcie0_errors. Write
+ fails with -EINVAL if input parsing fails or input error code
+ doesn't match.
+
+What: /sys/bus/platform/devices/dfl-fme.0/errors/pcie1_errors
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-Write. Read this file for errors detected on pcie1 link.
+ Write this file to clear errors logged in pcie1_errors. Write
+ fails with -EINVAL if input parsing fails or input error code
+ doesn't match.
+
+What: /sys/bus/platform/devices/dfl-fme.0/errors/nonfatal_errors
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. It returns non-fatal errors detected.
+
+What: /sys/bus/platform/devices/dfl-fme.0/errors/catfatal_errors
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. It returns catastrophic and fatal errors detected.
+
+What: /sys/bus/platform/devices/dfl-fme.0/errors/inject_error
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-Write. Read this file to check errors injected. Write this
+ file to inject errors for testing purpose. Write fails with
+ -EINVAL if input parsing fails or input inject error code isn't
+ supported.
+
+What: /sys/bus/platform/devices/dfl-fme.0/errors/fme-errors/errors
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get errors detected by hardware.
+
+What: /sys/bus/platform/devices/dfl-fme.0/errors/fme-errors/first_error
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get the first error detected by
+ hardware.
+
+What: /sys/bus/platform/devices/dfl-fme.0/errors/fme-errors/next_error
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get the second error detected by
+ hardware.
+
+What: /sys/bus/platform/devices/dfl-fme.0/errors/fme-errors/clear
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Write-only. Write error code to this file to clear all errors
+ logged in errors, first_error and next_error. Write fails with
+ -EINVAL if input parsing fails or input error code doesn't
+ match.
diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile
index 7255891..4865b74 100644
--- a/drivers/fpga/Makefile
+++ b/drivers/fpga/Makefile
@@ -39,7 +39,7 @@ obj-$(CONFIG_FPGA_DFL_FME_BRIDGE) += dfl-fme-br.o
obj-$(CONFIG_FPGA_DFL_FME_REGION) += dfl-fme-region.o
obj-$(CONFIG_FPGA_DFL_AFU) += dfl-afu.o
-dfl-fme-objs := dfl-fme-main.o dfl-fme-pr.o
+dfl-fme-objs := dfl-fme-main.o dfl-fme-pr.o dfl-fme-error.o
dfl-afu-objs := dfl-afu-main.o dfl-afu-region.o dfl-afu-dma-region.o
dfl-afu-objs += dfl-afu-error.o
diff --git a/drivers/fpga/dfl-fme-error.c b/drivers/fpga/dfl-fme-error.c
new file mode 100644
index 0000000..6b5605d
--- /dev/null
+++ b/drivers/fpga/dfl-fme-error.c
@@ -0,0 +1,385 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for FPGA Management Engine Error Management
+ *
+ * Copyright 2019 Intel Corporation, Inc.
+ *
+ * Authors:
+ * Kang Luwei <luwei.kang@intel.com>
+ * Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ * Wu Hao <hao.wu@intel.com>
+ * Joseph Grecco <joe.grecco@intel.com>
+ * Enno Luebbers <enno.luebbers@intel.com>
+ * Tim Whisonant <tim.whisonant@intel.com>
+ * Ananda Ravuri <ananda.ravuri@intel.com>
+ * Mitchel, Henry <henry.mitchel@intel.com>
+ */
+
+#include <linux/uaccess.h>
+
+#include "dfl.h"
+#include "dfl-fme.h"
+
+#define FME_ERROR_MASK 0x8
+#define FME_ERROR 0x10
+#define MBP_ERROR BIT_ULL(6)
+#define PCIE0_ERROR_MASK 0x18
+#define PCIE0_ERROR 0x20
+#define PCIE1_ERROR_MASK 0x28
+#define PCIE1_ERROR 0x30
+#define FME_FIRST_ERROR 0x38
+#define FME_NEXT_ERROR 0x40
+#define RAS_NONFAT_ERROR_MASK 0x48
+#define RAS_NONFAT_ERROR 0x50
+#define RAS_CATFAT_ERROR_MASK 0x58
+#define RAS_CATFAT_ERROR 0x60
+#define RAS_ERROR_INJECT 0x68
+#define INJECT_ERROR_MASK GENMASK_ULL(2, 0)
+
+static ssize_t revision_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ return sprintf(buf, "%u\n", dfl_feature_revision(base));
+}
+static DEVICE_ATTR_RO(revision);
+
+static ssize_t pcie0_errors_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ return sprintf(buf, "0x%llx\n",
+ (unsigned long long)readq(base + PCIE0_ERROR));
+}
+
+static ssize_t pcie0_errors_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+ int ret = 0;
+ u64 v, val;
+
+ if (kstrtou64(buf, 0, &val))
+ return -EINVAL;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ mutex_lock(&pdata->lock);
+ writeq(GENMASK_ULL(63, 0), base + PCIE0_ERROR_MASK);
+
+ v = readq(base + PCIE0_ERROR);
+ if (val == v)
+ writeq(v, base + PCIE0_ERROR);
+ else
+ ret = -EINVAL;
+
+ writeq(0ULL, base + PCIE0_ERROR_MASK);
+ mutex_unlock(&pdata->lock);
+ return ret ? ret : count;
+}
+static DEVICE_ATTR_RW(pcie0_errors);
+
+static ssize_t pcie1_errors_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ return sprintf(buf, "0x%llx\n",
+ (unsigned long long)readq(base + PCIE1_ERROR));
+}
+
+static ssize_t pcie1_errors_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+ int ret = 0;
+ u64 v, val;
+
+ if (kstrtou64(buf, 0, &val))
+ return -EINVAL;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ mutex_lock(&pdata->lock);
+ writeq(GENMASK_ULL(63, 0), base + PCIE1_ERROR_MASK);
+
+ v = readq(base + PCIE1_ERROR);
+ if (val == v)
+ writeq(v, base + PCIE1_ERROR);
+ else
+ ret = -EINVAL;
+
+ writeq(0ULL, base + PCIE1_ERROR_MASK);
+ mutex_unlock(&pdata->lock);
+ return ret ? ret : count;
+}
+static DEVICE_ATTR_RW(pcie1_errors);
+
+static ssize_t nonfatal_errors_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ return sprintf(buf, "0x%llx\n",
+ (unsigned long long)readq(base + RAS_NONFAT_ERROR));
+}
+static DEVICE_ATTR_RO(nonfatal_errors);
+
+static ssize_t catfatal_errors_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ return sprintf(buf, "0x%llx\n",
+ (unsigned long long)readq(base + RAS_CATFAT_ERROR));
+}
+static DEVICE_ATTR_RO(catfatal_errors);
+
+static ssize_t inject_error_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+ u64 v;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ v = readq(base + RAS_ERROR_INJECT);
+
+ return sprintf(buf, "0x%llx\n",
+ (unsigned long long)FIELD_GET(INJECT_ERROR_MASK, v));
+}
+
+static ssize_t inject_error_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+ u8 inject_error;
+ u64 v;
+
+ if (kstrtou8(buf, 0, &inject_error))
+ return -EINVAL;
+
+ if (inject_error & ~INJECT_ERROR_MASK)
+ return -EINVAL;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ mutex_lock(&pdata->lock);
+ v = readq(base + RAS_ERROR_INJECT);
+ v &= ~INJECT_ERROR_MASK;
+ v |= FIELD_PREP(INJECT_ERROR_MASK, inject_error);
+ writeq(v, base + RAS_ERROR_INJECT);
+ mutex_unlock(&pdata->lock);
+
+ return count;
+}
+static DEVICE_ATTR_RW(inject_error);
+
+static struct attribute *errors_attrs[] = {
+ &dev_attr_revision.attr,
+ &dev_attr_pcie0_errors.attr,
+ &dev_attr_pcie1_errors.attr,
+ &dev_attr_nonfatal_errors.attr,
+ &dev_attr_catfatal_errors.attr,
+ &dev_attr_inject_error.attr,
+ NULL,
+};
+
+static struct attribute_group errors_attr_group = {
+ .attrs = errors_attrs,
+};
+
+static ssize_t errors_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ return sprintf(buf, "0x%llx\n",
+ (unsigned long long)readq(base + FME_ERROR));
+}
+static DEVICE_ATTR_RO(errors);
+
+static ssize_t first_error_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ return sprintf(buf, "0x%llx\n",
+ (unsigned long long)readq(base + FME_FIRST_ERROR));
+}
+static DEVICE_ATTR_RO(first_error);
+
+static ssize_t next_error_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ return sprintf(buf, "0x%llx\n",
+ (unsigned long long)readq(base + FME_NEXT_ERROR));
+}
+static DEVICE_ATTR_RO(next_error);
+
+static ssize_t clear_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
+ struct device *err_dev = dev->parent;
+ void __iomem *base;
+ u64 v, val;
+ int ret = 0;
+
+ if (kstrtou64(buf, 0, &val))
+ return -EINVAL;
+
+ base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+ mutex_lock(&pdata->lock);
+ writeq(GENMASK_ULL(63, 0), base + FME_ERROR_MASK);
+
+ v = readq(base + FME_ERROR);
+ if (val == v) {
+ writeq(v, base + FME_ERROR);
+ v = readq(base + FME_FIRST_ERROR);
+ writeq(v, base + FME_FIRST_ERROR);
+ v = readq(base + FME_NEXT_ERROR);
+ writeq(v, base + FME_NEXT_ERROR);
+ } else {
+ ret = -EINVAL;
+ }
+
+ /* Workaround: disable MBP_ERROR if feature revision is 0 */
+ writeq(dfl_feature_revision(base) ? 0ULL : MBP_ERROR,
+ base + FME_ERROR_MASK);
+ mutex_unlock(&pdata->lock);
+ return ret ? ret : count;
+}
+static DEVICE_ATTR_WO(clear);
+
+static struct attribute *fme_errors_attrs[] = {
+ &dev_attr_errors.attr,
+ &dev_attr_first_error.attr,
+ &dev_attr_next_error.attr,
+ &dev_attr_clear.attr,
+ NULL,
+};
+
+static struct attribute_group fme_errors_attr_group = {
+ .attrs = fme_errors_attrs,
+ .name = "fme-errors",
+};
+
+static const struct attribute_group *error_groups[] = {
+ &fme_errors_attr_group,
+ &errors_attr_group,
+ NULL
+};
+
+static void fme_error_enable(struct dfl_feature *feature)
+{
+ void __iomem *base = feature->ioaddr;
+
+ /* Workaround: disable MBP_ERROR if revision is 0 */
+ writeq(dfl_feature_revision(feature->ioaddr) ? 0ULL : MBP_ERROR,
+ base + FME_ERROR_MASK);
+ writeq(0ULL, base + PCIE0_ERROR_MASK);
+ writeq(0ULL, base + PCIE1_ERROR_MASK);
+ writeq(0ULL, base + RAS_NONFAT_ERROR_MASK);
+ writeq(0ULL, base + RAS_CATFAT_ERROR_MASK);
+}
+
+static void err_dev_release(struct device *dev)
+{
+ kfree(dev);
+}
+
+static int fme_global_err_init(struct platform_device *pdev,
+ struct dfl_feature *feature)
+{
+ struct device *dev;
+ int ret = 0;
+
+ dev_dbg(&pdev->dev, "FME Global Error Reporting Init.\n");
+
+ dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+ if (!dev)
+ return -ENOMEM;
+
+ dev->parent = &pdev->dev;
+ dev->release = err_dev_release;
+ dev_set_name(dev, "errors");
+
+ fme_error_enable(feature);
+
+ ret = device_register(dev);
+ if (ret) {
+ put_device(dev);
+ return ret;
+ }
+
+ ret = device_add_groups(dev, error_groups);
+ if (ret) {
+ device_unregister(dev);
+ return ret;
+ }
+
+ feature->priv = dev;
+
+ return ret;
+}
+
+static void fme_global_err_uinit(struct platform_device *pdev,
+ struct dfl_feature *feature)
+{
+ struct device *dev = feature->priv;
+
+ dev_dbg(&pdev->dev, "FME Global Error Reporting UInit.\n");
+
+ device_remove_groups(dev, error_groups);
+ device_unregister(dev);
+}
+
+const struct dfl_feature_id fme_global_err_id_table[] = {
+ {.id = FME_FEATURE_ID_GLOBAL_ERR,},
+ {0,}
+};
+
+const struct dfl_feature_ops fme_global_err_ops = {
+ .init = fme_global_err_init,
+ .uinit = fme_global_err_uinit,
+};
diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index c8703c4..a09f75b 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -202,6 +202,10 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
.ops = &fme_pr_mgmt_ops,
},
{
+ .id_table = fme_global_err_id_table,
+ .ops = &fme_global_err_ops,
+ },
+ {
.ops = NULL,
},
};
diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
index 7a021c4..5fbe3f5 100644
--- a/drivers/fpga/dfl-fme.h
+++ b/drivers/fpga/dfl-fme.h
@@ -37,5 +37,7 @@ struct dfl_fme {
extern const struct dfl_feature_ops fme_pr_mgmt_ops;
extern const struct dfl_feature_id fme_pr_mgmt_id_table[];
+extern const struct dfl_feature_ops fme_global_err_ops;
+extern const struct dfl_feature_id fme_global_err_id_table[];
#endif /* __DFL_FME_H */
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index f44fda1..64eae77 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -197,12 +197,14 @@ struct dfl_feature_driver {
* feature dev (platform device)'s reources.
* @ioaddr: mapped mmio resource address.
* @ops: ops of this sub feature.
+ * @priv: priv data of this feature.
*/
struct dfl_feature {
u64 id;
int resource_index;
void __iomem *ioaddr;
const struct dfl_feature_ops *ops;
+ void *priv;
};
#define DEV_STATUS_IN_USE 0
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 10/11] fpga: dfl: fme: add capability sysfs interfaces
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga
Cc: linux-kernel, linux-api, atull, Wu Hao, Luwei Kang, Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
This patch adds 3 read-only sysfs interfaces for FPGA Management Engine
(FME) block for capabilities including cache_size, fabric_version and
socket_id.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: rebased.
---
Documentation/ABI/testing/sysfs-platform-dfl-fme | 23 ++++++++++++
drivers/fpga/dfl-fme-main.c | 48 ++++++++++++++++++++++++
2 files changed, 71 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
index 8fa4feb..99cd3b2 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
@@ -21,3 +21,26 @@ Contact: Wu Hao <hao.wu@intel.com>
Description: Read-only. It returns Bitstream (static FPGA region) meta
data, which includes the synthesis date, seed and other
information of this static FPGA region.
+
+What: /sys/bus/platform/devices/dfl-fme.0/cache_size
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. It returns cache size of this FPGA device.
+
+What: /sys/bus/platform/devices/dfl-fme.0/fabric_version
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. It returns fabric version of this FPGA device.
+ Userspace applications need this information to select
+ best data channels per different fabric design.
+
+What: /sys/bus/platform/devices/dfl-fme.0/socket_id
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. It returns socket_id to indicate which socket
+ this FPGA belongs to, only valid for integrated solution.
+ User only needs this information, in case standard numa node
+ can't provide correct information.
diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index e333f19..c8703c4 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -73,10 +73,58 @@ static ssize_t bitstream_metadata_show(struct device *dev,
}
static DEVICE_ATTR_RO(bitstream_metadata);
+static ssize_t cache_size_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ void __iomem *base;
+ u64 v;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, FME_FEATURE_ID_HEADER);
+
+ v = readq(base + FME_HDR_CAP);
+
+ return sprintf(buf, "%u\n",
+ (unsigned int)FIELD_GET(FME_CAP_CACHE_SIZE, v));
+}
+static DEVICE_ATTR_RO(cache_size);
+
+static ssize_t fabric_version_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ void __iomem *base;
+ u64 v;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, FME_FEATURE_ID_HEADER);
+
+ v = readq(base + FME_HDR_CAP);
+
+ return sprintf(buf, "%u\n",
+ (unsigned int)FIELD_GET(FME_CAP_FABRIC_VERID, v));
+}
+static DEVICE_ATTR_RO(fabric_version);
+
+static ssize_t socket_id_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ void __iomem *base;
+ u64 v;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, FME_FEATURE_ID_HEADER);
+
+ v = readq(base + FME_HDR_CAP);
+
+ return sprintf(buf, "%u\n",
+ (unsigned int)FIELD_GET(FME_CAP_SOCKET_ID, v));
+}
+static DEVICE_ATTR_RO(socket_id);
+
static struct attribute *fme_hdr_attrs[] = {
&dev_attr_ports_num.attr,
&dev_attr_bitstream_id.attr,
&dev_attr_bitstream_metadata.attr,
+ &dev_attr_cache_size.attr,
+ &dev_attr_fabric_version.attr,
+ &dev_attr_socket_id.attr,
NULL,
};
ATTRIBUTE_GROUPS(fme_hdr);
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 09/11] fpga: dfl: afu: add STP (SignalTap) support
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga; +Cc: linux-kernel, linux-api, atull, Wu Hao, Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
STP (SignalTap) is one of the private features under the port for
debugging. This patch adds private feature driver support for it
to allow userspace applications to mmap related mmio region and
provide STP service.
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
drivers/fpga/dfl-afu-main.c | 34 ++++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 15dd4cb..395f96e 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -514,6 +514,36 @@ static void port_afu_uinit(struct platform_device *pdev,
.uinit = port_afu_uinit,
};
+static int port_stp_init(struct platform_device *pdev,
+ struct dfl_feature *feature)
+{
+ struct resource *res = &pdev->resource[feature->resource_index];
+
+ dev_dbg(&pdev->dev, "PORT STP Init.\n");
+
+ return afu_mmio_region_add(dev_get_platdata(&pdev->dev),
+ DFL_PORT_REGION_INDEX_STP,
+ resource_size(res), res->start,
+ DFL_PORT_REGION_MMAP | DFL_PORT_REGION_READ |
+ DFL_PORT_REGION_WRITE);
+}
+
+static void port_stp_uinit(struct platform_device *pdev,
+ struct dfl_feature *feature)
+{
+ dev_dbg(&pdev->dev, "PORT STP UInit.\n");
+}
+
+static const struct dfl_feature_id port_stp_id_table[] = {
+ {.id = PORT_FEATURE_ID_STP,},
+ {0,}
+};
+
+static const struct dfl_feature_ops port_stp_ops = {
+ .init = port_stp_init,
+ .uinit = port_stp_uinit,
+};
+
static struct dfl_feature_driver port_feature_drvs[] = {
{
.id_table = port_hdr_id_table,
@@ -528,6 +558,10 @@ static void port_afu_uinit(struct platform_device *pdev,
.ops = &port_err_ops,
},
{
+ .id_table = port_stp_id_table,
+ .ops = &port_stp_ops,
+ },
+ {
.ops = NULL,
}
};
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 08/11] fpga: dfl: afu: add error reporting support.
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga; +Cc: linux-kernel, linux-api, atull, Wu Hao, Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
Error reporting is one important private feature, it reports error
detected on port and accelerated function unit (AFU). It introduces
several sysfs interfaces to allow userspace to check and clear
errors detected by hardware.
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: switch to device_add/remove_group for sysfs.
---
Documentation/ABI/testing/sysfs-platform-dfl-port | 39 ++++
drivers/fpga/Makefile | 1 +
drivers/fpga/dfl-afu-error.c | 225 ++++++++++++++++++++++
drivers/fpga/dfl-afu-main.c | 4 +
drivers/fpga/dfl-afu.h | 4 +
5 files changed, 273 insertions(+)
create mode 100644 drivers/fpga/dfl-afu-error.c
diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-port b/Documentation/ABI/testing/sysfs-platform-dfl-port
index 04ea7f2..4aeca94 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-port
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-port
@@ -79,3 +79,42 @@ KernelVersion: 5.3
Contact: Wu Hao <hao.wu@intel.com>
Description: Read-only. Read this file to get the status of issued command
to userclck_freqcntrcmd.
+
+What: /sys/bus/platform/devices/dfl-port.0/errors/revision
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get the revision of this error
+ reporting private feature.
+
+What: /sys/bus/platform/devices/dfl-port.0/errors/errors
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get errors detected on port and
+ Accelerated Function Unit (AFU).
+
+What: /sys/bus/platform/devices/dfl-port.0/errors/first_error
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get the first error detected by
+ hardware.
+
+What: /sys/bus/platform/devices/dfl-port.0/errors/first_malformed_req
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get the first malformed request
+ captured by hardware.
+
+What: /sys/bus/platform/devices/dfl-port.0/errors/clear
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Write-only. Write error code to this file to clear errors.
+ Write fails with -EINVAL if input parsing fails or input error
+ code doesn't match.
+ Write fails with -EBUSY or -ETIMEDOUT if error can't be cleared
+ as hardware is in low power state (-EBUSY) or not responding
+ (-ETIMEDOUT).
diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile
index 312b937..7255891 100644
--- a/drivers/fpga/Makefile
+++ b/drivers/fpga/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_FPGA_DFL_AFU) += dfl-afu.o
dfl-fme-objs := dfl-fme-main.o dfl-fme-pr.o
dfl-afu-objs := dfl-afu-main.o dfl-afu-region.o dfl-afu-dma-region.o
+dfl-afu-objs += dfl-afu-error.o
# Drivers for FPGAs which implement DFL
obj-$(CONFIG_FPGA_DFL_PCI) += dfl-pci.o
diff --git a/drivers/fpga/dfl-afu-error.c b/drivers/fpga/dfl-afu-error.c
new file mode 100644
index 0000000..9649da8
--- /dev/null
+++ b/drivers/fpga/dfl-afu-error.c
@@ -0,0 +1,225 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for FPGA Accelerated Function Unit (AFU) Error Reporting
+ *
+ * Copyright 2019 Intel Corporation, Inc.
+ *
+ * Authors:
+ * Wu Hao <hao.wu@linux.intel.com>
+ * Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ * Joseph Grecco <joe.grecco@intel.com>
+ * Enno Luebbers <enno.luebbers@intel.com>
+ * Tim Whisonant <tim.whisonant@intel.com>
+ * Ananda Ravuri <ananda.ravuri@intel.com>
+ * Mitchel Henry <henry.mitchel@intel.com>
+ */
+
+#include <linux/uaccess.h>
+
+#include "dfl-afu.h"
+
+#define PORT_ERROR_MASK 0x8
+#define PORT_ERROR 0x10
+#define PORT_FIRST_ERROR 0x18
+#define PORT_MALFORMED_REQ0 0x20
+#define PORT_MALFORMED_REQ1 0x28
+
+#define ERROR_MASK GENMASK_ULL(63, 0)
+
+/* mask or unmask port errors by the error mask register. */
+static void __port_err_mask(struct device *dev, bool mask)
+{
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+
+ writeq(mask ? ERROR_MASK : 0, base + PORT_ERROR_MASK);
+}
+
+/* clear port errors. */
+static int __port_err_clear(struct device *dev, u64 err)
+{
+ struct platform_device *pdev = to_platform_device(dev);
+ void __iomem *base_err, *base_hdr;
+ int ret;
+ u64 v;
+
+ base_err = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+ base_hdr = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ /*
+ * clear Port Errors
+ *
+ * - Check for AP6 State
+ * - Halt Port by keeping Port in reset
+ * - Set PORT Error mask to all 1 to mask errors
+ * - Clear all errors
+ * - Set Port mask to all 0 to enable errors
+ * - All errors start capturing new errors
+ * - Enable Port by pulling the port out of reset
+ */
+
+ /* if device is still in AP6 power state, can not clear any error. */
+ v = readq(base_hdr + PORT_HDR_STS);
+ if (FIELD_GET(PORT_STS_PWR_STATE, v) == PORT_STS_PWR_STATE_AP6) {
+ dev_err(dev, "Could not clear errors, device in AP6 state.\n");
+ return -EBUSY;
+ }
+
+ /* Halt Port by keeping Port in reset */
+ ret = __port_disable(pdev);
+ if (ret)
+ return ret;
+
+ /* Mask all errors */
+ __port_err_mask(dev, true);
+
+ /* Clear errors if err input matches with current port errors.*/
+ v = readq(base_err + PORT_ERROR);
+
+ if (v == err) {
+ writeq(v, base_err + PORT_ERROR);
+
+ v = readq(base_err + PORT_FIRST_ERROR);
+ writeq(v, base_err + PORT_FIRST_ERROR);
+ } else {
+ ret = -EINVAL;
+ }
+
+ /* Clear mask */
+ __port_err_mask(dev, false);
+
+ /* Enable the Port by clear the reset */
+ __port_enable(pdev);
+
+ return ret;
+}
+
+static ssize_t revision_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+
+ return sprintf(buf, "%u\n", dfl_feature_revision(base));
+}
+static DEVICE_ATTR_RO(revision);
+
+static ssize_t errors_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ void __iomem *base;
+ u64 error;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+
+ mutex_lock(&pdata->lock);
+ error = readq(base + PORT_ERROR);
+ mutex_unlock(&pdata->lock);
+
+ return sprintf(buf, "0x%llx\n", (unsigned long long)error);
+}
+static DEVICE_ATTR_RO(errors);
+
+static ssize_t first_error_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ void __iomem *base;
+ u64 error;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+
+ mutex_lock(&pdata->lock);
+ error = readq(base + PORT_FIRST_ERROR);
+ mutex_unlock(&pdata->lock);
+
+ return sprintf(buf, "0x%llx\n", (unsigned long long)error);
+}
+static DEVICE_ATTR_RO(first_error);
+
+static ssize_t first_malformed_req_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ void __iomem *base;
+ u64 req0, req1;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+
+ mutex_lock(&pdata->lock);
+ req0 = readq(base + PORT_MALFORMED_REQ0);
+ req1 = readq(base + PORT_MALFORMED_REQ1);
+ mutex_unlock(&pdata->lock);
+
+ return sprintf(buf, "0x%016llx%016llx\n",
+ (unsigned long long)req1, (unsigned long long)req0);
+}
+static DEVICE_ATTR_RO(first_malformed_req);
+
+static ssize_t clear_store(struct device *dev, struct device_attribute *attr,
+ const char *buff, size_t count)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ u64 value;
+ int ret;
+
+ if (kstrtou64(buff, 0, &value))
+ return -EINVAL;
+
+ mutex_lock(&pdata->lock);
+ ret = __port_err_clear(dev, value);
+ mutex_unlock(&pdata->lock);
+
+ return ret ? ret : count;
+}
+static DEVICE_ATTR_WO(clear);
+
+static struct attribute *port_err_attrs[] = {
+ &dev_attr_revision.attr,
+ &dev_attr_errors.attr,
+ &dev_attr_first_error.attr,
+ &dev_attr_first_malformed_req.attr,
+ &dev_attr_clear.attr,
+ NULL,
+};
+
+static struct attribute_group port_err_attr_group = {
+ .attrs = port_err_attrs,
+ .name = "errors",
+};
+
+static int port_err_init(struct platform_device *pdev,
+ struct dfl_feature *feature)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
+
+ dev_dbg(&pdev->dev, "PORT ERR Init.\n");
+
+ mutex_lock(&pdata->lock);
+ __port_err_mask(&pdev->dev, false);
+ mutex_unlock(&pdata->lock);
+
+ return device_add_group(&pdev->dev, &port_err_attr_group);
+}
+
+static void port_err_uinit(struct platform_device *pdev,
+ struct dfl_feature *feature)
+{
+ dev_dbg(&pdev->dev, "PORT ERR UInit.\n");
+
+ device_remove_group(&pdev->dev, &port_err_attr_group);
+}
+
+const struct dfl_feature_id port_err_id_table[] = {
+ {.id = PORT_FEATURE_ID_ERROR,},
+ {0,}
+};
+
+const struct dfl_feature_ops port_err_ops = {
+ .init = port_err_init,
+ .uinit = port_err_uinit,
+};
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 22d294b..15dd4cb 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -524,6 +524,10 @@ static void port_afu_uinit(struct platform_device *pdev,
.ops = &port_afu_ops,
},
{
+ .id_table = port_err_id_table,
+ .ops = &port_err_ops,
+ },
+ {
.ops = NULL,
}
};
diff --git a/drivers/fpga/dfl-afu.h b/drivers/fpga/dfl-afu.h
index 35e60c5..c3182a2 100644
--- a/drivers/fpga/dfl-afu.h
+++ b/drivers/fpga/dfl-afu.h
@@ -100,4 +100,8 @@ int afu_dma_map_region(struct dfl_feature_platform_data *pdata,
struct dfl_afu_dma_region *
afu_dma_region_find(struct dfl_feature_platform_data *pdata,
u64 iova, u64 size);
+
+extern const struct dfl_feature_ops port_err_ops;
+extern const struct dfl_feature_id port_err_id_table[];
+
#endif /* __DFL_AFU_H */
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 07/11] fpga: dfl: afu: export __port_enable/disable function.
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga; +Cc: linux-kernel, linux-api, atull, Wu Hao, Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
As these two functions are used by other private features. e.g.
in error reporting private feature, it requires to check port status
and reset port for error clearing.
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: rebased.
---
drivers/fpga/dfl-afu-main.c | 25 ++++++++++++++-----------
drivers/fpga/dfl-afu.h | 3 +++
2 files changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index fbd9553..22d294b 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -22,14 +22,16 @@
#include "dfl-afu.h"
/**
- * port_enable - enable a port
+ * __port_enable - enable a port
* @pdev: port platform device.
*
* Enable Port by clear the port soft reset bit, which is set by default.
* The AFU is unable to respond to any MMIO access while in reset.
- * port_enable function should only be used after port_disable function.
+ * __port_enable function should only be used after __port_disable function.
+ *
+ * The caller needs to hold lock for protection.
*/
-static void port_enable(struct platform_device *pdev)
+void __port_enable(struct platform_device *pdev)
{
struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
void __iomem *base;
@@ -52,13 +54,14 @@ static void port_enable(struct platform_device *pdev)
#define RST_POLL_TIMEOUT 1000 /* us */
/**
- * port_disable - disable a port
+ * __port_disable - disable a port
* @pdev: port platform device.
*
- * Disable Port by setting the port soft reset bit, it puts the port into
- * reset.
+ * Disable Port by setting the port soft reset bit, it puts the port into reset.
+ *
+ * The caller needs to hold lock for protection.
*/
-static int port_disable(struct platform_device *pdev)
+int __port_disable(struct platform_device *pdev)
{
struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
void __iomem *base;
@@ -104,9 +107,9 @@ static int __port_reset(struct platform_device *pdev)
{
int ret;
- ret = port_disable(pdev);
+ ret = __port_disable(pdev);
if (!ret)
- port_enable(pdev);
+ __port_enable(pdev);
return ret;
}
@@ -806,9 +809,9 @@ static int port_enable_set(struct platform_device *pdev, bool enable)
mutex_lock(&pdata->lock);
if (enable)
- port_enable(pdev);
+ __port_enable(pdev);
else
- ret = port_disable(pdev);
+ ret = __port_disable(pdev);
mutex_unlock(&pdata->lock);
return ret;
diff --git a/drivers/fpga/dfl-afu.h b/drivers/fpga/dfl-afu.h
index 0c7630a..35e60c5 100644
--- a/drivers/fpga/dfl-afu.h
+++ b/drivers/fpga/dfl-afu.h
@@ -79,6 +79,9 @@ struct dfl_afu {
struct dfl_feature_platform_data *pdata;
};
+void __port_enable(struct platform_device *pdev);
+int __port_disable(struct platform_device *pdev);
+
void afu_mmio_region_init(struct dfl_feature_platform_data *pdata);
int afu_mmio_region_add(struct dfl_feature_platform_data *pdata,
u32 region_index, u64 region_size, u64 phys, u32 flags);
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 06/11] fpga: dfl: add id_table for dfl private feature driver
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga; +Cc: linux-kernel, linux-api, atull, Wu Hao, Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
This patch adds id_table for each dfl private feature driver,
it allows to reuse same private feature driver to match and support
multiple dfl private features.
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: rebased, remove DRV/MODULE_VERSION modifications
---
drivers/fpga/dfl-afu-main.c | 14 ++++++++++++--
drivers/fpga/dfl-fme-main.c | 11 ++++++++---
drivers/fpga/dfl-fme-pr.c | 7 ++++++-
drivers/fpga/dfl-fme.h | 3 ++-
drivers/fpga/dfl.c | 18 ++++++++++++++++--
drivers/fpga/dfl.h | 21 +++++++++++++++------
6 files changed, 59 insertions(+), 15 deletions(-)
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 9025314..fbd9553 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -435,6 +435,11 @@ static void port_hdr_uinit(struct platform_device *pdev,
return ret;
}
+static const struct dfl_feature_id port_hdr_id_table[] = {
+ {.id = PORT_FEATURE_ID_HEADER,},
+ {0,}
+};
+
static const struct dfl_feature_ops port_hdr_ops = {
.init = port_hdr_init,
.uinit = port_hdr_uinit,
@@ -496,6 +501,11 @@ static void port_afu_uinit(struct platform_device *pdev,
device_remove_groups(&pdev->dev, port_afu_groups);
}
+static const struct dfl_feature_id port_afu_id_table[] = {
+ {.id = PORT_FEATURE_ID_AFU,},
+ {0,}
+};
+
static const struct dfl_feature_ops port_afu_ops = {
.init = port_afu_init,
.uinit = port_afu_uinit,
@@ -503,11 +513,11 @@ static void port_afu_uinit(struct platform_device *pdev,
static struct dfl_feature_driver port_feature_drvs[] = {
{
- .id = PORT_FEATURE_ID_HEADER,
+ .id_table = port_hdr_id_table,
.ops = &port_hdr_ops,
},
{
- .id = PORT_FEATURE_ID_AFU,
+ .id_table = port_afu_id_table,
.ops = &port_afu_ops,
},
{
diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index e61e0fe..e333f19 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -133,6 +133,11 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
return -ENODEV;
}
+static const struct dfl_feature_id fme_hdr_id_table[] = {
+ {.id = FME_FEATURE_ID_HEADER,},
+ {0,}
+};
+
static const struct dfl_feature_ops fme_hdr_ops = {
.init = fme_hdr_init,
.uinit = fme_hdr_uinit,
@@ -141,12 +146,12 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
static struct dfl_feature_driver fme_feature_drvs[] = {
{
- .id = FME_FEATURE_ID_HEADER,
+ .id_table = fme_hdr_id_table,
.ops = &fme_hdr_ops,
},
{
- .id = FME_FEATURE_ID_PR_MGMT,
- .ops = &pr_mgmt_ops,
+ .id_table = fme_pr_mgmt_id_table,
+ .ops = &fme_pr_mgmt_ops,
},
{
.ops = NULL,
diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
index cd94ba8..52f1745 100644
--- a/drivers/fpga/dfl-fme-pr.c
+++ b/drivers/fpga/dfl-fme-pr.c
@@ -483,7 +483,12 @@ static long fme_pr_ioctl(struct platform_device *pdev,
return ret;
}
-const struct dfl_feature_ops pr_mgmt_ops = {
+const struct dfl_feature_id fme_pr_mgmt_id_table[] = {
+ {.id = FME_FEATURE_ID_PR_MGMT,},
+ {0}
+};
+
+const struct dfl_feature_ops fme_pr_mgmt_ops = {
.init = pr_mgmt_init,
.uinit = pr_mgmt_uinit,
.ioctl = fme_pr_ioctl,
diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
index de20755..7a021c4 100644
--- a/drivers/fpga/dfl-fme.h
+++ b/drivers/fpga/dfl-fme.h
@@ -35,6 +35,7 @@ struct dfl_fme {
struct dfl_feature_platform_data *pdata;
};
-extern const struct dfl_feature_ops pr_mgmt_ops;
+extern const struct dfl_feature_ops fme_pr_mgmt_ops;
+extern const struct dfl_feature_id fme_pr_mgmt_id_table[];
#endif /* __DFL_FME_H */
diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
index c3a8e1d..3eb67ab 100644
--- a/drivers/fpga/dfl.c
+++ b/drivers/fpga/dfl.c
@@ -281,6 +281,21 @@ static int dfl_feature_instance_init(struct platform_device *pdev,
return ret;
}
+static bool dfl_feature_drv_match(struct dfl_feature *feature,
+ struct dfl_feature_driver *driver)
+{
+ const struct dfl_feature_id *ids = driver->id_table;
+
+ if (ids) {
+ while (ids->id) {
+ if (ids->id == feature->id)
+ return true;
+ ids++;
+ }
+ }
+ return false;
+}
+
/**
* dfl_fpga_dev_feature_init - init for sub features of dfl feature device
* @pdev: feature device.
@@ -301,8 +316,7 @@ int dfl_fpga_dev_feature_init(struct platform_device *pdev,
while (drv->ops) {
dfl_fpga_dev_for_each_feature(pdata, feature) {
- /* match feature and drv using id */
- if (feature->id == drv->id) {
+ if (dfl_feature_drv_match(feature, drv)) {
ret = dfl_feature_instance_init(pdev, pdata,
feature, drv);
if (ret)
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index 77b5137..f44fda1 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -30,8 +30,8 @@
/* plus one for fme device */
#define MAX_DFL_FEATURE_DEV_NUM (MAX_DFL_FPGA_PORT_NUM + 1)
-/* Reserved 0x0 for Header Group Register and 0xff for AFU */
-#define FEATURE_ID_FIU_HEADER 0x0
+/* Reserved 0xfe for Header Group Register and 0xff for AFU */
+#define FEATURE_ID_FIU_HEADER 0xfe
#define FEATURE_ID_AFU 0xff
#define FME_FEATURE_ID_HEADER FEATURE_ID_FIU_HEADER
@@ -169,13 +169,22 @@ struct dfl_fpga_port_ops {
int dfl_fpga_check_port_id(struct platform_device *pdev, void *pport_id);
/**
- * struct dfl_feature_driver - sub feature's driver
+ * struct dfl_feature_id - dfl private feature id
*
- * @id: sub feature id.
- * @ops: ops of this sub feature.
+ * @id: unique dfl private feature id.
*/
-struct dfl_feature_driver {
+struct dfl_feature_id {
u64 id;
+};
+
+/**
+ * struct dfl_feature_driver - dfl private feature driver
+ *
+ * @id_table: id_table for dfl private features supported by this driver.
+ * @ops: ops of this dfl private feature driver.
+ */
+struct dfl_feature_driver {
+ const struct dfl_feature_id *id_table;
const struct dfl_feature_ops *ops;
};
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 05/11] fpga: dfl: afu: add userclock sysfs interfaces.
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga
Cc: linux-kernel, linux-api, atull, Wu Hao, Ananda Ravuri,
Russ Weight, Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
This patch introduces userclock sysfs interfaces for AFU, user
could use these interfaces for clock setting to AFU.
Please note that, this is only working for port header feature
with revision 0, for later revisions, userclock setting is moved
to a separated private feature, so one revision sysfs interface
is exposed to userspace application for this purpose too.
Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: rebased, and switched to use device_add/remove_groups for sysfs.
---
Documentation/ABI/testing/sysfs-platform-dfl-port | 35 +++++++
drivers/fpga/dfl-afu-main.c | 114 +++++++++++++++++++++-
drivers/fpga/dfl.h | 4 +
3 files changed, 152 insertions(+), 1 deletion(-)
diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-port b/Documentation/ABI/testing/sysfs-platform-dfl-port
index 17b37d1..04ea7f2 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-port
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-port
@@ -44,3 +44,38 @@ Contact: Wu Hao <hao.wu@intel.com>
Description: Read-write. Read and set AFU latency tolerance reporting value.
Set ltr to 1 if the AFU can tolerate latency >= 40us or set it
to 0 if it is latency sensitive.
+
+What: /sys/bus/platform/devices/dfl-port.0/revision
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get the revision of port header
+ feature.
+
+What: /sys/bus/platform/devices/dfl-port.0/userclk_freqcmd
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Write-only. User writes command to this interface to set
+ userclock to AFU.
+
+What: /sys/bus/platform/devices/dfl-port.0/userclk_freqsts
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get the status of issued command
+ to userclck_freqcmd.
+
+What: /sys/bus/platform/devices/dfl-port.0/userclk_freqcntrcmd
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Write-only. User writes command to this interface to set
+ userclock counter.
+
+What: /sys/bus/platform/devices/dfl-port.0/userclk_freqcntrsts
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Read this file to get the status of issued command
+ to userclck_freqcntrcmd.
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index cb3f73e..9025314 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -142,6 +142,17 @@ static int port_get_id(struct platform_device *pdev)
static DEVICE_ATTR_RO(id);
static ssize_t
+revision_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ return sprintf(buf, "%x\n", dfl_feature_revision(base));
+}
+static DEVICE_ATTR_RO(revision);
+
+static ssize_t
ltr_show(struct device *dev, struct device_attribute *attr, char *buf)
{
struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
@@ -276,6 +287,7 @@ static int port_get_id(struct platform_device *pdev)
static struct attribute *port_hdr_attrs[] = {
&dev_attr_id.attr,
+ &dev_attr_revision.attr,
&dev_attr_ltr.attr,
&dev_attr_ap1_event.attr,
&dev_attr_ap2_event.attr,
@@ -284,14 +296,113 @@ static int port_get_id(struct platform_device *pdev)
};
ATTRIBUTE_GROUPS(port_hdr);
+static ssize_t
+userclk_freqcmd_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ u64 userclk_freq_cmd;
+ void __iomem *base;
+
+ if (kstrtou64(buf, 0, &userclk_freq_cmd))
+ return -EINVAL;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ mutex_lock(&pdata->lock);
+ writeq(userclk_freq_cmd, base + PORT_HDR_USRCLK_CMD0);
+ mutex_unlock(&pdata->lock);
+
+ return count;
+}
+static DEVICE_ATTR_WO(userclk_freqcmd);
+
+static ssize_t
+userclk_freqcntrcmd_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ u64 userclk_freqcntr_cmd;
+ void __iomem *base;
+
+ if (kstrtou64(buf, 0, &userclk_freqcntr_cmd))
+ return -EINVAL;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ mutex_lock(&pdata->lock);
+ writeq(userclk_freqcntr_cmd, base + PORT_HDR_USRCLK_CMD1);
+ mutex_unlock(&pdata->lock);
+
+ return count;
+}
+static DEVICE_ATTR_WO(userclk_freqcntrcmd);
+
+static ssize_t
+userclk_freqsts_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ u64 userclk_freqsts;
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ userclk_freqsts = readq(base + PORT_HDR_USRCLK_STS0);
+
+ return sprintf(buf, "0x%llx\n", (unsigned long long)userclk_freqsts);
+}
+static DEVICE_ATTR_RO(userclk_freqsts);
+
+static ssize_t
+userclk_freqcntrsts_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ u64 userclk_freqcntrsts;
+ void __iomem *base;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ userclk_freqcntrsts = readq(base + PORT_HDR_USRCLK_STS1);
+
+ return sprintf(buf, "0x%llx\n",
+ (unsigned long long)userclk_freqcntrsts);
+}
+static DEVICE_ATTR_RO(userclk_freqcntrsts);
+
+static struct attribute *port_hdr_userclk_attrs[] = {
+ &dev_attr_userclk_freqcmd.attr,
+ &dev_attr_userclk_freqcntrcmd.attr,
+ &dev_attr_userclk_freqsts.attr,
+ &dev_attr_userclk_freqcntrsts.attr,
+ NULL,
+};
+ATTRIBUTE_GROUPS(port_hdr_userclk);
+
static int port_hdr_init(struct platform_device *pdev,
struct dfl_feature *feature)
{
+ int ret;
+
dev_dbg(&pdev->dev, "PORT HDR Init.\n");
port_reset(pdev);
- return device_add_groups(&pdev->dev, port_hdr_groups);
+ ret = device_add_groups(&pdev->dev, port_hdr_groups);
+ if (ret)
+ return ret;
+
+ /*
+ * if revision > 0, the userclock will be moved from port hdr register
+ * region to a separated private feature.
+ */
+ if (dfl_feature_revision(feature->ioaddr) > 0)
+ return 0;
+
+ ret = device_add_groups(&pdev->dev, port_hdr_userclk_groups);
+ if (ret)
+ device_remove_groups(&pdev->dev, port_hdr_groups);
+
+ return ret;
}
static void port_hdr_uinit(struct platform_device *pdev,
@@ -299,6 +410,7 @@ static void port_hdr_uinit(struct platform_device *pdev,
{
dev_dbg(&pdev->dev, "PORT HDR UInit.\n");
+ device_remove_groups(&pdev->dev, port_hdr_userclk_groups);
device_remove_groups(&pdev->dev, port_hdr_groups);
}
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index fe7bca4..77b5137 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -120,6 +120,10 @@
#define PORT_HDR_CAP 0x30
#define PORT_HDR_CTRL 0x38
#define PORT_HDR_STS 0x40
+#define PORT_HDR_USRCLK_CMD0 0x50
+#define PORT_HDR_USRCLK_CMD1 0x58
+#define PORT_HDR_USRCLK_STS0 0x60
+#define PORT_HDR_USRCLK_STS1 0x68
/* Port Capability Register Bitfield */
#define PORT_CAP_PORT_NUM GENMASK_ULL(1, 0) /* ID of this port */
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 04/11] fpga: dfl: afu: add AFU state related sysfs interfaces
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga
Cc: linux-kernel, linux-api, atull, Wu Hao, Ananda Ravuri, Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
This patch introduces more sysfs interfaces for Accelerated
Function Unit (AFU). These interfaces allow users to read
current AFU Power State (APx), read / clear AFU Power (APx)
events which are sticky to identify transient APx state,
and manage AFU's LTR (latency tolerance reporting).
Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: rebased, and remove DRV/MODULE_VERSION modifications
---
Documentation/ABI/testing/sysfs-platform-dfl-port | 30 +++++
drivers/fpga/dfl-afu-main.c | 137 ++++++++++++++++++++++
drivers/fpga/dfl.h | 11 ++
3 files changed, 178 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-port b/Documentation/ABI/testing/sysfs-platform-dfl-port
index 6a92dda..17b37d1 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-port
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-port
@@ -14,3 +14,33 @@ Description: Read-only. User can program different PR bitstreams to FPGA
Accelerator Function Unit (AFU) for different functions. It
returns uuid which could be used to identify which PR bitstream
is programmed in this AFU.
+
+What: /sys/bus/platform/devices/dfl-port.0/power_state
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. It reports the APx (AFU Power) state, different APx
+ means different throttling level. When reading this file, it
+ returns "0" - Normal / "1" - AP1 / "2" - AP2 / "6" - AP6.
+
+What: /sys/bus/platform/devices/dfl-port.0/ap1_event
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-write. Read or set 1 to clear AP1 (AFU Power State 1)
+ event. It's used to indicate transient AP1 state.
+
+What: /sys/bus/platform/devices/dfl-port.0/ap2_event
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-write. Read or set 1 to clear AP2 (AFU Power State 2)
+ event. It's used to indicate transient AP2 state.
+
+What: /sys/bus/platform/devices/dfl-port.0/ltr
+Date: June 2019
+KernelVersion: 5.3
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-write. Read and set AFU latency tolerance reporting value.
+ Set ltr to 1 if the AFU can tolerate latency >= 40us or set it
+ to 0 if it is latency sensitive.
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 68b4d08..cb3f73e 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -141,8 +141,145 @@ static int port_get_id(struct platform_device *pdev)
}
static DEVICE_ATTR_RO(id);
+static ssize_t
+ltr_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ void __iomem *base;
+ u64 v;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ mutex_lock(&pdata->lock);
+ v = readq(base + PORT_HDR_CTRL);
+ mutex_unlock(&pdata->lock);
+
+ return sprintf(buf, "%x\n", (u8)FIELD_GET(PORT_CTRL_LATENCY, v));
+}
+
+static ssize_t
+ltr_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ void __iomem *base;
+ u8 ltr;
+ u64 v;
+
+ if (kstrtou8(buf, 0, <r) || ltr > 1)
+ return -EINVAL;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ mutex_lock(&pdata->lock);
+ v = readq(base + PORT_HDR_CTRL);
+ v &= ~PORT_CTRL_LATENCY;
+ v |= FIELD_PREP(PORT_CTRL_LATENCY, ltr);
+ writeq(v, base + PORT_HDR_CTRL);
+ mutex_unlock(&pdata->lock);
+
+ return count;
+}
+static DEVICE_ATTR_RW(ltr);
+
+static ssize_t
+ap1_event_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ void __iomem *base;
+ u64 v;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ mutex_lock(&pdata->lock);
+ v = readq(base + PORT_HDR_STS);
+ mutex_unlock(&pdata->lock);
+
+ return sprintf(buf, "%x\n", (u8)FIELD_GET(PORT_STS_AP1_EVT, v));
+}
+
+static ssize_t
+ap1_event_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ void __iomem *base;
+ u8 ap1_event;
+
+ if (kstrtou8(buf, 0, &ap1_event) || ap1_event != 1)
+ return -EINVAL;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ mutex_lock(&pdata->lock);
+ writeq(PORT_STS_AP1_EVT, base + PORT_HDR_STS);
+ mutex_unlock(&pdata->lock);
+
+ return count;
+}
+static DEVICE_ATTR_RW(ap1_event);
+
+static ssize_t
+ap2_event_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ void __iomem *base;
+ u64 v;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ mutex_lock(&pdata->lock);
+ v = readq(base + PORT_HDR_STS);
+ mutex_unlock(&pdata->lock);
+
+ return sprintf(buf, "%x\n", (u8)FIELD_GET(PORT_STS_AP2_EVT, v));
+}
+
+static ssize_t
+ap2_event_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ void __iomem *base;
+ u8 ap2_event;
+
+ if (kstrtou8(buf, 0, &ap2_event) || ap2_event != 1)
+ return -EINVAL;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ mutex_lock(&pdata->lock);
+ writeq(PORT_STS_AP2_EVT, base + PORT_HDR_STS);
+ mutex_unlock(&pdata->lock);
+
+ return count;
+}
+static DEVICE_ATTR_RW(ap2_event);
+
+static ssize_t
+power_state_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+ void __iomem *base;
+ u64 v;
+
+ base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+ mutex_lock(&pdata->lock);
+ v = readq(base + PORT_HDR_STS);
+ mutex_unlock(&pdata->lock);
+
+ return sprintf(buf, "0x%x\n", (u8)FIELD_GET(PORT_STS_PWR_STATE, v));
+}
+static DEVICE_ATTR_RO(power_state);
+
static struct attribute *port_hdr_attrs[] = {
&dev_attr_id.attr,
+ &dev_attr_ltr.attr,
+ &dev_attr_ap1_event.attr,
+ &dev_attr_ap2_event.attr,
+ &dev_attr_power_state.attr,
NULL,
};
ATTRIBUTE_GROUPS(port_hdr);
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index 061ccd4..fe7bca4 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -119,6 +119,7 @@
#define PORT_HDR_NEXT_AFU NEXT_AFU
#define PORT_HDR_CAP 0x30
#define PORT_HDR_CTRL 0x38
+#define PORT_HDR_STS 0x40
/* Port Capability Register Bitfield */
#define PORT_CAP_PORT_NUM GENMASK_ULL(1, 0) /* ID of this port */
@@ -130,6 +131,16 @@
/* Latency tolerance reporting. '1' >= 40us, '0' < 40us.*/
#define PORT_CTRL_LATENCY BIT_ULL(2)
#define PORT_CTRL_SFTRST_ACK BIT_ULL(4) /* HW ack for reset */
+
+/* Port Status Register Bitfield */
+#define PORT_STS_AP2_EVT BIT_ULL(13) /* AP2 event detected */
+#define PORT_STS_AP1_EVT BIT_ULL(12) /* AP1 event detected */
+#define PORT_STS_PWR_STATE GENMASK_ULL(11, 8) /* AFU power states */
+#define PORT_STS_PWR_STATE_NORM 0
+#define PORT_STS_PWR_STATE_AP1 1 /* 50% throttling */
+#define PORT_STS_PWR_STATE_AP2 2 /* 90% throttling */
+#define PORT_STS_PWR_STATE_AP6 6 /* 100% throttling */
+
/**
* struct dfl_fpga_port_ops - port ops
*
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 03/11] fpga: dfl: pci: enable SRIOV support.
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga
Cc: linux-kernel, linux-api, atull, Wu Hao, Zhang Yi Z, Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
This patch enables the standard sriov support. It allows user to
enable SRIOV (and VFs), then user could pass through accelerators
(VFs) into virtual machine or use VFs directly in host.
Signed-off-by: Zhang Yi Z <yi.z.zhang@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
Acked-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: remove DRV/MODULE_VERSION modifications.
---
drivers/fpga/dfl-pci.c | 39 +++++++++++++++++++++++++++++++++++++++
drivers/fpga/dfl.c | 41 +++++++++++++++++++++++++++++++++++++++++
drivers/fpga/dfl.h | 1 +
3 files changed, 81 insertions(+)
diff --git a/drivers/fpga/dfl-pci.c b/drivers/fpga/dfl-pci.c
index 66b5720..0e65d81 100644
--- a/drivers/fpga/dfl-pci.c
+++ b/drivers/fpga/dfl-pci.c
@@ -223,8 +223,46 @@ int cci_pci_probe(struct pci_dev *pcidev, const struct pci_device_id *pcidevid)
return ret;
}
+static int cci_pci_sriov_configure(struct pci_dev *pcidev, int num_vfs)
+{
+ struct cci_drvdata *drvdata = pci_get_drvdata(pcidev);
+ struct dfl_fpga_cdev *cdev = drvdata->cdev;
+ int ret = 0;
+
+ mutex_lock(&cdev->lock);
+
+ if (!num_vfs) {
+ /*
+ * disable SRIOV and then put released ports back to default
+ * PF access mode.
+ */
+ pci_disable_sriov(pcidev);
+
+ __dfl_fpga_cdev_config_port_vf(cdev, false);
+
+ } else if (cdev->released_port_num == num_vfs) {
+ /*
+ * only enable SRIOV if cdev has matched released ports, put
+ * released ports into VF access mode firstly.
+ */
+ __dfl_fpga_cdev_config_port_vf(cdev, true);
+
+ ret = pci_enable_sriov(pcidev, num_vfs);
+ if (ret)
+ __dfl_fpga_cdev_config_port_vf(cdev, false);
+ } else {
+ ret = -EINVAL;
+ }
+
+ mutex_unlock(&cdev->lock);
+ return ret;
+}
+
static void cci_pci_remove(struct pci_dev *pcidev)
{
+ if (dev_is_pf(&pcidev->dev))
+ cci_pci_sriov_configure(pcidev, 0);
+
cci_remove_feature_devs(pcidev);
pci_disable_pcie_error_reporting(pcidev);
}
@@ -234,6 +272,7 @@ static void cci_pci_remove(struct pci_dev *pcidev)
.id_table = cci_pcie_id_tbl,
.probe = cci_pci_probe,
.remove = cci_pci_remove,
+ .sriov_configure = cci_pci_sriov_configure,
};
module_pci_driver(cci_pci_driver);
diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
index e04ed45..c3a8e1d 100644
--- a/drivers/fpga/dfl.c
+++ b/drivers/fpga/dfl.c
@@ -1112,6 +1112,47 @@ int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev, int port_id,
}
EXPORT_SYMBOL_GPL(dfl_fpga_cdev_config_port);
+static void config_port_vf(struct device *fme_dev, int port_id, bool is_vf)
+{
+ void __iomem *base;
+ u64 v;
+
+ base = dfl_get_feature_ioaddr_by_id(fme_dev, FME_FEATURE_ID_HEADER);
+
+ v = readq(base + FME_HDR_PORT_OFST(port_id));
+
+ v &= ~FME_PORT_OFST_ACC_CTRL;
+ v |= FIELD_PREP(FME_PORT_OFST_ACC_CTRL,
+ is_vf ? FME_PORT_OFST_ACC_VF : FME_PORT_OFST_ACC_PF);
+
+ writeq(v, base + FME_HDR_PORT_OFST(port_id));
+}
+
+/**
+ * __dfl_fpga_cdev_config_port_vf - configure port to VF access mode
+ *
+ * @cdev: parent container device.
+ * @if_vf: true for VF access mode, and false for PF access mode
+ *
+ * Return: 0 on success, negative error code otherwise.
+ *
+ * This function is needed in sriov configuration routine. It could be used to
+ * configures the released ports access mode to VF or PF.
+ * The caller needs to hold lock for protection.
+ */
+void __dfl_fpga_cdev_config_port_vf(struct dfl_fpga_cdev *cdev, bool is_vf)
+{
+ struct dfl_feature_platform_data *pdata;
+
+ list_for_each_entry(pdata, &cdev->port_dev_list, node) {
+ if (device_is_registered(&pdata->dev->dev))
+ continue;
+
+ config_port_vf(cdev->fme_dev, pdata->id, is_vf);
+ }
+}
+EXPORT_SYMBOL_GPL(__dfl_fpga_cdev_config_port_vf);
+
static int __init dfl_fpga_init(void)
{
int ret;
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index d700ee9..061ccd4 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -421,5 +421,6 @@ struct platform_device *
int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev,
int port_id, bool release);
+void __dfl_fpga_cdev_config_port_vf(struct dfl_fpga_cdev *cdev, bool is_vf);
#endif /* __FPGA_DFL_H */
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 02/11] fpga: dfl: fme: add DFL_FPGA_FME_PORT_RELEASE/ASSIGN ioctl support.
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga
Cc: linux-kernel, linux-api, atull, Wu Hao, Zhang Yi Z, Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
In order to support virtualization usage via PCIe SRIOV, this patch
adds two ioctls under FPGA Management Engine (FME) to release and
assign back the port device. In order to safely turn Port from PF
into VF and enable PCIe SRIOV, it requires user to invoke this
PORT_RELEASE ioctl to release port firstly to remove userspace
interfaces, and then configure the PF/VF access register in FME.
After disable SRIOV, it requires user to invoke this PORT_ASSIGN
ioctl to attach the port back to PF.
Ioctl interfaces:
* DFL_FPGA_FME_PORT_RELEASE
Release platform device of given port, it deletes port platform
device to remove related userspace interfaces on PF, then
configures PF/VF access mode to VF.
* DFL_FPGA_FME_PORT_ASSIGN
Assign platform device of given port back to PF, it configures
PF/VF access mode to PF, then adds port platform device back to
re-enable related userspace interfaces on PF.
Signed-off-by: Zhang Yi Z <yi.z.zhang@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
Acked-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: remove argsz from ioctls.
---
drivers/fpga/dfl-fme-main.c | 30 ++++++++++++
drivers/fpga/dfl.c | 107 +++++++++++++++++++++++++++++++++++++-----
drivers/fpga/dfl.h | 10 ++++
include/uapi/linux/fpga-dfl.h | 19 ++++++++
4 files changed, 154 insertions(+), 12 deletions(-)
diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index 0be4635..e61e0fe 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -16,6 +16,7 @@
#include <linux/kernel.h>
#include <linux/module.h>
+#include <linux/uaccess.h>
#include <linux/fpga-dfl.h>
#include "dfl.h"
@@ -104,9 +105,38 @@ static void fme_hdr_uinit(struct platform_device *pdev,
device_remove_groups(&pdev->dev, fme_hdr_groups);
}
+static long fme_hdr_ioctl_config_port(struct dfl_feature_platform_data *pdata,
+ unsigned long arg, bool release)
+{
+ struct dfl_fpga_cdev *cdev = pdata->dfl_cdev;
+ int port_id;
+
+ if (get_user(port_id, (int __user *)arg))
+ return -EFAULT;
+
+ return dfl_fpga_cdev_config_port(cdev, port_id, release);
+}
+
+static long fme_hdr_ioctl(struct platform_device *pdev,
+ struct dfl_feature *feature,
+ unsigned int cmd, unsigned long arg)
+{
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
+
+ switch (cmd) {
+ case DFL_FPGA_FME_PORT_RELEASE:
+ return fme_hdr_ioctl_config_port(pdata, arg, true);
+ case DFL_FPGA_FME_PORT_ASSIGN:
+ return fme_hdr_ioctl_config_port(pdata, arg, false);
+ }
+
+ return -ENODEV;
+}
+
static const struct dfl_feature_ops fme_hdr_ops = {
.init = fme_hdr_init,
.uinit = fme_hdr_uinit,
+ .ioctl = fme_hdr_ioctl,
};
static struct dfl_feature_driver fme_feature_drvs[] = {
diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
index 4b66aaa..e04ed45 100644
--- a/drivers/fpga/dfl.c
+++ b/drivers/fpga/dfl.c
@@ -231,16 +231,20 @@ void dfl_fpga_port_ops_del(struct dfl_fpga_port_ops *ops)
*/
int dfl_fpga_check_port_id(struct platform_device *pdev, void *pport_id)
{
- struct dfl_fpga_port_ops *port_ops = dfl_fpga_port_ops_get(pdev);
- int port_id;
+ struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
+ struct dfl_fpga_port_ops *port_ops;
+
+ if (pdata->id != FEATURE_DEV_ID_UNUSED)
+ return pdata->id == *(int *)pport_id;
+ port_ops = dfl_fpga_port_ops_get(pdev);
if (!port_ops || !port_ops->get_id)
return 0;
- port_id = port_ops->get_id(pdev);
+ pdata->id = port_ops->get_id(pdev);
dfl_fpga_port_ops_put(port_ops);
- return port_id == *(int *)pport_id;
+ return pdata->id == *(int *)pport_id;
}
EXPORT_SYMBOL_GPL(dfl_fpga_check_port_id);
@@ -474,6 +478,7 @@ static int build_info_commit_dev(struct build_feature_devs_info *binfo)
pdata->dev = fdev;
pdata->num = binfo->feature_num;
pdata->dfl_cdev = binfo->cdev;
+ pdata->id = FEATURE_DEV_ID_UNUSED;
mutex_init(&pdata->lock);
lockdep_set_class_and_name(&pdata->lock, &dfl_pdata_keys[type],
dfl_pdata_key_strings[type]);
@@ -973,25 +978,27 @@ void dfl_fpga_feature_devs_remove(struct dfl_fpga_cdev *cdev)
{
struct dfl_feature_platform_data *pdata, *ptmp;
- remove_feature_devs(cdev);
-
mutex_lock(&cdev->lock);
- if (cdev->fme_dev) {
- /* the fme should be unregistered. */
- WARN_ON(device_is_registered(cdev->fme_dev));
+ if (cdev->fme_dev)
put_device(cdev->fme_dev);
- }
list_for_each_entry_safe(pdata, ptmp, &cdev->port_dev_list, node) {
struct platform_device *port_dev = pdata->dev;
- /* the port should be unregistered. */
- WARN_ON(device_is_registered(&port_dev->dev));
+ /* remove released ports */
+ if (!device_is_registered(&port_dev->dev)) {
+ dfl_id_free(feature_dev_id_type(port_dev),
+ port_dev->id);
+ platform_device_put(port_dev);
+ }
+
list_del(&pdata->node);
put_device(&port_dev->dev);
}
mutex_unlock(&cdev->lock);
+ remove_feature_devs(cdev);
+
fpga_region_unregister(cdev->region);
devm_kfree(cdev->parent, cdev);
}
@@ -1029,6 +1036,82 @@ struct platform_device *
}
EXPORT_SYMBOL_GPL(__dfl_fpga_cdev_find_port);
+static int attach_port_dev(struct dfl_fpga_cdev *cdev, int port_id)
+{
+ struct platform_device *port_pdev;
+ int ret = -ENODEV;
+
+ mutex_lock(&cdev->lock);
+ port_pdev = __dfl_fpga_cdev_find_port(cdev, &port_id,
+ dfl_fpga_check_port_id);
+ if (!port_pdev)
+ goto unlock_exit;
+
+ if (device_is_registered(&port_pdev->dev)) {
+ ret = -EBUSY;
+ goto put_dev_exit;
+ }
+
+ ret = platform_device_add(port_pdev);
+ if (ret)
+ goto put_dev_exit;
+
+ dfl_feature_dev_use_end(dev_get_platdata(&port_pdev->dev));
+ cdev->released_port_num--;
+put_dev_exit:
+ put_device(&port_pdev->dev);
+unlock_exit:
+ mutex_unlock(&cdev->lock);
+ return ret;
+}
+
+static int detach_port_dev(struct dfl_fpga_cdev *cdev, int port_id)
+{
+ struct platform_device *port_pdev;
+ int ret = -ENODEV;
+
+ mutex_lock(&cdev->lock);
+ port_pdev = __dfl_fpga_cdev_find_port(cdev, &port_id,
+ dfl_fpga_check_port_id);
+ if (!port_pdev)
+ goto unlock_exit;
+
+ if (!device_is_registered(&port_pdev->dev)) {
+ ret = -EBUSY;
+ goto put_dev_exit;
+ }
+
+ ret = dfl_feature_dev_use_begin(dev_get_platdata(&port_pdev->dev));
+ if (ret)
+ goto put_dev_exit;
+
+ platform_device_del(port_pdev);
+ cdev->released_port_num++;
+put_dev_exit:
+ put_device(&port_pdev->dev);
+unlock_exit:
+ mutex_unlock(&cdev->lock);
+ return ret;
+}
+
+/**
+ * dfl_fpga_cdev_config_port - configure a port feature dev
+ * @cdev: parent container device.
+ * @port_id: id of the port feature device.
+ * @release: release port or assign port back.
+ *
+ * This function allows user to release port platform device or assign it back.
+ * e.g. to safely turn one port from PF into VF for PCI device SRIOV support,
+ * release port platform device is one necessary step.
+ */
+int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev, int port_id,
+ bool release)
+{
+ return release ? detach_port_dev(cdev, port_id) :
+ attach_port_dev(cdev, port_id);
+}
+EXPORT_SYMBOL_GPL(dfl_fpga_cdev_config_port);
+
static int __init dfl_fpga_init(void)
{
int ret;
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index 8851c6c..d700ee9 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -183,6 +183,8 @@ struct dfl_feature {
#define DEV_STATUS_IN_USE 0
+#define FEATURE_DEV_ID_UNUSED (-1)
+
/**
* struct dfl_feature_platform_data - platform data for feature devices
*
@@ -191,6 +193,7 @@ struct dfl_feature {
* @cdev: cdev of feature dev.
* @dev: ptr to platform device linked with this platform data.
* @dfl_cdev: ptr to container device.
+ * @id: id used for this feature device.
* @disable_count: count for port disable.
* @num: number for sub features.
* @dev_status: dev status (e.g. DEV_STATUS_IN_USE).
@@ -203,6 +206,7 @@ struct dfl_feature_platform_data {
struct cdev cdev;
struct platform_device *dev;
struct dfl_fpga_cdev *dfl_cdev;
+ int id;
unsigned int disable_count;
unsigned long dev_status;
void *private;
@@ -378,6 +382,7 @@ int dfl_fpga_enum_info_add_dfl(struct dfl_fpga_enum_info *info,
* @fme_dev: FME feature device under this container device.
* @lock: mutex lock to protect the port device list.
* @port_dev_list: list of all port feature devices under this container device.
+ * @released_port_num: released port number under this container device.
*/
struct dfl_fpga_cdev {
struct device *parent;
@@ -385,6 +390,7 @@ struct dfl_fpga_cdev {
struct device *fme_dev;
struct mutex lock;
struct list_head port_dev_list;
+ int released_port_num;
};
struct dfl_fpga_cdev *
@@ -412,4 +418,8 @@ struct platform_device *
return pdev;
}
+
+int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev,
+ int port_id, bool release);
+
#endif /* __FPGA_DFL_H */
diff --git a/include/uapi/linux/fpga-dfl.h b/include/uapi/linux/fpga-dfl.h
index 2e324e5..72f11fd 100644
--- a/include/uapi/linux/fpga-dfl.h
+++ b/include/uapi/linux/fpga-dfl.h
@@ -176,4 +176,23 @@ struct dfl_fpga_fme_port_pr {
#define DFL_FPGA_FME_PORT_PR _IO(DFL_FPGA_MAGIC, DFL_FME_BASE + 0)
+/**
+ * DFL_FPGA_FME_PORT_RELEASE - _IOW(DFL_FPGA_MAGIC, DFL_FME_BASE + 1,
+ * int port_id)
+ *
+ * Driver releases the port per Port ID provided by caller.
+ * Return: 0 on success, -errno on failure.
+ */
+#define DFL_FPGA_FME_PORT_RELEASE _IOW(DFL_FPGA_MAGIC, DFL_FME_BASE + 1, int)
+
+/**
+ * DFL_FPGA_FME_PORT_ASSIGN - _IOW(DFL_FPGA_MAGIC, DFL_FME_BASE + 2,
+ * int port_id)
+ *
+ * Driver assigns the port back per Port ID provided by caller.
+ * Return: 0 on success, -errno on failure.
+ */
+
+#define DFL_FPGA_FME_PORT_ASSIGN _IOW(DFL_FPGA_MAGIC, DFL_FME_BASE + 2, int)
+
#endif /* _UAPI_LINUX_FPGA_DFL_H */
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 01/11] fpga: dfl: fme: support 512bit data width PR
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga
Cc: linux-kernel, linux-api, atull, Wu Hao, Ananda Ravuri, Xu Yilun
In-Reply-To: <1562286238-11413-1-git-send-email-hao.wu@intel.com>
In early partial reconfiguration private feature, it only
supports 32bit data width when writing data to hardware for
PR. 512bit data width PR support is an important optimization
for some specific solutions (e.g. XEON with FPGA integrated),
it allows driver to use AVX512 instruction to improve the
performance of partial reconfiguration. e.g. programming one
100MB bitstream image via this 512bit data width PR hardware
only takes ~300ms, but 32bit revision requires ~3s per test
result.
Please note now this optimization is only done on revision 2
of this PR private feature which is only used in integrated
solution that AVX512 is always supported. This revision 2
hardware doesn't support 32bit PR.
Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---
v2: remove DRV/MODULE_VERSION modifications
---
drivers/fpga/dfl-fme-mgr.c | 110 ++++++++++++++++++++++++++++++++++++++-------
drivers/fpga/dfl-fme-pr.c | 43 +++++++++++-------
drivers/fpga/dfl-fme.h | 2 +
drivers/fpga/dfl.h | 5 +++
4 files changed, 129 insertions(+), 31 deletions(-)
diff --git a/drivers/fpga/dfl-fme-mgr.c b/drivers/fpga/dfl-fme-mgr.c
index b3f7eee..46e17f0 100644
--- a/drivers/fpga/dfl-fme-mgr.c
+++ b/drivers/fpga/dfl-fme-mgr.c
@@ -22,6 +22,7 @@
#include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/fpga/fpga-mgr.h>
+#include "dfl.h"
#include "dfl-fme-pr.h"
/* FME Partial Reconfiguration Sub Feature Register Set */
@@ -30,6 +31,7 @@
#define FME_PR_STS 0x10
#define FME_PR_DATA 0x18
#define FME_PR_ERR 0x20
+#define FME_PR_512_DATA 0x40 /* Data Register for 512bit datawidth PR */
#define FME_PR_INTFC_ID_L 0xA8
#define FME_PR_INTFC_ID_H 0xB0
@@ -67,8 +69,43 @@
#define PR_WAIT_TIMEOUT 8000000
#define PR_HOST_STATUS_IDLE 0
+#if defined(CONFIG_X86) && defined(CONFIG_AS_AVX512)
+
+#include <linux/cpufeature.h>
+#include <asm/fpu/api.h>
+
+static inline int is_cpu_avx512_enabled(void)
+{
+ return cpu_feature_enabled(X86_FEATURE_AVX512F);
+}
+
+static inline void copy512(const void *src, void __iomem *dst)
+{
+ kernel_fpu_begin();
+
+ asm volatile("vmovdqu64 (%0), %%zmm0;"
+ "vmovntdq %%zmm0, (%1);"
+ :
+ : "r"(src), "r"(dst)
+ : "memory");
+
+ kernel_fpu_end();
+}
+#else
+static inline int is_cpu_avx512_enabled(void)
+{
+ return 0;
+}
+
+static inline void copy512(const void *src, void __iomem *dst)
+{
+ WARN_ON_ONCE(1);
+}
+#endif
+
struct fme_mgr_priv {
void __iomem *ioaddr;
+ unsigned int pr_datawidth;
u64 pr_error;
};
@@ -169,7 +206,7 @@ static int fme_mgr_write(struct fpga_manager *mgr,
struct fme_mgr_priv *priv = mgr->priv;
void __iomem *fme_pr = priv->ioaddr;
u64 pr_ctrl, pr_status, pr_data;
- int delay = 0, pr_credit, i = 0;
+ int ret = 0, delay = 0, pr_credit;
dev_dbg(dev, "start request\n");
@@ -181,9 +218,9 @@ static int fme_mgr_write(struct fpga_manager *mgr,
/*
* driver can push data to PR hardware using PR_DATA register once HW
- * has enough pr_credit (> 1), pr_credit reduces one for every 32bit
- * pr data write to PR_DATA register. If pr_credit <= 1, driver needs
- * to wait for enough pr_credit from hardware by polling.
+ * has enough pr_credit (> 1), pr_credit reduces one for every pr data
+ * width write to PR_DATA register. If pr_credit <= 1, driver needs to
+ * wait for enough pr_credit from hardware by polling.
*/
pr_status = readq(fme_pr + FME_PR_STS);
pr_credit = FIELD_GET(FME_PR_STS_PR_CREDIT, pr_status);
@@ -192,7 +229,8 @@ static int fme_mgr_write(struct fpga_manager *mgr,
while (pr_credit <= 1) {
if (delay++ > PR_WAIT_TIMEOUT) {
dev_err(dev, "PR_CREDIT timeout\n");
- return -ETIMEDOUT;
+ ret = -ETIMEDOUT;
+ goto done;
}
udelay(1);
@@ -200,21 +238,27 @@ static int fme_mgr_write(struct fpga_manager *mgr,
pr_credit = FIELD_GET(FME_PR_STS_PR_CREDIT, pr_status);
}
- if (count < 4) {
- dev_err(dev, "Invalid PR bitstream size\n");
- return -EINVAL;
+ WARN_ON(count < priv->pr_datawidth);
+
+ switch (priv->pr_datawidth) {
+ case 4:
+ pr_data = FIELD_PREP(FME_PR_DATA_PR_DATA_RAW,
+ *(u32 *)buf);
+ writeq(pr_data, fme_pr + FME_PR_DATA);
+ break;
+ case 64:
+ copy512(buf, fme_pr + FME_PR_512_DATA);
+ break;
+ default:
+ WARN_ON_ONCE(1);
}
-
- pr_data = 0;
- pr_data |= FIELD_PREP(FME_PR_DATA_PR_DATA_RAW,
- *(((u32 *)buf) + i));
- writeq(pr_data, fme_pr + FME_PR_DATA);
- count -= 4;
+ buf += priv->pr_datawidth;
+ count -= priv->pr_datawidth;
pr_credit--;
- i++;
}
- return 0;
+done:
+ return ret;
}
static int fme_mgr_write_complete(struct fpga_manager *mgr,
@@ -279,6 +323,36 @@ static void fme_mgr_get_compat_id(void __iomem *fme_pr,
id->id_h = readq(fme_pr + FME_PR_INTFC_ID_H);
}
+static u8 fme_mgr_get_pr_datawidth(struct device *dev, void __iomem *fme_pr)
+{
+ u8 revision = dfl_feature_revision(fme_pr);
+
+ if (revision < 2) {
+ /*
+ * revision 0 and 1 only support 32bit data width partial
+ * reconfiguration, so pr_datawidth is 4 (Byte).
+ */
+ return 4;
+ } else if (revision == 2) {
+ /*
+ * revision 2 hardware has optimization to support 512bit data
+ * width partial reconfiguration with AVX512 instructions. So
+ * pr_datawidth is 64 (Byte). As revision 2 hardware is only
+ * used in integrated solution, CPU supports AVX512 instructions
+ * for sure, but it still needs to check here as AVX512 could be
+ * disabled in kernel (e.g. using clearcpuid boot option).
+ */
+ if (is_cpu_avx512_enabled())
+ return 64;
+
+ dev_err(dev, "revision 2: AVX512 is disabled\n");
+ return 0;
+ }
+
+ dev_err(dev, "revision %d is not supported yet\n", revision);
+ return 0;
+}
+
static int fme_mgr_probe(struct platform_device *pdev)
{
struct dfl_fme_mgr_pdata *pdata = dev_get_platdata(&pdev->dev);
@@ -302,6 +376,10 @@ static int fme_mgr_probe(struct platform_device *pdev)
return PTR_ERR(priv->ioaddr);
}
+ priv->pr_datawidth = fme_mgr_get_pr_datawidth(dev, priv->ioaddr);
+ if (!priv->pr_datawidth)
+ return -ENODEV;
+
compat_id = devm_kzalloc(dev, sizeof(*compat_id), GFP_KERNEL);
if (!compat_id)
return -ENOMEM;
diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
index 3c71dc3..cd94ba8 100644
--- a/drivers/fpga/dfl-fme-pr.c
+++ b/drivers/fpga/dfl-fme-pr.c
@@ -83,7 +83,7 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
if (copy_from_user(&port_pr, argp, minsz))
return -EFAULT;
- if (port_pr.argsz < minsz || port_pr.flags)
+ if (port_pr.argsz < minsz || port_pr.flags || !port_pr.buffer_size)
return -EINVAL;
/* get fme header region */
@@ -101,15 +101,25 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
port_pr.buffer_size))
return -EFAULT;
+ mutex_lock(&pdata->lock);
+ fme = dfl_fpga_pdata_get_private(pdata);
+ /* fme device has been unregistered. */
+ if (!fme) {
+ ret = -EINVAL;
+ goto unlock_exit;
+ }
+
/*
* align PR buffer per PR bandwidth, as HW ignores the extra padding
* data automatically.
*/
- length = ALIGN(port_pr.buffer_size, 4);
+ length = ALIGN(port_pr.buffer_size, fme->pr_datawidth);
buf = vmalloc(length);
- if (!buf)
- return -ENOMEM;
+ if (!buf) {
+ ret = -ENOMEM;
+ goto unlock_exit;
+ }
if (copy_from_user(buf,
(void __user *)(unsigned long)port_pr.buffer_address,
@@ -127,18 +137,10 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
info->flags |= FPGA_MGR_PARTIAL_RECONFIG;
- mutex_lock(&pdata->lock);
- fme = dfl_fpga_pdata_get_private(pdata);
- /* fme device has been unregistered. */
- if (!fme) {
- ret = -EINVAL;
- goto unlock_exit;
- }
-
region = dfl_fme_region_find(fme, port_pr.port_id);
if (!region) {
ret = -EINVAL;
- goto unlock_exit;
+ goto free_exit;
}
fpga_image_info_free(region->info);
@@ -159,10 +161,10 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
fpga_bridges_put(®ion->bridge_list);
put_device(®ion->dev);
-unlock_exit:
- mutex_unlock(&pdata->lock);
free_exit:
vfree(buf);
+unlock_exit:
+ mutex_unlock(&pdata->lock);
return ret;
}
@@ -388,6 +390,17 @@ static int pr_mgmt_init(struct platform_device *pdev,
mutex_lock(&pdata->lock);
priv = dfl_fpga_pdata_get_private(pdata);
+ /*
+ * Initialize PR data width.
+ * Only revision 2 supports 512bit datawidth for better performance,
+ * other revisions use default 32bit datawidth. This is used for
+ * buffer alignment.
+ */
+ if (dfl_feature_revision(feature->ioaddr) == 2)
+ priv->pr_datawidth = 64;
+ else
+ priv->pr_datawidth = 4;
+
/* Initialize the region and bridge sub device list */
INIT_LIST_HEAD(&priv->region_list);
INIT_LIST_HEAD(&priv->bridge_list);
diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
index 5394a21..de20755 100644
--- a/drivers/fpga/dfl-fme.h
+++ b/drivers/fpga/dfl-fme.h
@@ -21,12 +21,14 @@
/**
* struct dfl_fme - dfl fme private data
*
+ * @pr_datawidth: data width for partial reconfiguration.
* @mgr: FME's FPGA manager platform device.
* @region_list: linked list of FME's FPGA regions.
* @bridge_list: linked list of FME's FPGA bridges.
* @pdata: fme platform device's pdata.
*/
struct dfl_fme {
+ int pr_datawidth;
struct platform_device *mgr;
struct list_head region_list;
struct list_head bridge_list;
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index a8b869e..8851c6c 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -331,6 +331,11 @@ static inline bool dfl_feature_is_port(void __iomem *base)
(FIELD_GET(DFH_ID, v) == DFH_ID_FIU_PORT);
}
+static inline u8 dfl_feature_revision(void __iomem *base)
+{
+ return (u8)FIELD_GET(DFH_REVISION, readq(base + DFH));
+}
+
/**
* struct dfl_fpga_enum_info - DFL FPGA enumeration information
*
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 00/11] FPGA DFL updates
From: Wu Hao @ 2019-07-05 0:23 UTC (permalink / raw)
To: gregkh, mdf, linux-fpga; +Cc: linux-kernel, linux-api, atull, Wu Hao
Hi Greg / Moritz
This is v2 patchset which adds more features to FPGA DFL. This patchset
is made on top of patch[1] and char-misc-next tree. Documentation patch
for DFL is dropped from this patchset, and will resubmit it later to
avoid conflict.
Main changes from v1:
- remove DRV/MODULE_VERSION modifications. (patch #1, #3, #4, #6)
- remove argsz from new ioctls. (patch #2)
- replace sysfs_create/remove_* with device_add/remove_* for sysfs entries.
(patch #5, #8, #11)
[1] [PATCH] fpga: dfl: use driver core functions, not sysfs ones.
https://lkml.org/lkml/2019/7/4/36
Wu Hao (11):
fpga: dfl: fme: support 512bit data width PR
fpga: dfl: fme: add DFL_FPGA_FME_PORT_RELEASE/ASSIGN ioctl support.
fpga: dfl: pci: enable SRIOV support.
fpga: dfl: afu: add AFU state related sysfs interfaces
fpga: dfl: afu: add userclock sysfs interfaces.
fpga: dfl: add id_table for dfl private feature driver
fpga: dfl: afu: export __port_enable/disable function.
fpga: dfl: afu: add error reporting support.
fpga: dfl: afu: add STP (SignalTap) support
fpga: dfl: fme: add capability sysfs interfaces
fpga: dfl: fme: add global error reporting support
Documentation/ABI/testing/sysfs-platform-dfl-fme | 98 ++++++
Documentation/ABI/testing/sysfs-platform-dfl-port | 104 ++++++
drivers/fpga/Makefile | 3 +-
drivers/fpga/dfl-afu-error.c | 225 +++++++++++++
drivers/fpga/dfl-afu-main.c | 328 +++++++++++++++++-
drivers/fpga/dfl-afu.h | 7 +
drivers/fpga/dfl-fme-error.c | 385 ++++++++++++++++++++++
drivers/fpga/dfl-fme-main.c | 93 +++++-
drivers/fpga/dfl-fme-mgr.c | 110 ++++++-
drivers/fpga/dfl-fme-pr.c | 50 ++-
drivers/fpga/dfl-fme.h | 7 +-
drivers/fpga/dfl-pci.c | 39 +++
drivers/fpga/dfl.c | 166 +++++++++-
drivers/fpga/dfl.h | 54 ++-
include/uapi/linux/fpga-dfl.h | 19 ++
15 files changed, 1617 insertions(+), 71 deletions(-)
create mode 100644 drivers/fpga/dfl-afu-error.c
create mode 100644 drivers/fpga/dfl-fme-error.c
--
1.8.3.1
^ permalink raw reply
* Re: [RFC PATCH] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Pavel Machek @ 2019-07-04 19:50 UTC (permalink / raw)
To: Jann Horn
Cc: Yu-cheng Yu, the arch/x86 maintainers, H. Peter Anvin,
Thomas Gleixner, Ingo Molnar, kernel list, linux-doc, Linux-MM,
linux-arch, Linux API, Arnd Bergmann, Andy Lutomirski,
Balbir Singh, Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
Eugene Syromiatnikov, Florian Weimer, H.J. Lu, Jonathan Corbet,
Kees Cook
In-Reply-To: <CAG48ez0rHHfcRgiVZf5FP0YOzxsXigvpg6ci790cmiN6PBwkhQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1166 bytes --]
Hi!
> > +static int scan(u8 *buf, u32 buf_size, int item_size, test_item_fn test_item,
> > + next_item_fn next_item, u32 *arg, u32 type, u32 *pos)
> > +{
> > + int found = 0;
> > + u8 *p, *max;
> > +
> > + max = buf + buf_size;
> > + if (max < buf)
> > + return 0;
>
> How can this ever legitimately happen? If it can't, perhaps you meant
> to put a WARN_ON_ONCE() or something like that here?
> Also, computing out-of-bounds pointers is UB (section 6.5.6 of C99:
> "If both the pointer operand and the result point to elements of the
> same array object, or one past the last element of the array object,
> the evaluation shall not produce an overflow; otherwise, the behavior
> is undefined."), and if the addition makes the pointer wrap, that's
> certainly out of bounds; so I don't think this condition can trigger
> without UB.
Kernel assumes sane compiler. We pass flags to get it... C99 does not
quite apply here.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply
* Re: [PATCH 01/13] vfs: verify param type in vfs_parse_sb_flag()
From: David Howells @ 2019-07-04 16:19 UTC (permalink / raw)
To: Miklos Szeredi
Cc: dhowells, Al Viro, Miklos Szeredi, Ian Kent, Linux API,
linux-fsdevel, linux-kernel
In-Reply-To: <CAJfpegv_ezsXOLV2f7yd07=T3MenJoMKhu=MBac1-80s0BFg9A@mail.gmail.com>
Miklos Szeredi <miklos@szeredi.hu> wrote:
> Ping? Have you had a chance of looking at this series?
Yeah, through due to time pressure, I haven't managed to do much with it.
I don't agree with all your changes, and also I'd like them to wait till after
the branch of mount API filesystem conversions that I've given to Al has had a
chance to hopefully go in in this merge window, along with whatever changes Al
has made to it.
Bocsánat,
David
^ permalink raw reply
* Re: [PATCH 6/9] Add a general, global device notification watch list [ver #5]
From: David Howells @ 2019-07-04 16:04 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: dhowells, viro, Casey Schaufler, Stephen Smalley, nicolas.dichtel,
raven, Christian Brauner, keyrings, linux-usb,
linux-security-module, linux-fsdevel, linux-api, linux-block,
linux-kernel
In-Reply-To: <20190703190846.GA15663@kroah.com>
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> Don't we need a manpage and a kselftest for it?
I've got part of a manpage, but it needs more work.
How do you do a kselftest for this when it does nothing unless hardware events
happen?
> > + u64 id = 0; /* Might want to allow dev# here. */
>
> I don't understand the comment here, what does "dev#" refer to?
This is really for mount subtree watches, so I'm removing it for now.
The reason it's there is because a mount object may have multiple watches, but
each watch is set on a dentry within that mount, and it doesn't have to be the
same dentry each time. The queue is shared between all the dentries, and the
ID is used (a) to label them so that they can be manually removed, (b) to
match them to each dentry when the notification is being propagated rootwards
along the tree and (c) to avoid adding another field to struct dentry.
David
^ permalink raw reply
* Re: [PATCH v6 10/17] fs-verity: implement FS_IOC_ENABLE_VERITY ioctl
From: Eric Biggers @ 2019-07-03 20:14 UTC (permalink / raw)
To: linux-fscrypt
Cc: Theodore Y . Ts'o, Darrick J . Wong, linux-api, Dave Chinner,
linux-f2fs-devel, linux-fsdevel, Jaegeuk Kim, linux-integrity,
linux-ext4, Linus Torvalds, Christoph Hellwig, Victor Hsieh
In-Reply-To: <20190701153237.1777-11-ebiggers@kernel.org>
On Mon, Jul 01, 2019 at 08:32:30AM -0700, Eric Biggers wrote:
> + err = mnt_want_write_file(filp);
> + if (err) /* -EROFS */
> + return err;
> +
> + err = deny_write_access(filp);
> + if (err) /* -ETXTBSY */
> + goto out_drop_write;
> +
> + inode_lock(inode);
> +
> + if (IS_VERITY(inode)) {
> + err = -EEXIST;
> + goto out_unlock;
> + }
> +
> + err = enable_verity(filp, &arg);
> + if (err)
> + goto out_unlock;
> +
> + /*
> + * Some pages of the file may have been evicted from pagecache after
> + * being used in the Merkle tree construction, then read into pagecache
> + * again by another process reading from the file concurrently. Since
> + * these pages didn't undergo verification against the file measurement
> + * which fs-verity now claims to be enforcing, we have to wipe the
> + * pagecache to ensure that all future reads are verified.
> + */
> + filemap_write_and_wait(inode->i_mapping);
> + invalidate_inode_pages2(inode->i_mapping);
> +
> + /*
> + * allow_write_access() is needed to pair with deny_write_access().
> + * Regardless, the filesystem won't allow writing to verity files.
> + */
> +out_unlock:
> + inode_unlock(inode);
> + allow_write_access(filp);
> +out_drop_write:
> + mnt_drop_write_file(filp);
> + return err;
> +}
> +EXPORT_SYMBOL_GPL(fsverity_ioctl_enable);
FYI, I've been thinking about the use of inode_lock() here. I don't think it's
good to hold it during the whole Merkle tree construction, since it means that
all syscalls that take the inode lock (e.g. chown(), chmod(), utimes()) will
block uninterruptibly. E.g. 'touch file' hangs in the following:
dd bs=1 count=0 seek=$((1<<40)) of=file
fsverity enable file &
touch file
It will proceed if you kill 'fsverity enable', but it's not ideal.
But AFAICS, it's safe to not hold the inode lock as long as we (a) keep using
deny_write_access() so that writes and truncates are not allowed (this is also
how the kernel handles files being executed), and (b) still take the inode lock
temporarily when beginning and ending enabling verity and enforce that only one
thread can build the Merkle tree at a time, and any other threads get EBUSY.
Does anyone have any objection to doing it that way instead? I.e. basically the
following incremental patch:
diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst
index 3a7a44ba7bb771..395f299ce25ea5 100644
--- a/Documentation/filesystems/fsverity.rst
+++ b/Documentation/filesystems/fsverity.rst
@@ -147,6 +147,7 @@ FS_IOC_ENABLE_VERITY can fail with the following errors:
- ``EACCES``: the process does not have write access to the file
- ``EBADMSG``: the signature is malformed
+- ``EBUSY``: this ioctl is already running on the file
- ``EEXIST``: the file already has verity enabled
- ``EFAULT``: the caller provided inaccessible memory
- ``EINTR``: the operation was interrupted by a fatal signal
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index dd0d1093e362cb..bb0a3b8e6ea71e 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
handle_t *handle;
int err;
+ if (ext4_verity_in_progress(inode))
+ return -EBUSY;
+
/*
* Since the file was opened readonly, we have to initialize the jbd
* inode and quotas here and not rely on ->open() doing it. This must
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index 91184cecbade1c..2a33c765a56860 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -123,6 +123,9 @@ static int f2fs_begin_enable_verity(struct file *filp)
struct inode *inode = file_inode(filp);
int err;
+ if (f2fs_verity_in_progress(inode))
+ return -EBUSY;
+
if (f2fs_is_atomic_file(inode) || f2fs_is_volatile_file(inode))
return -EOPNOTSUPP;
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index e9dca76fe5104f..a8430283a52a44 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -187,8 +187,6 @@ static int enable_verity(struct file *filp,
desc->data_size = cpu_to_le64(inode->i_size);
- pr_debug("Building Merkle tree...\n");
-
/* Prepare the Merkle tree parameters */
err = fsverity_init_merkle_tree_params(¶ms, inode,
arg->hash_algorithm,
@@ -197,12 +195,29 @@ static int enable_verity(struct file *filp,
if (err)
goto out;
- /* Tell the filesystem that verity is being enabled on the file */
- err = vops->begin_enable_verity(filp);
+ /*
+ * Start enabling verity on this file, serialized by the inode lock.
+ * Fail if verity is already enabled or is already being enabled.
+ */
+ inode_lock(inode);
+ if (IS_VERITY(inode))
+ err = -EEXIST;
+ else
+ err = vops->begin_enable_verity(filp);
+ inode_unlock(inode);
if (err)
goto out;
- /* Build the Merkle tree */
+ /*
+ * Build the Merkle tree. Don't hold the inode lock during this, since
+ * on huge files it may take a very long time and we don't want to force
+ * unrelated syscalls like chown() to block forever. We don't need the
+ * inode lock because deny_write_access() already prevents the file from
+ * being written to or truncated, and we still serialize
+ * ->begin_enable_verity() and ->end_enable_verity() with the inode lock
+ * and only allow one process to be here at a time.
+ */
+ pr_debug("Building Merkle tree...\n");
BUILD_BUG_ON(sizeof(desc->root_hash) < FS_VERITY_MAX_DIGEST_SIZE);
err = build_merkle_tree(inode, ¶ms, desc->root_hash);
if (err) {
@@ -229,8 +244,13 @@ static int enable_verity(struct file *filp,
pr_debug("Storing a %u-byte PKCS#7 signature alongside the file\n",
arg->sig_size);
- /* Tell the filesystem to finish enabling verity on the file */
+ /*
+ * Tell the filesystem to finish enabling verity on the file. The
+ * inode_lock() serializes this with ->begin_enable_verity().
+ */
+ inode_lock(inode);
err = vops->end_enable_verity(filp, desc, desc_size, params.tree_size);
+ inode_unlock(inode);
if (err) {
fsverity_err(inode, "%ps() failed with err %d",
vops->end_enable_verity, err);
@@ -254,7 +274,9 @@ static int enable_verity(struct file *filp,
return err;
rollback:
+ inode_lock(inode);
(void)vops->end_enable_verity(filp, NULL, 0, params.tree_size);
+ inode_unlock(inode);
goto out;
}
@@ -319,16 +341,9 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *uarg)
if (err) /* -ETXTBSY */
goto out_drop_write;
- inode_lock(inode);
-
- if (IS_VERITY(inode)) {
- err = -EEXIST;
- goto out_unlock;
- }
-
err = enable_verity(filp, &arg);
if (err)
- goto out_unlock;
+ goto out_allow_write_access;
/*
* Some pages of the file may have been evicted from pagecache after
@@ -345,8 +360,7 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *uarg)
* allow_write_access() is needed to pair with deny_write_access().
* Regardless, the filesystem won't allow writing to verity files.
*/
-out_unlock:
- inode_unlock(inode);
+out_allow_write_access:
allow_write_access(filp);
out_drop_write:
mnt_drop_write_file(filp);
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 9ebb97c174c7c4..e31a6b974ab0ef 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -23,7 +23,8 @@ struct fsverity_operations {
* @filp: a readonly file descriptor for the file
*
* The filesystem must do any needed filesystem-specific preparations
- * for enabling verity, e.g. evicting inline data.
+ * for enabling verity, e.g. evicting inline data. It also must return
+ * -EBUSY if verity is already being enabled on the given file.
*
* i_rwsem is held for write.
*
@@ -46,7 +47,8 @@ struct fsverity_operations {
* inode, e.g. setting a bit in the on-disk inode. The filesystem is
* also responsible for setting the S_VERITY flag in the VFS inode.
*
- * i_rwsem is held for write.
+ * i_rwsem is held for write, but it may have been dropped between the
+ * calls to ->begin_enable_verity() and ->end_enable_verity().
*
* Return: 0 on success, -errno on failure
*/
@@ -96,7 +98,7 @@ struct fsverity_operations {
* @log_blocksize: log base 2 of the Merkle tree block size
*
* This is only called between ->begin_enable_verity() and
- * ->end_enable_verity(). i_rwsem is held for write.
+ * ->end_enable_verity().
*
* Return: 0 on success, -errno on failure
*/
^ permalink raw reply related
* Re: [PATCH 6/9] Add a general, global device notification watch list [ver #5]
From: Greg Kroah-Hartman @ 2019-07-03 19:08 UTC (permalink / raw)
To: David Howells
Cc: viro, Casey Schaufler, Stephen Smalley, nicolas.dichtel, raven,
Christian Brauner, keyrings, linux-usb, linux-security-module,
linux-fsdevel, linux-api, linux-block, linux-kernel
In-Reply-To: <156173697086.15137.9549379251509621554.stgit@warthog.procyon.org.uk>
On Fri, Jun 28, 2019 at 04:49:30PM +0100, David Howells wrote:
> Create a general, global watch list that can be used for the posting of
> device notification events, for such things as device attachment,
> detachment and errors on sources such as block devices and USB devices.
> This can be enabled with:
>
> CONFIG_DEVICE_NOTIFICATIONS
>
> To add a watch on this list, an event queue must be created and configured:
>
> fd = open("/dev/event_queue", O_RDWR);
> ioctl(fd, IOC_WATCH_QUEUE_SET_SIZE, page_size << n);
>
> and then a watch can be placed upon it using a system call:
>
> watch_devices(fd, 12, 0);
>
> Unless the application wants to receive all events, it should employ
> appropriate filters.
Ok, as discussed off-list, this is needed by the other patches
afterward, i.e. the USB and block ones, which makes more sense.
Some tiny nits:
> diff --git a/drivers/base/watch.c b/drivers/base/watch.c
> new file mode 100644
> index 000000000000..00336607dc73
> --- /dev/null
> +++ b/drivers/base/watch.c
> @@ -0,0 +1,90 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Event notifications.
> + *
> + * Copyright (C) 2019 Red Hat, Inc. All Rights Reserved.
> + * Written by David Howells (dhowells@redhat.com)
> + */
> +
> +#include <linux/watch_queue.h>
> +#include <linux/syscalls.h>
> +#include <linux/init_task.h>
> +#include <linux/security.h>
You forgot to include device.h which has the prototype for your global
function :)
> +
> +/*
> + * Global queue for watching for device layer events.
> + */
> +static struct watch_list device_watchers = {
> + .watchers = HLIST_HEAD_INIT,
> + .lock = __SPIN_LOCK_UNLOCKED(&device_watchers.lock),
> +};
> +
> +static DEFINE_SPINLOCK(device_watchers_lock);
> +
> +/**
> + * post_device_notification - Post notification of a device event
> + * @n - The notification to post
> + * @id - The device ID
> + *
> + * Note that there's only a global queue to which all events are posted. Might
> + * want to provide per-dev queues also.
> + */
> +void post_device_notification(struct watch_notification *n, u64 id)
> +{
> + post_watch_notification(&device_watchers, n, &init_cred, id);
> +}
Don't you need to export this symbol?
> +
> +/**
> + * sys_watch_devices - Watch for device events.
> + * @watch_fd: The watch queue to send notifications to.
> + * @watch_id: The watch ID to be placed in the notification (-1 to remove watch)
> + * @flags: Flags (reserved for future)
> + */
> +SYSCALL_DEFINE3(watch_devices, int, watch_fd, int, watch_id, unsigned int, flags)
Finally, the driver core gets a syscall! :)
Don't we need a manpage and a kselftest for it?
> +{
> + struct watch_queue *wqueue;
> + struct watch_list *wlist = &device_watchers;
No real need for wlist, right? You just set it to this value and then
it never changes?
> + struct watch *watch;
> + long ret = -ENOMEM;
> + u64 id = 0; /* Might want to allow dev# here. */
I don't understand the comment here, what does "dev#" refer to?
> +
> + if (watch_id < -1 || watch_id > 0xff || flags)
> + return -EINVAL;
> +
> + wqueue = get_watch_queue(watch_fd);
> + if (IS_ERR(wqueue)) {
> + ret = PTR_ERR(wqueue);
> + goto err;
> + }
> +
> + if (watch_id >= 0) {
> + watch = kzalloc(sizeof(*watch), GFP_KERNEL);
> + if (!watch)
> + goto err_wqueue;
> +
> + init_watch(watch, wqueue);
> + watch->id = id;
> + watch->info_id = (u32)watch_id << WATCH_INFO_ID__SHIFT;
> +
> + ret = security_watch_devices(watch);
> + if (ret < 0)
> + goto err_watch;
> +
> + spin_lock(&device_watchers_lock);
> + ret = add_watch_to_object(watch, wlist);
> + spin_unlock(&device_watchers_lock);
> + if (ret == 0)
> + watch = NULL;
> + } else {
> + spin_lock(&device_watchers_lock);
> + ret = remove_watch_from_object(wlist, wqueue, id, false);
> + spin_unlock(&device_watchers_lock);
> + }
> +
> +err_watch:
> + kfree(watch);
> +err_wqueue:
> + put_watch_queue(wqueue);
> +err:
> + return ret;
> +}
> diff --git a/include/linux/device.h b/include/linux/device.h
> index e85264fb6616..c947c078b1be 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -26,6 +26,7 @@
> #include <linux/uidgid.h>
> #include <linux/gfp.h>
> #include <linux/overflow.h>
> +#include <linux/watch_queue.h>
No need for this, just do:
struct watch_notification;
so that things build.
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCH 6/9] Add a general, global device notification watch list [ver #5]
From: Greg Kroah-Hartman @ 2019-07-03 17:16 UTC (permalink / raw)
To: David Howells
Cc: viro, Casey Schaufler, Stephen Smalley, nicolas.dichtel, raven,
Christian Brauner, keyrings, linux-usb, linux-security-module,
linux-fsdevel, linux-api, linux-block, linux-kernel
In-Reply-To: <156173697086.15137.9549379251509621554.stgit@warthog.procyon.org.uk>
On Fri, Jun 28, 2019 at 04:49:30PM +0100, David Howells wrote:
> Create a general, global watch list that can be used for the posting of
> device notification events, for such things as device attachment,
> detachment and errors on sources such as block devices and USB devices.
> This can be enabled with:
>
> CONFIG_DEVICE_NOTIFICATIONS
>
> To add a watch on this list, an event queue must be created and configured:
>
> fd = open("/dev/event_queue", O_RDWR);
> ioctl(fd, IOC_WATCH_QUEUE_SET_SIZE, page_size << n);
>
> and then a watch can be placed upon it using a system call:
>
> watch_devices(fd, 12, 0);
>
> Unless the application wants to receive all events, it should employ
> appropriate filters.
What "filter"? Who is going to use this and why a new system call for
this? You can do this today with udev/netlink/hotplug/whatever so why
create yet-another-way?
I don't think this is a good idea unless we really nail down the api and
who is going to be using it.
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCH 4/9] General notification queue with user mmap()'able ring buffer [ver #5]
From: Greg Kroah-Hartman @ 2019-07-03 17:11 UTC (permalink / raw)
To: David Howells
Cc: viro, Casey Schaufler, Stephen Smalley, nicolas.dichtel, raven,
Christian Brauner, keyrings, linux-usb, linux-security-module,
linux-fsdevel, linux-api, linux-block, linux-kernel
In-Reply-To: <156173695061.15137.17196611619288074120.stgit@warthog.procyon.org.uk>
On Fri, Jun 28, 2019 at 04:49:10PM +0100, David Howells wrote:
> Implement a misc device that implements a general notification queue as a
> ring buffer that can be mmap()'d from userspace.
>
> The way this is done is:
>
> (1) An application opens the device and indicates the size of the ring
> buffer that it wants to reserve in pages (this can only be set once):
>
> fd = open("/dev/watch_queue", O_RDWR);
> ioctl(fd, IOC_WATCH_QUEUE_NR_PAGES, nr_of_pages);
>
> (2) The application should then map the pages that the device has
> reserved. Each instance of the device created by open() allocates
> separate pages so that maps of different fds don't interfere with one
> another. Multiple mmap() calls on the same fd, however, will all work
> together.
>
> page_size = sysconf(_SC_PAGESIZE);
> mapping_size = nr_of_pages * page_size;
> char *buf = mmap(NULL, mapping_size, PROT_READ|PROT_WRITE,
> MAP_SHARED, fd, 0);
>
> The ring is divided into 8-byte slots. Entries written into the ring are
> variable size and can use between 1 and 63 slots. A special entry is
> maintained in the first two slots of the ring that contains the head and
> tail pointers. This is skipped when the ring wraps round. Note that
> multislot entries, therefore, aren't allowed to be broken over the end of
> the ring, but instead "skip" entries are inserted to pad out the buffer.
>
> Each entry has a 1-slot header that describes it:
>
> struct watch_notification {
> __u32 type:24;
> __u32 subtype:8;
> __u32 info;
> };
>
> The type indicates the source (eg. mount tree changes, superblock events,
> keyring changes, block layer events) and the subtype indicates the event
> type (eg. mount, unmount; EIO, EDQUOT; link, unlink). The info field
> indicates a number of things, including the entry length, an ID assigned to
> a watchpoint contributing to this buffer, type-specific flags and meta
> flags, such as an overrun indicator.
>
> Supplementary data, such as the key ID that generated an event, are
> attached in additional slots.
>
> Signed-off-by: David Howells <dhowells@redhat.com>
I don't know if I mentioned this before, but your naming seems a bit
"backwards" from other subsystems. Should "watch_queue" always be the
prefix, instead of a mix of prefix/suffix usage?
Anyway, your call, it's your code :)
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply
* Re: [PATCH 1/9] uapi: General notification ring definitions [ver #5]
From: Greg Kroah-Hartman @ 2019-07-03 17:08 UTC (permalink / raw)
To: David Howells
Cc: viro, Casey Schaufler, Stephen Smalley, nicolas.dichtel, raven,
Christian Brauner, keyrings, linux-usb, linux-security-module,
linux-fsdevel, linux-api, linux-block, linux-kernel
In-Reply-To: <156173691411.15137.2073887155273175167.stgit@warthog.procyon.org.uk>
On Fri, Jun 28, 2019 at 04:48:34PM +0100, David Howells wrote:
> Add UAPI definitions for the general notification ring, including the
> following pieces:
>
> (1) struct watch_notification.
>
> This is the metadata header for each entry in the ring. It includes a
> type and subtype that indicate the source of the message
> (eg. WATCH_TYPE_MOUNT_NOTIFY) and the kind of the message
> (eg. NOTIFY_MOUNT_NEW_MOUNT).
>
> The header also contains an information field that conveys the
> following information:
>
> - WATCH_INFO_LENGTH. The size of the entry (entries are variable
> length).
>
> - WATCH_INFO_ID. The watch ID specified when the watchpoint was
> set.
>
> - WATCH_INFO_TYPE_INFO. (Sub)type-specific information.
>
> - WATCH_INFO_FLAG_*. Flag bits overlain on the type-specific
> information. For use by the type.
>
> All the information in the header can be used in filtering messages at
> the point of writing into the buffer.
>
> (2) struct watch_queue_buffer.
>
> This describes the layout of the ring. Note that the first slots in
> the ring contain a special metadata entry that contains the ring
> pointers. The producer in the kernel knows to skip this and it has a
> proper header (WATCH_TYPE_META, WATCH_META_SKIP_NOTIFICATION) that
> indicates the size so that the ring consumer can handle it the same as
> any other record and just skip it.
>
> Note that this means that ring entries can never be split over the end
> of the ring, so if an entry would need to be split, a skip record is
> inserted to wrap the ring first; this is also WATCH_TYPE_META,
> WATCH_META_SKIP_NOTIFICATION.
>
> (3) WATCH_INFO_NOTIFICATIONS_LOST.
>
> This is a flag that can be set in the metadata header by the kernel to
> indicate that at least one message was lost since it was last cleared
> by userspace.
>
> Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply
* Re: [PATCH 8/9] usb: Add USB subsystem notifications [ver #5]
From: Greg Kroah-Hartman @ 2019-07-03 17:07 UTC (permalink / raw)
To: David Howells
Cc: viro, Casey Schaufler, Stephen Smalley, nicolas.dichtel, raven,
Christian Brauner, keyrings, linux-usb, linux-security-module,
linux-fsdevel, linux-api, linux-block, linux-kernel
In-Reply-To: <156173698939.15137.11150923486478934112.stgit@warthog.procyon.org.uk>
On Fri, Jun 28, 2019 at 04:49:49PM +0100, David Howells wrote:
> Add a USB subsystem notification mechanism whereby notifications about
> hardware events such as device connection, disconnection, reset and I/O
> errors, can be reported to a monitoring process asynchronously.
>
> Firstly, an event queue needs to be created:
>
> fd = open("/dev/event_queue", O_RDWR);
> ioctl(fd, IOC_WATCH_QUEUE_SET_SIZE, page_size << n);
>
> then a notification can be set up to report USB notifications via that
> queue:
>
> struct watch_notification_filter filter = {
> .nr_filters = 1,
> .filters = {
> [0] = {
> .type = WATCH_TYPE_USB_NOTIFY,
> .subtype_filter[0] = UINT_MAX;
> },
> },
> };
> ioctl(fd, IOC_WATCH_QUEUE_SET_FILTER, &filter);
> notify_devices(fd, 12);
>
> After that, records will be placed into the queue when events occur on a
> USB device or bus. Records are of the following format:
>
> struct usb_notification {
> struct watch_notification watch;
> __u32 error;
> __u32 reserved;
> __u8 name_len;
> __u8 name[0];
> } *n;
>
> Where:
>
> n->watch.type will be WATCH_TYPE_USB_NOTIFY
>
> n->watch.subtype will be the type of notification, such as
> NOTIFY_USB_DEVICE_ADD.
>
> n->watch.info & WATCH_INFO_LENGTH will indicate the length of the
> record.
>
> n->watch.info & WATCH_INFO_ID will be the second argument to
> device_notify(), shifted.
>
> n->error and n->reserved are intended to convey information such as
> error codes, but are currently not used
>
> n->name_len and n->name convey the USB device name as an
> unterminated string. This may be truncated - it is currently
> limited to a maximum 63 chars.
>
> Note that it is permissible for event records to be of variable length -
> or, at least, the length may be dependent on the subtype.
>
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> cc: linux-usb@vger.kernel.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply
* Re: [PATCH 4/6] vfs: Allow mount information to be queried by fsinfo() [ver #15]
From: Ian Kent @ 2019-07-03 1:42 UTC (permalink / raw)
To: christian, David Howells, viro
Cc: mszeredi, linux-api, linux-fsdevel, linux-kernel
In-Reply-To: <2daf229272884deaf139be510f5842f0689c18a6.camel@themaw.net>
On Wed, 2019-07-03 at 09:24 +0800, Ian Kent wrote:
> On Wed, 2019-07-03 at 09:09 +0800, Ian Kent wrote:
> > Hi Christian,
> >
> > About the propagation attributes you mentioned ...
>
> Umm ... how did you work out if a mount is unbindable from proc
> mountinfo?
>
> I didn't notice anything that could be used for that when I was
> looking at this.
Oh wait, fs/proc_namespace.c:show_mountinfo() has:
if (IS_MNT_UNBINDABLE(r))
seq_puts(m, " unbindable");
I missed that, probably because I didn't have any unbindable mounts
at the time I was looking at it, oops!
That's missing and probably should be added too.
>
> > On Fri, 2019-06-28 at 16:47 +0100, David Howells wrote:
> >
> > snip ...
> >
> > > +
> > > +#ifdef CONFIG_FSINFO
> > > +int fsinfo_generic_mount_info(struct path *path, struct fsinfo_kparams
> > > *params)
> > > +{
> > > + struct fsinfo_mount_info *p = params->buffer;
> > > + struct super_block *sb;
> > > + struct mount *m;
> > > + struct path root;
> > > + unsigned int flags;
> > > +
> > > + if (!path->mnt)
> > > + return -ENODATA;
> > > +
> > > + m = real_mount(path->mnt);
> > > + sb = m->mnt.mnt_sb;
> > > +
> > > + p->f_sb_id = sb->s_unique_id;
> > > + p->mnt_id = m->mnt_id;
> > > + p->parent_id = m->mnt_parent->mnt_id;
> > > + p->change_counter = atomic_read(&m->mnt_change_counter);
> > > +
> > > + get_fs_root(current->fs, &root);
> > > + if (path->mnt == root.mnt) {
> > > + p->parent_id = p->mnt_id;
> > > + } else {
> > > + rcu_read_lock();
> > > + if (!are_paths_connected(&root, path))
> > > + p->parent_id = p->mnt_id;
> > > + rcu_read_unlock();
> > > + }
> > > + if (IS_MNT_SHARED(m))
> > > + p->group_id = m->mnt_group_id;
> > > + if (IS_MNT_SLAVE(m)) {
> > > + int master = m->mnt_master->mnt_group_id;
> > > + int dom = get_dominating_id(m, &root);
> > > + p->master_id = master;
> > > + if (dom && dom != master)
> > > + p->from_id = dom;
> >
> > This provides information about mount propagation (well mostly).
> >
> > My understanding of this was that:
> > "If a mount is propagation private (or slave) the group_id will
> > be zero otherwise it's propagation shared and it's group id will
> > be non-zero.
> >
> > If a mount is propagation slave and propagation peers exist then
> > the mount field mnt_master will be non-NULL. Then mnt_master
> > (slave's master) can be used to set master_id. If the group id
> > of the propagation source is not that of the master then set
> > the from_id group as well."
> >
> > This parallels the way in which these values are reported in
> > the proc pseudo file system.
> >
> > Perhaps adding flags as well as setting the fields would be
> > useful too, since interpreting the meaning of the structure
> > fields isn't obvious, ;)
> >
> > David, Al, thoughts?
> >
> > Ian
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox