[PATCH v2 00/15] cxl: add Type2 device support

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 00/15] cxl: add Type2 device support
@ 2024-07-15 17:28 alejandro.lucero-palau
  2024-07-15 17:28 ` [PATCH v2 01/15] cxl: add type2 device basic support alejandro.lucero-palau
                   ` (14 more replies)
  0 siblings, 15 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alejandro.lucero-palau@amd.com>

This is a second version for adding CXL Type2 support with changes from the
first RFC patchset.

I have removed the introduction about the concerns with BIOS/UEFI after the
discussion leading to confirm the need of the functionality implemented, at
least is some scenarios.

There are two main changes from the RFC:

1) Following concerns about drivers using CXL core without restrictions, the CXL
struct to work with is opaque to those drivers, therefore functions are
implemented for modifying or reading those structs indirectly.

2) The driver for using the added functionality is not a test driver but a real
one: the SFC ethernet network driver. It uses the CXL region mapped for PIO
buffers instead of regions inside PCIe BARs.

Current CXL kernel code is focused on supporting Type3 CXL devices, aka memory
expanders. Type2 CXL devices, aka device accelerators, share some functionalities
but require some special handling.

First of all, Type2 are by definition specific to drivers doing something and not just
a memory expander, so it is expected to work with the CXL specifics. This implies the CXL
setup needs to be done by such a driver instead of by a generic CXL PCI driver
as for memory expanders. Most of such setup needs to use current CXL core code
and therefore needs to be accessible to those vendor drivers. This is accomplished 
exporting opaque CXL structs and adding and exporting functions for working with
those structs indirectly.

Some of the patches are based on a patchset sent by Dan Williams [1] which was just
partially integrated, most related to making things ready for Type2 but none
related to specific Type2 support. Those patches based on Dan´s work have Dan´s
signing as co-developer, and a link to the original patch.

A final note about CXL.cache is needed. This patchset does not cover it at all,
although the emulated Type2 device advertises it. From the kernel point of view
supporting CXL.cache will imply to be sure the CXL path supports what the Type2
device needs. A device accelerator will likely be connected to a Root Switch,
but other configurations can not be discarded. Therefore the kernel will need to
check not just HPA, DPA, interleave and granularity, but also the available
CXL.cache support and resources in each switch in the CXL path to the Type2
device. I expect to contribute to this support in the following months, and
it would be good to discuss about it when possible.

[1] https://lore.kernel.org/linux-cxl/98b1f61a-e6c2-71d4-c368-50d958501b0c@intel.com/T/

Alejandro Lucero (15):
  cxl: add type2 device basic support
  cxl: add function for type2 cxl regs setup
  cxl: add function for type2 resource request
  cxl: add capabilities field to cxl_dev_state
  cxl: fix use of resource_contains
  cxl: add function for setting media ready by an accelerator
  cxl: support type2 memdev creation
  cxl: indicate probe deferral
  cxl: define a driver interface for HPA free space enumaration
  cxl: define a driver interface for DPA allocation
  cxl: make region type based on endpoint type
  cxl: allow region creation by type2 drivers
  cxl: preclude device memory to be used for dax
  cxl: add function for obtaining params from a region
  efx: support pio mapping based on cxl

 drivers/cxl/core/cdat.c               |   3 +
 drivers/cxl/core/core.h               |   1 +
 drivers/cxl/core/hdm.c                | 160 +++++++--
 drivers/cxl/core/mbox.c               |   1 +
 drivers/cxl/core/memdev.c             | 122 +++++++
 drivers/cxl/core/port.c               |   4 +-
 drivers/cxl/core/region.c             | 459 ++++++++++++++++++++++----
 drivers/cxl/core/regs.c               |  11 +-
 drivers/cxl/cxl.h                     |   9 +-
 drivers/cxl/cxlmem.h                  |  11 +
 drivers/cxl/mem.c                     |  24 +-
 drivers/cxl/pci.c                     |  39 ++-
 drivers/net/ethernet/sfc/Makefile     |   2 +-
 drivers/net/ethernet/sfc/ef10.c       |  25 +-
 drivers/net/ethernet/sfc/efx.c        |   6 +
 drivers/net/ethernet/sfc/efx_cxl.c    | 134 ++++++++
 drivers/net/ethernet/sfc/efx_cxl.h    |  30 ++
 drivers/net/ethernet/sfc/mcdi_pcol.h  |   3 +
 drivers/net/ethernet/sfc/net_driver.h |   4 +
 drivers/net/ethernet/sfc/nic.h        |   1 +
 include/linux/cxl_accel_mem.h         |  58 ++++
 include/linux/cxl_accel_pci.h         |  23 ++
 22 files changed, 1021 insertions(+), 109 deletions(-)
 create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
 create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
 create mode 100644 include/linux/cxl_accel_mem.h
 create mode 100644 include/linux/cxl_accel_pci.h

-- 
2.17.1

^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v2 01/15] cxl: add type2 device basic support
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-15 18:48   ` Andrew Lunn
                     ` (4 more replies)
  2024-07-15 17:28 ` [PATCH v2 02/15] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
                   ` (13 subsequent siblings)
  14 siblings, 5 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Differientiate Type3, aka memory expanders, from Type2, aka device
accelerators, with a new function for initializing cxl_dev_state.

Create opaque struct to be used by accelerators relying on new access
functions in following patches.

Add SFC ethernet network driver as the client.

Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c             | 52 ++++++++++++++++++++++++++
 drivers/net/ethernet/sfc/Makefile     |  2 +-
 drivers/net/ethernet/sfc/efx.c        |  4 ++
 drivers/net/ethernet/sfc/efx_cxl.c    | 53 +++++++++++++++++++++++++++
 drivers/net/ethernet/sfc/efx_cxl.h    | 29 +++++++++++++++
 drivers/net/ethernet/sfc/net_driver.h |  4 ++
 include/linux/cxl_accel_mem.h         | 22 +++++++++++
 include/linux/cxl_accel_pci.h         | 23 ++++++++++++
 8 files changed, 188 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
 create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
 create mode 100644 include/linux/cxl_accel_mem.h
 create mode 100644 include/linux/cxl_accel_pci.h

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 0277726afd04..61b5d35b49e7 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -8,6 +8,7 @@
 #include <linux/idr.h>
 #include <linux/pci.h>
 #include <cxlmem.h>
+#include <linux/cxl_accel_mem.h>
 #include "trace.h"
 #include "core.h"
 
@@ -615,6 +616,25 @@ static void detach_memdev(struct work_struct *work)
 
 static struct lock_class_key cxl_memdev_key;
 
+struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
+{
+	struct cxl_dev_state *cxlds;
+
+	cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
+	if (!cxlds)
+		return ERR_PTR(-ENOMEM);
+
+	cxlds->dev = dev;
+	cxlds->type = CXL_DEVTYPE_DEVMEM;
+
+	cxlds->dpa_res = DEFINE_RES_MEM_NAMED(0, 0, "dpa");
+	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
+	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
+
+	return cxlds;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
+
 static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
 					   const struct file_operations *fops)
 {
@@ -692,6 +712,38 @@ static int cxl_memdev_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
+
+void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
+{
+	cxlds->cxl_dvsec = dvsec;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_accel_set_dvsec, CXL);
+
+void cxl_accel_set_serial(struct cxl_dev_state *cxlds, u64 serial)
+{
+	cxlds->serial= serial;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_accel_set_serial, CXL);
+
+void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
+			    enum accel_resource type)
+{
+	switch (type) {
+	case CXL_ACCEL_RES_DPA:
+		cxlds->dpa_res = res;
+		return;
+	case CXL_ACCEL_RES_RAM:
+		cxlds->ram_res = res;
+		return;
+	case CXL_ACCEL_RES_PMEM:
+		cxlds->pmem_res = res;
+		return;
+	default:
+		dev_err(cxlds->dev, "unkown resource type (%u)\n", type);
+	}
+}
+EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
+
 static int cxl_memdev_release_file(struct inode *inode, struct file *file)
 {
 	struct cxl_memdev *cxlmd =
diff --git a/drivers/net/ethernet/sfc/Makefile b/drivers/net/ethernet/sfc/Makefile
index 8f446b9bd5ee..e80c713c3b0c 100644
--- a/drivers/net/ethernet/sfc/Makefile
+++ b/drivers/net/ethernet/sfc/Makefile
@@ -7,7 +7,7 @@ sfc-y			+= efx.o efx_common.o efx_channels.o nic.o \
 			   mcdi_functions.o mcdi_filters.o mcdi_mon.o \
 			   ef100.o ef100_nic.o ef100_netdev.o \
 			   ef100_ethtool.o ef100_rx.o ef100_tx.o \
-			   efx_devlink.o
+			   efx_devlink.o efx_cxl.o
 sfc-$(CONFIG_SFC_MTD)	+= mtd.o
 sfc-$(CONFIG_SFC_SRIOV)	+= sriov.o ef10_sriov.o ef100_sriov.o ef100_rep.o \
                            mae.o tc.o tc_bindings.o tc_counters.o \
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index e9d9de8e648a..cb3f74d30852 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -33,6 +33,7 @@
 #include "selftest.h"
 #include "sriov.h"
 #include "efx_devlink.h"
+#include "efx_cxl.h"
 
 #include "mcdi_port_common.h"
 #include "mcdi_pcol.h"
@@ -899,6 +900,7 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
 	efx_pci_remove_main(efx);
 
 	efx_fini_io(efx);
+
 	pci_dbg(efx->pci_dev, "shutdown successful\n");
 
 	efx_fini_devlink_and_unlock(efx);
@@ -1109,6 +1111,8 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
 	if (rc)
 		goto fail2;
 
+	efx_cxl_init(efx);
+
 	rc = efx_pci_probe_post_io(efx);
 	if (rc) {
 		/* On failure, retry once immediately.
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
new file mode 100644
index 000000000000..4554dd7cca76
--- /dev/null
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -0,0 +1,53 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/****************************************************************************
+ * Driver for AMD network controllers and boards
+ * Copyright (C) 2024, Advanced Micro Devices, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation, incorporated herein by reference.
+ */
+
+
+#include <linux/pci.h>
+#include <linux/cxl_accel_mem.h>
+#include <linux/cxl_accel_pci.h>
+
+#include "net_driver.h"
+#include "efx_cxl.h"
+
+#define EFX_CTPIO_BUFFER_SIZE	(1024*1024*256)
+
+void efx_cxl_init(struct efx_nic *efx)
+{
+	struct pci_dev *pci_dev = efx->pci_dev;
+	struct efx_cxl *cxl = efx->cxl;
+	struct resource res;
+	u16 dvsec;
+
+	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
+					  CXL_DVSEC_PCIE_DEVICE);
+
+	if (!dvsec)
+		return;
+
+	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability found");
+
+	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
+	if (IS_ERR(cxl->cxlds)) {
+		pci_info(pci_dev, "CXL accel device state failed");
+		return;
+	}
+
+	cxl_accel_set_dvsec(cxl->cxlds, dvsec);
+	cxl_accel_set_serial(cxl->cxlds, pci_dev->dev.id);
+
+	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
+	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_DPA);
+
+	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
+	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
+}
+
+
+MODULE_IMPORT_NS(CXL);
diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
new file mode 100644
index 000000000000..76c6794c20d8
--- /dev/null
+++ b/drivers/net/ethernet/sfc/efx_cxl.h
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/****************************************************************************
+ * Driver for AMD network controllers and boards
+ * Copyright (C) 2024, Advanced Micro Devices, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation, incorporated herein by reference.
+ */
+
+#ifndef EFX_CXL_H
+#define EFX_CLX_H
+
+#include <linux/cxl_accel_mem.h>
+
+struct efx_nic;
+
+struct efx_cxl {
+	cxl_accel_state *cxlds;
+	struct cxl_memdev *cxlmd;
+	struct cxl_root_decoder *cxlrd;
+	struct cxl_port *endpoint;
+	struct cxl_endpoint_decoder *cxled;
+	struct cxl_region *efx_region;
+	void __iomem *ctpio_cxl;
+};
+
+void efx_cxl_init(struct efx_nic *efx);
+#endif
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index f2dd7feb0e0c..58b7517afea4 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -814,6 +814,8 @@ enum efx_xdp_tx_queues_mode {
 
 struct efx_mae;
 
+struct efx_cxl;
+
 /**
  * struct efx_nic - an Efx NIC
  * @name: Device name (net device name or bus id before net device registered)
@@ -962,6 +964,7 @@ struct efx_mae;
  * @tc: state for TC offload (EF100).
  * @devlink: reference to devlink structure owned by this device
  * @dl_port: devlink port associated with the PF
+ * @cxl: details of related cxl objects
  * @mem_bar: The BAR that is mapped into membase.
  * @reg_base: Offset from the start of the bar to the function control window.
  * @monitor_work: Hardware monitor workitem
@@ -1148,6 +1151,7 @@ struct efx_nic {
 
 	struct devlink *devlink;
 	struct devlink_port *dl_port;
+	struct efx_cxl *cxl;
 	unsigned int mem_bar;
 	u32 reg_base;
 
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
new file mode 100644
index 000000000000..daf46d41f59c
--- /dev/null
+++ b/include/linux/cxl_accel_mem.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
+
+#include <linux/cdev.h>
+
+#ifndef __CXL_ACCEL_MEM_H
+#define __CXL_ACCEL_MEM_H
+
+enum accel_resource{
+	CXL_ACCEL_RES_DPA,
+	CXL_ACCEL_RES_RAM,
+	CXL_ACCEL_RES_PMEM,
+};
+
+typedef struct cxl_dev_state cxl_accel_state;
+cxl_accel_state *cxl_accel_state_create(struct device *dev);
+
+void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
+void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
+void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
+			    enum accel_resource);
+#endif
diff --git a/include/linux/cxl_accel_pci.h b/include/linux/cxl_accel_pci.h
new file mode 100644
index 000000000000..c337ae8797e6
--- /dev/null
+++ b/include/linux/cxl_accel_pci.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
+
+#ifndef __CXL_ACCEL_PCI_H
+#define __CXL_ACCEL_PCI_H
+
+/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
+#define CXL_DVSEC_PCIE_DEVICE					0
+#define   CXL_DVSEC_CAP_OFFSET		0xA
+#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
+#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
+#define   CXL_DVSEC_CTRL_OFFSET		0xC
+#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
+#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
+#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
+#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
+#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
+#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
+#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
+#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
+#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
+
+#endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 02/15] cxl: add function for type2 cxl regs setup
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
  2024-07-15 17:28 ` [PATCH v2 01/15] cxl: add type2 device basic support alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-16  6:26   ` Li, Ming4
                     ` (2 more replies)
  2024-07-15 17:28 ` [PATCH v2 03/15] cxl: add function for type2 resource request alejandro.lucero-palau
                   ` (12 subsequent siblings)
  14 siblings, 3 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Create a new function for a type2 device initialising the opaque
cxl_dev_state struct regarding cxl regs setup and mapping.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/pci.c                  | 28 ++++++++++++++++++++++++++++
 drivers/net/ethernet/sfc/efx_cxl.c |  3 +++
 include/linux/cxl_accel_mem.h      |  1 +
 3 files changed, 32 insertions(+)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index e53646e9f2fb..b34d6259faf4 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -11,6 +11,7 @@
 #include <linux/pci.h>
 #include <linux/aer.h>
 #include <linux/io.h>
+#include <linux/cxl_accel_mem.h>
 #include "cxlmem.h"
 #include "cxlpci.h"
 #include "cxl.h"
@@ -521,6 +522,33 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
 	return cxl_setup_regs(map);
 }
 
+int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
+{
+	struct cxl_register_map map;
+	int rc;
+
+	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
+	if (rc)
+		return rc;
+
+	rc = cxl_map_device_regs(&map, &cxlds->regs.device_regs);
+	if (rc)
+		return rc;
+
+	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
+				&cxlds->reg_map);
+	if (rc)
+		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
+
+	rc = cxl_map_component_regs(&cxlds->reg_map, &cxlds->regs.component,
+				    BIT(CXL_CM_CAP_CAP_ID_RAS));
+	if (rc)
+		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
+
+	return rc;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_pci_accel_setup_regs, CXL);
+
 static int cxl_pci_ras_unmask(struct pci_dev *pdev)
 {
 	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 4554dd7cca76..10c4fb915278 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -47,6 +47,9 @@ void efx_cxl_init(struct efx_nic *efx)
 
 	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
 	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
+
+	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
+		pci_info(pci_dev, "CXL accel setup regs failed");
 }
 
 
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
index daf46d41f59c..ca7af4a9cefc 100644
--- a/include/linux/cxl_accel_mem.h
+++ b/include/linux/cxl_accel_mem.h
@@ -19,4 +19,5 @@ void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
 void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
 void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
 			    enum accel_resource);
+int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 03/15] cxl: add function for type2 resource request
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
  2024-07-15 17:28 ` [PATCH v2 01/15] cxl: add type2 device basic support alejandro.lucero-palau
  2024-07-15 17:28 ` [PATCH v2 02/15] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-18 23:36   ` Dave Jiang
                     ` (2 more replies)
  2024-07-15 17:28 ` [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state alejandro.lucero-palau
                   ` (11 subsequent siblings)
  14 siblings, 3 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Create a new function for a type2 device requesting a resource
passing the opaque struct to work with.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/memdev.c          | 13 +++++++++++++
 drivers/net/ethernet/sfc/efx_cxl.c |  7 ++++++-
 include/linux/cxl_accel_mem.h      |  1 +
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 61b5d35b49e7..04c3a0f8bc2e 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -744,6 +744,19 @@ void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
 }
 EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
 
+int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram)
+{
+	int rc;
+
+	if (is_ram)
+		rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
+	else
+		rc = request_resource(&cxlds->dpa_res, &cxlds->pmem_res);
+
+	return rc;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
+
 static int cxl_memdev_release_file(struct inode *inode, struct file *file)
 {
 	struct cxl_memdev *cxlmd =
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 10c4fb915278..9cefcaf3caca 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -48,8 +48,13 @@ void efx_cxl_init(struct efx_nic *efx)
 	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
 	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
 
-	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
+	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds)) {
 		pci_info(pci_dev, "CXL accel setup regs failed");
+		return;
+	}
+
+	if (cxl_accel_request_resource(cxl->cxlds, true))
+		pci_info(pci_dev, "CXL accel resource request failed");
 }
 
 
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
index ca7af4a9cefc..c7b254edc096 100644
--- a/include/linux/cxl_accel_mem.h
+++ b/include/linux/cxl_accel_mem.h
@@ -20,4 +20,5 @@ void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
 void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
 			    enum accel_resource);
 int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
+int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (2 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 03/15] cxl: add function for type2 resource request alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-19 19:01   ` Dave Jiang
                     ` (2 more replies)
  2024-07-15 17:28 ` [PATCH v2 05/15] cxl: fix use of resource_contains alejandro.lucero-palau
                   ` (10 subsequent siblings)
  14 siblings, 3 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Type2 devices have some Type3 functionalities as optional like an mbox
or an hdm decoder, and CXL core needs a way to know what a CXL accelerator
implements.

Add a new field for keeping device capabilities to be initialised by
Type2 drivers. Advertise all those capabilities for Type3.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/mbox.c            |  1 +
 drivers/cxl/core/memdev.c          |  4 +++-
 drivers/cxl/core/port.c            |  2 +-
 drivers/cxl/core/regs.c            | 11 ++++++-----
 drivers/cxl/cxl.h                  |  2 +-
 drivers/cxl/cxlmem.h               |  4 ++++
 drivers/cxl/pci.c                  | 15 +++++++++------
 drivers/net/ethernet/sfc/efx_cxl.c |  3 ++-
 include/linux/cxl_accel_mem.h      |  5 ++++-
 9 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 2626f3fff201..2ba7d36e3f38 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1424,6 +1424,7 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
 	mds->cxlds.reg_map.host = dev;
 	mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE;
 	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
+	mds->cxlds.capabilities = CXL_DRIVER_CAP_HDM | CXL_DRIVER_CAP_MBOX;
 	mds->ram_perf.qos_class = CXL_QOS_CLASS_INVALID;
 	mds->pmem_perf.qos_class = CXL_QOS_CLASS_INVALID;
 
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 04c3a0f8bc2e..b4205ecca365 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -616,7 +616,7 @@ static void detach_memdev(struct work_struct *work)
 
 static struct lock_class_key cxl_memdev_key;
 
-struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
+struct cxl_dev_state *cxl_accel_state_create(struct device *dev, uint8_t caps)
 {
 	struct cxl_dev_state *cxlds;
 
@@ -631,6 +631,8 @@ struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
 	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
 	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
 
+	cxlds->capabilities = caps;
+
 	return cxlds;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 887ed6e358fb..d66c6349ed2d 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -763,7 +763,7 @@ static int cxl_setup_comp_regs(struct device *host, struct cxl_register_map *map
 	map->reg_type = CXL_REGLOC_RBI_COMPONENT;
 	map->max_size = CXL_COMPONENT_REG_BLOCK_SIZE;
 
-	return cxl_setup_regs(map);
+	return cxl_setup_regs(map, 0);
 }
 
 static int cxl_port_setup_regs(struct cxl_port *port,
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index e1082e749c69..9d218ebe180d 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -421,7 +421,7 @@ static void cxl_unmap_regblock(struct cxl_register_map *map)
 	map->base = NULL;
 }
 
-static int cxl_probe_regs(struct cxl_register_map *map)
+static int cxl_probe_regs(struct cxl_register_map *map, uint8_t caps)
 {
 	struct cxl_component_reg_map *comp_map;
 	struct cxl_device_reg_map *dev_map;
@@ -437,11 +437,12 @@ static int cxl_probe_regs(struct cxl_register_map *map)
 	case CXL_REGLOC_RBI_MEMDEV:
 		dev_map = &map->device_map;
 		cxl_probe_device_regs(host, base, dev_map);
-		if (!dev_map->status.valid || !dev_map->mbox.valid ||
+		if (!dev_map->status.valid ||
+		    ((caps & CXL_DRIVER_CAP_MBOX) && !dev_map->mbox.valid) ||
 		    !dev_map->memdev.valid) {
 			dev_err(host, "registers not found: %s%s%s\n",
 				!dev_map->status.valid ? "status " : "",
-				!dev_map->mbox.valid ? "mbox " : "",
+				((caps & CXL_DRIVER_CAP_MBOX) && !dev_map->mbox.valid) ? "mbox " : "",
 				!dev_map->memdev.valid ? "memdev " : "");
 			return -ENXIO;
 		}
@@ -455,7 +456,7 @@ static int cxl_probe_regs(struct cxl_register_map *map)
 	return 0;
 }
 
-int cxl_setup_regs(struct cxl_register_map *map)
+int cxl_setup_regs(struct cxl_register_map *map, uint8_t caps)
 {
 	int rc;
 
@@ -463,7 +464,7 @@ int cxl_setup_regs(struct cxl_register_map *map)
 	if (rc)
 		return rc;
 
-	rc = cxl_probe_regs(map);
+	rc = cxl_probe_regs(map, caps);
 	cxl_unmap_regblock(map);
 
 	return rc;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index a6613a6f8923..9973430d975f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -300,7 +300,7 @@ int cxl_find_regblock_instance(struct pci_dev *pdev, enum cxl_regloc_type type,
 			       struct cxl_register_map *map, int index);
 int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
 		      struct cxl_register_map *map);
-int cxl_setup_regs(struct cxl_register_map *map);
+int cxl_setup_regs(struct cxl_register_map *map, uint8_t caps);
 struct cxl_dport;
 resource_size_t cxl_rcd_component_reg_phys(struct device *dev,
 					   struct cxl_dport *dport);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index af8169ccdbc0..8f2a820bd92d 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -405,6 +405,9 @@ struct cxl_dpa_perf {
 	int qos_class;
 };
 
+#define CXL_DRIVER_CAP_HDM	0x1
+#define CXL_DRIVER_CAP_MBOX	0x2
+
 /**
  * struct cxl_dev_state - The driver device state
  *
@@ -438,6 +441,7 @@ struct cxl_dev_state {
 	struct resource ram_res;
 	u64 serial;
 	enum cxl_devtype type;
+	uint8_t capabilities;
 };
 
 /**
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index b34d6259faf4..e2a978312281 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -502,7 +502,8 @@ static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev,
 }
 
 static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
-			      struct cxl_register_map *map)
+			      struct cxl_register_map *map,
+			      uint8_t cxl_dev_caps)
 {
 	int rc;
 
@@ -519,7 +520,7 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
 	if (rc)
 		return rc;
 
-	return cxl_setup_regs(map);
+	return cxl_setup_regs(map, cxl_dev_caps);
 }
 
 int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
@@ -527,7 +528,8 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
 	struct cxl_register_map map;
 	int rc;
 
-	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
+	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
+				cxlds->capabilities);
 	if (rc)
 		return rc;
 
@@ -536,7 +538,7 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
 		return rc;
 
 	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
-				&cxlds->reg_map);
+				&cxlds->reg_map, cxlds->capabilities);
 	if (rc)
 		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
 
@@ -850,7 +852,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		dev_warn(&pdev->dev,
 			 "Device DVSEC not present, skip CXL.mem init\n");
 
-	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
+	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
+				cxlds->capabilities);
 	if (rc)
 		return rc;
 
@@ -863,7 +866,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	 * still be useful for management functions so don't return an error.
 	 */
 	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
-				&cxlds->reg_map);
+				&cxlds->reg_map, cxlds->capabilities);
 	if (rc)
 		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
 	else if (!cxlds->reg_map.component_map.ras.valid)
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 9cefcaf3caca..37d8bfdef517 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -33,7 +33,8 @@ void efx_cxl_init(struct efx_nic *efx)
 
 	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability found");
 
-	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
+	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev,
+					    CXL_ACCEL_DRIVER_CAP_HDM);
 	if (IS_ERR(cxl->cxlds)) {
 		pci_info(pci_dev, "CXL accel device state failed");
 		return;
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
index c7b254edc096..0ba2195b919b 100644
--- a/include/linux/cxl_accel_mem.h
+++ b/include/linux/cxl_accel_mem.h
@@ -12,8 +12,11 @@ enum accel_resource{
 	CXL_ACCEL_RES_PMEM,
 };
 
+#define CXL_ACCEL_DRIVER_CAP_HDM	0x1
+#define CXL_ACCEL_DRIVER_CAP_MBOX	0x2
+
 typedef struct cxl_dev_state cxl_accel_state;
-cxl_accel_state *cxl_accel_state_create(struct device *dev);
+cxl_accel_state *cxl_accel_state_create(struct device *dev, uint8_t caps);
 
 void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
 void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 05/15] cxl: fix use of resource_contains
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (3 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-24 21:25   ` fan
                     ` (2 more replies)
  2024-07-15 17:28 ` [PATCH v2 06/15] cxl: add function for setting media ready by an accelerator alejandro.lucero-palau
                   ` (9 subsequent siblings)
  14 siblings, 3 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

For a resource defined with size zero, resource contains will also
return true.

Add resource size check before using it.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/hdm.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 3df10517a327..4af9225d4b59 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -327,10 +327,13 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
 	cxled->dpa_res = res;
 	cxled->skip = skipped;
 
-	if (resource_contains(&cxlds->pmem_res, res))
+	if ((resource_size(&cxlds->pmem_res)) && (resource_contains(&cxlds->pmem_res, res))) {
+		printk("%s: resource_contains CXL_DECODER_PMEM\n", __func__);
 		cxled->mode = CXL_DECODER_PMEM;
-	else if (resource_contains(&cxlds->ram_res, res))
+	} else if ((resource_size(&cxlds->ram_res)) && (resource_contains(&cxlds->ram_res, res))) {
+		printk("%s: resource_contains CXL_DECODER_RAM\n", __func__);
 		cxled->mode = CXL_DECODER_RAM;
+	}
 	else {
 		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
 			 port->id, cxled->cxld.id, cxled->dpa_res);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 06/15] cxl: add function for setting media ready by an accelerator
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (4 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 05/15] cxl: fix use of resource_contains alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-08-04 17:26   ` Jonathan Cameron
  2024-07-15 17:28 ` [PATCH v2 07/15] cxl: support type2 memdev creation alejandro.lucero-palau
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

A Type-2 driver can require to set the memory availability explicitly.

Add a function to the exported CXL API for accelerator drivers.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/memdev.c          | 7 ++++++-
 drivers/net/ethernet/sfc/efx_cxl.c | 5 +++++
 include/linux/cxl_accel_mem.h      | 2 ++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index b4205ecca365..58a51e7fd37f 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -714,7 +714,6 @@ static int cxl_memdev_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
-
 void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
 {
 	cxlds->cxl_dvsec = dvsec;
@@ -759,6 +758,12 @@ int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
 
+void cxl_accel_set_media_ready(struct cxl_dev_state *cxlds)
+{
+	cxlds->media_ready = true;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_accel_set_media_ready, CXL);
+
 static int cxl_memdev_release_file(struct inode *inode, struct file *file)
 {
 	struct cxl_memdev *cxlmd =
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 37d8bfdef517..a84fe7992c53 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -56,6 +56,11 @@ void efx_cxl_init(struct efx_nic *efx)
 
 	if (cxl_accel_request_resource(cxl->cxlds, true))
 		pci_info(pci_dev, "CXL accel resource request failed");
+
+	if (!cxl_await_media_ready(cxl->cxlds))
+		cxl_accel_set_media_ready(cxl->cxlds);
+	else
+		pci_info(pci_dev, "CXL accel media not active");
 }
 
 
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
index 0ba2195b919b..b883c438a132 100644
--- a/include/linux/cxl_accel_mem.h
+++ b/include/linux/cxl_accel_mem.h
@@ -24,4 +24,6 @@ void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
 			    enum accel_resource);
 int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
 int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
+void cxl_accel_set_media_ready(struct cxl_dev_state *cxlds);
+int cxl_await_media_ready(struct cxl_dev_state *cxlds);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 07/15] cxl: support type2 memdev creation
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (5 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 06/15] cxl: add function for setting media ready by an accelerator alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-24 21:32   ` fan
  2024-08-04 17:31   ` Jonathan Cameron
  2024-07-15 17:28 ` [PATCH v2 08/15] cxl: indicate probe deferral alejandro.lucero-palau
                   ` (7 subsequent siblings)
  14 siblings, 2 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Add memdev creation from sfc driver.

Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when
creating a memdev leading to problems when obtaining cxl_memdev_state
references from a CXL_DEVTYPE_DEVMEM type. This last device type is
managed by a specific vendor driver and does not need same sysfs files
since not userspace intervention is expected. This patch checks for the
right device type in those functions using cxl_memdev_state.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/cdat.c            |  3 +++
 drivers/cxl/core/memdev.c          |  9 +++++++++
 drivers/cxl/mem.c                  | 17 +++++++++++------
 drivers/net/ethernet/sfc/efx_cxl.c | 10 ++++++++--
 include/linux/cxl_accel_mem.h      |  3 +++
 5 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index bb83867d9fec..0d4679c137d4 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -558,6 +558,9 @@ void cxl_region_perf_data_calculate(struct cxl_region *cxlr,
 	};
 	struct cxl_dpa_perf *perf;
 
+	if (!mds)
+		return;
+
 	switch (cxlr->mode) {
 	case CXL_DECODER_RAM:
 		perf = &mds->ram_perf;
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 58a51e7fd37f..b902948b121f 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -468,6 +468,9 @@ static umode_t cxl_ram_visible(struct kobject *kobj, struct attribute *a, int n)
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 
+	if (!mds)
+		return 0;
+
 	if (a == &dev_attr_ram_qos_class.attr)
 		if (mds->ram_perf.qos_class == CXL_QOS_CLASS_INVALID)
 			return 0;
@@ -487,6 +490,9 @@ static umode_t cxl_pmem_visible(struct kobject *kobj, struct attribute *a, int n
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 
+	if (!mds)
+		return 0;
+
 	if (a == &dev_attr_pmem_qos_class.attr)
 		if (mds->pmem_perf.qos_class == CXL_QOS_CLASS_INVALID)
 			return 0;
@@ -507,6 +513,9 @@ static umode_t cxl_memdev_security_visible(struct kobject *kobj,
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 
+	if (!mds)
+		return 0;
+
 	if (a == &dev_attr_security_sanitize.attr &&
 	    !test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
 		return 0;
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 2f1b49bfe162..f76af75a87b7 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -131,12 +131,14 @@ static int cxl_mem_probe(struct device *dev)
 	dentry = cxl_debugfs_create_dir(dev_name(dev));
 	debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show);
 
-	if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
-		debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
-				    &cxl_poison_inject_fops);
-	if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
-		debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
-				    &cxl_poison_clear_fops);
+	if (mds) {
+		if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
+			debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
+					    &cxl_poison_inject_fops);
+		if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
+			debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
+					    &cxl_poison_clear_fops);
+	}
 
 	rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
 	if (rc)
@@ -222,6 +224,9 @@ static umode_t cxl_mem_visible(struct kobject *kobj, struct attribute *a, int n)
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 
+	if (!mds)
+		return 0;
+
 	if (a == &dev_attr_trigger_poison_list.attr)
 		if (!test_bit(CXL_POISON_ENABLED_LIST,
 			      mds->poison.enabled_cmds))
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index a84fe7992c53..0abe66490ef5 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -57,10 +57,16 @@ void efx_cxl_init(struct efx_nic *efx)
 	if (cxl_accel_request_resource(cxl->cxlds, true))
 		pci_info(pci_dev, "CXL accel resource request failed");
 
-	if (!cxl_await_media_ready(cxl->cxlds))
+	if (!cxl_await_media_ready(cxl->cxlds)) {
 		cxl_accel_set_media_ready(cxl->cxlds);
-	else
+	} else {
 		pci_info(pci_dev, "CXL accel media not active");
+		return;
+	}
+
+	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
+	if (IS_ERR(cxl->cxlmd))
+		pci_info(pci_dev, "CXL accel memdev creation failed");
 }
 
 
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
index b883c438a132..442ed9862292 100644
--- a/include/linux/cxl_accel_mem.h
+++ b/include/linux/cxl_accel_mem.h
@@ -26,4 +26,7 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
 int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
 void cxl_accel_set_media_ready(struct cxl_dev_state *cxlds);
 int cxl_await_media_ready(struct cxl_dev_state *cxlds);
+
+struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
+				       struct cxl_dev_state *cxlds);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 08/15] cxl: indicate probe deferral
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (6 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 07/15] cxl: support type2 memdev creation alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-16  5:52   ` Li, Ming4
                     ` (4 more replies)
  2024-07-15 17:28 ` [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration alejandro.lucero-palau
                   ` (6 subsequent siblings)
  14 siblings, 5 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

The first stop for a CXL accelerator driver that wants to establish new
CXL.mem regions is to register a 'struct cxl_memdev. That kicks off
cxl_mem_probe() to enumerate all 'struct cxl_port' instances in the
topology up to the root.

If the root driver has not attached yet the expectation is that the
driver waits until that link is established. The common cxl_pci_driver
has reason to keep the 'struct cxl_memdev' device attached to the bus
until the root driver attaches. An accelerator may want to instead defer
probing until CXL resources can be acquired.

Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
accelerator driver probing should be defferred vs failed. Provide that
indication via a new cxl_acquire_endpoint() API that can retrieve the
probe status of the memdev.

The first consumer of this API is a test driver that excercises the CXL
Type-2 flow.

Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m18497367d2ae38f88e94c06369eaa83fa23e92b2

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c          | 41 ++++++++++++++++++++++++++++++
 drivers/cxl/core/port.c            |  2 +-
 drivers/cxl/mem.c                  |  7 +++--
 drivers/net/ethernet/sfc/efx_cxl.c | 10 +++++++-
 include/linux/cxl_accel_mem.h      |  3 +++
 5 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index b902948b121f..d51c8bfb32e3 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -1137,6 +1137,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
 }
 EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
 
+/*
+ * Try to get a locked reference on a memdev's CXL port topology
+ * connection. Be careful to observe when cxl_mem_probe() has deposited
+ * a probe deferral awaiting the arrival of the CXL root driver
+*/
+struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
+{
+	struct cxl_port *endpoint;
+	int rc = -ENXIO;
+
+	device_lock(&cxlmd->dev);
+	endpoint = cxlmd->endpoint;
+	if (!endpoint)
+		goto err;
+
+	if (IS_ERR(endpoint)) {
+		rc = PTR_ERR(endpoint);
+		goto err;
+	}
+
+	device_lock(&endpoint->dev);
+	if (!endpoint->dev.driver)
+		goto err_endpoint;
+
+	return endpoint;
+
+err_endpoint:
+	device_unlock(&endpoint->dev);
+err:
+	device_unlock(&cxlmd->dev);
+	return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
+
+void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
+{
+	device_unlock(&endpoint->dev);
+	device_unlock(&cxlmd->dev);
+}
+EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
+
 static void sanitize_teardown_notifier(void *data)
 {
 	struct cxl_memdev_state *mds = data;
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index d66c6349ed2d..3c6b896c5f65 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
 		 */
 		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
 			dev_name(dport_dev));
-		return -ENXIO;
+		return -EPROBE_DEFER;
 	}
 
 	parent_port = find_cxl_port(dparent, &parent_dport);
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index f76af75a87b7..383a6f4829d3 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -145,13 +145,16 @@ static int cxl_mem_probe(struct device *dev)
 		return rc;
 
 	rc = devm_cxl_enumerate_ports(cxlmd);
-	if (rc)
+	if (rc) {
+		cxlmd->endpoint = ERR_PTR(rc);
 		return rc;
+	}
 
 	parent_port = cxl_mem_find_port(cxlmd, &dport);
 	if (!parent_port) {
 		dev_err(dev, "CXL port topology not found\n");
-		return -ENXIO;
+		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
+		return -EPROBE_DEFER;
 	}
 
 	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 0abe66490ef5..2cf4837ddfc1 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -65,8 +65,16 @@ void efx_cxl_init(struct efx_nic *efx)
 	}
 
 	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
-	if (IS_ERR(cxl->cxlmd))
+	if (IS_ERR(cxl->cxlmd)) {
 		pci_info(pci_dev, "CXL accel memdev creation failed");
+		return;
+	}
+
+	cxl->endpoint = cxl_acquire_endpoint(cxl->cxlmd);
+	if (IS_ERR(cxl->endpoint))
+		pci_info(pci_dev, "CXL accel acquire endpoint failed");
+
+	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
 }
 
 
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
index 442ed9862292..701910021df8 100644
--- a/include/linux/cxl_accel_mem.h
+++ b/include/linux/cxl_accel_mem.h
@@ -29,4 +29,7 @@ int cxl_await_media_ready(struct cxl_dev_state *cxlds);
 
 struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
 				       struct cxl_dev_state *cxlds);
+
+struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
+void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (7 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 08/15] cxl: indicate probe deferral alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-16  0:53   ` kernel test robot
                     ` (2 more replies)
  2024-07-15 17:28 ` [PATCH v2 10/15] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
                   ` (5 subsequent siblings)
  14 siblings, 3 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

CXL region creation involves allocating capacity from device DPA
(device-physical-address space) and assigning it to decode a given HPA
(host-physical-address space). Before determining how much DPA to
allocate the amount of available HPA must be determined. Also, not all
HPA is create equal, some specifically targets RAM, some target PMEM,
some is prepared for device-memory flows like HDM-D and HDM-DB, and some
is host-only (HDM-H).

Wrap all of those concerns into an API that retrieves a root decoder
(platform CXL window) that fits the specified constraints and the
capacity available for a new region.

Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m6fbe775541da3cd477d65fa95c8acdc347345b4f

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c          | 161 +++++++++++++++++++++++++++++
 drivers/cxl/cxl.h                  |   3 +
 drivers/cxl/cxlmem.h               |   5 +
 drivers/net/ethernet/sfc/efx_cxl.c |  14 +++
 include/linux/cxl_accel_mem.h      |   9 ++
 5 files changed, 192 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 538ebd5a64fd..ca464bfef77b 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -702,6 +702,167 @@ static int free_hpa(struct cxl_region *cxlr)
 	return 0;
 }
 
+
+struct cxlrd_max_context {
+	struct device * const *host_bridges;
+	int interleave_ways;
+	unsigned long flags;
+	resource_size_t max_hpa;
+	struct cxl_root_decoder *cxlrd;
+};
+
+static int find_max_hpa(struct device *dev, void *data)
+{
+	struct cxlrd_max_context *ctx = data;
+	struct cxl_switch_decoder *cxlsd;
+	struct cxl_root_decoder *cxlrd;
+	struct resource *res, *prev;
+	struct cxl_decoder *cxld;
+	resource_size_t max;
+	int found;
+
+	if (!is_root_decoder(dev))
+		return 0;
+
+	cxlrd = to_cxl_root_decoder(dev);
+	cxld = &cxlrd->cxlsd.cxld;
+	if ((cxld->flags & ctx->flags) != ctx->flags) {
+		dev_dbg(dev, "find_max_hpa, flags not matching: %08lx vs %08lx\n",
+			      cxld->flags, ctx->flags);
+		return 0;
+	}
+
+	/* A Host bridge could have more interleave ways than an
+	 * endpoint, couldn´t it?
+	 *
+	 * What does interleave ways mean here in terms of the requestor?
+	 * Why the FFMWS has 0 interleave ways but root port has 1?
+	 */
+	if (cxld->interleave_ways != ctx->interleave_ways) {
+		dev_dbg(dev, "find_max_hpa, interleave_ways  not matching\n");
+		return 0;
+	}
+
+	cxlsd = &cxlrd->cxlsd;
+
+	guard(rwsem_read)(&cxl_region_rwsem);
+	found = 0;
+	for (int i = 0; i < ctx->interleave_ways; i++)
+		for (int j = 0; j < ctx->interleave_ways; j++)
+			if (ctx->host_bridges[i] ==
+					cxlsd->target[j]->dport_dev) {
+				found++;
+				break;
+			}
+
+	if (found != ctx->interleave_ways) {
+		dev_dbg(dev, "find_max_hpa, no interleave_ways found\n");
+		return 0;
+	}
+
+	/*
+	 * Walk the root decoder resource range relying on cxl_region_rwsem to
+	 * preclude sibling arrival/departure and find the largest free space
+	 * gap.
+	 */
+	lockdep_assert_held_read(&cxl_region_rwsem);
+	max = 0;
+	res = cxlrd->res->child;
+	if (!res)
+		max = resource_size(cxlrd->res);
+	else
+		max = 0;
+
+	for (prev = NULL; res; prev = res, res = res->sibling) {
+		struct resource *next = res->sibling;
+		resource_size_t free = 0;
+
+		if (!prev && res->start > cxlrd->res->start) {
+			free = res->start - cxlrd->res->start;
+			max = max(free, max);
+		}
+		if (prev && res->start > prev->end + 1) {
+			free = res->start - prev->end + 1;
+			max = max(free, max);
+		}
+		if (next && res->end + 1 < next->start) {
+			free = next->start - res->end + 1;
+			max = max(free, max);
+		}
+		if (!next && res->end + 1 < cxlrd->res->end + 1) {
+			free = cxlrd->res->end + 1 - res->end + 1;
+			max = max(free, max);
+		}
+	}
+
+	if (max > ctx->max_hpa) {
+		if (ctx->cxlrd)
+			put_device(CXLRD_DEV(ctx->cxlrd));
+		get_device(CXLRD_DEV(cxlrd));
+		ctx->cxlrd = cxlrd;
+		ctx->max_hpa = max;
+		dev_info(CXLRD_DEV(cxlrd), "found %pa bytes of free space\n", &max);
+	}
+	return 0;
+}
+
+/**
+ * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
+ * @endpoint: an endpoint that is mapped by the returned decoder
+ * @interleave_ways: number of entries in @host_bridges
+ * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
+ * @max: output parameter of bytes available in the returned decoder
+ *
+ * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
+ * is a point in time snapshot. If by the time the caller goes to use this root
+ * decoder's capacity the capacity is reduced then caller needs to loop and
+ * retry.
+ *
+ * The returned root decoder has an elevated reference count that needs to be
+ * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
+ * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
+ * does not race.
+ */
+struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
+					       int interleave_ways,
+					       unsigned long flags,
+					       resource_size_t *max)
+{
+
+	struct cxlrd_max_context ctx = {
+		.host_bridges = &endpoint->host_bridge,
+		.interleave_ways = interleave_ways,
+		.flags = flags,
+	};
+	struct cxl_port *root_port;
+	struct cxl_root *root;
+
+	if (!is_cxl_endpoint(endpoint)) {
+		dev_dbg(&endpoint->dev, "hpa requestor is not an endpoint\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	root = find_cxl_root(endpoint);
+	if (!root) {
+		dev_dbg(&endpoint->dev, "endpoint can not be related to a root port\n");
+		return ERR_PTR(-ENXIO);
+	}
+
+	root_port = &root->port;
+	down_read(&cxl_region_rwsem);
+	device_for_each_child(&root_port->dev, &ctx, find_max_hpa);
+	up_read(&cxl_region_rwsem);
+	put_device(&root_port->dev);
+
+	if (!ctx.cxlrd)
+		return ERR_PTR(-ENOMEM);
+
+	*max = ctx.max_hpa;
+	return ctx.cxlrd;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_get_hpa_freespace, CXL);
+
+
 static ssize_t size_store(struct device *dev, struct device_attribute *attr,
 			  const char *buf, size_t len)
 {
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 9973430d975f..d3fdd2c1e066 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -770,6 +770,9 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
 struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
 struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
 struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev);
+
+#define CXLRD_DEV(cxlrd) &cxlrd->cxlsd.cxld.dev
+
 bool is_root_decoder(struct device *dev);
 bool is_switch_decoder(struct device *dev);
 bool is_endpoint_decoder(struct device *dev);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 8f2a820bd92d..a0e0795ec064 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -877,4 +877,9 @@ struct cxl_hdm {
 struct seq_file;
 struct dentry *cxl_debugfs_create_dir(const char *dir);
 void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds);
+struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
+					       int interleave_ways,
+					       unsigned long flags,
+					       resource_size_t *max);
+
 #endif /* __CXL_MEM_H__ */
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 2cf4837ddfc1..6d49571ccff7 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -22,6 +22,7 @@ void efx_cxl_init(struct efx_nic *efx)
 {
 	struct pci_dev *pci_dev = efx->pci_dev;
 	struct efx_cxl *cxl = efx->cxl;
+	resource_size_t max = 0;
 	struct resource res;
 	u16 dvsec;
 
@@ -74,6 +75,19 @@ void efx_cxl_init(struct efx_nic *efx)
 	if (IS_ERR(cxl->endpoint))
 		pci_info(pci_dev, "CXL accel acquire endpoint failed");
 
+	cxl->cxlrd = cxl_get_hpa_freespace(cxl->endpoint, 1,
+					    CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
+					    &max);
+
+	if (IS_ERR(cxl->cxlrd)) {
+		pci_info(pci_dev, "CXL accel get HPA failed");
+		goto out;
+	}
+
+	if (max < EFX_CTPIO_BUFFER_SIZE)
+		pci_info(pci_dev, "CXL accel not enough free HPA space %llu < %u\n",
+				  max, EFX_CTPIO_BUFFER_SIZE);
+out:
 	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
 }
 
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
index 701910021df8..f3e77688ffe0 100644
--- a/include/linux/cxl_accel_mem.h
+++ b/include/linux/cxl_accel_mem.h
@@ -6,6 +6,10 @@
 #ifndef __CXL_ACCEL_MEM_H
 #define __CXL_ACCEL_MEM_H
 
+#define CXL_DECODER_F_RAM   BIT(0)
+#define CXL_DECODER_F_PMEM  BIT(1)
+#define CXL_DECODER_F_TYPE2 BIT(2)
+
 enum accel_resource{
 	CXL_ACCEL_RES_DPA,
 	CXL_ACCEL_RES_RAM,
@@ -32,4 +36,9 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
 
 struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
 void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
+
+struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
+					       int interleave_ways,
+					       unsigned long flags,
+					       resource_size_t *max);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 10/15] cxl: define a driver interface for DPA allocation
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (8 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-16  3:32   ` kernel test robot
                     ` (2 more replies)
  2024-07-15 17:28 ` [PATCH v2 11/15] cxl: make region type based on endpoint type alejandro.lucero-palau
                   ` (4 subsequent siblings)
  14 siblings, 3 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Region creation involves finding available DPA (device-physical-address)
capacity to map into HPA (host-physical-address) space. Given the HPA
capacity constraint, define an API, cxl_request_dpa(), that has the
flexibility to  map the minimum amount of memory the driver needs to
operate vs the total possible that can be mapped given HPA availability.

Factor out the core of cxl_dpa_alloc, that does free space scanning,
into a cxl_dpa_freespace() helper, and use that to balance the capacity
available to map vs the @min and @max arguments to cxl_request_dpa.

Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m4271ee49a91615c8af54e3ab20679f8be3099393

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/core.h            |   1 +
 drivers/cxl/core/hdm.c             | 153 +++++++++++++++++++++++++----
 drivers/net/ethernet/sfc/efx.c     |   2 +
 drivers/net/ethernet/sfc/efx_cxl.c |  18 +++-
 drivers/net/ethernet/sfc/efx_cxl.h |   1 +
 include/linux/cxl_accel_mem.h      |   7 ++
 6 files changed, 161 insertions(+), 21 deletions(-)

diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 625394486459..a243ff12c0f4 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -76,6 +76,7 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
 		     enum cxl_decoder_mode mode);
 int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size);
 int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
+int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
 resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
 resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled);
 
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 4af9225d4b59..3e53ae222d40 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -3,6 +3,7 @@
 #include <linux/seq_file.h>
 #include <linux/device.h>
 #include <linux/delay.h>
+#include <linux/cxl_accel_mem.h>
 
 #include "cxlmem.h"
 #include "core.h"
@@ -420,6 +421,7 @@ int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
 	up_write(&cxl_dpa_rwsem);
 	return rc;
 }
+EXPORT_SYMBOL_NS_GPL(cxl_dpa_free, CXL);
 
 int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
 		     enum cxl_decoder_mode mode)
@@ -467,30 +469,17 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
 	return rc;
 }
 
-int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
+static resource_size_t cxl_dpa_freespace(struct cxl_endpoint_decoder *cxled,
+					 resource_size_t *start_out,
+					 resource_size_t *skip_out)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 	resource_size_t free_ram_start, free_pmem_start;
-	struct cxl_port *port = cxled_to_port(cxled);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
-	struct device *dev = &cxled->cxld.dev;
 	resource_size_t start, avail, skip;
 	struct resource *p, *last;
-	int rc;
-
-	down_write(&cxl_dpa_rwsem);
-	if (cxled->cxld.region) {
-		dev_dbg(dev, "decoder attached to %s\n",
-			dev_name(&cxled->cxld.region->dev));
-		rc = -EBUSY;
-		goto out;
-	}
 
-	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
-		dev_dbg(dev, "decoder enabled\n");
-		rc = -EBUSY;
-		goto out;
-	}
+	lockdep_assert_held(&cxl_dpa_rwsem);
 
 	for (p = cxlds->ram_res.child, last = NULL; p; p = p->sibling)
 		last = p;
@@ -528,14 +517,45 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
 			skip_end = start - 1;
 		skip = skip_end - skip_start + 1;
 	} else {
-		dev_dbg(dev, "mode not set\n");
-		rc = -EINVAL;
+		avail = 0;
+	}
+
+	if (!avail)
+		return 0;
+	if (start_out)
+		*start_out = start;
+	if (skip_out)
+		*skip_out = skip;
+	return avail;
+}
+
+int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
+{
+	struct cxl_port *port = cxled_to_port(cxled);
+	struct device *dev = &cxled->cxld.dev;
+	resource_size_t start, avail, skip;
+	int rc;
+
+	down_write(&cxl_dpa_rwsem);
+	if (cxled->cxld.region) {
+		dev_dbg(dev, "EBUSY, decoder attached to %s\n",
+			     dev_name(&cxled->cxld.region->dev));
+		rc = -EBUSY;
 		goto out;
 	}
 
+	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
+		dev_dbg(dev, "EBUSY, decoder enabled\n");
+		rc = -EBUSY;
+		goto out;
+	}
+
+	avail = cxl_dpa_freespace(cxled, &start, &skip);
+
 	if (size > avail) {
 		dev_dbg(dev, "%pa exceeds available %s capacity: %pa\n", &size,
-			cxl_decoder_mode_name(cxled->mode), &avail);
+			     cxled->mode == CXL_DECODER_RAM ? "ram" : "pmem",
+			     &avail);
 		rc = -ENOSPC;
 		goto out;
 	}
@@ -550,6 +570,99 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
 	return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
 }
 
+static int find_free_decoder(struct device *dev, void *data)
+{
+	struct cxl_endpoint_decoder *cxled;
+	struct cxl_port *port;
+
+	if (!is_endpoint_decoder(dev))
+		return 0;
+
+	cxled = to_cxl_endpoint_decoder(dev);
+	port = cxled_to_port(cxled);
+
+	if (cxled->cxld.id != port->hdm_end + 1) {
+		return 0;
+	}
+	return 1;
+}
+
+/**
+ * cxl_request_dpa - search and reserve DPA given input constraints
+ * @endpoint: an endpoint port with available decoders
+ * @mode: DPA operation mode (ram vs pmem)
+ * @min: the minimum amount of capacity the call needs
+ * @max: extra capacity to allocate after min is satisfied
+ *
+ * Given that a region needs to allocate from limited HPA capacity it
+ * may be the case that a device has more mappable DPA capacity than
+ * available HPA. So, the expectation is that @min is a driver known
+ * value for how much capacity is needed, and @max is based the limit of
+ * how much HPA space is available for a new region.
+ *
+ * Returns a pinned cxl_decoder with at least @min bytes of capacity
+ * reserved, or an error pointer. The caller is also expected to own the
+ * lifetime of the memdev registration associated with the endpoint to
+ * pin the decoder registered as well.
+ */
+struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
+					     bool is_ram,
+					     resource_size_t min,
+					     resource_size_t max)
+{
+	struct cxl_endpoint_decoder *cxled;
+	enum cxl_decoder_mode mode;
+	struct device *cxled_dev;
+	resource_size_t alloc;
+	int rc;
+
+	if (!IS_ALIGNED(min | max, SZ_256M))
+		return ERR_PTR(-EINVAL);
+
+	down_read(&cxl_dpa_rwsem);
+
+	cxled_dev = device_find_child(&endpoint->dev, NULL, find_free_decoder);
+	if (!cxled_dev)
+		cxled = ERR_PTR(-ENXIO);
+	else
+		cxled = to_cxl_endpoint_decoder(cxled_dev);
+
+	up_read(&cxl_dpa_rwsem);
+
+	if (IS_ERR(cxled))
+		return cxled;
+
+	if (is_ram)
+		mode = CXL_DECODER_RAM;
+	else
+		mode = CXL_DECODER_PMEM;
+
+	rc = cxl_dpa_set_mode(cxled, mode);
+	if (rc)
+		goto err;
+
+	down_read(&cxl_dpa_rwsem);
+	alloc = cxl_dpa_freespace(cxled, NULL, NULL);
+	up_read(&cxl_dpa_rwsem);
+
+	if (max)
+		alloc = min(max, alloc);
+	if (alloc < min) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	rc = cxl_dpa_alloc(cxled, alloc);
+	if (rc)
+		goto err;
+
+	return cxled;
+err:
+	put_device(cxled_dev);
+	return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_request_dpa, CXL);
+
 static void cxld_set_interleave(struct cxl_decoder *cxld, u32 *ctrl)
 {
 	u16 eig;
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index cb3f74d30852..9cfe29002d98 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -901,6 +901,8 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
 
 	efx_fini_io(efx);
 
+	efx_cxl_exit(efx);
+
 	pci_dbg(efx->pci_dev, "shutdown successful\n");
 
 	efx_fini_devlink_and_unlock(efx);
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 6d49571ccff7..b5626d724b52 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -84,12 +84,28 @@ void efx_cxl_init(struct efx_nic *efx)
 		goto out;
 	}
 
-	if (max < EFX_CTPIO_BUFFER_SIZE)
+	if (max < EFX_CTPIO_BUFFER_SIZE) {
 		pci_info(pci_dev, "CXL accel not enough free HPA space %llu < %u\n",
 				  max, EFX_CTPIO_BUFFER_SIZE);
+		goto out;
+	}
+
+	cxl->cxled = cxl_request_dpa(cxl->endpoint, true, EFX_CTPIO_BUFFER_SIZE,
+				     EFX_CTPIO_BUFFER_SIZE);
+	if (IS_ERR(cxl->cxled))
+		pci_info(pci_dev, "CXL accel request DPA failed");
 out:
 	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
 }
 
+void efx_cxl_exit(struct efx_nic *efx)
+{
+	struct efx_cxl *cxl = efx->cxl;
+
+	if (cxl->cxled)
+		cxl_dpa_free(cxl->cxled);
+ 
+ 	return;
+ }
 
 MODULE_IMPORT_NS(CXL);
diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
index 76c6794c20d8..59d5217a684c 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.h
+++ b/drivers/net/ethernet/sfc/efx_cxl.h
@@ -26,4 +26,5 @@ struct efx_cxl {
 };
 
 void efx_cxl_init(struct efx_nic *efx);
+void efx_cxl_exit(struct efx_nic *efx);
 #endif
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
index f3e77688ffe0..d4ecb5bb4fc8 100644
--- a/include/linux/cxl_accel_mem.h
+++ b/include/linux/cxl_accel_mem.h
@@ -2,6 +2,7 @@
 /* Copyright(c) 2024 Advanced Micro Devices, Inc. */
 
 #include <linux/cdev.h>
+#include <linux/pci.h>
 
 #ifndef __CXL_ACCEL_MEM_H
 #define __CXL_ACCEL_MEM_H
@@ -41,4 +42,10 @@ struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
 					       int interleave_ways,
 					       unsigned long flags,
 					       resource_size_t *max);
+
+struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
+					     bool is_ram,
+					     resource_size_t min,
+					     resource_size_t max);
+int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 11/15] cxl: make region type based on endpoint type
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (9 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 10/15] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-16  7:14   ` Li, Ming4
  2024-07-15 17:28 ` [PATCH v2 12/15] cxl: allow region creation by type2 drivers alejandro.lucero-palau
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Current code is expecting Type3 or CXL_DECODER_HOSTONLYMEM devices only.
Suport for Type2 implies region type needs to be based on the endpoint
type instead.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/region.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index ca464bfef77b..5cc71b8868bc 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2645,7 +2645,8 @@ static ssize_t create_ram_region_show(struct device *dev,
 }
 
 static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
-					  enum cxl_decoder_mode mode, int id)
+					  enum cxl_decoder_mode mode, int id,
+					  enum cxl_decoder_type target_type)
 {
 	int rc;
 
@@ -2667,7 +2668,7 @@ static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
 		return ERR_PTR(-EBUSY);
 	}
 
-	return devm_cxl_add_region(cxlrd, id, mode, CXL_DECODER_HOSTONLYMEM);
+	return devm_cxl_add_region(cxlrd, id, mode, target_type);
 }
 
 static ssize_t create_pmem_region_store(struct device *dev,
@@ -2682,7 +2683,8 @@ static ssize_t create_pmem_region_store(struct device *dev,
 	if (rc != 1)
 		return -EINVAL;
 
-	cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id);
+	cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id,
+			       CXL_DECODER_HOSTONLYMEM);
 	if (IS_ERR(cxlr))
 		return PTR_ERR(cxlr);
 
@@ -2702,7 +2704,8 @@ static ssize_t create_ram_region_store(struct device *dev,
 	if (rc != 1)
 		return -EINVAL;
 
-	cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id);
+	cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id,
+			       CXL_DECODER_HOSTONLYMEM);
 	if (IS_ERR(cxlr))
 		return PTR_ERR(cxlr);
 
@@ -3364,7 +3367,8 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 
 	do {
 		cxlr = __create_region(cxlrd, cxled->mode,
-				       atomic_read(&cxlrd->region_id));
+				       atomic_read(&cxlrd->region_id),
+				       cxled->cxld.target_type);
 	} while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
 
 	if (IS_ERR(cxlr)) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 12/15] cxl: allow region creation by type2 drivers
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (10 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 11/15] cxl: make region type based on endpoint type alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-08-04 18:29   ` Jonathan Cameron
  2024-08-22 13:12   ` Zhi Wang
  2024-07-15 17:28 ` [PATCH v2 13/15] cxl: preclude device memory to be used for dax alejandro.lucero-palau
                   ` (2 subsequent siblings)
  14 siblings, 2 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Creating a CXL region requires userspace intervention through the cxl
sysfs files. Type2 support should allow accelerator drivers to create
such cxl region from kernel code.

Adding that functionality and integrating it with current support for
memory expanders.

Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m84598b534cc5664f5bb31521ba6e41c7bc213758
Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c          | 265 ++++++++++++++++++++++-------
 drivers/cxl/cxl.h                  |   1 +
 drivers/cxl/cxlmem.h               |   4 +-
 drivers/net/ethernet/sfc/efx_cxl.c |  15 +-
 include/linux/cxl_accel_mem.h      |   5 +
 5 files changed, 231 insertions(+), 59 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 5cc71b8868bc..697c8df83a4b 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -479,22 +479,14 @@ static ssize_t interleave_ways_show(struct device *dev,
 
 static const struct attribute_group *get_cxl_region_target_group(void);
 
-static ssize_t interleave_ways_store(struct device *dev,
-				     struct device_attribute *attr,
-				     const char *buf, size_t len)
+static int set_interleave_ways(struct cxl_region *cxlr, int val)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
 	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
-	struct cxl_region *cxlr = to_cxl_region(dev);
 	struct cxl_region_params *p = &cxlr->params;
-	unsigned int val, save;
-	int rc;
+	int save, rc;
 	u8 iw;
 
-	rc = kstrtouint(buf, 0, &val);
-	if (rc)
-		return rc;
-
 	rc = ways_to_eiw(val, &iw);
 	if (rc)
 		return rc;
@@ -509,25 +501,42 @@ static ssize_t interleave_ways_store(struct device *dev,
 		return -EINVAL;
 	}
 
-	rc = down_write_killable(&cxl_region_rwsem);
-	if (rc)
-		return rc;
-	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
-		rc = -EBUSY;
-		goto out;
-	}
+	lockdep_assert_held_write(&cxl_region_rwsem);
+	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
+		return -EBUSY;
 
 	save = p->interleave_ways;
 	p->interleave_ways = val;
 	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
 	if (rc)
 		p->interleave_ways = save;
-out:
+
+	return rc;
+}
+
+static ssize_t interleave_ways_store(struct device *dev,
+				     struct device_attribute *attr,
+				     const char *buf, size_t len)
+{
+	struct cxl_region *cxlr = to_cxl_region(dev);
+	unsigned int val;
+	int rc;
+
+	rc = kstrtouint(buf, 0, &val);
+	if (rc)
+		return rc;
+
+	rc = down_write_killable(&cxl_region_rwsem);
+	if (rc)
+		return rc;
+
+	rc = set_interleave_ways(cxlr, val);
 	up_write(&cxl_region_rwsem);
 	if (rc)
 		return rc;
 	return len;
 }
+
 static DEVICE_ATTR_RW(interleave_ways);
 
 static ssize_t interleave_granularity_show(struct device *dev,
@@ -547,21 +556,14 @@ static ssize_t interleave_granularity_show(struct device *dev,
 	return rc;
 }
 
-static ssize_t interleave_granularity_store(struct device *dev,
-					    struct device_attribute *attr,
-					    const char *buf, size_t len)
+static int set_interleave_granularity(struct cxl_region *cxlr, int val)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
 	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
-	struct cxl_region *cxlr = to_cxl_region(dev);
 	struct cxl_region_params *p = &cxlr->params;
-	int rc, val;
+	int rc;
 	u16 ig;
 
-	rc = kstrtoint(buf, 0, &val);
-	if (rc)
-		return rc;
-
 	rc = granularity_to_eig(val, &ig);
 	if (rc)
 		return rc;
@@ -577,21 +579,36 @@ static ssize_t interleave_granularity_store(struct device *dev,
 	if (cxld->interleave_ways > 1 && val != cxld->interleave_granularity)
 		return -EINVAL;
 
+	lockdep_assert_held_write(&cxl_region_rwsem);
+	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
+		return -EBUSY;
+
+	p->interleave_granularity = val;
+	return 0;
+}
+
+static ssize_t interleave_granularity_store(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t len)
+{
+	struct cxl_region *cxlr = to_cxl_region(dev);
+	int rc, val;
+
+	rc = kstrtoint(buf, 0, &val);
+	if (rc)
+		return rc;
+
 	rc = down_write_killable(&cxl_region_rwsem);
 	if (rc)
 		return rc;
-	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
-		rc = -EBUSY;
-		goto out;
-	}
 
-	p->interleave_granularity = val;
-out:
+	rc = set_interleave_granularity(cxlr, val);
 	up_write(&cxl_region_rwsem);
 	if (rc)
 		return rc;
 	return len;
 }
+
 static DEVICE_ATTR_RW(interleave_granularity);
 
 static ssize_t resource_show(struct device *dev, struct device_attribute *attr,
@@ -2193,7 +2210,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
 	return 0;
 }
 
-static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
+int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
 {
 	struct cxl_port *iter, *ep_port = cxled_to_port(cxled);
 	struct cxl_region *cxlr = cxled->cxld.region;
@@ -2252,6 +2269,7 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
 	put_device(&cxlr->dev);
 	return rc;
 }
+EXPORT_SYMBOL_NS_GPL(cxl_region_detach, CXL);
 
 void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
 {
@@ -2746,6 +2764,14 @@ cxl_find_region_by_name(struct cxl_root_decoder *cxlrd, const char *name)
 	return to_cxl_region(region_dev);
 }
 
+static void drop_region(struct cxl_region *cxlr)
+{
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	struct cxl_port *port = cxlrd_to_port(cxlrd);
+
+	devm_release_action(port->uport_dev, unregister_region, cxlr);
+}
+
 static ssize_t delete_region_store(struct device *dev,
 				   struct device_attribute *attr,
 				   const char *buf, size_t len)
@@ -3353,17 +3379,18 @@ static int match_region_by_range(struct device *dev, void *data)
 	return rc;
 }
 
-/* Establish an empty region covering the given HPA range */
-static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
-					   struct cxl_endpoint_decoder *cxled)
+static void construct_region_end(void)
+{
+	up_write(&cxl_region_rwsem);
+}
+
+static struct cxl_region *construct_region_begin(struct cxl_root_decoder *cxlrd,
+						 struct cxl_endpoint_decoder *cxled)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
-	struct cxl_port *port = cxlrd_to_port(cxlrd);
-	struct range *hpa = &cxled->cxld.hpa_range;
 	struct cxl_region_params *p;
 	struct cxl_region *cxlr;
-	struct resource *res;
-	int rc;
+	int err = 0;
 
 	do {
 		cxlr = __create_region(cxlrd, cxled->mode,
@@ -3372,8 +3399,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 	} while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
 
 	if (IS_ERR(cxlr)) {
-		dev_err(cxlmd->dev.parent,
-			"%s:%s: %s failed assign region: %ld\n",
+		dev_err(cxlmd->dev.parent,"%s:%s: %s failed assign region: %ld\n",
 			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
 			__func__, PTR_ERR(cxlr));
 		return cxlr;
@@ -3383,23 +3409,47 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 	p = &cxlr->params;
 	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
 		dev_err(cxlmd->dev.parent,
-			"%s:%s: %s autodiscovery interrupted\n",
+			"%s:%s: %s region setup interrupted\n",
 			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
 			__func__);
-		rc = -EBUSY;
-		goto err;
+		err = -EBUSY;
+	}
+
+	if (err) {
+		construct_region_end();
+		drop_region(cxlr);
+		return ERR_PTR(err);
 	}
+	return cxlr;
+}
+
+
+/* Establish an empty region covering the given HPA range */
+static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
+					   struct cxl_endpoint_decoder *cxled)
+{
+	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+	struct range *hpa = &cxled->cxld.hpa_range;
+	struct cxl_region_params *p;
+	struct cxl_region *cxlr;
+	struct resource *res;
+	int rc;
+
+	cxlr = construct_region_begin(cxlrd, cxled);
+	if (IS_ERR(cxlr))
+		return cxlr;
 
 	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
 
 	res = kmalloc(sizeof(*res), GFP_KERNEL);
 	if (!res) {
 		rc = -ENOMEM;
-		goto err;
+		goto out;
 	}
 
 	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
 				    dev_name(&cxlr->dev));
+
 	rc = insert_resource(cxlrd->res, res);
 	if (rc) {
 		/*
@@ -3412,6 +3462,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 			 __func__, dev_name(&cxlr->dev));
 	}
 
+	p = &cxlr->params;
 	p->res = res;
 	p->interleave_ways = cxled->cxld.interleave_ways;
 	p->interleave_granularity = cxled->cxld.interleave_granularity;
@@ -3419,24 +3470,124 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 
 	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
 	if (rc)
-		goto err;
+		goto out;
 
 	dev_dbg(cxlmd->dev.parent, "%s:%s: %s %s res: %pr iw: %d ig: %d\n",
-		dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), __func__,
-		dev_name(&cxlr->dev), p->res, p->interleave_ways,
-		p->interleave_granularity);
+				   dev_name(&cxlmd->dev),
+				   dev_name(&cxled->cxld.dev), __func__,
+				   dev_name(&cxlr->dev), p->res,
+				   p->interleave_ways,
+				   p->interleave_granularity);
 
 	/* ...to match put_device() in cxl_add_to_region() */
 	get_device(&cxlr->dev);
 	up_write(&cxl_region_rwsem);
+out:
+	construct_region_end();
+	if (rc) {
+		drop_region(cxlr);
+		return ERR_PTR(rc);
+	}
+	return cxlr;
+}
+
+static struct cxl_region *
+__construct_new_region(struct cxl_root_decoder *cxlrd,
+		       struct cxl_endpoint_decoder **cxled, int ways)
+{
+	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
+	struct cxl_region_params *p;
+	resource_size_t size = 0;
+	struct cxl_region *cxlr;
+	int rc, i;
+
+	/* If interleaving is not supported, why does ways need to be at least 1? */
+	if (ways < 1)
+		return ERR_PTR(-EINVAL);
+
+	cxlr = construct_region_begin(cxlrd, cxled[0]);
+	if (IS_ERR(cxlr))
+		return cxlr;
+
+	rc = set_interleave_ways(cxlr, ways);
+	if (rc)
+		goto out;
+
+	rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
+	if (rc)
+		goto out;
+
+	down_read(&cxl_dpa_rwsem);
+	for (i = 0; i < ways; i++) {
+		if (!cxled[i]->dpa_res)
+			break;
+		size += resource_size(cxled[i]->dpa_res);
+	}
+	up_read(&cxl_dpa_rwsem);
+
+	if (i < ways)
+		goto out;
+
+	rc = alloc_hpa(cxlr, size);
+	if (rc)
+		goto out;
+
+	down_read(&cxl_dpa_rwsem);
+	for (i = 0; i < ways; i++) {
+		rc = cxl_region_attach(cxlr, cxled[i], i);
+		if (rc)
+			break;
+	}
+	up_read(&cxl_dpa_rwsem);
+
+	if (rc)
+		goto out;
+
+	rc = cxl_region_decode_commit(cxlr);
+	if (rc)
+		goto out;
 
+	p = &cxlr->params;
+	p->state = CXL_CONFIG_COMMIT;
+out:
+	construct_region_end();
+	if (rc) {
+		drop_region(cxlr);
+		return ERR_PTR(rc);
+	}
 	return cxlr;
+}
 
-err:
-	up_write(&cxl_region_rwsem);
-	devm_release_action(port->uport_dev, unregister_region, cxlr);
-	return ERR_PTR(rc);
+/**
+ * cxl_create_region - Establish a region given an array of endpoint decoders
+ * @cxlrd: root decoder to allocate HPA
+ * @cxled: array of endpoint decoders with reserved DPA capacity
+ * @ways: size of @cxled array
+ *
+ * Returns a fully formed region in the commit state and attached to the
+ * cxl_region driver.
+ */
+struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
+				     struct cxl_endpoint_decoder **cxled,
+				     int ways)
+{
+	struct cxl_region *cxlr;
+
+	mutex_lock(&cxlrd->range_lock);
+	cxlr = __construct_new_region(cxlrd, cxled, ways);
+	mutex_unlock(&cxlrd->range_lock);
+
+	if (IS_ERR(cxlr))
+		return cxlr;
+
+	if (device_attach(&cxlr->dev) <= 0) {
+		dev_err(&cxlr->dev, "failed to create region\n");
+		drop_region(cxlr);
+		return ERR_PTR(-ENODEV);
+	}
+	return cxlr;
 }
+EXPORT_SYMBOL_NS_GPL(cxl_create_region, CXL);
 
 int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
 {
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index d3fdd2c1e066..1bf3b74ff959 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -905,6 +905,7 @@ void cxl_coordinates_combine(struct access_coordinate *out,
 
 bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
 
+int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index a0e0795ec064..377bb3cd2d47 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -881,5 +881,7 @@ struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
 					       int interleave_ways,
 					       unsigned long flags,
 					       resource_size_t *max);
-
+struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
+				     struct cxl_endpoint_decoder **cxled,
+				     int ways);
 #endif /* __CXL_MEM_H__ */
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index b5626d724b52..4012e3faa298 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -92,8 +92,18 @@ void efx_cxl_init(struct efx_nic *efx)
 
 	cxl->cxled = cxl_request_dpa(cxl->endpoint, true, EFX_CTPIO_BUFFER_SIZE,
 				     EFX_CTPIO_BUFFER_SIZE);
-	if (IS_ERR(cxl->cxled))
+	if (IS_ERR(cxl->cxled)) {
 		pci_info(pci_dev, "CXL accel request DPA failed");
+		return;
+	}
+
+	cxl->efx_region = cxl_create_region(cxl->cxlrd, &cxl->cxled, 1);
+	if (!cxl->efx_region) {
+		pci_info(pci_dev, "CXL accel create region failed");
+		cxl_dpa_free(cxl->cxled);
+		return;
+	}
+
 out:
 	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
 }
@@ -102,6 +112,9 @@ void efx_cxl_exit(struct efx_nic *efx)
 {
 	struct efx_cxl *cxl = efx->cxl;
 
+	if (cxl->efx_region)
+		cxl_region_detach(cxl->cxled);
+
 	if (cxl->cxled)
 		cxl_dpa_free(cxl->cxled);
  
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
index d4ecb5bb4fc8..a5f9ffc24509 100644
--- a/include/linux/cxl_accel_mem.h
+++ b/include/linux/cxl_accel_mem.h
@@ -48,4 +48,9 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
 					     resource_size_t min,
 					     resource_size_t max);
 int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
+struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
+				     struct cxl_endpoint_decoder **cxled,
+				     int ways);
+
+int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 13/15] cxl: preclude device memory to be used for dax
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (11 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 12/15] cxl: allow region creation by type2 drivers alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-07-15 17:28 ` [PATCH v2 14/15] cxl: add function for obtaining params from a region alejandro.lucero-palau
  2024-07-15 17:28 ` [PATCH v2 15/15] efx: support pio mapping based on cxl alejandro.lucero-palau
  14 siblings, 0 replies; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero, Alejandro Lucero

From: Alejandro Lucero <alucero@os3sl.com>

By definition a type2 cxl device will use the host managed memory for
specific functionality, therefore it should not be available to other
uses.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/region.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 697c8df83a4b..c8fc14ac437e 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3704,6 +3704,9 @@ static int cxl_region_probe(struct device *dev)
 	case CXL_DECODER_PMEM:
 		return devm_cxl_add_pmem_region(cxlr);
 	case CXL_DECODER_RAM:
+		if (cxlr->type != CXL_DECODER_HOSTONLYMEM)
+			return 0;
+
 		/*
 		 * The region can not be manged by CXL if any portion of
 		 * it is already online as 'System RAM'
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 14/15] cxl: add function for obtaining params from a region
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (12 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 13/15] cxl: preclude device memory to be used for dax alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-08-09 15:24   ` Zhi Wang
  2024-07-15 17:28 ` [PATCH v2 15/15] efx: support pio mapping based on cxl alejandro.lucero-palau
  14 siblings, 1 reply; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

A CXL region struct contains the physical address to work with.

Add a function for given a opaque cxl region struct returns the params
to be used for mapping such memory range.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/region.c     | 16 ++++++++++++++++
 drivers/cxl/cxl.h             |  3 +++
 include/linux/cxl_accel_mem.h |  2 ++
 3 files changed, 21 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index c8fc14ac437e..9ff10923e9fc 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3345,6 +3345,22 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
 	return rc;
 }
 
+int cxl_accel_get_region_params(struct cxl_region *region,
+				resource_size_t *start, resource_size_t *end)
+{
+	if (!region)
+		return -ENODEV;
+
+	if (!region->params.res) {
+		return -ENODEV;
+	}
+	*start = region->params.res->start;
+	*end = region->params.res->end;
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_accel_get_region_params, CXL);
+
 static int match_root_decoder_by_range(struct device *dev, void *data)
 {
 	struct range *r1, *r2 = data;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 1bf3b74ff959..b4c4c4455ef1 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -906,6 +906,9 @@ void cxl_coordinates_combine(struct access_coordinate *out,
 bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
 
 int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
+
+int cxl_accel_get_region_params(struct cxl_region *region,
+				resource_size_t *start, resource_size_t *end);
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.
diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
index a5f9ffc24509..5d715eea6e91 100644
--- a/include/linux/cxl_accel_mem.h
+++ b/include/linux/cxl_accel_mem.h
@@ -53,4 +53,6 @@ struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
 				     int ways);
 
 int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
+int cxl_accel_get_region_params(struct cxl_region *region,
+				resource_size_t *start, resource_size_t *end);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 15/15] efx: support pio mapping based on cxl
  2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
                   ` (13 preceding siblings ...)
  2024-07-15 17:28 ` [PATCH v2 14/15] cxl: add function for obtaining params from a region alejandro.lucero-palau
@ 2024-07-15 17:28 ` alejandro.lucero-palau
  2024-08-04 18:13   ` Jonathan Cameron
  14 siblings, 1 reply; 114+ messages in thread
From: alejandro.lucero-palau @ 2024-07-15 17:28 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

With a device supporting CXL and successfully initialised, use the cxl
region to map the memory range and use this mapping for PIO buffers.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/net/ethernet/sfc/ef10.c      | 25 +++++++++++++++++++++----
 drivers/net/ethernet/sfc/efx_cxl.c   | 12 +++++++++++-
 drivers/net/ethernet/sfc/mcdi_pcol.h |  3 +++
 drivers/net/ethernet/sfc/nic.h       |  1 +
 4 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 8fa6c0e9195b..3924076d2628 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -24,6 +24,7 @@
 #include <linux/wait.h>
 #include <linux/workqueue.h>
 #include <net/udp_tunnel.h>
+#include "efx_cxl.h"
 
 /* Hardware control for EF10 architecture including 'Huntington'. */
 
@@ -177,6 +178,12 @@ static int efx_ef10_init_datapath_caps(struct efx_nic *efx)
 			  efx->num_mac_stats);
 	}
 
+	if (outlen < MC_CMD_GET_CAPABILITIES_V7_OUT_LEN)
+		nic_data->datapath_caps3 = 0;
+	else
+		nic_data->datapath_caps3 = MCDI_DWORD(outbuf,
+						      GET_CAPABILITIES_V7_OUT_FLAGS3);
+
 	return 0;
 }
 
@@ -1275,10 +1282,20 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx)
 			return -ENOMEM;
 		}
 		nic_data->pio_write_vi_base = pio_write_vi_base;
-		nic_data->pio_write_base =
-			nic_data->wc_membase +
-			(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
-			 uc_mem_map_size);
+
+		if ((nic_data->datapath_caps3 &
+		    (1 << MC_CMD_GET_CAPABILITIES_V10_OUT_CXL_CONFIG_ENABLE_LBN)) &&
+		    efx->cxl->ctpio_cxl)
+		{
+			nic_data->pio_write_base =
+				efx->cxl->ctpio_cxl +
+				(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
+				 uc_mem_map_size);
+		} else {
+			nic_data->pio_write_base =nic_data->wc_membase +
+				(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
+				 uc_mem_map_size);
+		}
 
 		rc = efx_ef10_link_piobufs(efx);
 		if (rc)
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 4012e3faa298..8e65ef42a572 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -21,8 +21,8 @@
 void efx_cxl_init(struct efx_nic *efx)
 {
 	struct pci_dev *pci_dev = efx->pci_dev;
+	resource_size_t start, end, max = 0;
 	struct efx_cxl *cxl = efx->cxl;
-	resource_size_t max = 0;
 	struct resource res;
 	u16 dvsec;
 
@@ -104,6 +104,13 @@ void efx_cxl_init(struct efx_nic *efx)
 		return;
 	}
 
+	cxl_accel_get_region_params(cxl->efx_region, &start, &end);
+
+	cxl->ctpio_cxl = ioremap(start, end - start);
+	if (!cxl->ctpio_cxl) {
+		pci_info(pci_dev, "CXL accel create region failed");
+		cxl_dpa_free(cxl->cxled);
+	}
 out:
 	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
 }
@@ -112,6 +119,9 @@ void efx_cxl_exit(struct efx_nic *efx)
 {
 	struct efx_cxl *cxl = efx->cxl;
 
+	if (cxl->ctpio_cxl)
+		iounmap(cxl->ctpio_cxl);
+
 	if (cxl->efx_region)
 		cxl_region_detach(cxl->cxled);
 
diff --git a/drivers/net/ethernet/sfc/mcdi_pcol.h b/drivers/net/ethernet/sfc/mcdi_pcol.h
index cd297e19cddc..05fd5e021142 100644
--- a/drivers/net/ethernet/sfc/mcdi_pcol.h
+++ b/drivers/net/ethernet/sfc/mcdi_pcol.h
@@ -18374,6 +18374,9 @@
 #define        MC_CMD_GET_CAPABILITIES_V10_OUT_DYNAMIC_MPORT_JOURNAL_OFST 148
 #define        MC_CMD_GET_CAPABILITIES_V10_OUT_DYNAMIC_MPORT_JOURNAL_LBN 14
 #define        MC_CMD_GET_CAPABILITIES_V10_OUT_DYNAMIC_MPORT_JOURNAL_WIDTH 1
+#define        MC_CMD_GET_CAPABILITIES_V10_OUT_CXL_CONFIG_ENABLE_OFST 148
+#define        MC_CMD_GET_CAPABILITIES_V10_OUT_CXL_CONFIG_ENABLE_LBN 16
+#define        MC_CMD_GET_CAPABILITIES_V10_OUT_CXL_CONFIG_ENABLE_WIDTH 1
 /* These bits are reserved for communicating test-specific capabilities to
  * host-side test software. All production drivers should treat this field as
  * opaque.
diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index 1db64fc6e909..cd635f4f7f94 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -186,6 +186,7 @@ struct efx_ef10_nic_data {
 	bool must_check_datapath_caps;
 	u32 datapath_caps;
 	u32 datapath_caps2;
+	u32 datapath_caps3;
 	unsigned int rx_dpcpu_fw_id;
 	unsigned int tx_dpcpu_fw_id;
 	bool must_probe_vswitching;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-07-15 17:28 ` [PATCH v2 01/15] cxl: add type2 device basic support alejandro.lucero-palau
@ 2024-07-15 18:48   ` Andrew Lunn
  2024-07-16  8:50     ` Alejandro Lucero Palau
  2024-07-16  1:57   ` kernel test robot
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 114+ messages in thread
From: Andrew Lunn @ 2024-07-15 18:48 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

> +++ b/include/linux/cxl_accel_mem.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> +
> +#include <linux/cdev.h>

That is generally a red flag that something not good is about to be
found. But it does not appear to be used in this patch....

       Andrew

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-07-15 17:28 ` [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration alejandro.lucero-palau
@ 2024-07-16  0:53   ` kernel test robot
  2024-07-16  6:06   ` Li, Ming4
  2024-08-04 17:57   ` Jonathan Cameron
  2 siblings, 0 replies; 114+ messages in thread
From: kernel test robot @ 2024-07-16  0:53 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: llvm, oe-kbuild-all, Alejandro Lucero

Hi,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.10 next-20240715]
[cannot apply to cxl/next cxl/pending horms-ipvs/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/alejandro-lucero-palau-amd-com/cxl-add-type2-device-basic-support/20240716-015920
base:   linus/master
patch link:    https://lore.kernel.org/r/20240715172835.24757-10-alejandro.lucero-palau%40amd.com
patch subject: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
config: i386-buildonly-randconfig-004-20240716 (https://download.01.org/0day-ci/archive/20240716/202407160818.7GrterxM-lkp@intel.com/config)
compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 617a15a9eac96088ae5e9134248d8236e34b91b1)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240716/202407160818.7GrterxM-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202407160818.7GrterxM-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from drivers/net/ethernet/sfc/efx_cxl.c:17:
   drivers/net/ethernet/sfc/efx_cxl.h:11:9: warning: 'EFX_CXL_H' is used as a header guard here, followed by #define of a different macro [-Wheader-guard]
      11 | #ifndef EFX_CXL_H
         |         ^~~~~~~~~
   drivers/net/ethernet/sfc/efx_cxl.h:12:9: note: 'EFX_CLX_H' is defined here; did you mean 'EFX_CXL_H'?
      12 | #define EFX_CLX_H
         |         ^~~~~~~~~
         |         EFX_CXL_H
>> drivers/net/ethernet/sfc/efx_cxl.c:89:7: warning: format specifies type 'unsigned long long' but the argument has type 'resource_size_t' (aka 'unsigned int') [-Wformat]
      88 |                 pci_info(pci_dev, "CXL accel not enough free HPA space %llu < %u\n",
         |                                                                        ~~~~
         |                                                                        %u
      89 |                                   max, EFX_CTPIO_BUFFER_SIZE);
         |                                   ^~~
   include/linux/pci.h:2683:67: note: expanded from macro 'pci_info'
    2683 | #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
         |                                                                ~~~    ^~~
   include/linux/dev_printk.h:160:67: note: expanded from macro 'dev_info'
     160 |         dev_printk_index_wrap(_dev_info, KERN_INFO, dev, dev_fmt(fmt), ##__VA_ARGS__)
         |                                                                  ~~~     ^~~~~~~~~~~
   include/linux/dev_printk.h:110:23: note: expanded from macro 'dev_printk_index_wrap'
     110 |                 _p_func(dev, fmt, ##__VA_ARGS__);                       \
         |                              ~~~    ^~~~~~~~~~~
   2 warnings generated.


vim +89 drivers/net/ethernet/sfc/efx_cxl.c

    15	
    16	#include "net_driver.h"
  > 17	#include "efx_cxl.h"
    18	
    19	#define EFX_CTPIO_BUFFER_SIZE	(1024*1024*256)
    20	
    21	void efx_cxl_init(struct efx_nic *efx)
    22	{
    23		struct pci_dev *pci_dev = efx->pci_dev;
    24		struct efx_cxl *cxl = efx->cxl;
    25		resource_size_t max = 0;
    26		struct resource res;
    27		u16 dvsec;
    28	
    29		dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
    30						  CXL_DVSEC_PCIE_DEVICE);
    31	
    32		if (!dvsec)
    33			return;
    34	
    35		pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability found");
    36	
    37		cxl->cxlds = cxl_accel_state_create(&pci_dev->dev,
    38						    CXL_ACCEL_DRIVER_CAP_HDM);
    39		if (IS_ERR(cxl->cxlds)) {
    40			pci_info(pci_dev, "CXL accel device state failed");
    41			return;
    42		}
    43	
    44		cxl_accel_set_dvsec(cxl->cxlds, dvsec);
    45		cxl_accel_set_serial(cxl->cxlds, pci_dev->dev.id);
    46	
    47		res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
    48		cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_DPA);
    49	
    50		res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
    51		cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
    52	
    53		if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds)) {
    54			pci_info(pci_dev, "CXL accel setup regs failed");
    55			return;
    56		}
    57	
    58		if (cxl_accel_request_resource(cxl->cxlds, true))
    59			pci_info(pci_dev, "CXL accel resource request failed");
    60	
    61		if (!cxl_await_media_ready(cxl->cxlds)) {
    62			cxl_accel_set_media_ready(cxl->cxlds);
    63		} else {
    64			pci_info(pci_dev, "CXL accel media not active");
    65			return;
    66		}
    67	
    68		cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
    69		if (IS_ERR(cxl->cxlmd)) {
    70			pci_info(pci_dev, "CXL accel memdev creation failed");
    71			return;
    72		}
    73	
    74		cxl->endpoint = cxl_acquire_endpoint(cxl->cxlmd);
    75		if (IS_ERR(cxl->endpoint))
    76			pci_info(pci_dev, "CXL accel acquire endpoint failed");
    77	
    78		cxl->cxlrd = cxl_get_hpa_freespace(cxl->endpoint, 1,
    79						    CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
    80						    &max);
    81	
    82		if (IS_ERR(cxl->cxlrd)) {
    83			pci_info(pci_dev, "CXL accel get HPA failed");
    84			goto out;
    85		}
    86	
    87		if (max < EFX_CTPIO_BUFFER_SIZE)
    88			pci_info(pci_dev, "CXL accel not enough free HPA space %llu < %u\n",
  > 89					  max, EFX_CTPIO_BUFFER_SIZE);
    90	out:
    91		cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
    92	}
    93	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-07-15 17:28 ` [PATCH v2 01/15] cxl: add type2 device basic support alejandro.lucero-palau
  2024-07-15 18:48   ` Andrew Lunn
@ 2024-07-16  1:57   ` kernel test robot
  2024-07-18 23:12   ` Dave Jiang
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 114+ messages in thread
From: kernel test robot @ 2024-07-16  1:57 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: llvm, oe-kbuild-all, Alejandro Lucero

Hi,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on cxl/pending v6.10 next-20240715]
[cannot apply to cxl/next horms-ipvs/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/alejandro-lucero-palau-amd-com/cxl-add-type2-device-basic-support/20240716-015920
base:   linus/master
patch link:    https://lore.kernel.org/r/20240715172835.24757-2-alejandro.lucero-palau%40amd.com
patch subject: [PATCH v2 01/15] cxl: add type2 device basic support
config: s390-allmodconfig (https://download.01.org/0day-ci/archive/20240716/202407160957.L4mIOUtI-lkp@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project a0c6b8aef853eedaa0980f07c0a502a5a8a9740e)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240716/202407160957.L4mIOUtI-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202407160957.L4mIOUtI-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from drivers/net/ethernet/sfc/efx.c:8:
   In file included from include/linux/filter.h:9:
   In file included from include/linux/bpf.h:20:
   In file included from include/linux/module.h:19:
   In file included from include/linux/elf.h:6:
   In file included from arch/s390/include/asm/elf.h:173:
   In file included from arch/s390/include/asm/mmu_context.h:11:
   In file included from arch/s390/include/asm/pgalloc.h:18:
   In file included from include/linux/mm.h:2258:
   include/linux/vmstat.h:500:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     500 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     501 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:507:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     507 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     508 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:514:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     514 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:519:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     519 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     520 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:528:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     528 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     529 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/net/ethernet/sfc/efx.c:8:
   In file included from include/linux/filter.h:12:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:93:
   include/asm-generic/io.h:548:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     548 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:561:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     561 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
         |                                                           ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
     102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
         |                                                      ^
   In file included from drivers/net/ethernet/sfc/efx.c:8:
   In file included from include/linux/filter.h:12:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:93:
   include/asm-generic/io.h:574:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     574 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
         |                                                           ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
     115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
         |                                                      ^
   In file included from drivers/net/ethernet/sfc/efx.c:8:
   In file included from include/linux/filter.h:12:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:93:
   include/asm-generic/io.h:585:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     585 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:595:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     595 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:605:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     605 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:693:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     693 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:701:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     701 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:709:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     709 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:718:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     718 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:727:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     727 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:736:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     736 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   In file included from drivers/net/ethernet/sfc/efx.c:36:
>> drivers/net/ethernet/sfc/efx_cxl.h:11:9: warning: 'EFX_CXL_H' is used as a header guard here, followed by #define of a different macro [-Wheader-guard]
      11 | #ifndef EFX_CXL_H
         |         ^~~~~~~~~
   drivers/net/ethernet/sfc/efx_cxl.h:12:9: note: 'EFX_CLX_H' is defined here; did you mean 'EFX_CXL_H'?
      12 | #define EFX_CLX_H
         |         ^~~~~~~~~
         |         EFX_CXL_H
   18 warnings generated.


vim +/EFX_CXL_H +11 drivers/net/ethernet/sfc/efx_cxl.h

  > 11	#ifndef EFX_CXL_H
    12	#define EFX_CLX_H
    13	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 10/15] cxl: define a driver interface for DPA allocation
  2024-07-15 17:28 ` [PATCH v2 10/15] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
@ 2024-07-16  3:32   ` kernel test robot
  2024-08-04 18:07   ` Jonathan Cameron
  2024-08-06 17:33   ` Fan Ni
  2 siblings, 0 replies; 114+ messages in thread
From: kernel test robot @ 2024-07-16  3:32 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: llvm, oe-kbuild-all, Alejandro Lucero

Hi,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.10 next-20240715]
[cannot apply to cxl/next cxl/pending horms-ipvs/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/alejandro-lucero-palau-amd-com/cxl-add-type2-device-basic-support/20240716-015920
base:   linus/master
patch link:    https://lore.kernel.org/r/20240715172835.24757-11-alejandro.lucero-palau%40amd.com
patch subject: [PATCH v2 10/15] cxl: define a driver interface for DPA allocation
config: s390-allmodconfig (https://download.01.org/0day-ci/archive/20240716/202407161159.KA2METLk-lkp@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project a0c6b8aef853eedaa0980f07c0a502a5a8a9740e)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240716/202407161159.KA2METLk-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202407161159.KA2METLk-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/cxl/core/hdm.c:612: warning: Function parameter or struct member 'is_ram' not described in 'cxl_request_dpa'
>> drivers/cxl/core/hdm.c:612: warning: Excess function parameter 'mode' description in 'cxl_request_dpa'


vim +612 drivers/cxl/core/hdm.c

   589	
   590	/**
   591	 * cxl_request_dpa - search and reserve DPA given input constraints
   592	 * @endpoint: an endpoint port with available decoders
   593	 * @mode: DPA operation mode (ram vs pmem)
   594	 * @min: the minimum amount of capacity the call needs
   595	 * @max: extra capacity to allocate after min is satisfied
   596	 *
   597	 * Given that a region needs to allocate from limited HPA capacity it
   598	 * may be the case that a device has more mappable DPA capacity than
   599	 * available HPA. So, the expectation is that @min is a driver known
   600	 * value for how much capacity is needed, and @max is based the limit of
   601	 * how much HPA space is available for a new region.
   602	 *
   603	 * Returns a pinned cxl_decoder with at least @min bytes of capacity
   604	 * reserved, or an error pointer. The caller is also expected to own the
   605	 * lifetime of the memdev registration associated with the endpoint to
   606	 * pin the decoder registered as well.
   607	 */
   608	struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
   609						     bool is_ram,
   610						     resource_size_t min,
   611						     resource_size_t max)
 > 612	{
   613		struct cxl_endpoint_decoder *cxled;
   614		enum cxl_decoder_mode mode;
   615		struct device *cxled_dev;
   616		resource_size_t alloc;
   617		int rc;
   618	
   619		if (!IS_ALIGNED(min | max, SZ_256M))
   620			return ERR_PTR(-EINVAL);
   621	
   622		down_read(&cxl_dpa_rwsem);
   623	
   624		cxled_dev = device_find_child(&endpoint->dev, NULL, find_free_decoder);
   625		if (!cxled_dev)
   626			cxled = ERR_PTR(-ENXIO);
   627		else
   628			cxled = to_cxl_endpoint_decoder(cxled_dev);
   629	
   630		up_read(&cxl_dpa_rwsem);
   631	
   632		if (IS_ERR(cxled))
   633			return cxled;
   634	
   635		if (is_ram)
   636			mode = CXL_DECODER_RAM;
   637		else
   638			mode = CXL_DECODER_PMEM;
   639	
   640		rc = cxl_dpa_set_mode(cxled, mode);
   641		if (rc)
   642			goto err;
   643	
   644		down_read(&cxl_dpa_rwsem);
   645		alloc = cxl_dpa_freespace(cxled, NULL, NULL);
   646		up_read(&cxl_dpa_rwsem);
   647	
   648		if (max)
   649			alloc = min(max, alloc);
   650		if (alloc < min) {
   651			rc = -ENOMEM;
   652			goto err;
   653		}
   654	
   655		rc = cxl_dpa_alloc(cxled, alloc);
   656		if (rc)
   657			goto err;
   658	
   659		return cxled;
   660	err:
   661		put_device(cxled_dev);
   662		return ERR_PTR(rc);
   663	}
   664	EXPORT_SYMBOL_NS_GPL(cxl_request_dpa, CXL);
   665	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 08/15] cxl: indicate probe deferral
  2024-07-15 17:28 ` [PATCH v2 08/15] cxl: indicate probe deferral alejandro.lucero-palau
@ 2024-07-16  5:52   ` Li, Ming4
  2024-07-16  8:10     ` Alejandro Lucero Palau
  2024-07-30 16:43   ` Fan Ni
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 114+ messages in thread
From: Li, Ming4 @ 2024-07-16  5:52 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: Alejandro Lucero

On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
>
> The first stop for a CXL accelerator driver that wants to establish new
> CXL.mem regions is to register a 'struct cxl_memdev. That kicks off
> cxl_mem_probe() to enumerate all 'struct cxl_port' instances in the
> topology up to the root.
>
> If the root driver has not attached yet the expectation is that the
> driver waits until that link is established. The common cxl_pci_driver
> has reason to keep the 'struct cxl_memdev' device attached to the bus
> until the root driver attaches. An accelerator may want to instead defer
> probing until CXL resources can be acquired.
>
> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
> accelerator driver probing should be defferred vs failed. Provide that
> indication via a new cxl_acquire_endpoint() API that can retrieve the
> probe status of the memdev.
>
> The first consumer of this API is a test driver that excercises the CXL
> Type-2 flow.
>
> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m18497367d2ae38f88e94c06369eaa83fa23e92b2
>
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c          | 41 ++++++++++++++++++++++++++++++
>  drivers/cxl/core/port.c            |  2 +-
>  drivers/cxl/mem.c                  |  7 +++--
>  drivers/net/ethernet/sfc/efx_cxl.c | 10 +++++++-
>  include/linux/cxl_accel_mem.h      |  3 +++
>  5 files changed, 59 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index b902948b121f..d51c8bfb32e3 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -1137,6 +1137,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>  }
>  EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
>  
> +/*
> + * Try to get a locked reference on a memdev's CXL port topology
> + * connection. Be careful to observe when cxl_mem_probe() has deposited
> + * a probe deferral awaiting the arrival of the CXL root driver
> +*/
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
> +{
> +	struct cxl_port *endpoint;
> +	int rc = -ENXIO;
> +
> +	device_lock(&cxlmd->dev);
> +	endpoint = cxlmd->endpoint;
> +	if (!endpoint)
> +		goto err;
> +
> +	if (IS_ERR(endpoint)) {
> +		rc = PTR_ERR(endpoint);
> +		goto err;
> +	}
> +
> +	device_lock(&endpoint->dev);
> +	if (!endpoint->dev.driver)
> +		goto err_endpoint;
> +
> +	return endpoint;
> +
> +err_endpoint:
> +	device_unlock(&endpoint->dev);
> +err:
> +	device_unlock(&cxlmd->dev);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
> +
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
> +{
> +	device_unlock(&endpoint->dev);
> +	device_unlock(&cxlmd->dev);
> +}
> +EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
> +
>  static void sanitize_teardown_notifier(void *data)
>  {
>  	struct cxl_memdev_state *mds = data;
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index d66c6349ed2d..3c6b896c5f65 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>  		 */
>  		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>  			dev_name(dport_dev));
> -		return -ENXIO;
> +		return -EPROBE_DEFER;
>  	}
>  
>  	parent_port = find_cxl_port(dparent, &parent_dport);
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index f76af75a87b7..383a6f4829d3 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -145,13 +145,16 @@ static int cxl_mem_probe(struct device *dev)
>  		return rc;
>  
>  	rc = devm_cxl_enumerate_ports(cxlmd);
> -	if (rc)
> +	if (rc) {
> +		cxlmd->endpoint = ERR_PTR(rc);
>  		return rc;
> +	}
>  
>  	parent_port = cxl_mem_find_port(cxlmd, &dport);
>  	if (!parent_port) {
>  		dev_err(dev, "CXL port topology not found\n");
> -		return -ENXIO;
> +		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
> +		return -EPROBE_DEFER;
>  	}
>  
>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 0abe66490ef5..2cf4837ddfc1 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -65,8 +65,16 @@ void efx_cxl_init(struct efx_nic *efx)
>  	}
>  
>  	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
> -	if (IS_ERR(cxl->cxlmd))
> +	if (IS_ERR(cxl->cxlmd)) {
>  		pci_info(pci_dev, "CXL accel memdev creation failed");
> +		return;
> +	}
> +
> +	cxl->endpoint = cxl_acquire_endpoint(cxl->cxlmd);
> +	if (IS_ERR(cxl->endpoint))
> +		pci_info(pci_dev, "CXL accel acquire endpoint failed");
> +
> +	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);

there is no need to invoke cxl_release_endpoint() if cxl_acquire_endpoint() failed. right?


>  }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index 442ed9862292..701910021df8 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -29,4 +29,7 @@ int cxl_await_media_ready(struct cxl_dev_state *cxlds);
>  
>  struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>  				       struct cxl_dev_state *cxlds);
> +
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
>  #endif



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-07-15 17:28 ` [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration alejandro.lucero-palau
  2024-07-16  0:53   ` kernel test robot
@ 2024-07-16  6:06   ` Li, Ming4
  2024-07-24  8:24     ` Alejandro Lucero Palau
  2024-08-04 17:57   ` Jonathan Cameron
  2 siblings, 1 reply; 114+ messages in thread
From: Li, Ming4 @ 2024-07-16  6:06 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: Alejandro Lucero

On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
>
> CXL region creation involves allocating capacity from device DPA
> (device-physical-address space) and assigning it to decode a given HPA
> (host-physical-address space). Before determining how much DPA to
> allocate the amount of available HPA must be determined. Also, not all
> HPA is create equal, some specifically targets RAM, some target PMEM,
> some is prepared for device-memory flows like HDM-D and HDM-DB, and some
> is host-only (HDM-H).
>
> Wrap all of those concerns into an API that retrieves a root decoder
> (platform CXL window) that fits the specified constraints and the
> capacity available for a new region.
>
> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m6fbe775541da3cd477d65fa95c8acdc347345b4f
>
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/region.c          | 161 +++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h                  |   3 +
>  drivers/cxl/cxlmem.h               |   5 +
>  drivers/net/ethernet/sfc/efx_cxl.c |  14 +++
>  include/linux/cxl_accel_mem.h      |   9 ++
>  5 files changed, 192 insertions(+)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 538ebd5a64fd..ca464bfef77b 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -702,6 +702,167 @@ static int free_hpa(struct cxl_region *cxlr)
>  	return 0;
>  }
>  
> +
> +struct cxlrd_max_context {
> +	struct device * const *host_bridges;
> +	int interleave_ways;
> +	unsigned long flags;
> +	resource_size_t max_hpa;
> +	struct cxl_root_decoder *cxlrd;
> +};
> +
> +static int find_max_hpa(struct device *dev, void *data)
> +{
> +	struct cxlrd_max_context *ctx = data;
> +	struct cxl_switch_decoder *cxlsd;
> +	struct cxl_root_decoder *cxlrd;
> +	struct resource *res, *prev;
> +	struct cxl_decoder *cxld;
> +	resource_size_t max;
> +	int found;
> +
> +	if (!is_root_decoder(dev))
> +		return 0;
> +
> +	cxlrd = to_cxl_root_decoder(dev);
> +	cxld = &cxlrd->cxlsd.cxld;
> +	if ((cxld->flags & ctx->flags) != ctx->flags) {
> +		dev_dbg(dev, "find_max_hpa, flags not matching: %08lx vs %08lx\n",
> +			      cxld->flags, ctx->flags);
> +		return 0;
> +	}
> +
> +	/* A Host bridge could have more interleave ways than an
> +	 * endpoint, couldn´t it?
> +	 *
> +	 * What does interleave ways mean here in terms of the requestor?
> +	 * Why the FFMWS has 0 interleave ways but root port has 1?
> +	 */
> +	if (cxld->interleave_ways != ctx->interleave_ways) {
> +		dev_dbg(dev, "find_max_hpa, interleave_ways  not matching\n");
> +		return 0;
> +	}
> +
> +	cxlsd = &cxlrd->cxlsd;
> +
> +	guard(rwsem_read)(&cxl_region_rwsem);
> +	found = 0;
> +	for (int i = 0; i < ctx->interleave_ways; i++)
> +		for (int j = 0; j < ctx->interleave_ways; j++)
> +			if (ctx->host_bridges[i] ==
> +					cxlsd->target[j]->dport_dev) {
> +				found++;
> +				break;
> +			}
> +
> +	if (found != ctx->interleave_ways) {
> +		dev_dbg(dev, "find_max_hpa, no interleave_ways found\n");
> +		return 0;
> +	}
> +
> +	/*
> +	 * Walk the root decoder resource range relying on cxl_region_rwsem to
> +	 * preclude sibling arrival/departure and find the largest free space
> +	 * gap.
> +	 */
> +	lockdep_assert_held_read(&cxl_region_rwsem);
> +	max = 0;
> +	res = cxlrd->res->child;
> +	if (!res)
> +		max = resource_size(cxlrd->res);
> +	else
> +		max = 0;
> +
> +	for (prev = NULL; res; prev = res, res = res->sibling) {
> +		struct resource *next = res->sibling;
> +		resource_size_t free = 0;
> +
> +		if (!prev && res->start > cxlrd->res->start) {
> +			free = res->start - cxlrd->res->start;
> +			max = max(free, max);
> +		}
> +		if (prev && res->start > prev->end + 1) {
> +			free = res->start - prev->end + 1;
> +			max = max(free, max);
> +		}
> +		if (next && res->end + 1 < next->start) {
> +			free = next->start - res->end + 1;
> +			max = max(free, max);
> +		}
> +		if (!next && res->end + 1 < cxlrd->res->end + 1) {
> +			free = cxlrd->res->end + 1 - res->end + 1;
> +			max = max(free, max);
> +		}
> +	}
> +
> +	if (max > ctx->max_hpa) {
> +		if (ctx->cxlrd)
> +			put_device(CXLRD_DEV(ctx->cxlrd));
> +		get_device(CXLRD_DEV(cxlrd));
> +		ctx->cxlrd = cxlrd;
> +		ctx->max_hpa = max;
> +		dev_info(CXLRD_DEV(cxlrd), "found %pa bytes of free space\n", &max);
> +	}
> +	return 0;
> +}
> +
> +/**
> + * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
> + * @endpoint: an endpoint that is mapped by the returned decoder
> + * @interleave_ways: number of entries in @host_bridges
> + * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
> + * @max: output parameter of bytes available in the returned decoder
> + *
> + * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
> + * is a point in time snapshot. If by the time the caller goes to use this root
> + * decoder's capacity the capacity is reduced then caller needs to loop and
> + * retry.
> + *
> + * The returned root decoder has an elevated reference count that needs to be
> + * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
> + * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
> + * does not race.
> + */
> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
> +					       int interleave_ways,
> +					       unsigned long flags,
> +					       resource_size_t *max)
> +{
> +
> +	struct cxlrd_max_context ctx = {
> +		.host_bridges = &endpoint->host_bridge,
> +		.interleave_ways = interleave_ways,
> +		.flags = flags,
> +	};
> +	struct cxl_port *root_port;
> +	struct cxl_root *root;
> +
> +	if (!is_cxl_endpoint(endpoint)) {
> +		dev_dbg(&endpoint->dev, "hpa requestor is not an endpoint\n");
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	root = find_cxl_root(endpoint);

Could use scope-based resource management  __free() here to drop below put_device(&root_port->dev);

e.g. struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(endpoint);


> +	if (!root) {
> +		dev_dbg(&endpoint->dev, "endpoint can not be related to a root port\n");
> +		return ERR_PTR(-ENXIO);
> +	}
> +
> +	root_port = &root->port;
> +	down_read(&cxl_region_rwsem);
> +	device_for_each_child(&root_port->dev, &ctx, find_max_hpa);
> +	up_read(&cxl_region_rwsem);
> +	put_device(&root_port->dev);
> +
> +	if (!ctx.cxlrd)
> +		return ERR_PTR(-ENOMEM);
> +
> +	*max = ctx.max_hpa;
> +	return ctx.cxlrd;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_get_hpa_freespace, CXL);
> +
> +
>  static ssize_t size_store(struct device *dev, struct device_attribute *attr,
>  			  const char *buf, size_t len)
>  {
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 9973430d975f..d3fdd2c1e066 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -770,6 +770,9 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
>  struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
>  struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev);
> +
> +#define CXLRD_DEV(cxlrd) &cxlrd->cxlsd.cxld.dev
> +
>  bool is_root_decoder(struct device *dev);
>  bool is_switch_decoder(struct device *dev);
>  bool is_endpoint_decoder(struct device *dev);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 8f2a820bd92d..a0e0795ec064 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -877,4 +877,9 @@ struct cxl_hdm {
>  struct seq_file;
>  struct dentry *cxl_debugfs_create_dir(const char *dir);
>  void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds);
> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
> +					       int interleave_ways,
> +					       unsigned long flags,
> +					       resource_size_t *max);
> +
>  #endif /* __CXL_MEM_H__ */
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 2cf4837ddfc1..6d49571ccff7 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -22,6 +22,7 @@ void efx_cxl_init(struct efx_nic *efx)
>  {
>  	struct pci_dev *pci_dev = efx->pci_dev;
>  	struct efx_cxl *cxl = efx->cxl;
> +	resource_size_t max = 0;
>  	struct resource res;
>  	u16 dvsec;
>  
> @@ -74,6 +75,19 @@ void efx_cxl_init(struct efx_nic *efx)
>  	if (IS_ERR(cxl->endpoint))
>  		pci_info(pci_dev, "CXL accel acquire endpoint failed");
>  
> +	cxl->cxlrd = cxl_get_hpa_freespace(cxl->endpoint, 1,
> +					    CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
> +					    &max);
> +
> +	if (IS_ERR(cxl->cxlrd)) {
> +		pci_info(pci_dev, "CXL accel get HPA failed");
> +		goto out;
> +	}
> +
> +	if (max < EFX_CTPIO_BUFFER_SIZE)
> +		pci_info(pci_dev, "CXL accel not enough free HPA space %llu < %u\n",
> +				  max, EFX_CTPIO_BUFFER_SIZE);
> +out:
>  	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
>  }
>  
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index 701910021df8..f3e77688ffe0 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -6,6 +6,10 @@
>  #ifndef __CXL_ACCEL_MEM_H
>  #define __CXL_ACCEL_MEM_H
>  
> +#define CXL_DECODER_F_RAM   BIT(0)
> +#define CXL_DECODER_F_PMEM  BIT(1)
> +#define CXL_DECODER_F_TYPE2 BIT(2)
> +
>  enum accel_resource{
>  	CXL_ACCEL_RES_DPA,
>  	CXL_ACCEL_RES_RAM,
> @@ -32,4 +36,9 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>  
>  struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
>  void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
> +
> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
> +					       int interleave_ways,
> +					       unsigned long flags,
> +					       resource_size_t *max);
>  #endif



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 02/15] cxl: add function for type2 cxl regs setup
  2024-07-15 17:28 ` [PATCH v2 02/15] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
@ 2024-07-16  6:26   ` Li, Ming4
  2024-08-14  7:46     ` Alejandro Lucero Palau
  2024-07-18 23:27   ` Dave Jiang
  2024-08-04 17:15   ` Jonathan Cameron
  2 siblings, 1 reply; 114+ messages in thread
From: Li, Ming4 @ 2024-07-16  6:26 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: Alejandro Lucero

On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
>
> Create a new function for a type2 device initialising the opaque
> cxl_dev_state struct regarding cxl regs setup and mapping.
>
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/pci.c                  | 28 ++++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/efx_cxl.c |  3 +++
>  include/linux/cxl_accel_mem.h      |  1 +
>  3 files changed, 32 insertions(+)
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index e53646e9f2fb..b34d6259faf4 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -11,6 +11,7 @@
>  #include <linux/pci.h>
>  #include <linux/aer.h>
>  #include <linux/io.h>
> +#include <linux/cxl_accel_mem.h>
>  #include "cxlmem.h"
>  #include "cxlpci.h"
>  #include "cxl.h"
> @@ -521,6 +522,33 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>  	return cxl_setup_regs(map);
>  }
>  
> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_register_map map;
> +	int rc;
> +
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_map_device_regs(&map, &cxlds->regs.device_regs);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> +				&cxlds->reg_map);
> +	if (rc)
> +		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
> +
> +	rc = cxl_map_component_regs(&cxlds->reg_map, &cxlds->regs.component,
> +				    BIT(CXL_CM_CAP_CAP_ID_RAS));
> +	if (rc)
> +		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_pci_accel_setup_regs, CXL);
> +

My first feeling is that above function should be provided by cxl_core rather than cxl_pci.

Let's see if Dan has comments on that.


>  static int cxl_pci_ras_unmask(struct pci_dev *pdev)
>  {
>  	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 4554dd7cca76..10c4fb915278 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -47,6 +47,9 @@ void efx_cxl_init(struct efx_nic *efx)
>  
>  	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>  	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
> +
> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
> +		pci_info(pci_dev, "CXL accel setup regs failed");
>  }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index daf46d41f59c..ca7af4a9cefc 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -19,4 +19,5 @@ void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>  void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>  void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>  			    enum accel_resource);
> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>  #endif



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 11/15] cxl: make region type based on endpoint type
  2024-07-15 17:28 ` [PATCH v2 11/15] cxl: make region type based on endpoint type alejandro.lucero-palau
@ 2024-07-16  7:14   ` Li, Ming4
  2024-07-16  8:13     ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Li, Ming4 @ 2024-07-16  7:14 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: Alejandro Lucero

On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
>
> Current code is expecting Type3 or CXL_DECODER_HOSTONLYMEM devices only.
> Suport for Type2 implies region type needs to be based on the endpoint
> type instead.
>
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/region.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index ca464bfef77b..5cc71b8868bc 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2645,7 +2645,8 @@ static ssize_t create_ram_region_show(struct device *dev,
>  }
>  
>  static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
> -					  enum cxl_decoder_mode mode, int id)
> +					  enum cxl_decoder_mode mode, int id,
> +					  enum cxl_decoder_type target_type)
>  {
>  	int rc;
>  
> @@ -2667,7 +2668,7 @@ static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
>  		return ERR_PTR(-EBUSY);
>  	}
>  
> -	return devm_cxl_add_region(cxlrd, id, mode, CXL_DECODER_HOSTONLYMEM);
> +	return devm_cxl_add_region(cxlrd, id, mode, target_type);
>  }
>  
>  static ssize_t create_pmem_region_store(struct device *dev,
> @@ -2682,7 +2683,8 @@ static ssize_t create_pmem_region_store(struct device *dev,
>  	if (rc != 1)
>  		return -EINVAL;
>  
> -	cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id);
> +	cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id,
> +			       CXL_DECODER_HOSTONLYMEM);
>  	if (IS_ERR(cxlr))
>  		return PTR_ERR(cxlr);
>  
> @@ -2702,7 +2704,8 @@ static ssize_t create_ram_region_store(struct device *dev,
>  	if (rc != 1)
>  		return -EINVAL;
>  
> -	cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id);
> +	cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id,
> +			       CXL_DECODER_HOSTONLYMEM);
>  	if (IS_ERR(cxlr))
>  		return PTR_ERR(cxlr);
>  
> @@ -3364,7 +3367,8 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  
>  	do {
>  		cxlr = __create_region(cxlrd, cxled->mode,
> -				       atomic_read(&cxlrd->region_id));
> +				       atomic_read(&cxlrd->region_id),
> +				       cxled->cxld.target_type);
>  	} while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
>  
>  	if (IS_ERR(cxlr)) {

I think that one more check between the type of root decoder and endpoint decoder is necessary in this case. Currently, root decoder type is hard coded to CXL_DECODER_HOSTONLYMEM, but it should be CXL_DECODER_DEVMEM or CXL_DECODER_HOSTONLYMEM based on cfmws->restrictions.




^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 08/15] cxl: indicate probe deferral
  2024-07-16  5:52   ` Li, Ming4
@ 2024-07-16  8:10     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-07-16  8:10 UTC (permalink / raw)
  To: Li, Ming4, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 7/16/24 06:52, Li, Ming4 wrote:
> On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> The first stop for a CXL accelerator driver that wants to establish new
>> CXL.mem regions is to register a 'struct cxl_memdev. That kicks off
>> cxl_mem_probe() to enumerate all 'struct cxl_port' instances in the
>> topology up to the root.
>>
>> If the root driver has not attached yet the expectation is that the
>> driver waits until that link is established. The common cxl_pci_driver
>> has reason to keep the 'struct cxl_memdev' device attached to the bus
>> until the root driver attaches. An accelerator may want to instead defer
>> probing until CXL resources can be acquired.
>>
>> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
>> accelerator driver probing should be defferred vs failed. Provide that
>> indication via a new cxl_acquire_endpoint() API that can retrieve the
>> probe status of the memdev.
>>
>> The first consumer of this API is a test driver that excercises the CXL
>> Type-2 flow.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m18497367d2ae38f88e94c06369eaa83fa23e92b2
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>   drivers/cxl/core/memdev.c          | 41 ++++++++++++++++++++++++++++++
>>   drivers/cxl/core/port.c            |  2 +-
>>   drivers/cxl/mem.c                  |  7 +++--
>>   drivers/net/ethernet/sfc/efx_cxl.c | 10 +++++++-
>>   include/linux/cxl_accel_mem.h      |  3 +++
>>   5 files changed, 59 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index b902948b121f..d51c8bfb32e3 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -1137,6 +1137,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>>   }
>>   EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
>>   
>> +/*
>> + * Try to get a locked reference on a memdev's CXL port topology
>> + * connection. Be careful to observe when cxl_mem_probe() has deposited
>> + * a probe deferral awaiting the arrival of the CXL root driver
>> +*/
>> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
>> +{
>> +	struct cxl_port *endpoint;
>> +	int rc = -ENXIO;
>> +
>> +	device_lock(&cxlmd->dev);
>> +	endpoint = cxlmd->endpoint;
>> +	if (!endpoint)
>> +		goto err;
>> +
>> +	if (IS_ERR(endpoint)) {
>> +		rc = PTR_ERR(endpoint);
>> +		goto err;
>> +	}
>> +
>> +	device_lock(&endpoint->dev);
>> +	if (!endpoint->dev.driver)
>> +		goto err_endpoint;
>> +
>> +	return endpoint;
>> +
>> +err_endpoint:
>> +	device_unlock(&endpoint->dev);
>> +err:
>> +	device_unlock(&cxlmd->dev);
>> +	return ERR_PTR(rc);
>> +}
>> +EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
>> +
>> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
>> +{
>> +	device_unlock(&endpoint->dev);
>> +	device_unlock(&cxlmd->dev);
>> +}
>> +EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
>> +
>>   static void sanitize_teardown_notifier(void *data)
>>   {
>>   	struct cxl_memdev_state *mds = data;
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index d66c6349ed2d..3c6b896c5f65 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>>   		 */
>>   		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>>   			dev_name(dport_dev));
>> -		return -ENXIO;
>> +		return -EPROBE_DEFER;
>>   	}
>>   
>>   	parent_port = find_cxl_port(dparent, &parent_dport);
>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>> index f76af75a87b7..383a6f4829d3 100644
>> --- a/drivers/cxl/mem.c
>> +++ b/drivers/cxl/mem.c
>> @@ -145,13 +145,16 @@ static int cxl_mem_probe(struct device *dev)
>>   		return rc;
>>   
>>   	rc = devm_cxl_enumerate_ports(cxlmd);
>> -	if (rc)
>> +	if (rc) {
>> +		cxlmd->endpoint = ERR_PTR(rc);
>>   		return rc;
>> +	}
>>   
>>   	parent_port = cxl_mem_find_port(cxlmd, &dport);
>>   	if (!parent_port) {
>>   		dev_err(dev, "CXL port topology not found\n");
>> -		return -ENXIO;
>> +		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
>> +		return -EPROBE_DEFER;
>>   	}
>>   
>>   	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index 0abe66490ef5..2cf4837ddfc1 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -65,8 +65,16 @@ void efx_cxl_init(struct efx_nic *efx)
>>   	}
>>   
>>   	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
>> -	if (IS_ERR(cxl->cxlmd))
>> +	if (IS_ERR(cxl->cxlmd)) {
>>   		pci_info(pci_dev, "CXL accel memdev creation failed");
>> +		return;
>> +	}
>> +
>> +	cxl->endpoint = cxl_acquire_endpoint(cxl->cxlmd);
>> +	if (IS_ERR(cxl->endpoint))
>> +		pci_info(pci_dev, "CXL accel acquire endpoint failed");
>> +
>> +	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
> there is no need to invoke cxl_release_endpoint() if cxl_acquire_endpoint() failed. right?
>
>

Right. BTW,  I do that in a following patch.

I should just add the functions to the CXL core here, and to use them in 
a subsequent patch where it makes sense.

Thanks


>>   }
>>   
>>   
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> index 442ed9862292..701910021df8 100644
>> --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -29,4 +29,7 @@ int cxl_await_media_ready(struct cxl_dev_state *cxlds);
>>   
>>   struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>>   				       struct cxl_dev_state *cxlds);
>> +
>> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
>> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
>>   #endif
>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 11/15] cxl: make region type based on endpoint type
  2024-07-16  7:14   ` Li, Ming4
@ 2024-07-16  8:13     ` Alejandro Lucero Palau
  2024-08-28 16:06       ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-07-16  8:13 UTC (permalink / raw)
  To: Li, Ming4, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 7/16/24 08:14, Li, Ming4 wrote:
> On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Current code is expecting Type3 or CXL_DECODER_HOSTONLYMEM devices only.
>> Suport for Type2 implies region type needs to be based on the endpoint
>> type instead.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/region.c | 14 +++++++++-----
>>   1 file changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index ca464bfef77b..5cc71b8868bc 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -2645,7 +2645,8 @@ static ssize_t create_ram_region_show(struct device *dev,
>>   }
>>   
>>   static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
>> -					  enum cxl_decoder_mode mode, int id)
>> +					  enum cxl_decoder_mode mode, int id,
>> +					  enum cxl_decoder_type target_type)
>>   {
>>   	int rc;
>>   
>> @@ -2667,7 +2668,7 @@ static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
>>   		return ERR_PTR(-EBUSY);
>>   	}
>>   
>> -	return devm_cxl_add_region(cxlrd, id, mode, CXL_DECODER_HOSTONLYMEM);
>> +	return devm_cxl_add_region(cxlrd, id, mode, target_type);
>>   }
>>   
>>   static ssize_t create_pmem_region_store(struct device *dev,
>> @@ -2682,7 +2683,8 @@ static ssize_t create_pmem_region_store(struct device *dev,
>>   	if (rc != 1)
>>   		return -EINVAL;
>>   
>> -	cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id);
>> +	cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id,
>> +			       CXL_DECODER_HOSTONLYMEM);
>>   	if (IS_ERR(cxlr))
>>   		return PTR_ERR(cxlr);
>>   
>> @@ -2702,7 +2704,8 @@ static ssize_t create_ram_region_store(struct device *dev,
>>   	if (rc != 1)
>>   		return -EINVAL;
>>   
>> -	cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id);
>> +	cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id,
>> +			       CXL_DECODER_HOSTONLYMEM);
>>   	if (IS_ERR(cxlr))
>>   		return PTR_ERR(cxlr);
>>   
>> @@ -3364,7 +3367,8 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>   
>>   	do {
>>   		cxlr = __create_region(cxlrd, cxled->mode,
>> -				       atomic_read(&cxlrd->region_id));
>> +				       atomic_read(&cxlrd->region_id),
>> +				       cxled->cxld.target_type);
>>   	} while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
>>   
>>   	if (IS_ERR(cxlr)) {
> I think that one more check between the type of root decoder and endpoint decoder is necessary in this case. Currently, root decoder type is hard coded to CXL_DECODER_HOSTONLYMEM, but it should be CXL_DECODER_DEVMEM or CXL_DECODER_HOSTONLYMEM based on cfmws->restrictions.
>

I think you are completely right.

I will work on this looking also for other implications.

Thanks


>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-07-15 18:48   ` Andrew Lunn
@ 2024-07-16  8:50     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-07-16  8:50 UTC (permalink / raw)
  To: Andrew Lunn, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 7/15/24 19:48, Andrew Lunn wrote:
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -0,0 +1,22 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>> +
>> +#include <linux/cdev.h>
> That is generally a red flag that something not good is about to be
> found. But it does not appear to be used in this patch....
>
>         Andrew
>

I have no explanation about how it ended up there. I suspect it comes 
from V1 --> V2 transition. cxlmem.h includes it and V1 was moving that 
file to include/linux.


Anyway, I'll get rid of it.

Thanks


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-07-15 17:28 ` [PATCH v2 01/15] cxl: add type2 device basic support alejandro.lucero-palau
  2024-07-15 18:48   ` Andrew Lunn
  2024-07-16  1:57   ` kernel test robot
@ 2024-07-18 23:12   ` Dave Jiang
  2024-07-19  6:03     ` Alejandro Lucero Palau
  2024-08-04 17:10   ` Jonathan Cameron
  2024-08-09  8:34   ` Zhi Wang
  4 siblings, 1 reply; 114+ messages in thread
From: Dave Jiang @ 2024-07-18 23:12 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: Alejandro Lucero



On 7/15/24 10:28 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Differientiate Type3, aka memory expanders, from Type2, aka device
> accelerators, with a new function for initializing cxl_dev_state.
> 
> Create opaque struct to be used by accelerators relying on new access
> functions in following patches.
> 
> Add SFC ethernet network driver as the client.
> 
> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c             | 52 ++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/Makefile     |  2 +-
>  drivers/net/ethernet/sfc/efx.c        |  4 ++
>  drivers/net/ethernet/sfc/efx_cxl.c    | 53 +++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/efx_cxl.h    | 29 +++++++++++++++
>  drivers/net/ethernet/sfc/net_driver.h |  4 ++
>  include/linux/cxl_accel_mem.h         | 22 +++++++++++
>  include/linux/cxl_accel_pci.h         | 23 ++++++++++++

Maybe create an include/linux/cxl and then we can put headers in there.

>  8 files changed, 188 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>  create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
>  create mode 100644 include/linux/cxl_accel_mem.h
>  create mode 100644 include/linux/cxl_accel_pci.h
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 0277726afd04..61b5d35b49e7 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -8,6 +8,7 @@
>  #include <linux/idr.h>
>  #include <linux/pci.h>
>  #include <cxlmem.h>
> +#include <linux/cxl_accel_mem.h>
>  #include "trace.h"
>  #include "core.h"
>  
> @@ -615,6 +616,25 @@ static void detach_memdev(struct work_struct *work)
>  
>  static struct lock_class_key cxl_memdev_key;
>  
> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
> +{
> +	struct cxl_dev_state *cxlds;
> +
> +	cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);

Naked cxlds. Do you think you'll need an accel_dev_state to wrap around cxl_dev_state similar to cxl_memdev_state in order to store accel related information? I also wonder if 'struct cxl_dev_state' should be a public definition. Need to look at the rest of the patchset to circle back. 

> +	if (!cxlds)
> +		return ERR_PTR(-ENOMEM);
> +
> +	cxlds->dev = dev;
> +	cxlds->type = CXL_DEVTYPE_DEVMEM;
> +
> +	cxlds->dpa_res = DEFINE_RES_MEM_NAMED(0, 0, "dpa");
> +	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
> +	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
> +
> +	return cxlds;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);

I do wonder if we should have a common device state init helper function to init all the common bits:
int cxlds_init(struct *dev, enum cxl_devtype devtype)


> +
>  static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>  					   const struct file_operations *fops)
>  {
> @@ -692,6 +712,38 @@ static int cxl_memdev_open(struct inode *inode, struct file *file)
>  	return 0;
>  }
>  
> +
> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
> +{
> +	cxlds->cxl_dvsec = dvsec;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_dvsec, CXL);
> +
> +void cxl_accel_set_serial(struct cxl_dev_state *cxlds, u64 serial)
> +{
> +	cxlds->serial= serial;

Missing space before '='
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_serial, CXL);
> +
> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> +			    enum accel_resource type)
> +{
> +	switch (type) {
> +	case CXL_ACCEL_RES_DPA:
> +		cxlds->dpa_res = res;
> +		return;
> +	case CXL_ACCEL_RES_RAM:
> +		cxlds->ram_res = res;
> +		return;
> +	case CXL_ACCEL_RES_PMEM:
> +		cxlds->pmem_res = res;
> +		return;
> +	default:
> +		dev_err(cxlds->dev, "unkown resource type (%u)\n", type);
> +	}
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
> +
>  static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>  {
>  	struct cxl_memdev *cxlmd =
> diff --git a/drivers/net/ethernet/sfc/Makefile b/drivers/net/ethernet/sfc/Makefile
> index 8f446b9bd5ee..e80c713c3b0c 100644
> --- a/drivers/net/ethernet/sfc/Makefile
> +++ b/drivers/net/ethernet/sfc/Makefile
> @@ -7,7 +7,7 @@ sfc-y			+= efx.o efx_common.o efx_channels.o nic.o \
>  			   mcdi_functions.o mcdi_filters.o mcdi_mon.o \
>  			   ef100.o ef100_nic.o ef100_netdev.o \
>  			   ef100_ethtool.o ef100_rx.o ef100_tx.o \
> -			   efx_devlink.o
> +			   efx_devlink.o efx_cxl.o
>  sfc-$(CONFIG_SFC_MTD)	+= mtd.o
>  sfc-$(CONFIG_SFC_SRIOV)	+= sriov.o ef10_sriov.o ef100_sriov.o ef100_rep.o \
>                             mae.o tc.o tc_bindings.o tc_counters.o \
> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
> index e9d9de8e648a..cb3f74d30852 100644
> --- a/drivers/net/ethernet/sfc/efx.c
> +++ b/drivers/net/ethernet/sfc/efx.c
> @@ -33,6 +33,7 @@
>  #include "selftest.h"
>  #include "sriov.h"
>  #include "efx_devlink.h"
> +#include "efx_cxl.h"
>  
>  #include "mcdi_port_common.h"
>  #include "mcdi_pcol.h"
> @@ -899,6 +900,7 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>  	efx_pci_remove_main(efx);
>  
>  	efx_fini_io(efx);
> +

stray blank line

>  	pci_dbg(efx->pci_dev, "shutdown successful\n");
>  
>  	efx_fini_devlink_and_unlock(efx);
> @@ -1109,6 +1111,8 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
>  	if (rc)
>  		goto fail2;
>  
> +	efx_cxl_init(efx);

No error checks? Does the device expect to work whether CXL is setup or not?

> +
>  	rc = efx_pci_probe_post_io(efx);
>  	if (rc) {
>  		/* On failure, retry once immediately.
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> new file mode 100644
> index 000000000000..4554dd7cca76
> --- /dev/null
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -0,0 +1,53 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/****************************************************************************
> + * Driver for AMD network controllers and boards
> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +
> +#include <linux/pci.h>
> +#include <linux/cxl_accel_mem.h>
> +#include <linux/cxl_accel_pci.h>
> +
> +#include "net_driver.h"
> +#include "efx_cxl.h"
> +
> +#define EFX_CTPIO_BUFFER_SIZE	(1024*1024*256)
> +
> +void efx_cxl_init(struct efx_nic *efx)
> +{
> +	struct pci_dev *pci_dev = efx->pci_dev;
> +	struct efx_cxl *cxl = efx->cxl;
> +	struct resource res;
> +	u16 dvsec;
> +
> +	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
> +					  CXL_DVSEC_PCIE_DEVICE);
> +
> +	if (!dvsec)
> +		return;
> +
> +	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability found");

Seem like unnecessary kern log emission

> +
> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
> +	if (IS_ERR(cxl->cxlds)) {
> +		pci_info(pci_dev, "CXL accel device state failed");

pci_err()? or maybe pci_warn() given it's ignoring error returns. 
> +		return;
> +	}
> +
> +	cxl_accel_set_dvsec(cxl->cxlds, dvsec);
> +	cxl_accel_set_serial(cxl->cxlds, pci_dev->dev.id);
> +
> +	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
> +	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_DPA);
> +
> +	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
> +	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
> +}
> +
> +
> +MODULE_IMPORT_NS(CXL);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
> new file mode 100644
> index 000000000000..76c6794c20d8
> --- /dev/null
> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
> @@ -0,0 +1,29 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/****************************************************************************
> + * Driver for AMD network controllers and boards
> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +#ifndef EFX_CXL_H
> +#define EFX_CLX_H
> +
> +#include <linux/cxl_accel_mem.h>
> +
> +struct efx_nic;
> +
> +struct efx_cxl {
> +	cxl_accel_state *cxlds;
> +	struct cxl_memdev *cxlmd;
> +	struct cxl_root_decoder *cxlrd;
> +	struct cxl_port *endpoint;
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_region *efx_region;
> +	void __iomem *ctpio_cxl;
> +};
> +
> +void efx_cxl_init(struct efx_nic *efx);
> +#endif
> diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
> index f2dd7feb0e0c..58b7517afea4 100644
> --- a/drivers/net/ethernet/sfc/net_driver.h
> +++ b/drivers/net/ethernet/sfc/net_driver.h
> @@ -814,6 +814,8 @@ enum efx_xdp_tx_queues_mode {
>  
>  struct efx_mae;
>  
> +struct efx_cxl;
> +
>  /**
>   * struct efx_nic - an Efx NIC
>   * @name: Device name (net device name or bus id before net device registered)
> @@ -962,6 +964,7 @@ struct efx_mae;
>   * @tc: state for TC offload (EF100).
>   * @devlink: reference to devlink structure owned by this device
>   * @dl_port: devlink port associated with the PF
> + * @cxl: details of related cxl objects
>   * @mem_bar: The BAR that is mapped into membase.
>   * @reg_base: Offset from the start of the bar to the function control window.
>   * @monitor_work: Hardware monitor workitem
> @@ -1148,6 +1151,7 @@ struct efx_nic {
>  
>  	struct devlink *devlink;
>  	struct devlink_port *dl_port;
> +	struct efx_cxl *cxl;
>  	unsigned int mem_bar;
>  	u32 reg_base;
>  
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> new file mode 100644
> index 000000000000..daf46d41f59c
> --- /dev/null
> +++ b/include/linux/cxl_accel_mem.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> +
> +#include <linux/cdev.h>

Don't think this header is needed?

> +
> +#ifndef __CXL_ACCEL_MEM_H
> +#define __CXL_ACCEL_MEM_H
> +
> +enum accel_resource{
> +	CXL_ACCEL_RES_DPA,
> +	CXL_ACCEL_RES_RAM,
> +	CXL_ACCEL_RES_PMEM,
> +};
> +
> +typedef struct cxl_dev_state cxl_accel_state;
Please use 'struct cxl_dev_state' directly. There's no good reason to hide the type.

> +cxl_accel_state *cxl_accel_state_create(struct device *dev);
> +
> +void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
> +void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> +			    enum accel_resource);
> +#endif
> diff --git a/include/linux/cxl_accel_pci.h b/include/linux/cxl_accel_pci.h
> new file mode 100644
> index 000000000000..c337ae8797e6
> --- /dev/null
> +++ b/include/linux/cxl_accel_pci.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> +
> +#ifndef __CXL_ACCEL_PCI_H
> +#define __CXL_ACCEL_PCI_H
> +
> +/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
> +#define CXL_DVSEC_PCIE_DEVICE					0
> +#define   CXL_DVSEC_CAP_OFFSET		0xA
> +#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
> +#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
> +#define   CXL_DVSEC_CTRL_OFFSET		0xC
> +#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
> +#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
> +#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
> +#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
> +#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
> +#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
> +#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
> +#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
> +#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)

This looks like a copy/paste of drivers/cxl/cxlpci.h definition. I suggest create a include/linux/cxl/pci.h and stick it in there and delete the copy in cxlpci.h. Also update the CXL spec version to latest (3.1) if you don't mind if we are going to move it. 
> +
> +#endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 02/15] cxl: add function for type2 cxl regs setup
  2024-07-15 17:28 ` [PATCH v2 02/15] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
  2024-07-16  6:26   ` Li, Ming4
@ 2024-07-18 23:27   ` Dave Jiang
  2024-08-14  7:49     ` Alejandro Lucero Palau
  2024-08-04 17:15   ` Jonathan Cameron
  2 siblings, 1 reply; 114+ messages in thread
From: Dave Jiang @ 2024-07-18 23:27 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: Alejandro Lucero



On 7/15/24 10:28 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Create a new function for a type2 device initialising the opaque
> cxl_dev_state struct regarding cxl regs setup and mapping.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/pci.c                  | 28 ++++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/efx_cxl.c |  3 +++
>  include/linux/cxl_accel_mem.h      |  1 +
>  3 files changed, 32 insertions(+)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index e53646e9f2fb..b34d6259faf4 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -11,6 +11,7 @@
>  #include <linux/pci.h>
>  #include <linux/aer.h>
>  #include <linux/io.h>
> +#include <linux/cxl_accel_mem.h>
>  #include "cxlmem.h"
>  #include "cxlpci.h"
>  #include "cxl.h"
> @@ -521,6 +522,33 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>  	return cxl_setup_regs(map);
>  }
>  
> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)

Function should go into cxl/core/pci.c

> +{
> +	struct cxl_register_map map;
> +	int rc;
> +
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_map_device_regs(&map, &cxlds->regs.device_regs);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> +				&cxlds->reg_map);
> +	if (rc)
> +		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
> +
> +	rc = cxl_map_component_regs(&cxlds->reg_map, &cxlds->regs.component,
> +				    BIT(CXL_CM_CAP_CAP_ID_RAS));
> +	if (rc)
> +		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");

dev_warn()? also maybe add the errno in the error emissioni. 

> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_pci_accel_setup_regs, CXL);
> +
>  static int cxl_pci_ras_unmask(struct pci_dev *pdev)
>  {
>  	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 4554dd7cca76..10c4fb915278 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -47,6 +47,9 @@ void efx_cxl_init(struct efx_nic *efx)
>  
>  	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>  	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
> +
> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
> +		pci_info(pci_dev, "CXL accel setup regs failed");

pci_warn()? although seems unnecesary since error emitted in cxl_pci_accel_setup_regs(). 

>  }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index daf46d41f59c..ca7af4a9cefc 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -19,4 +19,5 @@ void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>  void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>  void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>  			    enum accel_resource);
> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>  #endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 03/15] cxl: add function for type2 resource request
  2024-07-15 17:28 ` [PATCH v2 03/15] cxl: add function for type2 resource request alejandro.lucero-palau
@ 2024-07-18 23:36   ` Dave Jiang
  2024-08-04 17:16     ` Jonathan Cameron
  2024-08-14  8:00     ` Alejandro Lucero Palau
  2024-08-09  9:01   ` Zhi Wang
  2024-08-22 13:07   ` Zhi Wang
  2 siblings, 2 replies; 114+ messages in thread
From: Dave Jiang @ 2024-07-18 23:36 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: Alejandro Lucero



On 7/15/24 10:28 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Create a new function for a type2 device requesting a resource
> passing the opaque struct to work with.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/memdev.c          | 13 +++++++++++++
>  drivers/net/ethernet/sfc/efx_cxl.c |  7 ++++++-
>  include/linux/cxl_accel_mem.h      |  1 +
>  3 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 61b5d35b49e7..04c3a0f8bc2e 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -744,6 +744,19 @@ void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
>  
> +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram)
Maybe declare a common enum like cxl_resource_type instead of 'enum accel_resource' and use here instead of bool?

> +{
> +	int rc;
> +
> +	if (is_ram)
> +		rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
> +	else
> +		rc = request_resource(&cxlds->dpa_res, &cxlds->pmem_res);
> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
> +
>  static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>  {
>  	struct cxl_memdev *cxlmd =
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 10c4fb915278..9cefcaf3caca 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -48,8 +48,13 @@ void efx_cxl_init(struct efx_nic *efx)
>  	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>  	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
>  
> -	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds)) {
>  		pci_info(pci_dev, "CXL accel setup regs failed");
> +		return;
> +	}
> +
> +	if (cxl_accel_request_resource(cxl->cxlds, true))
> +		pci_info(pci_dev, "CXL accel resource request failed");

pci_warn()? also emitting the errno would be nice. 

>  }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index ca7af4a9cefc..c7b254edc096 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -20,4 +20,5 @@ void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>  void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>  			    enum accel_resource);
>  int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
> +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
>  #endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-07-18 23:12   ` Dave Jiang
@ 2024-07-19  6:03     ` Alejandro Lucero Palau
  2024-08-04 16:44       ` Jonathan Cameron
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-07-19  6:03 UTC (permalink / raw)
  To: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 7/19/24 00:12, Dave Jiang wrote:
>
> On 7/15/24 10:28 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Differientiate Type3, aka memory expanders, from Type2, aka device
>> accelerators, with a new function for initializing cxl_dev_state.
>>
>> Create opaque struct to be used by accelerators relying on new access
>> functions in following patches.
>>
>> Add SFC ethernet network driver as the client.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>   drivers/cxl/core/memdev.c             | 52 ++++++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/Makefile     |  2 +-
>>   drivers/net/ethernet/sfc/efx.c        |  4 ++
>>   drivers/net/ethernet/sfc/efx_cxl.c    | 53 +++++++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/efx_cxl.h    | 29 +++++++++++++++
>>   drivers/net/ethernet/sfc/net_driver.h |  4 ++
>>   include/linux/cxl_accel_mem.h         | 22 +++++++++++
>>   include/linux/cxl_accel_pci.h         | 23 ++++++++++++
> Maybe create an include/linux/cxl and then we can put headers in there.
>
>>   8 files changed, 188 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
>>   create mode 100644 include/linux/cxl_accel_mem.h
>>   create mode 100644 include/linux/cxl_accel_pci.h
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index 0277726afd04..61b5d35b49e7 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -8,6 +8,7 @@
>>   #include <linux/idr.h>
>>   #include <linux/pci.h>
>>   #include <cxlmem.h>
>> +#include <linux/cxl_accel_mem.h>
>>   #include "trace.h"
>>   #include "core.h"
>>   
>> @@ -615,6 +616,25 @@ static void detach_memdev(struct work_struct *work)
>>   
>>   static struct lock_class_key cxl_memdev_key;
>>   
>> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
>> +{
>> +	struct cxl_dev_state *cxlds;
>> +
>> +	cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
> Naked cxlds. Do you think you'll need an accel_dev_state to wrap around cxl_dev_state similar to cxl_memdev_state in order to store accel related information? I also wonder if 'struct cxl_dev_state' should be a public definition. Need to look at the rest of the patchset to circle back.
>

Not sure I understand your concern. Are you saying we need to introduce 
an cxl_accel_state struct? Fro my work and I guess from Dan's original 
patch, it seems it is not needed, although I have already raised my 
concerns about, maybe, current structs requiring a refactoring due to 
the optional capabilities for Type2.

Regarding if cxl_dev_state needs to be public, this patchet version 
defines it as opaque for addressing the concerns about accel drivers 
need to be "controlled".


>> +	if (!cxlds)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	cxlds->dev = dev;
>> +	cxlds->type = CXL_DEVTYPE_DEVMEM;
>> +
>> +	cxlds->dpa_res = DEFINE_RES_MEM_NAMED(0, 0, "dpa");
>> +	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
>> +	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
>> +
>> +	return cxlds;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
> I do wonder if we should have a common device state init helper function to init all the common bits:
> int cxlds_init(struct *dev, enum cxl_devtype devtype)
>
>
>> +
>>   static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>>   					   const struct file_operations *fops)
>>   {
>> @@ -692,6 +712,38 @@ static int cxl_memdev_open(struct inode *inode, struct file *file)
>>   	return 0;
>>   }
>>   
>> +
>> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>> +{
>> +	cxlds->cxl_dvsec = dvsec;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_dvsec, CXL);
>> +
>> +void cxl_accel_set_serial(struct cxl_dev_state *cxlds, u64 serial)
>> +{
>> +	cxlds->serial= serial;
> Missing space before '='
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_serial, CXL);
>> +
>> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>> +			    enum accel_resource type)
>> +{
>> +	switch (type) {
>> +	case CXL_ACCEL_RES_DPA:
>> +		cxlds->dpa_res = res;
>> +		return;
>> +	case CXL_ACCEL_RES_RAM:
>> +		cxlds->ram_res = res;
>> +		return;
>> +	case CXL_ACCEL_RES_PMEM:
>> +		cxlds->pmem_res = res;
>> +		return;
>> +	default:
>> +		dev_err(cxlds->dev, "unkown resource type (%u)\n", type);
>> +	}
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
>> +
>>   static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>>   {
>>   	struct cxl_memdev *cxlmd =
>> diff --git a/drivers/net/ethernet/sfc/Makefile b/drivers/net/ethernet/sfc/Makefile
>> index 8f446b9bd5ee..e80c713c3b0c 100644
>> --- a/drivers/net/ethernet/sfc/Makefile
>> +++ b/drivers/net/ethernet/sfc/Makefile
>> @@ -7,7 +7,7 @@ sfc-y			+= efx.o efx_common.o efx_channels.o nic.o \
>>   			   mcdi_functions.o mcdi_filters.o mcdi_mon.o \
>>   			   ef100.o ef100_nic.o ef100_netdev.o \
>>   			   ef100_ethtool.o ef100_rx.o ef100_tx.o \
>> -			   efx_devlink.o
>> +			   efx_devlink.o efx_cxl.o
>>   sfc-$(CONFIG_SFC_MTD)	+= mtd.o
>>   sfc-$(CONFIG_SFC_SRIOV)	+= sriov.o ef10_sriov.o ef100_sriov.o ef100_rep.o \
>>                              mae.o tc.o tc_bindings.o tc_counters.o \
>> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
>> index e9d9de8e648a..cb3f74d30852 100644
>> --- a/drivers/net/ethernet/sfc/efx.c
>> +++ b/drivers/net/ethernet/sfc/efx.c
>> @@ -33,6 +33,7 @@
>>   #include "selftest.h"
>>   #include "sriov.h"
>>   #include "efx_devlink.h"
>> +#include "efx_cxl.h"
>>   
>>   #include "mcdi_port_common.h"
>>   #include "mcdi_pcol.h"
>> @@ -899,6 +900,7 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>>   	efx_pci_remove_main(efx);
>>   
>>   	efx_fini_io(efx);
>> +
> stray blank line
>
>>   	pci_dbg(efx->pci_dev, "shutdown successful\n");
>>   
>>   	efx_fini_devlink_and_unlock(efx);
>> @@ -1109,6 +1111,8 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
>>   	if (rc)
>>   		goto fail2;
>>   
>> +	efx_cxl_init(efx);
> No error checks? Does the device expect to work whether CXL is setup or not?
>

Right. The netdev functionality will not be jeopardized because CXL 
initialization errors. If it is all fine, the PIO buffers will be mapped 
using the created CXL region, if not, PIO buffers will be used mapping 
at specific BAR offset.


>> +
>>   	rc = efx_pci_probe_post_io(efx);
>>   	if (rc) {
>>   		/* On failure, retry once immediately.
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> new file mode 100644
>> index 000000000000..4554dd7cca76
>> --- /dev/null
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -0,0 +1,53 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/****************************************************************************
>> + * Driver for AMD network controllers and boards
>> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation, incorporated herein by reference.
>> + */
>> +
>> +
>> +#include <linux/pci.h>
>> +#include <linux/cxl_accel_mem.h>
>> +#include <linux/cxl_accel_pci.h>
>> +
>> +#include "net_driver.h"
>> +#include "efx_cxl.h"
>> +
>> +#define EFX_CTPIO_BUFFER_SIZE	(1024*1024*256)
>> +
>> +void efx_cxl_init(struct efx_nic *efx)
>> +{
>> +	struct pci_dev *pci_dev = efx->pci_dev;
>> +	struct efx_cxl *cxl = efx->cxl;
>> +	struct resource res;
>> +	u16 dvsec;
>> +
>> +	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
>> +					  CXL_DVSEC_PCIE_DEVICE);
>> +
>> +	if (!dvsec)
>> +		return;
>> +
>> +	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability found");
> Seem like unnecessary kern log emission
>

Uhmm, yes, maybe something more linked to how PIO buffer end up being 
used at a later time.

>> +
>> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
>> +	if (IS_ERR(cxl->cxlds)) {
>> +		pci_info(pci_dev, "CXL accel device state failed");
> pci_err()? or maybe pci_warn() given it's ignoring error returns.


Right. I will change this and other similar ones.


>> +		return;
>> +	}
>> +
>> +	cxl_accel_set_dvsec(cxl->cxlds, dvsec);
>> +	cxl_accel_set_serial(cxl->cxlds, pci_dev->dev.id);
>> +
>> +	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
>> +	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_DPA);
>> +
>> +	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>> +	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
>> +}
>> +
>> +
>> +MODULE_IMPORT_NS(CXL);
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
>> new file mode 100644
>> index 000000000000..76c6794c20d8
>> --- /dev/null
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
>> @@ -0,0 +1,29 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/****************************************************************************
>> + * Driver for AMD network controllers and boards
>> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation, incorporated herein by reference.
>> + */
>> +
>> +#ifndef EFX_CXL_H
>> +#define EFX_CLX_H
>> +
>> +#include <linux/cxl_accel_mem.h>
>> +
>> +struct efx_nic;
>> +
>> +struct efx_cxl {
>> +	cxl_accel_state *cxlds;
>> +	struct cxl_memdev *cxlmd;
>> +	struct cxl_root_decoder *cxlrd;
>> +	struct cxl_port *endpoint;
>> +	struct cxl_endpoint_decoder *cxled;
>> +	struct cxl_region *efx_region;
>> +	void __iomem *ctpio_cxl;
>> +};
>> +
>> +void efx_cxl_init(struct efx_nic *efx);
>> +#endif
>> diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
>> index f2dd7feb0e0c..58b7517afea4 100644
>> --- a/drivers/net/ethernet/sfc/net_driver.h
>> +++ b/drivers/net/ethernet/sfc/net_driver.h
>> @@ -814,6 +814,8 @@ enum efx_xdp_tx_queues_mode {
>>   
>>   struct efx_mae;
>>   
>> +struct efx_cxl;
>> +
>>   /**
>>    * struct efx_nic - an Efx NIC
>>    * @name: Device name (net device name or bus id before net device registered)
>> @@ -962,6 +964,7 @@ struct efx_mae;
>>    * @tc: state for TC offload (EF100).
>>    * @devlink: reference to devlink structure owned by this device
>>    * @dl_port: devlink port associated with the PF
>> + * @cxl: details of related cxl objects
>>    * @mem_bar: The BAR that is mapped into membase.
>>    * @reg_base: Offset from the start of the bar to the function control window.
>>    * @monitor_work: Hardware monitor workitem
>> @@ -1148,6 +1151,7 @@ struct efx_nic {
>>   
>>   	struct devlink *devlink;
>>   	struct devlink_port *dl_port;
>> +	struct efx_cxl *cxl;
>>   	unsigned int mem_bar;
>>   	u32 reg_base;
>>   
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> new file mode 100644
>> index 000000000000..daf46d41f59c
>> --- /dev/null
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -0,0 +1,22 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>> +
>> +#include <linux/cdev.h>
> Don't think this header is needed?
>
>> +
>> +#ifndef __CXL_ACCEL_MEM_H
>> +#define __CXL_ACCEL_MEM_H
>> +
>> +enum accel_resource{
>> +	CXL_ACCEL_RES_DPA,
>> +	CXL_ACCEL_RES_RAM,
>> +	CXL_ACCEL_RES_PMEM,
>> +};
>> +
>> +typedef struct cxl_dev_state cxl_accel_state;
> Please use 'struct cxl_dev_state' directly. There's no good reason to hide the type.


That is what I think I was told to do although not explicitly. There 
were concerns in the RFC about accel drivers too loose for doing things 
regarding CXL and somehow CXL core should keep control as much as 
possible.  I was even thought I was being asked to implement auxbus with 
the CXL part of an accel as an auxiliar device which should be bound to 
a CXL core driver. Then Jonathan Cameron the only one explicitly giving 
the possibility of the opaque approach and disadvising the auxbus idea.


Maybe I need an explicit action here.


>> +cxl_accel_state *cxl_accel_state_create(struct device *dev);
>> +
>> +void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>> +void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>> +			    enum accel_resource);
>> +#endif
>> diff --git a/include/linux/cxl_accel_pci.h b/include/linux/cxl_accel_pci.h
>> new file mode 100644
>> index 000000000000..c337ae8797e6
>> --- /dev/null
>> +++ b/include/linux/cxl_accel_pci.h
>> @@ -0,0 +1,23 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>> +
>> +#ifndef __CXL_ACCEL_PCI_H
>> +#define __CXL_ACCEL_PCI_H
>> +
>> +/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
>> +#define CXL_DVSEC_PCIE_DEVICE					0
>> +#define   CXL_DVSEC_CAP_OFFSET		0xA
>> +#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
>> +#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
>> +#define   CXL_DVSEC_CTRL_OFFSET		0xC
>> +#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
>> +#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
>> +#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
>> +#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
>> +#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
>> +#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
>> +#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
>> +#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
>> +#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
> This looks like a copy/paste of drivers/cxl/cxlpci.h definition. I suggest create a include/linux/cxl/pci.h and stick it in there and delete the copy in cxlpci.h. Also update the CXL spec version to latest (3.1) if you don't mind if we are going to move it.


That makes sense. I'll do it.

Thanks


>> +
>> +#endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-07-15 17:28 ` [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state alejandro.lucero-palau
@ 2024-07-19 19:01   ` Dave Jiang
  2024-07-23 13:43     ` Alejandro Lucero Palau
  2024-08-04 17:22   ` Jonathan Cameron
  2024-08-09  9:10   ` Zhi Wang
  2 siblings, 1 reply; 114+ messages in thread
From: Dave Jiang @ 2024-07-19 19:01 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes
  Cc: Alejandro Lucero



On 7/15/24 10:28 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Type2 devices have some Type3 functionalities as optional like an mbox
> or an hdm decoder, and CXL core needs a way to know what a CXL accelerator
> implements.
> 
> Add a new field for keeping device capabilities to be initialised by
> Type2 drivers. Advertise all those capabilities for Type3.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/mbox.c            |  1 +
>  drivers/cxl/core/memdev.c          |  4 +++-
>  drivers/cxl/core/port.c            |  2 +-
>  drivers/cxl/core/regs.c            | 11 ++++++-----
>  drivers/cxl/cxl.h                  |  2 +-
>  drivers/cxl/cxlmem.h               |  4 ++++
>  drivers/cxl/pci.c                  | 15 +++++++++------
>  drivers/net/ethernet/sfc/efx_cxl.c |  3 ++-
>  include/linux/cxl_accel_mem.h      |  5 ++++-
>  9 files changed, 31 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 2626f3fff201..2ba7d36e3f38 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1424,6 +1424,7 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
>  	mds->cxlds.reg_map.host = dev;
>  	mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE;
>  	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
> +	mds->cxlds.capabilities = CXL_DRIVER_CAP_HDM | CXL_DRIVER_CAP_MBOX;
>  	mds->ram_perf.qos_class = CXL_QOS_CLASS_INVALID;
>  	mds->pmem_perf.qos_class = CXL_QOS_CLASS_INVALID;
>  
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 04c3a0f8bc2e..b4205ecca365 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -616,7 +616,7 @@ static void detach_memdev(struct work_struct *work)
>  
>  static struct lock_class_key cxl_memdev_key;
>  
> -struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev, uint8_t caps)
>  {
>  	struct cxl_dev_state *cxlds;
>  
> @@ -631,6 +631,8 @@ struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
>  	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
>  	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
>  
> +	cxlds->capabilities = caps;
> +
>  	return cxlds;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 887ed6e358fb..d66c6349ed2d 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -763,7 +763,7 @@ static int cxl_setup_comp_regs(struct device *host, struct cxl_register_map *map
>  	map->reg_type = CXL_REGLOC_RBI_COMPONENT;
>  	map->max_size = CXL_COMPONENT_REG_BLOCK_SIZE;
>  
> -	return cxl_setup_regs(map);
> +	return cxl_setup_regs(map, 0);
>  }
>  
>  static int cxl_port_setup_regs(struct cxl_port *port,
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index e1082e749c69..9d218ebe180d 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -421,7 +421,7 @@ static void cxl_unmap_regblock(struct cxl_register_map *map)
>  	map->base = NULL;
>  }
>  
> -static int cxl_probe_regs(struct cxl_register_map *map)
> +static int cxl_probe_regs(struct cxl_register_map *map, uint8_t caps)
>  {
>  	struct cxl_component_reg_map *comp_map;
>  	struct cxl_device_reg_map *dev_map;
> @@ -437,11 +437,12 @@ static int cxl_probe_regs(struct cxl_register_map *map)
>  	case CXL_REGLOC_RBI_MEMDEV:
>  		dev_map = &map->device_map;
>  		cxl_probe_device_regs(host, base, dev_map);
> -		if (!dev_map->status.valid || !dev_map->mbox.valid ||
> +		if (!dev_map->status.valid ||
> +		    ((caps & CXL_DRIVER_CAP_MBOX) && !dev_map->mbox.valid) ||
>  		    !dev_map->memdev.valid) {
>  			dev_err(host, "registers not found: %s%s%s\n",
>  				!dev_map->status.valid ? "status " : "",
> -				!dev_map->mbox.valid ? "mbox " : "",
> +				((caps & CXL_DRIVER_CAP_MBOX) && !dev_map->mbox.valid) ? "mbox " : "",

According to the r3.1 8.2.8.2.1, the device status registers and the primary mailbox registers are both mandatory if regloc id=3 block is found. So if the type2 device does not implement a mailbox then it shouldn't be calling cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map) to begin with from the driver init right? If the type2 device defines a regblock with id=3 but without a mailbox, then isn't that a spec violation?

DJ

>  				!dev_map->memdev.valid ? "memdev " : "");
>  			return -ENXIO;
>  		}
> @@ -455,7 +456,7 @@ static int cxl_probe_regs(struct cxl_register_map *map)
>  	return 0;
>  }
>  
> -int cxl_setup_regs(struct cxl_register_map *map)
> +int cxl_setup_regs(struct cxl_register_map *map, uint8_t caps)
>  {
>  	int rc;
>  
> @@ -463,7 +464,7 @@ int cxl_setup_regs(struct cxl_register_map *map)
>  	if (rc)
>  		return rc;
>  
> -	rc = cxl_probe_regs(map);
> +	rc = cxl_probe_regs(map, caps);
>  	cxl_unmap_regblock(map);
>  
>  	return rc;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index a6613a6f8923..9973430d975f 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -300,7 +300,7 @@ int cxl_find_regblock_instance(struct pci_dev *pdev, enum cxl_regloc_type type,
>  			       struct cxl_register_map *map, int index);
>  int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>  		      struct cxl_register_map *map);
> -int cxl_setup_regs(struct cxl_register_map *map);
> +int cxl_setup_regs(struct cxl_register_map *map, uint8_t caps);
>  struct cxl_dport;
>  resource_size_t cxl_rcd_component_reg_phys(struct device *dev,
>  					   struct cxl_dport *dport);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index af8169ccdbc0..8f2a820bd92d 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -405,6 +405,9 @@ struct cxl_dpa_perf {
>  	int qos_class;
>  };
>  
> +#define CXL_DRIVER_CAP_HDM	0x1
> +#define CXL_DRIVER_CAP_MBOX	0x2
> +
>  /**
>   * struct cxl_dev_state - The driver device state
>   *
> @@ -438,6 +441,7 @@ struct cxl_dev_state {
>  	struct resource ram_res;
>  	u64 serial;
>  	enum cxl_devtype type;
> +	uint8_t capabilities;
>  };
>  
>  /**
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index b34d6259faf4..e2a978312281 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -502,7 +502,8 @@ static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev,
>  }
>  
>  static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> -			      struct cxl_register_map *map)
> +			      struct cxl_register_map *map,
> +			      uint8_t cxl_dev_caps)
>  {
>  	int rc;
>  
> @@ -519,7 +520,7 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>  	if (rc)
>  		return rc;
>  
> -	return cxl_setup_regs(map);
> +	return cxl_setup_regs(map, cxl_dev_caps);
>  }
>  
>  int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
> @@ -527,7 +528,8 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
>  	struct cxl_register_map map;
>  	int rc;
>  
> -	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
> +				cxlds->capabilities);
>  	if (rc)
>  		return rc;
>  
> @@ -536,7 +538,7 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
>  		return rc;
>  
>  	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> -				&cxlds->reg_map);
> +				&cxlds->reg_map, cxlds->capabilities);
>  	if (rc)
>  		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
>  
> @@ -850,7 +852,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  		dev_warn(&pdev->dev,
>  			 "Device DVSEC not present, skip CXL.mem init\n");
>  
> -	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
> +				cxlds->capabilities);
>  	if (rc)
>  		return rc;
>  
> @@ -863,7 +866,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	 * still be useful for management functions so don't return an error.
>  	 */
>  	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> -				&cxlds->reg_map);
> +				&cxlds->reg_map, cxlds->capabilities);
>  	if (rc)
>  		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
>  	else if (!cxlds->reg_map.component_map.ras.valid)
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 9cefcaf3caca..37d8bfdef517 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -33,7 +33,8 @@ void efx_cxl_init(struct efx_nic *efx)
>  
>  	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability found");
>  
> -	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev,
> +					    CXL_ACCEL_DRIVER_CAP_HDM);
>  	if (IS_ERR(cxl->cxlds)) {
>  		pci_info(pci_dev, "CXL accel device state failed");
>  		return;
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index c7b254edc096..0ba2195b919b 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -12,8 +12,11 @@ enum accel_resource{
>  	CXL_ACCEL_RES_PMEM,
>  };
>  
> +#define CXL_ACCEL_DRIVER_CAP_HDM	0x1
> +#define CXL_ACCEL_DRIVER_CAP_MBOX	0x2
> +
>  typedef struct cxl_dev_state cxl_accel_state;
> -cxl_accel_state *cxl_accel_state_create(struct device *dev);
> +cxl_accel_state *cxl_accel_state_create(struct device *dev, uint8_t caps);
>  
>  void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>  void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-07-19 19:01   ` Dave Jiang
@ 2024-07-23 13:43     ` Alejandro Lucero Palau
  2024-08-09 10:25       ` Zhi Wang
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-07-23 13:43 UTC (permalink / raw)
  To: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 7/19/24 20:01, Dave Jiang wrote:
>
>>   
>> -static int cxl_probe_regs(struct cxl_register_map *map)
>> +static int cxl_probe_regs(struct cxl_register_map *map, uint8_t caps)
>>   {
>>   	struct cxl_component_reg_map *comp_map;
>>   	struct cxl_device_reg_map *dev_map;
>> @@ -437,11 +437,12 @@ static int cxl_probe_regs(struct cxl_register_map *map)
>>   	case CXL_REGLOC_RBI_MEMDEV:
>>   		dev_map = &map->device_map;
>>   		cxl_probe_device_regs(host, base, dev_map);
>> -		if (!dev_map->status.valid || !dev_map->mbox.valid ||
>> +		if (!dev_map->status.valid ||
>> +		    ((caps & CXL_DRIVER_CAP_MBOX) && !dev_map->mbox.valid) ||
>>   		    !dev_map->memdev.valid) {
>>   			dev_err(host, "registers not found: %s%s%s\n",
>>   				!dev_map->status.valid ? "status " : "",
>> -				!dev_map->mbox.valid ? "mbox " : "",
>> +				((caps & CXL_DRIVER_CAP_MBOX) && !dev_map->mbox.valid) ? "mbox " : "",
> According to the r3.1 8.2.8.2.1, the device status registers and the primary mailbox registers are both mandatory if regloc id=3 block is found. So if the type2 device does not implement a mailbox then it shouldn't be calling cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map) to begin with from the driver init right? If the type2 device defines a regblock with id=3 but without a mailbox, then isn't that a spec violation?
>
> DJ


Right. The code needs to support the possibility of a Type2 having a 
mailbox, and if it is not supported, the rest of the dvsec regs 
initialization needs to be performed. This is not what the code does 
now, so I'll fix this.


A wider explanation is, for the RFC I used a test driver based on QEMU 
emulating a Type2 which had a CXL Device Register Interface defined 
(03h) but not a CXL Device Capability with id 2 for the primary mailbox 
register, breaking the spec as you spotted.


Thanks.



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-07-16  6:06   ` Li, Ming4
@ 2024-07-24  8:24     ` Alejandro Lucero Palau
  2024-07-25  5:51       ` Li, Ming4
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-07-24  8:24 UTC (permalink / raw)
  To: Li, Ming4, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 7/16/24 07:06, Li, Ming4 wrote:
> On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> CXL region creation involves allocating capacity from device DPA
>> (device-physical-address space) and assigning it to decode a given HPA
>> (host-physical-address space). Before determining how much DPA to
>> allocate the amount of available HPA must be determined. Also, not all
>> HPA is create equal, some specifically targets RAM, some target PMEM,
>> some is prepared for device-memory flows like HDM-D and HDM-DB, and some
>> is host-only (HDM-H).
>>
>> Wrap all of those concerns into an API that retrieves a root decoder
>> (platform CXL window) that fits the specified constraints and the
>> capacity available for a new region.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m6fbe775541da3cd477d65fa95c8acdc347345b4f
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>   drivers/cxl/core/region.c          | 161 +++++++++++++++++++++++++++++
>>   drivers/cxl/cxl.h                  |   3 +
>>   drivers/cxl/cxlmem.h               |   5 +
>>   drivers/net/ethernet/sfc/efx_cxl.c |  14 +++
>>   include/linux/cxl_accel_mem.h      |   9 ++
>>   5 files changed, 192 insertions(+)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 538ebd5a64fd..ca464bfef77b 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -702,6 +702,167 @@ static int free_hpa(struct cxl_region *cxlr)
>>   	return 0;
>>   }
>>   
>> +
>> +struct cxlrd_max_context {
>> +	struct device * const *host_bridges;
>> +	int interleave_ways;
>> +	unsigned long flags;
>> +	resource_size_t max_hpa;
>> +	struct cxl_root_decoder *cxlrd;
>> +};
>> +
>> +static int find_max_hpa(struct device *dev, void *data)
>> +{
>> +	struct cxlrd_max_context *ctx = data;
>> +	struct cxl_switch_decoder *cxlsd;
>> +	struct cxl_root_decoder *cxlrd;
>> +	struct resource *res, *prev;
>> +	struct cxl_decoder *cxld;
>> +	resource_size_t max;
>> +	int found;
>> +
>> +	if (!is_root_decoder(dev))
>> +		return 0;
>> +
>> +	cxlrd = to_cxl_root_decoder(dev);
>> +	cxld = &cxlrd->cxlsd.cxld;
>> +	if ((cxld->flags & ctx->flags) != ctx->flags) {
>> +		dev_dbg(dev, "find_max_hpa, flags not matching: %08lx vs %08lx\n",
>> +			      cxld->flags, ctx->flags);
>> +		return 0;
>> +	}
>> +
>> +	/* A Host bridge could have more interleave ways than an
>> +	 * endpoint, couldn´t it?
>> +	 *
>> +	 * What does interleave ways mean here in terms of the requestor?
>> +	 * Why the FFMWS has 0 interleave ways but root port has 1?
>> +	 */
>> +	if (cxld->interleave_ways != ctx->interleave_ways) {
>> +		dev_dbg(dev, "find_max_hpa, interleave_ways  not matching\n");
>> +		return 0;
>> +	}
>> +
>> +	cxlsd = &cxlrd->cxlsd;
>> +
>> +	guard(rwsem_read)(&cxl_region_rwsem);
>> +	found = 0;
>> +	for (int i = 0; i < ctx->interleave_ways; i++)
>> +		for (int j = 0; j < ctx->interleave_ways; j++)
>> +			if (ctx->host_bridges[i] ==
>> +					cxlsd->target[j]->dport_dev) {
>> +				found++;
>> +				break;
>> +			}
>> +
>> +	if (found != ctx->interleave_ways) {
>> +		dev_dbg(dev, "find_max_hpa, no interleave_ways found\n");
>> +		return 0;
>> +	}
>> +
>> +	/*
>> +	 * Walk the root decoder resource range relying on cxl_region_rwsem to
>> +	 * preclude sibling arrival/departure and find the largest free space
>> +	 * gap.
>> +	 */
>> +	lockdep_assert_held_read(&cxl_region_rwsem);
>> +	max = 0;
>> +	res = cxlrd->res->child;
>> +	if (!res)
>> +		max = resource_size(cxlrd->res);
>> +	else
>> +		max = 0;
>> +
>> +	for (prev = NULL; res; prev = res, res = res->sibling) {
>> +		struct resource *next = res->sibling;
>> +		resource_size_t free = 0;
>> +
>> +		if (!prev && res->start > cxlrd->res->start) {
>> +			free = res->start - cxlrd->res->start;
>> +			max = max(free, max);
>> +		}
>> +		if (prev && res->start > prev->end + 1) {
>> +			free = res->start - prev->end + 1;
>> +			max = max(free, max);
>> +		}
>> +		if (next && res->end + 1 < next->start) {
>> +			free = next->start - res->end + 1;
>> +			max = max(free, max);
>> +		}
>> +		if (!next && res->end + 1 < cxlrd->res->end + 1) {
>> +			free = cxlrd->res->end + 1 - res->end + 1;
>> +			max = max(free, max);
>> +		}
>> +	}
>> +
>> +	if (max > ctx->max_hpa) {
>> +		if (ctx->cxlrd)
>> +			put_device(CXLRD_DEV(ctx->cxlrd));
>> +		get_device(CXLRD_DEV(cxlrd));
>> +		ctx->cxlrd = cxlrd;
>> +		ctx->max_hpa = max;
>> +		dev_info(CXLRD_DEV(cxlrd), "found %pa bytes of free space\n", &max);
>> +	}
>> +	return 0;
>> +}
>> +
>> +/**
>> + * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
>> + * @endpoint: an endpoint that is mapped by the returned decoder
>> + * @interleave_ways: number of entries in @host_bridges
>> + * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
>> + * @max: output parameter of bytes available in the returned decoder
>> + *
>> + * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
>> + * is a point in time snapshot. If by the time the caller goes to use this root
>> + * decoder's capacity the capacity is reduced then caller needs to loop and
>> + * retry.
>> + *
>> + * The returned root decoder has an elevated reference count that needs to be
>> + * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
>> + * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
>> + * does not race.
>> + */
>> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>> +					       int interleave_ways,
>> +					       unsigned long flags,
>> +					       resource_size_t *max)
>> +{
>> +
>> +	struct cxlrd_max_context ctx = {
>> +		.host_bridges = &endpoint->host_bridge,
>> +		.interleave_ways = interleave_ways,
>> +		.flags = flags,
>> +	};
>> +	struct cxl_port *root_port;
>> +	struct cxl_root *root;
>> +
>> +	if (!is_cxl_endpoint(endpoint)) {
>> +		dev_dbg(&endpoint->dev, "hpa requestor is not an endpoint\n");
>> +		return ERR_PTR(-EINVAL);
>> +	}
>> +
>> +	root = find_cxl_root(endpoint);
> Could use scope-based resource management  __free() here to drop below put_device(&root_port->dev);
>
> e.g. struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(endpoint);
>

I need to admit not familiar yet with scope-based macros, but I think 
these are different things. The scope of the pointer is inside this 
function, but the data referenced is likely to persist.


  get_device, inside find_cxl_root, is needed to avoid the 
device-related data disappearing while referenced by the code inside 
this function, and at the time of put_device, the data will be freed if 
ref counter reaches 0. Am I missing something?


>> +	if (!root) {
>> +		dev_dbg(&endpoint->dev, "endpoint can not be related to a root port\n");
>> +		return ERR_PTR(-ENXIO);
>> +	}
>> +
>> +	root_port = &root->port;
>> +	down_read(&cxl_region_rwsem);
>> +	device_for_each_child(&root_port->dev, &ctx, find_max_hpa);
>> +	up_read(&cxl_region_rwsem);
>> +	put_device(&root_port->dev);
>> +
>> +	if (!ctx.cxlrd)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	*max = ctx.max_hpa;
>> +	return ctx.cxlrd;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_get_hpa_freespace, CXL);
>> +
>> +
>>   static ssize_t size_store(struct device *dev, struct device_attribute *attr,
>>   			  const char *buf, size_t len)
>>   {
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 9973430d975f..d3fdd2c1e066 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -770,6 +770,9 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
>>   struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
>>   struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
>>   struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev);
>> +
>> +#define CXLRD_DEV(cxlrd) &cxlrd->cxlsd.cxld.dev
>> +
>>   bool is_root_decoder(struct device *dev);
>>   bool is_switch_decoder(struct device *dev);
>>   bool is_endpoint_decoder(struct device *dev);
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index 8f2a820bd92d..a0e0795ec064 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -877,4 +877,9 @@ struct cxl_hdm {
>>   struct seq_file;
>>   struct dentry *cxl_debugfs_create_dir(const char *dir);
>>   void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds);
>> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>> +					       int interleave_ways,
>> +					       unsigned long flags,
>> +					       resource_size_t *max);
>> +
>>   #endif /* __CXL_MEM_H__ */
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index 2cf4837ddfc1..6d49571ccff7 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -22,6 +22,7 @@ void efx_cxl_init(struct efx_nic *efx)
>>   {
>>   	struct pci_dev *pci_dev = efx->pci_dev;
>>   	struct efx_cxl *cxl = efx->cxl;
>> +	resource_size_t max = 0;
>>   	struct resource res;
>>   	u16 dvsec;
>>   
>> @@ -74,6 +75,19 @@ void efx_cxl_init(struct efx_nic *efx)
>>   	if (IS_ERR(cxl->endpoint))
>>   		pci_info(pci_dev, "CXL accel acquire endpoint failed");
>>   
>> +	cxl->cxlrd = cxl_get_hpa_freespace(cxl->endpoint, 1,
>> +					    CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
>> +					    &max);
>> +
>> +	if (IS_ERR(cxl->cxlrd)) {
>> +		pci_info(pci_dev, "CXL accel get HPA failed");
>> +		goto out;
>> +	}
>> +
>> +	if (max < EFX_CTPIO_BUFFER_SIZE)
>> +		pci_info(pci_dev, "CXL accel not enough free HPA space %llu < %u\n",
>> +				  max, EFX_CTPIO_BUFFER_SIZE);
>> +out:
>>   	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
>>   }
>>   
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> index 701910021df8..f3e77688ffe0 100644
>> --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -6,6 +6,10 @@
>>   #ifndef __CXL_ACCEL_MEM_H
>>   #define __CXL_ACCEL_MEM_H
>>   
>> +#define CXL_DECODER_F_RAM   BIT(0)
>> +#define CXL_DECODER_F_PMEM  BIT(1)
>> +#define CXL_DECODER_F_TYPE2 BIT(2)
>> +
>>   enum accel_resource{
>>   	CXL_ACCEL_RES_DPA,
>>   	CXL_ACCEL_RES_RAM,
>> @@ -32,4 +36,9 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>>   
>>   struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
>>   void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
>> +
>> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>> +					       int interleave_ways,
>> +					       unsigned long flags,
>> +					       resource_size_t *max);
>>   #endif
>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 05/15] cxl: fix use of resource_contains
  2024-07-15 17:28 ` [PATCH v2 05/15] cxl: fix use of resource_contains alejandro.lucero-palau
@ 2024-07-24 21:25   ` fan
  2024-08-16 14:43     ` Alejandro Lucero Palau
  2024-08-04 17:25   ` Jonathan Cameron
  2024-08-09  9:14   ` Zhi Wang
  2 siblings, 1 reply; 114+ messages in thread
From: fan @ 2024-07-24 21:25 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, Jul 15, 2024 at 06:28:25PM +0100, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> For a resource defined with size zero, resource contains will also
> return true.

s/resource contains/resource_contains/

Fan
> 
> Add resource size check before using it.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/hdm.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 3df10517a327..4af9225d4b59 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -327,10 +327,13 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
>  	cxled->dpa_res = res;
>  	cxled->skip = skipped;
>  
> -	if (resource_contains(&cxlds->pmem_res, res))
> +	if ((resource_size(&cxlds->pmem_res)) && (resource_contains(&cxlds->pmem_res, res))) {
> +		printk("%s: resource_contains CXL_DECODER_PMEM\n", __func__);
>  		cxled->mode = CXL_DECODER_PMEM;
> -	else if (resource_contains(&cxlds->ram_res, res))
> +	} else if ((resource_size(&cxlds->ram_res)) && (resource_contains(&cxlds->ram_res, res))) {
> +		printk("%s: resource_contains CXL_DECODER_RAM\n", __func__);
>  		cxled->mode = CXL_DECODER_RAM;
> +	}
>  	else {
>  		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
>  			 port->id, cxled->cxld.id, cxled->dpa_res);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 07/15] cxl: support type2 memdev creation
  2024-07-15 17:28 ` [PATCH v2 07/15] cxl: support type2 memdev creation alejandro.lucero-palau
@ 2024-07-24 21:32   ` fan
  2024-08-16 14:57     ` Alejandro Lucero Palau
  2024-08-04 17:31   ` Jonathan Cameron
  1 sibling, 1 reply; 114+ messages in thread
From: fan @ 2024-07-24 21:32 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, Jul 15, 2024 at 06:28:27PM +0100, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Add memdev creation from sfc driver.
> 
> Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when
> creating a memdev leading to problems when obtaining cxl_memdev_state
> references from a CXL_DEVTYPE_DEVMEM type. This last device type is
> managed by a specific vendor driver and does not need same sysfs files
> since not userspace intervention is expected. This patch checks for the
> right device type in those functions using cxl_memdev_state.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/cdat.c            |  3 +++
>  drivers/cxl/core/memdev.c          |  9 +++++++++
>  drivers/cxl/mem.c                  | 17 +++++++++++------
>  drivers/net/ethernet/sfc/efx_cxl.c | 10 ++++++++--
>  include/linux/cxl_accel_mem.h      |  3 +++
>  5 files changed, 34 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> index bb83867d9fec..0d4679c137d4 100644
> --- a/drivers/cxl/core/cdat.c
> +++ b/drivers/cxl/core/cdat.c
> @@ -558,6 +558,9 @@ void cxl_region_perf_data_calculate(struct cxl_region *cxlr,
>  	};
>  	struct cxl_dpa_perf *perf;
>  
> +	if (!mds)
> +		return;
> +
>  	switch (cxlr->mode) {
>  	case CXL_DECODER_RAM:
>  		perf = &mds->ram_perf;
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 58a51e7fd37f..b902948b121f 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -468,6 +468,9 @@ static umode_t cxl_ram_visible(struct kobject *kobj, struct attribute *a, int n)
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>  
> +	if (!mds)
> +		return 0;
> +
>  	if (a == &dev_attr_ram_qos_class.attr)
>  		if (mds->ram_perf.qos_class == CXL_QOS_CLASS_INVALID)
>  			return 0;
> @@ -487,6 +490,9 @@ static umode_t cxl_pmem_visible(struct kobject *kobj, struct attribute *a, int n
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>  
> +	if (!mds)
> +		return 0;
> +
>  	if (a == &dev_attr_pmem_qos_class.attr)
>  		if (mds->pmem_perf.qos_class == CXL_QOS_CLASS_INVALID)
>  			return 0;
> @@ -507,6 +513,9 @@ static umode_t cxl_memdev_security_visible(struct kobject *kobj,
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>  
> +	if (!mds)
> +		return 0;
> +
>  	if (a == &dev_attr_security_sanitize.attr &&
>  	    !test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
>  		return 0;
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 2f1b49bfe162..f76af75a87b7 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -131,12 +131,14 @@ static int cxl_mem_probe(struct device *dev)
>  	dentry = cxl_debugfs_create_dir(dev_name(dev));
>  	debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show);
>  
> -	if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
> -		debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
> -				    &cxl_poison_inject_fops);
> -	if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
> -		debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
> -				    &cxl_poison_clear_fops);
> +	if (mds) {
> +		if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
> +			debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
> +					    &cxl_poison_inject_fops);
> +		if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
> +			debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
> +					    &cxl_poison_clear_fops);
> +	}
>  
>  	rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
>  	if (rc)
> @@ -222,6 +224,9 @@ static umode_t cxl_mem_visible(struct kobject *kobj, struct attribute *a, int n)
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>  
> +	if (!mds)
> +		return 0;
> +
>  	if (a == &dev_attr_trigger_poison_list.attr)
>  		if (!test_bit(CXL_POISON_ENABLED_LIST,
>  			      mds->poison.enabled_cmds))
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index a84fe7992c53..0abe66490ef5 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -57,10 +57,16 @@ void efx_cxl_init(struct efx_nic *efx)
>  	if (cxl_accel_request_resource(cxl->cxlds, true))
>  		pci_info(pci_dev, "CXL accel resource request failed");
>  
> -	if (!cxl_await_media_ready(cxl->cxlds))
> +	if (!cxl_await_media_ready(cxl->cxlds)) {
>  		cxl_accel_set_media_ready(cxl->cxlds);
> -	else
> +	} else {
>  		pci_info(pci_dev, "CXL accel media not active");
pci_warning() ??
> +		return;
> +	}
> +
> +	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
> +	if (IS_ERR(cxl->cxlmd))
> +		pci_info(pci_dev, "CXL accel memdev creation failed");
pci_err()

Fan
>  }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index b883c438a132..442ed9862292 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -26,4 +26,7 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>  int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
>  void cxl_accel_set_media_ready(struct cxl_dev_state *cxlds);
>  int cxl_await_media_ready(struct cxl_dev_state *cxlds);
> +
> +struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
> +				       struct cxl_dev_state *cxlds);
>  #endif
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-07-24  8:24     ` Alejandro Lucero Palau
@ 2024-07-25  5:51       ` Li, Ming4
  2024-07-25 11:59         ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Li, Ming4 @ 2024-07-25  5:51 UTC (permalink / raw)
  To: Alejandro Lucero Palau, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes

On 7/24/2024 4:24 PM, Alejandro Lucero Palau wrote:
>
> On 7/16/24 07:06, Li, Ming4 wrote:
>> On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
>>> From: Alejandro Lucero <alucerop@amd.com>
>>>
>>> CXL region creation involves allocating capacity from device DPA
>>> (device-physical-address space) and assigning it to decode a given HPA
>>> (host-physical-address space). Before determining how much DPA to
>>> allocate the amount of available HPA must be determined. Also, not all
>>> HPA is create equal, some specifically targets RAM, some target PMEM,
>>> some is prepared for device-memory flows like HDM-D and HDM-DB, and some
>>> is host-only (HDM-H).
>>>
>>> Wrap all of those concerns into an API that retrieves a root decoder
>>> (platform CXL window) that fits the specified constraints and the
>>> capacity available for a new region.
>>>
>>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m6fbe775541da3cd477d65fa95c8acdc347345b4f
>>>
>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>>> ---
>>>   drivers/cxl/core/region.c          | 161 +++++++++++++++++++++++++++++
>>>   drivers/cxl/cxl.h                  |   3 +
>>>   drivers/cxl/cxlmem.h               |   5 +
>>>   drivers/net/ethernet/sfc/efx_cxl.c |  14 +++
>>>   include/linux/cxl_accel_mem.h      |   9 ++
>>>   5 files changed, 192 insertions(+)
>>>
>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>> index 538ebd5a64fd..ca464bfef77b 100644
>>> --- a/drivers/cxl/core/region.c
>>> +++ b/drivers/cxl/core/region.c
>>> @@ -702,6 +702,167 @@ static int free_hpa(struct cxl_region *cxlr)
>>>       return 0;
>>>   }
>>>   +
>>> +struct cxlrd_max_context {
>>> +    struct device * const *host_bridges;
>>> +    int interleave_ways;
>>> +    unsigned long flags;
>>> +    resource_size_t max_hpa;
>>> +    struct cxl_root_decoder *cxlrd;
>>> +};
>>> +
>>> +static int find_max_hpa(struct device *dev, void *data)
>>> +{
>>> +    struct cxlrd_max_context *ctx = data;
>>> +    struct cxl_switch_decoder *cxlsd;
>>> +    struct cxl_root_decoder *cxlrd;
>>> +    struct resource *res, *prev;
>>> +    struct cxl_decoder *cxld;
>>> +    resource_size_t max;
>>> +    int found;
>>> +
>>> +    if (!is_root_decoder(dev))
>>> +        return 0;
>>> +
>>> +    cxlrd = to_cxl_root_decoder(dev);
>>> +    cxld = &cxlrd->cxlsd.cxld;
>>> +    if ((cxld->flags & ctx->flags) != ctx->flags) {
>>> +        dev_dbg(dev, "find_max_hpa, flags not matching: %08lx vs %08lx\n",
>>> +                  cxld->flags, ctx->flags);
>>> +        return 0;
>>> +    }
>>> +
>>> +    /* A Host bridge could have more interleave ways than an
>>> +     * endpoint, couldn´t it?
>>> +     *
>>> +     * What does interleave ways mean here in terms of the requestor?
>>> +     * Why the FFMWS has 0 interleave ways but root port has 1?
>>> +     */
>>> +    if (cxld->interleave_ways != ctx->interleave_ways) {
>>> +        dev_dbg(dev, "find_max_hpa, interleave_ways  not matching\n");
>>> +        return 0;
>>> +    }
>>> +
>>> +    cxlsd = &cxlrd->cxlsd;
>>> +
>>> +    guard(rwsem_read)(&cxl_region_rwsem);
>>> +    found = 0;
>>> +    for (int i = 0; i < ctx->interleave_ways; i++)
>>> +        for (int j = 0; j < ctx->interleave_ways; j++)
>>> +            if (ctx->host_bridges[i] ==
>>> +                    cxlsd->target[j]->dport_dev) {
>>> +                found++;
>>> +                break;
>>> +            }
>>> +
>>> +    if (found != ctx->interleave_ways) {
>>> +        dev_dbg(dev, "find_max_hpa, no interleave_ways found\n");
>>> +        return 0;
>>> +    }
>>> +
>>> +    /*
>>> +     * Walk the root decoder resource range relying on cxl_region_rwsem to
>>> +     * preclude sibling arrival/departure and find the largest free space
>>> +     * gap.
>>> +     */
>>> +    lockdep_assert_held_read(&cxl_region_rwsem);
>>> +    max = 0;
>>> +    res = cxlrd->res->child;
>>> +    if (!res)
>>> +        max = resource_size(cxlrd->res);
>>> +    else
>>> +        max = 0;
>>> +
>>> +    for (prev = NULL; res; prev = res, res = res->sibling) {
>>> +        struct resource *next = res->sibling;
>>> +        resource_size_t free = 0;
>>> +
>>> +        if (!prev && res->start > cxlrd->res->start) {
>>> +            free = res->start - cxlrd->res->start;
>>> +            max = max(free, max);
>>> +        }
>>> +        if (prev && res->start > prev->end + 1) {
>>> +            free = res->start - prev->end + 1;
>>> +            max = max(free, max);
>>> +        }
>>> +        if (next && res->end + 1 < next->start) {
>>> +            free = next->start - res->end + 1;
>>> +            max = max(free, max);
>>> +        }
>>> +        if (!next && res->end + 1 < cxlrd->res->end + 1) {
>>> +            free = cxlrd->res->end + 1 - res->end + 1;
>>> +            max = max(free, max);
>>> +        }
>>> +    }
>>> +
>>> +    if (max > ctx->max_hpa) {
>>> +        if (ctx->cxlrd)
>>> +            put_device(CXLRD_DEV(ctx->cxlrd));
>>> +        get_device(CXLRD_DEV(cxlrd));
>>> +        ctx->cxlrd = cxlrd;
>>> +        ctx->max_hpa = max;
>>> +        dev_info(CXLRD_DEV(cxlrd), "found %pa bytes of free space\n", &max);
>>> +    }
>>> +    return 0;
>>> +}
>>> +
>>> +/**
>>> + * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
>>> + * @endpoint: an endpoint that is mapped by the returned decoder
>>> + * @interleave_ways: number of entries in @host_bridges
>>> + * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
>>> + * @max: output parameter of bytes available in the returned decoder
>>> + *
>>> + * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
>>> + * is a point in time snapshot. If by the time the caller goes to use this root
>>> + * decoder's capacity the capacity is reduced then caller needs to loop and
>>> + * retry.
>>> + *
>>> + * The returned root decoder has an elevated reference count that needs to be
>>> + * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
>>> + * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
>>> + * does not race.
>>> + */
>>> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>>> +                           int interleave_ways,
>>> +                           unsigned long flags,
>>> +                           resource_size_t *max)
>>> +{
>>> +
>>> +    struct cxlrd_max_context ctx = {
>>> +        .host_bridges = &endpoint->host_bridge,
>>> +        .interleave_ways = interleave_ways,
>>> +        .flags = flags,
>>> +    };
>>> +    struct cxl_port *root_port;
>>> +    struct cxl_root *root;
>>> +
>>> +    if (!is_cxl_endpoint(endpoint)) {
>>> +        dev_dbg(&endpoint->dev, "hpa requestor is not an endpoint\n");
>>> +        return ERR_PTR(-EINVAL);
>>> +    }
>>> +
>>> +    root = find_cxl_root(endpoint);
>> Could use scope-based resource management  __free() here to drop below put_device(&root_port->dev);
>>
>> e.g. struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(endpoint);
>>
>
> I need to admit not familiar yet with scope-based macros, but I think these are different things. The scope of the pointer is inside this function, but the data referenced is likely to persist.
>
>
>  get_device, inside find_cxl_root, is needed to avoid the device-related data disappearing while referenced by the code inside this function, and at the time of put_device, the data will be freed if ref counter reaches 0. Am I missing something?
>
Yes, get_device() is to avoid the device-related data disappearing, __free(put_cxl_root) will help to release the reference of cxl_root->port.dev when cxl_get_hpa_freespace() finished, so that you don't need a put_device(&root_port->dev) in the function.

I think that your case is similar to this patch

https://lore.kernel.org/all/170449247353.3779673.5963704495491343135.stgit@djiang5-mobl3/


>
>>> +    if (!root) {
>>> +        dev_dbg(&endpoint->dev, "endpoint can not be related to a root port\n");
>>> +        return ERR_PTR(-ENXIO);
>>> +    }
>>> +
>>> +    root_port = &root->port;
>>> +    down_read(&cxl_region_rwsem);
>>> +    device_for_each_child(&root_port->dev, &ctx, find_max_hpa);
>>> +    up_read(&cxl_region_rwsem);
>>> +    put_device(&root_port->dev);
>>> +
>>> +    if (!ctx.cxlrd)
>>> +        return ERR_PTR(-ENOMEM);
>>> +
>>> +    *max = ctx.max_hpa;
>>> +    return ctx.cxlrd;
>>> +}
>>> +EXPORT_SYMBOL_NS_GPL(cxl_get_hpa_freespace, CXL);
>>> +
>>> +
>>>   static ssize_t size_store(struct device *dev, struct device_attribute *attr,
>>>                 const char *buf, size_t len)
>>>   {
>>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>>> index 9973430d975f..d3fdd2c1e066 100644
>>> --- a/drivers/cxl/cxl.h
>>> +++ b/drivers/cxl/cxl.h
>>> @@ -770,6 +770,9 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
>>>   struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
>>>   struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
>>>   struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev);
>>> +
>>> +#define CXLRD_DEV(cxlrd) &cxlrd->cxlsd.cxld.dev
>>> +
>>>   bool is_root_decoder(struct device *dev);
>>>   bool is_switch_decoder(struct device *dev);
>>>   bool is_endpoint_decoder(struct device *dev);
>>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>>> index 8f2a820bd92d..a0e0795ec064 100644
>>> --- a/drivers/cxl/cxlmem.h
>>> +++ b/drivers/cxl/cxlmem.h
>>> @@ -877,4 +877,9 @@ struct cxl_hdm {
>>>   struct seq_file;
>>>   struct dentry *cxl_debugfs_create_dir(const char *dir);
>>>   void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds);
>>> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>>> +                           int interleave_ways,
>>> +                           unsigned long flags,
>>> +                           resource_size_t *max);
>>> +
>>>   #endif /* __CXL_MEM_H__ */
>>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>>> index 2cf4837ddfc1..6d49571ccff7 100644
>>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>>> @@ -22,6 +22,7 @@ void efx_cxl_init(struct efx_nic *efx)
>>>   {
>>>       struct pci_dev *pci_dev = efx->pci_dev;
>>>       struct efx_cxl *cxl = efx->cxl;
>>> +    resource_size_t max = 0;
>>>       struct resource res;
>>>       u16 dvsec;
>>>   @@ -74,6 +75,19 @@ void efx_cxl_init(struct efx_nic *efx)
>>>       if (IS_ERR(cxl->endpoint))
>>>           pci_info(pci_dev, "CXL accel acquire endpoint failed");
>>>   +    cxl->cxlrd = cxl_get_hpa_freespace(cxl->endpoint, 1,
>>> +                        CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
>>> +                        &max);
>>> +
>>> +    if (IS_ERR(cxl->cxlrd)) {
>>> +        pci_info(pci_dev, "CXL accel get HPA failed");
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (max < EFX_CTPIO_BUFFER_SIZE)
>>> +        pci_info(pci_dev, "CXL accel not enough free HPA space %llu < %u\n",
>>> +                  max, EFX_CTPIO_BUFFER_SIZE);
>>> +out:
>>>       cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
>>>   }
>>>   diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>>> index 701910021df8..f3e77688ffe0 100644
>>> --- a/include/linux/cxl_accel_mem.h
>>> +++ b/include/linux/cxl_accel_mem.h
>>> @@ -6,6 +6,10 @@
>>>   #ifndef __CXL_ACCEL_MEM_H
>>>   #define __CXL_ACCEL_MEM_H
>>>   +#define CXL_DECODER_F_RAM   BIT(0)
>>> +#define CXL_DECODER_F_PMEM  BIT(1)
>>> +#define CXL_DECODER_F_TYPE2 BIT(2)
>>> +
>>>   enum accel_resource{
>>>       CXL_ACCEL_RES_DPA,
>>>       CXL_ACCEL_RES_RAM,
>>> @@ -32,4 +36,9 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>>>     struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
>>>   void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
>>> +
>>> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>>> +                           int interleave_ways,
>>> +                           unsigned long flags,
>>> +                           resource_size_t *max);
>>>   #endif
>>
>


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-07-25  5:51       ` Li, Ming4
@ 2024-07-25 11:59         ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-07-25 11:59 UTC (permalink / raw)
  To: Li, Ming4, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 7/25/24 06:51, Li, Ming4 wrote:
> On 7/24/2024 4:24 PM, Alejandro Lucero Palau wrote:
>> On 7/16/24 07:06, Li, Ming4 wrote:
>>> On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
>>>> From: Alejandro Lucero <alucerop@amd.com>
>>>>
>>>>
>>> Could use scope-based resource management  __free() here to drop below put_device(&root_port->dev);
>>>
>>> e.g. struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(endpoint);
>>>
>> I need to admit not familiar yet with scope-based macros, but I think these are different things. The scope of the pointer is inside this function, but the data referenced is likely to persist.
>>
>>
>>   get_device, inside find_cxl_root, is needed to avoid the device-related data disappearing while referenced by the code inside this function, and at the time of put_device, the data will be freed if ref counter reaches 0. Am I missing something?
>>
> Yes, get_device() is to avoid the device-related data disappearing, __free(put_cxl_root) will help to release the reference of cxl_root->port.dev when cxl_get_hpa_freespace() finished, so that you don't need a put_device(&root_port->dev) in the function.
>
> I think that your case is similar to this patch
>
> https://lore.kernel.org/all/170449247353.3779673.5963704495491343135.stgit@djiang5-mobl3/
>

OK. It makes sense. I was blinded assuming it was just about freeing 
memory, but the function to call for cleaning up can do other things as 
well.

I will use it in next version.

Thanks


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 08/15] cxl: indicate probe deferral
  2024-07-15 17:28 ` [PATCH v2 08/15] cxl: indicate probe deferral alejandro.lucero-palau
  2024-07-16  5:52   ` Li, Ming4
@ 2024-07-30 16:43   ` Fan Ni
  2024-08-04 17:41   ` Jonathan Cameron
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 114+ messages in thread
From: Fan Ni @ 2024-07-30 16:43 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, Jul 15, 2024 at 06:28:28PM +0100, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> The first stop for a CXL accelerator driver that wants to establish new
> CXL.mem regions is to register a 'struct cxl_memdev. That kicks off
> cxl_mem_probe() to enumerate all 'struct cxl_port' instances in the
> topology up to the root.
> 
> If the root driver has not attached yet the expectation is that the
> driver waits until that link is established. The common cxl_pci_driver
> has reason to keep the 'struct cxl_memdev' device attached to the bus
> until the root driver attaches. An accelerator may want to instead defer
> probing until CXL resources can be acquired.
> 
> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
> accelerator driver probing should be defferred vs failed. Provide that
> indication via a new cxl_acquire_endpoint() API that can retrieve the
> probe status of the memdev.
> 
> The first consumer of this API is a test driver that excercises the CXL
> Type-2 flow.
> 
> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m18497367d2ae38f88e94c06369eaa83fa23e92b2
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c          | 41 ++++++++++++++++++++++++++++++
>  drivers/cxl/core/port.c            |  2 +-
>  drivers/cxl/mem.c                  |  7 +++--
>  drivers/net/ethernet/sfc/efx_cxl.c | 10 +++++++-
>  include/linux/cxl_accel_mem.h      |  3 +++
>  5 files changed, 59 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index b902948b121f..d51c8bfb32e3 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -1137,6 +1137,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>  }
>  EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
>  
> +/*
> + * Try to get a locked reference on a memdev's CXL port topology
> + * connection. Be careful to observe when cxl_mem_probe() has deposited
> + * a probe deferral awaiting the arrival of the CXL root driver
> +*/
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
> +{
> +	struct cxl_port *endpoint;
> +	int rc = -ENXIO;
> +
> +	device_lock(&cxlmd->dev);
> +	endpoint = cxlmd->endpoint;
> +	if (!endpoint)
> +		goto err;
> +
> +	if (IS_ERR(endpoint)) {
> +		rc = PTR_ERR(endpoint);
> +		goto err;
> +	}
> +
> +	device_lock(&endpoint->dev);
> +	if (!endpoint->dev.driver)
> +		goto err_endpoint;
> +
> +	return endpoint;
> +
> +err_endpoint:
> +	device_unlock(&endpoint->dev);
> +err:
> +	device_unlock(&cxlmd->dev);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
> +
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
> +{
> +	device_unlock(&endpoint->dev);
> +	device_unlock(&cxlmd->dev);
> +}
> +EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
> +
>  static void sanitize_teardown_notifier(void *data)
>  {
>  	struct cxl_memdev_state *mds = data;
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index d66c6349ed2d..3c6b896c5f65 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>  		 */
>  		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>  			dev_name(dport_dev));
> -		return -ENXIO;
> +		return -EPROBE_DEFER;
>  	}
>  
>  	parent_port = find_cxl_port(dparent, &parent_dport);
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index f76af75a87b7..383a6f4829d3 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -145,13 +145,16 @@ static int cxl_mem_probe(struct device *dev)
>  		return rc;
>  
>  	rc = devm_cxl_enumerate_ports(cxlmd);
> -	if (rc)
> +	if (rc) {
> +		cxlmd->endpoint = ERR_PTR(rc);
>  		return rc;
> +	}
>  
>  	parent_port = cxl_mem_find_port(cxlmd, &dport);
>  	if (!parent_port) {
>  		dev_err(dev, "CXL port topology not found\n");
> -		return -ENXIO;
> +		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
> +		return -EPROBE_DEFER;
>  	}
>  
>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 0abe66490ef5..2cf4837ddfc1 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -65,8 +65,16 @@ void efx_cxl_init(struct efx_nic *efx)
>  	}
>  
>  	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
> -	if (IS_ERR(cxl->cxlmd))
> +	if (IS_ERR(cxl->cxlmd)) {
>  		pci_info(pci_dev, "CXL accel memdev creation failed");
pci_err()?
> +		return;
> +	}
> +
> +	cxl->endpoint = cxl_acquire_endpoint(cxl->cxlmd);
> +	if (IS_ERR(cxl->endpoint))
> +		pci_info(pci_dev, "CXL accel acquire endpoint failed");
pci_err()?
> +
> +	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
>  }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index 442ed9862292..701910021df8 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -29,4 +29,7 @@ int cxl_await_media_ready(struct cxl_dev_state *cxlds);
>  
>  struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>  				       struct cxl_dev_state *cxlds);
> +
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
>  #endif
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-07-19  6:03     ` Alejandro Lucero Palau
@ 2024-08-04 16:44       ` Jonathan Cameron
  2024-08-09  7:26         ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 16:44 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes

> >> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> >> new file mode 100644
> >> index 000000000000..daf46d41f59c
> >> --- /dev/null
> >> +++ b/include/linux/cxl_accel_mem.h
> >> @@ -0,0 +1,22 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> >> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> >> +
> >> +#include <linux/cdev.h>  
> > Don't think this header is needed?
> >  
> >> +
> >> +#ifndef __CXL_ACCEL_MEM_H
> >> +#define __CXL_ACCEL_MEM_H
> >> +
> >> +enum accel_resource{
> >> +	CXL_ACCEL_RES_DPA,
> >> +	CXL_ACCEL_RES_RAM,
> >> +	CXL_ACCEL_RES_PMEM,
> >> +};
> >> +
> >> +typedef struct cxl_dev_state cxl_accel_state;  
> > Please use 'struct cxl_dev_state' directly. There's no good reason to hide the type.  
> 
> 
> That is what I think I was told to do although not explicitly. There 
> were concerns in the RFC about accel drivers too loose for doing things 
> regarding CXL and somehow CXL core should keep control as much as 
> possible.  I was even thought I was being asked to implement auxbus with 
> the CXL part of an accel as an auxiliar device which should be bound to 
> a CXL core driver. Then Jonathan Cameron the only one explicitly giving 
> the possibility of the opaque approach and disadvising the auxbus idea.

I wasn't thinking a typedef to hide it.
More making all state accesses that are needed through accessor functions so
that from the 'internals' become opaque to the accelerator code and
we can radically change how things are structured internally with
no impact to the (hopefully large number of) CXL accelerator drivers.

So here, I'd just expect a
struct cxl_device_state; forwards declaration.

Or potentially one to a a different structure after refactors etc.

> 
> 
> Maybe I need an explicit action here.

J

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-07-15 17:28 ` [PATCH v2 01/15] cxl: add type2 device basic support alejandro.lucero-palau
                     ` (2 preceding siblings ...)
  2024-07-18 23:12   ` Dave Jiang
@ 2024-08-04 17:10   ` Jonathan Cameron
  2024-08-12 11:16     ` Alejandro Lucero Palau
  2024-08-09  8:34   ` Zhi Wang
  4 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 17:10 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:21 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Differientiate Type3, aka memory expanders, from Type2, aka device
> accelerators, with a new function for initializing cxl_dev_state.
> 
> Create opaque struct to be used by accelerators relying on new access
> functions in following patches.
> 
> Add SFC ethernet network driver as the client.
> 
> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Hi Alejandro,

Various comments inline. Mostly minor detail as I need to get my head
around the whole thing which will take a while yet!

Some will seem very fussy given the stage we are at (and fairly
long way to go), but cleaner code will generally be easier to read
so may help move the bigger stuff forwards quicker.
+ I had my review brain in gear so couldn't ignore things.

Jonathan

> ---
>  drivers/cxl/core/memdev.c             | 52 ++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/Makefile     |  2 +-
>  drivers/net/ethernet/sfc/efx.c        |  4 ++
>  drivers/net/ethernet/sfc/efx_cxl.c    | 53 +++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/efx_cxl.h    | 29 +++++++++++++++
>  drivers/net/ethernet/sfc/net_driver.h |  4 ++
>  include/linux/cxl_accel_mem.h         | 22 +++++++++++
>  include/linux/cxl_accel_pci.h         | 23 ++++++++++++
>  8 files changed, 188 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>  create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
>  create mode 100644 include/linux/cxl_accel_mem.h
>  create mode 100644 include/linux/cxl_accel_pci.h
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 0277726afd04..61b5d35b49e7 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c

> @@ -692,6 +712,38 @@ static int cxl_memdev_open(struct inode *inode, struct file *file)
>  	return 0;
>  }
>  
> +
> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
> +{
> +	cxlds->cxl_dvsec = dvsec;

Nothing to do with accel. If these make sense promote to cxl
core and a linux/cxl/ header.  Also we may want the type3 driver to
switch to them long term. If nothing else, making that handle the
cxl_dev_state as more opaque will show up what is still directly
accessed and may need to be wrapped up for a future accelerator driver
to use.


> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_dvsec, CXL);
> +
> +void cxl_accel_set_serial(struct cxl_dev_state *cxlds, u64 serial)
> +{
> +	cxlds->serial= serial;

Run checkpatch over this series before v3 with --strict and fix the
warnings. Probably would have spotted missing space before =

Sure it's a series that is kind of RFC ish at the moment but clean
code means you don't get nitpickers like me pointing this stuff out!

> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_serial, CXL);
> +
> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> +			    enum accel_resource type)
> +{
> +	switch (type) {
> +	case CXL_ACCEL_RES_DPA:
> +		cxlds->dpa_res = res;
> +		return;
> +	case CXL_ACCEL_RES_RAM:
> +		cxlds->ram_res = res;
> +		return;
> +	case CXL_ACCEL_RES_PMEM:
> +		cxlds->pmem_res = res;
> +		return;
> +	default:
> +		dev_err(cxlds->dev, "unkown resource type (%u)\n", type);
typo. Plus I'd let this return an error as we may well have more types
in future and not handle them all.

> +	}
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
> +
>  static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>  {
>  	struct cxl_memdev *cxlmd =

> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
> index e9d9de8e648a..cb3f74d30852 100644
> --- a/drivers/net/ethernet/sfc/efx.c
> +++ b/drivers/net/ethernet/sfc/efx.c
> @@ -33,6 +33,7 @@
>  #include "selftest.h"
>  #include "sriov.h"
>  #include "efx_devlink.h"
> +#include "efx_cxl.h"
>  
>  #include "mcdi_port_common.h"
>  #include "mcdi_pcol.h"
> @@ -899,6 +900,7 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>  	efx_pci_remove_main(efx);
>  
>  	efx_fini_io(efx);
> +

Make sure you don't add noisy whitespace changes in v3. Slows down
review and makes a patch set look bigger than it is.

>  	pci_dbg(efx->pci_dev, "shutdown successful\n");
>  
>  	efx_fini_devlink_and_unlock(efx);
> @@ -1109,6 +1111,8 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
>  	if (rc)
>  		goto fail2;
>  
> +	efx_cxl_init(efx);
> +
As below, have an error code. This is not something we want to fail
and have the driver carry on.

>  	rc = efx_pci_probe_post_io(efx);
>  	if (rc) {
>  		/* On failure, retry once immediately.
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> new file mode 100644
> index 000000000000..4554dd7cca76
> --- /dev/null
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -0,0 +1,53 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/****************************************************************************
> + * Driver for AMD network controllers and boards
> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +
> +#include <linux/pci.h>
> +#include <linux/cxl_accel_mem.h>
> +#include <linux/cxl_accel_pci.h>
> +
> +#include "net_driver.h"
> +#include "efx_cxl.h"
> +
> +#define EFX_CTPIO_BUFFER_SIZE	(1024*1024*256)
> +
> +void efx_cxl_init(struct efx_nic *efx)
> +{
> +	struct pci_dev *pci_dev = efx->pci_dev;
> +	struct efx_cxl *cxl = efx->cxl;
> +	struct resource res;
> +	u16 dvsec;
> +
> +	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
> +					  CXL_DVSEC_PCIE_DEVICE);
> +
> +	if (!dvsec)
> +		return;
> +
> +	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability found");

pci_dbg();  

> +
> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
> +	if (IS_ERR(cxl->cxlds)) {
> +		pci_info(pci_dev, "CXL accel device state failed");
> +		return;

Return an error.  A driver calling CXL stuff that fails is going to
want to know

> +	}
> +
> +	cxl_accel_set_dvsec(cxl->cxlds, dvsec);
> +	cxl_accel_set_serial(cxl->cxlds, pci_dev->dev.id);
> +
> +	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
> +	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_DPA);
> +
> +	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
> +	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
> +}
> +
> +
> +MODULE_IMPORT_NS(CXL);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
> new file mode 100644
> index 000000000000..76c6794c20d8
> --- /dev/null
> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
> @@ -0,0 +1,29 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/****************************************************************************
> + * Driver for AMD network controllers and boards
> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +#ifndef EFX_CXL_H
> +#define EFX_CLX_H
> +
> +#include <linux/cxl_accel_mem.h>

Maybe, or maybe just some more forward defines to keep all the header
nice an separate.

> +
> +struct efx_nic;
> +
> +struct efx_cxl {
> +	cxl_accel_state *cxlds;
There are other ways to keep this opaque that let you embed the structure
into one you do know about.  Usually involve allocating a
cxl_device_state + your structure and some cxl_devstate_private()
accessors to get to the data placed after the cxlds part.

May not be worth bothering here though, particularly as the CXL-ness
of the device may not be the most important part and you may well be
doing similar tricks anyway to hid some other subsystem specific driver.

So for now this looks like a sensible approach to me.

> +	struct cxl_memdev *cxlmd;
> +	struct cxl_root_decoder *cxlrd;
> +	struct cxl_port *endpoint;
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_region *efx_region;
> +	void __iomem *ctpio_cxl;
> +};
> +
> +void efx_cxl_init(struct efx_nic *efx);
> +#endif

> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> new file mode 100644
> index 000000000000..daf46d41f59c
> --- /dev/null
> +++ b/include/linux/cxl_accel_mem.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> +
> +#include <linux/cdev.h>
> +
> +#ifndef __CXL_ACCEL_MEM_H
> +#define __CXL_ACCEL_MEM_H
> +
> +enum accel_resource{
> +	CXL_ACCEL_RES_DPA,
> +	CXL_ACCEL_RES_RAM,
> +	CXL_ACCEL_RES_PMEM,
> +};
> +
> +typedef struct cxl_dev_state cxl_accel_state;

A forwards def would work like you do for struct efx_cxl
above. Keeps the structure opaque unless code actually needs
to know what is in it. That code can including the header
that defines it.  In many cases it will be an opaque pointer
passed to code in the CXL core.

struct cxl_dev_state;

Then use pointers to that in these functions.

> +cxl_accel_state *cxl_accel_state_create(struct device *dev);
> +
> +void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
> +void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> +			    enum accel_resource);
> +#endif
> diff --git a/include/linux/cxl_accel_pci.h b/include/linux/cxl_accel_pci.h
> new file mode 100644
> index 000000000000..c337ae8797e6
> --- /dev/null
> +++ b/include/linux/cxl_accel_pci.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> +
> +#ifndef __CXL_ACCEL_PCI_H
> +#define __CXL_ACCEL_PCI_H
> +
> +/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
> +#define CXL_DVSEC_PCIE_DEVICE					0
> +#define   CXL_DVSEC_CAP_OFFSET		0xA
> +#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
> +#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
> +#define   CXL_DVSEC_CTRL_OFFSET		0xC
> +#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
> +#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
> +#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
> +#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
> +#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
> +#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
> +#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
> +#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
> +#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)

As I think Dave suggested, pull any defs you need to linux/cxl/pci.h or whatever
makes sense and make the exiting code look for them there.

Ideally do that in a patch that does nothing else as simple
moves are easier to review quickly than ones mixed with real changes.


> +
> +#endif


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 02/15] cxl: add function for type2 cxl regs setup
  2024-07-15 17:28 ` [PATCH v2 02/15] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
  2024-07-16  6:26   ` Li, Ming4
  2024-07-18 23:27   ` Dave Jiang
@ 2024-08-04 17:15   ` Jonathan Cameron
  2024-08-14  7:56     ` Alejandro Lucero Palau
  2 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 17:15 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:22 +0100
alejandro.lucero-palau@amd.com wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Create a new function for a type2 device initialising the opaque
> cxl_dev_state struct regarding cxl regs setup and mapping.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/pci.c                  | 28 ++++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/efx_cxl.c |  3 +++
>  include/linux/cxl_accel_mem.h      |  1 +
>  3 files changed, 32 insertions(+)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index e53646e9f2fb..b34d6259faf4 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -11,6 +11,7 @@
>  #include <linux/pci.h>
>  #include <linux/aer.h>
>  #include <linux/io.h>
> +#include <linux/cxl_accel_mem.h>
>  #include "cxlmem.h"
>  #include "cxlpci.h"
>  #include "cxl.h"
> @@ -521,6 +522,33 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>  	return cxl_setup_regs(map);
>  }
>  
> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_register_map map;
> +	int rc;
> +
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_map_device_regs(&map, &cxlds->regs.device_regs);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> +				&cxlds->reg_map);
> +	if (rc)
> +		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);

Not fatal?  If we think it will happen on real devices, then dev_warn
is too strong.

> +
> +	rc = cxl_map_component_regs(&cxlds->reg_map, &cxlds->regs.component,
> +				    BIT(CXL_CM_CAP_CAP_ID_RAS));
> +	if (rc)
> +		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");

pci_err() or similar would make sense here as we have asked for something
that isn't happening. Specification says this is mandatory so
definitely smells like a fatal error to me.


> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_pci_accel_setup_regs, CXL);
> +
>  static int cxl_pci_ras_unmask(struct pci_dev *pdev)
>  {
>  	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 4554dd7cca76..10c4fb915278 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -47,6 +47,9 @@ void efx_cxl_init(struct efx_nic *efx)
>  
>  	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>  	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
> +
> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
> +		pci_info(pci_dev, "CXL accel setup regs failed");
Handle errors fully. That is report them  up to the caller.

>  }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index daf46d41f59c..ca7af4a9cefc 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -19,4 +19,5 @@ void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>  void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>  void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>  			    enum accel_resource);
> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>  #endif


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 03/15] cxl: add function for type2 resource request
  2024-07-18 23:36   ` Dave Jiang
@ 2024-08-04 17:16     ` Jonathan Cameron
  2024-08-14  8:08       ` Alejandro Lucero Palau
  2024-08-14  8:00     ` Alejandro Lucero Palau
  1 sibling, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 17:16 UTC (permalink / raw)
  To: Dave Jiang
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes, Alejandro Lucero

On Thu, 18 Jul 2024 16:36:00 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> On 7/15/24 10:28 AM, alejandro.lucero-palau@amd.com wrote:
> > From: Alejandro Lucero <alucerop@amd.com>
> > 
> > Create a new function for a type2 device requesting a resource
> > passing the opaque struct to work with.
> > 
> > Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> > ---
> >  drivers/cxl/core/memdev.c          | 13 +++++++++++++
> >  drivers/net/ethernet/sfc/efx_cxl.c |  7 ++++++-
> >  include/linux/cxl_accel_mem.h      |  1 +
> >  3 files changed, 20 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index 61b5d35b49e7..04c3a0f8bc2e 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -744,6 +744,19 @@ void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
> >  
> > +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram)  
> Maybe declare a common enum like cxl_resource_type instead of 'enum accel_resource' and use here instead of bool?
> 
> > +{
> > +	int rc;
> > +
> > +	if (is_ram)
> > +		rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
> > +	else
> > +		rc = request_resource(&cxlds->dpa_res, &cxlds->pmem_res);
> > +
> > +	return rc;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
> > +
> >  static int cxl_memdev_release_file(struct inode *inode, struct file *file)
> >  {
> >  	struct cxl_memdev *cxlmd =
> > diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> > index 10c4fb915278..9cefcaf3caca 100644
> > --- a/drivers/net/ethernet/sfc/efx_cxl.c
> > +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> > @@ -48,8 +48,13 @@ void efx_cxl_init(struct efx_nic *efx)
> >  	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
> >  	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
> >  
> > -	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
> > +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds)) {
> >  		pci_info(pci_dev, "CXL accel setup regs failed");
> > +		return;
> > +	}
> > +
> > +	if (cxl_accel_request_resource(cxl->cxlds, true))
> > +		pci_info(pci_dev, "CXL accel resource request failed");  
> 
> pci_warn()? also emitting the errno would be nice. 
Don't hide it at all.  Fail if this doesn't succeed and let the caller
know. Not to mention, tear down any other state already set up.
 
> >  }
> >  
> >  
> > diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> > index ca7af4a9cefc..c7b254edc096 100644
> > --- a/include/linux/cxl_accel_mem.h
> > +++ b/include/linux/cxl_accel_mem.h
> > @@ -20,4 +20,5 @@ void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
> >  void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> >  			    enum accel_resource);
> >  int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
> > +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
> >  #endif  
> 


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-07-15 17:28 ` [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state alejandro.lucero-palau
  2024-07-19 19:01   ` Dave Jiang
@ 2024-08-04 17:22   ` Jonathan Cameron
  2024-08-15 15:43     ` Alejandro Lucero Palau
  2024-08-09  9:10   ` Zhi Wang
  2 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 17:22 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:24 +0100
alejandro.lucero-palau@amd.com wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Type2 devices have some Type3 functionalities as optional like an mbox
> or an hdm decoder, and CXL core needs a way to know what a CXL accelerator
> implements.
> 
> Add a new field for keeping device capabilities to be initialised by
> Type2 drivers. Advertise all those capabilities for Type3.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
In general seems a reasonable approach, so just minor comments.

> ---
>  drivers/cxl/core/mbox.c            |  1 +
>  drivers/cxl/core/memdev.c          |  4 +++-
>  drivers/cxl/core/port.c            |  2 +-
>  drivers/cxl/core/regs.c            | 11 ++++++-----
>  drivers/cxl/cxl.h                  |  2 +-
>  drivers/cxl/cxlmem.h               |  4 ++++
>  drivers/cxl/pci.c                  | 15 +++++++++------
>  drivers/net/ethernet/sfc/efx_cxl.c |  3 ++-
>  include/linux/cxl_accel_mem.h      |  5 ++++-
>  9 files changed, 31 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 2626f3fff201..2ba7d36e3f38 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1424,6 +1424,7 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
>  	mds->cxlds.reg_map.host = dev;
>  	mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE;
>  	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
> +	mds->cxlds.capabilities = CXL_DRIVER_CAP_HDM | CXL_DRIVER_CAP_MBOX;

Add a reference for this perhaps.  Make it clear that a type3 device must
support mailbox and hdm by pointing at requirement for the various structures
in a spec reference.

> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index af8169ccdbc0..8f2a820bd92d 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -405,6 +405,9 @@ struct cxl_dpa_perf {
>  	int qos_class;
>  };
>  
> +#define CXL_DRIVER_CAP_HDM	0x1
> +#define CXL_DRIVER_CAP_MBOX	0x2
> +
Enum and BIT() for the defines.  Avoids someone in future
thinking they can define 0x3 to be something.

Definitely only one definition as well. Seems reasonable for
this to be CXL wide.


>  /**
>   * struct cxl_dev_state - The driver device state
>   *
> @@ -438,6 +441,7 @@ struct cxl_dev_state {
>  	struct resource ram_res;
>  	u64 serial;
>  	enum cxl_devtype type;
> +	uint8_t capabilities;
>  };


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 05/15] cxl: fix use of resource_contains
  2024-07-15 17:28 ` [PATCH v2 05/15] cxl: fix use of resource_contains alejandro.lucero-palau
  2024-07-24 21:25   ` fan
@ 2024-08-04 17:25   ` Jonathan Cameron
  2024-08-16 14:37     ` Alejandro Lucero Palau
  2024-08-09  9:14   ` Zhi Wang
  2 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 17:25 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:25 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> For a resource defined with size zero, resource contains will also
> return true.
> 
> Add resource size check before using it.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
If this can happen in existing type 3 case the fixes tag
and send it separately from this series.

If there is no path due to some external code, then
drop the word fix from the title and call it

cxl: harden resource_contains checks to handle zero size resources

Avoids it getting backported into stable / distros picking it
up if there isn't a real issue before this series.

Thanks,

Jonathan

> ---
>  drivers/cxl/core/hdm.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 3df10517a327..4af9225d4b59 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -327,10 +327,13 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
>  	cxled->dpa_res = res;
>  	cxled->skip = skipped;
>  
> -	if (resource_contains(&cxlds->pmem_res, res))
> +	if ((resource_size(&cxlds->pmem_res)) && (resource_contains(&cxlds->pmem_res, res))) {
> +		printk("%s: resource_contains CXL_DECODER_PMEM\n", __func__);
>  		cxled->mode = CXL_DECODER_PMEM;
> -	else if (resource_contains(&cxlds->ram_res, res))
> +	} else if ((resource_size(&cxlds->ram_res)) && (resource_contains(&cxlds->ram_res, res))) {
> +		printk("%s: resource_contains CXL_DECODER_RAM\n", __func__);
>  		cxled->mode = CXL_DECODER_RAM;
> +	}
>  	else {
>  		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
>  			 port->id, cxled->cxld.id, cxled->dpa_res);


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 06/15] cxl: add function for setting media ready by an accelerator
  2024-07-15 17:28 ` [PATCH v2 06/15] cxl: add function for setting media ready by an accelerator alejandro.lucero-palau
@ 2024-08-04 17:26   ` Jonathan Cameron
  2024-08-16 14:54     ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 17:26 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:26 +0100
alejandro.lucero-palau@amd.com wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> A Type-2 driver can require to set the memory availability explicitly.
> 
> Add a function to the exported CXL API for accelerator drivers.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/memdev.c          | 7 ++++++-
>  drivers/net/ethernet/sfc/efx_cxl.c | 5 +++++
>  include/linux/cxl_accel_mem.h      | 2 ++
>  3 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index b4205ecca365..58a51e7fd37f 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -714,7 +714,6 @@ static int cxl_memdev_open(struct inode *inode, struct file *file)
>  	return 0;
>  }
>  
> -
Grumpy maintainer time ;)
Scrub for this stuff before posting.  Move the whitespace cleanup to the
earlier patch so we have less noise here.

>  void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>  {
>  	cxlds->cxl_dvsec = dvsec;
> @@ -759,6 +758,12 @@ int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
>  
> +void cxl_accel_set_media_ready(struct cxl_dev_state *cxlds)
> +{
> +	cxlds->media_ready = true;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_media_ready, CXL);
> +
>  static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>  {
>  	struct cxl_memdev *cxlmd =
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 37d8bfdef517..a84fe7992c53 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -56,6 +56,11 @@ void efx_cxl_init(struct efx_nic *efx)
>  
>  	if (cxl_accel_request_resource(cxl->cxlds, true))
>  		pci_info(pci_dev, "CXL accel resource request failed");
> +
> +	if (!cxl_await_media_ready(cxl->cxlds))
> +		cxl_accel_set_media_ready(cxl->cxlds);
> +	else
> +		pci_info(pci_dev, "CXL accel media not active");
Feels fatal. pci_err() and return an error.

>  }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index 0ba2195b919b..b883c438a132 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -24,4 +24,6 @@ void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>  			    enum accel_resource);
>  int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>  int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
> +void cxl_accel_set_media_ready(struct cxl_dev_state *cxlds);
> +int cxl_await_media_ready(struct cxl_dev_state *cxlds);
>  #endif


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 07/15] cxl: support type2 memdev creation
  2024-07-15 17:28 ` [PATCH v2 07/15] cxl: support type2 memdev creation alejandro.lucero-palau
  2024-07-24 21:32   ` fan
@ 2024-08-04 17:31   ` Jonathan Cameron
  2024-08-16 15:00     ` Alejandro Lucero Palau
  1 sibling, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 17:31 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:27 +0100
alejandro.lucero-palau@amd.com wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Add memdev creation from sfc driver.
> 
> Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when
> creating a memdev leading to problems when obtaining cxl_memdev_state
> references from a CXL_DEVTYPE_DEVMEM type. This last device type is
> managed by a specific vendor driver and does not need same sysfs files
> since not userspace intervention is expected. This patch checks for the
> right device type in those functions using cxl_memdev_state.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Same general comment about treating failure to get things you expect
as proper driver probe errors.  Very unlikely we'd ever want to carry
on if these fail. If we do want to, that should be a high level decision
and the chances are the driver needs to know that the error occurred
so it can take some mitigating measures (using some alternative mechanisms
etc).

> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index a84fe7992c53..0abe66490ef5 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -57,10 +57,16 @@ void efx_cxl_init(struct efx_nic *efx)
>  	if (cxl_accel_request_resource(cxl->cxlds, true))
>  		pci_info(pci_dev, "CXL accel resource request failed");
>  
> -	if (!cxl_await_media_ready(cxl->cxlds))
> +	if (!cxl_await_media_ready(cxl->cxlds)) {
>  		cxl_accel_set_media_ready(cxl->cxlds);
> -	else
> +	} else {
>  		pci_info(pci_dev, "CXL accel media not active");
> +		return;
Once you are returning an error in this path you can just have
		return -ETIMEDOUT; or similar here adn avoid
this code changing in this patch.
> +	}
> +
> +	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
> +	if (IS_ERR(cxl->cxlmd))
> +		pci_info(pci_dev, "CXL accel memdev creation failed");

I'd treat this one as fatal as well.

People argue in favor of muddling on to allow firmware upgrade etc.
That is fine, but pass up the errors then decide to ignore them
at the higher levels.

>  }
>  
>  



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 08/15] cxl: indicate probe deferral
  2024-07-15 17:28 ` [PATCH v2 08/15] cxl: indicate probe deferral alejandro.lucero-palau
  2024-07-16  5:52   ` Li, Ming4
  2024-07-30 16:43   ` Fan Ni
@ 2024-08-04 17:41   ` Jonathan Cameron
  2024-08-19 13:54     ` Alejandro Lucero Palau
  2024-08-09 14:40   ` Zhi Wang
  2024-08-26 17:42   ` Zhi Wang
  4 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 17:41 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:28 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> The first stop for a CXL accelerator driver that wants to establish new
> CXL.mem regions is to register a 'struct cxl_memdev. That kicks off
> cxl_mem_probe() to enumerate all 'struct cxl_port' instances in the
> topology up to the root.
> 
> If the root driver has not attached yet the expectation is that the
> driver waits until that link is established. The common cxl_pci_driver
> has reason to keep the 'struct cxl_memdev' device attached to the bus
> until the root driver attaches. An accelerator may want to instead defer
> probing until CXL resources can be acquired.
> 
> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
> accelerator driver probing should be defferred vs failed. Provide that
> indication via a new cxl_acquire_endpoint() API that can retrieve the
> probe status of the memdev.
> 
> The first consumer of this API is a test driver that excercises the CXL
Spell check.
exercises

> Type-2 flow.
> 
> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m18497367d2ae38f88e94c06369eaa83fa23e92b2
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c          | 41 ++++++++++++++++++++++++++++++
>  drivers/cxl/core/port.c            |  2 +-
>  drivers/cxl/mem.c                  |  7 +++--
>  drivers/net/ethernet/sfc/efx_cxl.c | 10 +++++++-
>  include/linux/cxl_accel_mem.h      |  3 +++
>  5 files changed, 59 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index b902948b121f..d51c8bfb32e3 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -1137,6 +1137,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>  }
>  EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
>  
> +/*
> + * Try to get a locked reference on a memdev's CXL port topology
> + * connection. Be careful to observe when cxl_mem_probe() has deposited
> + * a probe deferral awaiting the arrival of the CXL root driver

It might have deposited an error that isn't deferral I think.
I would be careful to make that clear in this comment.

> +*/
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
> +{
> +	struct cxl_port *endpoint;
> +	int rc = -ENXIO;
> +
> +	device_lock(&cxlmd->dev);

I'd not really expect an 'acquire endpoint' to exit
in the good path with the cxlmd->dev device lock held.
Perhaps that needs a bit more shouting in the naming of
the function?

> +	endpoint = cxlmd->endpoint;
> +	if (!endpoint)
> +		goto err;
> +
> +	if (IS_ERR(endpoint)) {
> +		rc = PTR_ERR(endpoint);
> +		goto err;
> +	}
> +
> +	device_lock(&endpoint->dev);
> +	if (!endpoint->dev.driver)
> +		goto err_endpoint;
> +
> +	return endpoint;
> +
> +err_endpoint:
> +	device_unlock(&endpoint->dev);
> +err:
> +	device_unlock(&cxlmd->dev);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
> +
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
> +{
> +	device_unlock(&endpoint->dev);
> +	device_unlock(&cxlmd->dev);
> +}
> +EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
> +
>  static void sanitize_teardown_notifier(void *data)
>  {
>  	struct cxl_memdev_state *mds = data;
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index d66c6349ed2d..3c6b896c5f65 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>  		 */
>  		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>  			dev_name(dport_dev));
> -		return -ENXIO;
> +		return -EPROBE_DEFER;
>  	}
>  
>  	parent_port = find_cxl_port(dparent, &parent_dport);
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index f76af75a87b7..383a6f4829d3 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -145,13 +145,16 @@ static int cxl_mem_probe(struct device *dev)
>  		return rc;
>  
>  	rc = devm_cxl_enumerate_ports(cxlmd);
> -	if (rc)
> +	if (rc) {
> +		cxlmd->endpoint = ERR_PTR(rc);
>  		return rc;
> +	}
>  
>  	parent_port = cxl_mem_find_port(cxlmd, &dport);
>  	if (!parent_port) {
>  		dev_err(dev, "CXL port topology not found\n");

Hmm. This seems excessive error print for a deferred path.

> -		return -ENXIO;
> +		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
> +		return -EPROBE_DEFER;
>  	}
>  
>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-07-15 17:28 ` [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration alejandro.lucero-palau
  2024-07-16  0:53   ` kernel test robot
  2024-07-16  6:06   ` Li, Ming4
@ 2024-08-04 17:57   ` Jonathan Cameron
  2024-08-19 14:47     ` Alejandro Lucero Palau
                       ` (2 more replies)
  2 siblings, 3 replies; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 17:57 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:29 +0100
alejandro.lucero-palau@amd.com wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> CXL region creation involves allocating capacity from device DPA
> (device-physical-address space) and assigning it to decode a given HPA
> (host-physical-address space). Before determining how much DPA to
> allocate the amount of available HPA must be determined. Also, not all
> HPA is create equal, some specifically targets RAM, some target PMEM,
> some is prepared for device-memory flows like HDM-D and HDM-DB, and some
> is host-only (HDM-H).
> 
> Wrap all of those concerns into an API that retrieves a root decoder
> (platform CXL window) that fits the specified constraints and the
> capacity available for a new region.
> 
> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m6fbe775541da3cd477d65fa95c8acdc347345b4f
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>

Hi.

This seems a lot more complex than an accelerator would need.
If plan is to use this in the type3 driver as well, I'd like to
see that done as a precursor to the main series.
If it only matters to accelerator drivers (as in type 3 I think
we make this a userspace problem), then limit the code to handle
interleave ways == 1 only.  Maybe we will care about higher interleave
in the long run, but do you have a multihead accelerator today?

Jonathan

> ---
>  drivers/cxl/core/region.c          | 161 +++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h                  |   3 +
>  drivers/cxl/cxlmem.h               |   5 +
>  drivers/net/ethernet/sfc/efx_cxl.c |  14 +++
>  include/linux/cxl_accel_mem.h      |   9 ++
>  5 files changed, 192 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 538ebd5a64fd..ca464bfef77b 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -702,6 +702,167 @@ static int free_hpa(struct cxl_region *cxlr)
>  	return 0;
>  }
>  
> +
> +struct cxlrd_max_context {
> +	struct device * const *host_bridges;
> +	int interleave_ways;
> +	unsigned long flags;
> +	resource_size_t max_hpa;
> +	struct cxl_root_decoder *cxlrd;
> +};
> +
> +static int find_max_hpa(struct device *dev, void *data)
> +{
> +	struct cxlrd_max_context *ctx = data;
> +	struct cxl_switch_decoder *cxlsd;
> +	struct cxl_root_decoder *cxlrd;
> +	struct resource *res, *prev;
> +	struct cxl_decoder *cxld;
> +	resource_size_t max;
> +	int found;
> +
> +	if (!is_root_decoder(dev))
> +		return 0;
> +
> +	cxlrd = to_cxl_root_decoder(dev);
> +	cxld = &cxlrd->cxlsd.cxld;
> +	if ((cxld->flags & ctx->flags) != ctx->flags) {
> +		dev_dbg(dev, "find_max_hpa, flags not matching: %08lx vs %08lx\n",
> +			      cxld->flags, ctx->flags);
> +		return 0;
> +	}
> +
> +	/* A Host bridge could have more interleave ways than an
> +	 * endpoint, couldn´t it?

EP interleave ways is about working out how the full HPA address (it's
all sent over the wire) is modified to get to the DPA.  So it needs
to know what the overall interleave is.  Host bridge can't interleave
and then have the EP not know about it.  If there are switch HDM decoders
in the path, the host bridge interleave may be less than that the EP needs
to deal with.

Does an accelerator actually cope with interleave? Is aim here to ensure
that IW is never anything other than 1?  Or is this meant to have
more general use? I guess it is meant to. In which case, I'd like to
see this used in the type3 driver as well.

> +	 *
> +	 * What does interleave ways mean here in terms of the requestor?
> +	 * Why the FFMWS has 0 interleave ways but root port has 1?

FFMWS?

> +	 */
> +	if (cxld->interleave_ways != ctx->interleave_ways) {
> +		dev_dbg(dev, "find_max_hpa, interleave_ways  not matching\n");
> +		return 0;
> +	}
> +
> +	cxlsd = &cxlrd->cxlsd;
> +
> +	guard(rwsem_read)(&cxl_region_rwsem);
> +	found = 0;
> +	for (int i = 0; i < ctx->interleave_ways; i++)
> +		for (int j = 0; j < ctx->interleave_ways; j++)
> +			if (ctx->host_bridges[i] ==
> +					cxlsd->target[j]->dport_dev) {
> +				found++;
> +				break;
> +			}
> +
> +	if (found != ctx->interleave_ways) {
> +		dev_dbg(dev, "find_max_hpa, no interleave_ways found\n");
> +		return 0;
> +	}
> +
> +	/*
> +	 * Walk the root decoder resource range relying on cxl_region_rwsem to
> +	 * preclude sibling arrival/departure and find the largest free space
> +	 * gap.
> +	 */
> +	lockdep_assert_held_read(&cxl_region_rwsem);
> +	max = 0;
> +	res = cxlrd->res->child;
> +	if (!res)
> +		max = resource_size(cxlrd->res);
> +	else
> +		max = 0;
> +
> +	for (prev = NULL; res; prev = res, res = res->sibling) {
> +		struct resource *next = res->sibling;
> +		resource_size_t free = 0;
> +
> +		if (!prev && res->start > cxlrd->res->start) {
> +			free = res->start - cxlrd->res->start;
> +			max = max(free, max);
> +		}
> +		if (prev && res->start > prev->end + 1) {
> +			free = res->start - prev->end + 1;
> +			max = max(free, max);
> +		}
> +		if (next && res->end + 1 < next->start) {
> +			free = next->start - res->end + 1;
> +			max = max(free, max);
> +		}
> +		if (!next && res->end + 1 < cxlrd->res->end + 1) {
> +			free = cxlrd->res->end + 1 - res->end + 1;
> +			max = max(free, max);
> +		}
> +	}
> +
> +	if (max > ctx->max_hpa) {
> +		if (ctx->cxlrd)
> +			put_device(CXLRD_DEV(ctx->cxlrd));
> +		get_device(CXLRD_DEV(cxlrd));
> +		ctx->cxlrd = cxlrd;
> +		ctx->max_hpa = max;
> +		dev_info(CXLRD_DEV(cxlrd), "found %pa bytes of free space\n", &max);

dev_dbg()

> +	}
> +	return 0;
> +}
> +
> +/**
> + * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
> + * @endpoint: an endpoint that is mapped by the returned decoder
> + * @interleave_ways: number of entries in @host_bridges
> + * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
> + * @max: output parameter of bytes available in the returned decoder

@available_size
or something along those lines. I'd expect max to be the end address of the available
region

> + *
> + * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
> + * is a point in time snapshot. If by the time the caller goes to use this root
> + * decoder's capacity the capacity is reduced then caller needs to loop and
> + * retry.
> + *
> + * The returned root decoder has an elevated reference count that needs to be
> + * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
> + * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
> + * does not race.
> + */
> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
> +					       int interleave_ways,
> +					       unsigned long flags,
> +					       resource_size_t *max)
> +{
> +
> +	struct cxlrd_max_context ctx = {
> +		.host_bridges = &endpoint->host_bridge,
> +		.interleave_ways = interleave_ways,
> +		.flags = flags,
> +	};
> +	struct cxl_port *root_port;
> +	struct cxl_root *root;
> +
> +	if (!is_cxl_endpoint(endpoint)) {
> +		dev_dbg(&endpoint->dev, "hpa requestor is not an endpoint\n");
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	root = find_cxl_root(endpoint);
> +	if (!root) {
> +		dev_dbg(&endpoint->dev, "endpoint can not be related to a root port\n");
> +		return ERR_PTR(-ENXIO);
> +	}
> +
> +	root_port = &root->port;
> +	down_read(&cxl_region_rwsem);
> +	device_for_each_child(&root_port->dev, &ctx, find_max_hpa);
> +	up_read(&cxl_region_rwsem);
> +	put_device(&root_port->dev);
> +
> +	if (!ctx.cxlrd)
> +		return ERR_PTR(-ENOMEM);
> +
> +	*max = ctx.max_hpa;

Rename max_hpa to available_hpa.

> +	return ctx.cxlrd;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_get_hpa_freespace, CXL);
> +
> +


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 10/15] cxl: define a driver interface for DPA allocation
  2024-07-15 17:28 ` [PATCH v2 10/15] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
  2024-07-16  3:32   ` kernel test robot
@ 2024-08-04 18:07   ` Jonathan Cameron
  2024-08-19 15:52     ` Alejandro Lucero Palau
  2024-08-06 17:33   ` Fan Ni
  2 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 18:07 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:30 +0100
alejandro.lucero-palau@amd.com wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Region creation involves finding available DPA (device-physical-address)
> capacity to map into HPA (host-physical-address) space. Given the HPA
> capacity constraint, define an API, cxl_request_dpa(), that has the
> flexibility to  map the minimum amount of memory the driver needs to
> operate vs the total possible that can be mapped given HPA availability.
> 
> Factor out the core of cxl_dpa_alloc, that does free space scanning,
> into a cxl_dpa_freespace() helper, and use that to balance the capacity
> available to map vs the @min and @max arguments to cxl_request_dpa.
> 
> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m4271ee49a91615c8af54e3ab20679f8be3099393
> 
Use the permalink link under these to get shorter links.
https://lore.kernel.org/linux-cxl/168592158743.1948938.7622563891193802610.stgit@dwillia2-xfh.jf.intel.com/
goes to the same patch.


> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>


> +
> +int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
> +{
> +	struct cxl_port *port = cxled_to_port(cxled);
> +	struct device *dev = &cxled->cxld.dev;
> +	resource_size_t start, avail, skip;
> +	int rc;
> +
> +	down_write(&cxl_dpa_rwsem);

Some cleanup.h magic would help here by allowing early returns.
Needs the scoped lock though to ensure it's released before the
devm_add_action_or_reset() as I'd guess we will deadlock otherwise
if that fails.

> +	if (cxled->cxld.region) {
> +		dev_dbg(dev, "EBUSY, decoder attached to %s\n",
> +			     dev_name(&cxled->cxld.region->dev));
> +		rc = -EBUSY;
>  		goto out;
>  	}
>  
> +	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
> +		dev_dbg(dev, "EBUSY, decoder enabled\n");
> +		rc = -EBUSY;
> +		goto out;
> +	}
> +
> +	avail = cxl_dpa_freespace(cxled, &start, &skip);
> +
>  	if (size > avail) {
>  		dev_dbg(dev, "%pa exceeds available %s capacity: %pa\n", &size,
> -			cxl_decoder_mode_name(cxled->mode), &avail);
> +			     cxled->mode == CXL_DECODER_RAM ? "ram" : "pmem",
> +			     &avail);
>  		rc = -ENOSPC;
>  		goto out;
>  	}
> @@ -550,6 +570,99 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>  	return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
>  }
>  
> +static int find_free_decoder(struct device *dev, void *data)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_port *port;
> +
> +	if (!is_endpoint_decoder(dev))
> +		return 0;
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	port = cxled_to_port(cxled);
> +
> +	if (cxled->cxld.id != port->hdm_end + 1) {
> +		return 0;

No brackets

> +	}
> +	return 1;
> +}
> +
> +/**
> + * cxl_request_dpa - search and reserve DPA given input constraints
> + * @endpoint: an endpoint port with available decoders
> + * @mode: DPA operation mode (ram vs pmem)
> + * @min: the minimum amount of capacity the call needs
> + * @max: extra capacity to allocate after min is satisfied
> + *
> + * Given that a region needs to allocate from limited HPA capacity it
> + * may be the case that a device has more mappable DPA capacity than
> + * available HPA. So, the expectation is that @min is a driver known
> + * value for how much capacity is needed, and @max is based the limit of
> + * how much HPA space is available for a new region.
We are going to need a policy control on the max value.
Otherwise, if you have two devices that support huge capacity and
not enough space, who gets it will just be a race.

Not a problem for now though!

> + *
> + * Returns a pinned cxl_decoder with at least @min bytes of capacity
> + * reserved, or an error pointer. The caller is also expected to own the
> + * lifetime of the memdev registration associated with the endpoint to
> + * pin the decoder registered as well.
> + */




^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 15/15] efx: support pio mapping based on cxl
  2024-07-15 17:28 ` [PATCH v2 15/15] efx: support pio mapping based on cxl alejandro.lucero-palau
@ 2024-08-04 18:13   ` Jonathan Cameron
  2024-08-19 16:28     ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 18:13 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:35 +0100
alejandro.lucero-palau@amd.com wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> With a device supporting CXL and successfully initialised, use the cxl
> region to map the memory range and use this mapping for PIO buffers.

This explains why you weren't worried about any step of the CXL
code failing and why that wasn't a 'bug' as such.

I'd argue that you should still have the cxl intialization return
an error code and cleanup any state it if hits an error.

Then the top level driver can of course elect to use an alternative
path given that failure.  Logically it belongs there rather than relying
on a buffer being mapped or not.

> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/net/ethernet/sfc/ef10.c      | 25 +++++++++++++++++++++----
>  drivers/net/ethernet/sfc/efx_cxl.c   | 12 +++++++++++-
>  drivers/net/ethernet/sfc/mcdi_pcol.h |  3 +++
>  drivers/net/ethernet/sfc/nic.h       |  1 +
>  4 files changed, 36 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
> index 8fa6c0e9195b..3924076d2628 100644
> --- a/drivers/net/ethernet/sfc/ef10.c
> +++ b/drivers/net/ethernet/sfc/ef10.c
> @@ -24,6 +24,7 @@
>  #include <linux/wait.h>
>  #include <linux/workqueue.h>
>  #include <net/udp_tunnel.h>
> +#include "efx_cxl.h"
>  
>  /* Hardware control for EF10 architecture including 'Huntington'. */
>  
> @@ -177,6 +178,12 @@ static int efx_ef10_init_datapath_caps(struct efx_nic *efx)
>  			  efx->num_mac_stats);
>  	}
>  
> +	if (outlen < MC_CMD_GET_CAPABILITIES_V7_OUT_LEN)
> +		nic_data->datapath_caps3 = 0;
> +	else
> +		nic_data->datapath_caps3 = MCDI_DWORD(outbuf,
> +						      GET_CAPABILITIES_V7_OUT_FLAGS3);
> +
>  	return 0;
>  }
>  
> @@ -1275,10 +1282,20 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx)
>  			return -ENOMEM;
>  		}
>  		nic_data->pio_write_vi_base = pio_write_vi_base;
> -		nic_data->pio_write_base =
> -			nic_data->wc_membase +
> -			(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
> -			 uc_mem_map_size);
> +
> +		if ((nic_data->datapath_caps3 &
> +		    (1 << MC_CMD_GET_CAPABILITIES_V10_OUT_CXL_CONFIG_ENABLE_LBN)) &&
> +		    efx->cxl->ctpio_cxl)
As per comment at the top, I'd prefer to see some clean handling of the an
error passed up to the caller of the cxl init that then sets a flag that
we can clearly see is all about whether we have CXL or not.

Using this buffer mapping is a it too much of a detail in my opinion.

> +		{
> +			nic_data->pio_write_base =
> +				efx->cxl->ctpio_cxl +
> +				(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
> +				 uc_mem_map_size);
> +		} else {
> +			nic_data->pio_write_base =nic_data->wc_membase +
> +				(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
> +				 uc_mem_map_size);
> +		}



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 12/15] cxl: allow region creation by type2 drivers
  2024-07-15 17:28 ` [PATCH v2 12/15] cxl: allow region creation by type2 drivers alejandro.lucero-palau
@ 2024-08-04 18:29   ` Jonathan Cameron
  2024-08-19 16:11     ` Alejandro Lucero Palau
  2024-08-22 13:12   ` Zhi Wang
  1 sibling, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-04 18:29 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, 15 Jul 2024 18:28:32 +0100
alejandro.lucero-palau@amd.com wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Creating a CXL region requires userspace intervention through the cxl
> sysfs files. Type2 support should allow accelerator drivers to create
> such cxl region from kernel code.
> 
> Adding that functionality and integrating it with current support for
> memory expanders.
> 
> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m84598b534cc5664f5bb31521ba6e41c7bc213758
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Needs a co-developed or similar given Dan didn't email this patch
(which this sign off list suggests he did).

I'll take another look at the locking, but my main comment is
that it is really confusing so I have no idea if it's right.
Consider different ways of breaking up the code you need
to try and keep the locking obvious.

Jonathan

> +
> +static ssize_t interleave_ways_store(struct device *dev,
> +				     struct device_attribute *attr,
> +				     const char *buf, size_t len)
> +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +	unsigned int val;
> +	int rc;
> +
> +	rc = kstrtouint(buf, 0, &val);
> +	if (rc)
> +		return rc;
> +
> +	rc = down_write_killable(&cxl_region_rwsem);
> +	if (rc)
> +		return rc;
> +
> +	rc = set_interleave_ways(cxlr, val);
>  	up_write(&cxl_region_rwsem);
>  	if (rc)
>  		return rc;
>  	return len;
>  }
> +
This was probably intentional. Common to group a macro like this
with the function it is using by not having a blank line.
>  static DEVICE_ATTR_RW(interleave_ways);
>  
>  static ssize_t interleave_granularity_show(struct device *dev,
> @@ -547,21 +556,14 @@ static ssize_t interleave_granularity_show(struct device *dev,
>  	return rc;
>  }

> +static ssize_t interleave_granularity_store(struct device *dev,
> +					    struct device_attribute *attr,
> +					    const char *buf, size_t len)
> +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +	int rc, val;
> +
> +	rc = kstrtoint(buf, 0, &val);
> +	if (rc)
> +		return rc;
> +
>  	rc = down_write_killable(&cxl_region_rwsem);
>  	if (rc)
>  		return rc;
> -	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
> -		rc = -EBUSY;
> -		goto out;
> -	}
>  
> -	p->interleave_granularity = val;
> -out:
> +	rc = set_interleave_granularity(cxlr, val);
>  	up_write(&cxl_region_rwsem);
>  	if (rc)
>  		return rc;
>  	return len;
>  }
> +

grump.

>  static DEVICE_ATTR_RW(interleave_granularity);

> +/* Establish an empty region covering the given HPA range */
> +static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> +					   struct cxl_endpoint_decoder *cxled)
> +{
> +	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +	struct range *hpa = &cxled->cxld.hpa_range;
> +	struct cxl_region_params *p;
> +	struct cxl_region *cxlr;
> +	struct resource *res;
> +	int rc;
> +
> +	cxlr = construct_region_begin(cxlrd, cxled);
> +	if (IS_ERR(cxlr))
> +		return cxlr;
>  
>  	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>  
>  	res = kmalloc(sizeof(*res), GFP_KERNEL);
>  	if (!res) {
>  		rc = -ENOMEM;
> -		goto err;
> +		goto out;
>  	}
>  
>  	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
>  				    dev_name(&cxlr->dev));
> +
>  	rc = insert_resource(cxlrd->res, res);
>  	if (rc) {
>  		/*
> @@ -3412,6 +3462,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  			 __func__, dev_name(&cxlr->dev));
>  	}
>  
> +	p = &cxlr->params;
>  	p->res = res;
>  	p->interleave_ways = cxled->cxld.interleave_ways;
>  	p->interleave_granularity = cxled->cxld.interleave_granularity;
> @@ -3419,24 +3470,124 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  
>  	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
>  	if (rc)
> -		goto err;
> +		goto out;
>  
>  	dev_dbg(cxlmd->dev.parent, "%s:%s: %s %s res: %pr iw: %d ig: %d\n",
> -		dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), __func__,
> -		dev_name(&cxlr->dev), p->res, p->interleave_ways,
> -		p->interleave_granularity);
> +				   dev_name(&cxlmd->dev),
> +				   dev_name(&cxled->cxld.dev), __func__,
> +				   dev_name(&cxlr->dev), p->res,
> +				   p->interleave_ways,
> +				   p->interleave_granularity);
>  
>  	/* ...to match put_device() in cxl_add_to_region() */
>  	get_device(&cxlr->dev);
>  	up_write(&cxl_region_rwsem);
> +out:
> +	construct_region_end();

two calls to up_write(&cxl_region_rwsem) next to each other?

> +	if (rc) {
> +		drop_region(cxlr);
> +		return ERR_PTR(rc);
> +	}
> +	return cxlr;
> +}
> +
> +static struct cxl_region *
> +__construct_new_region(struct cxl_root_decoder *cxlrd,
> +		       struct cxl_endpoint_decoder **cxled, int ways)
> +{
> +	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
> +	struct cxl_region_params *p;
> +	resource_size_t size = 0;
> +	struct cxl_region *cxlr;
> +	int rc, i;
> +
> +	/* If interleaving is not supported, why does ways need to be at least 1? */

I think 1 means no interleave. It's simpler to do this than have 0 and 1 both 
mean no interleave because 1 works for programmable decoders.

> +	if (ways < 1)
> +		return ERR_PTR(-EINVAL);
> +
> +	cxlr = construct_region_begin(cxlrd, cxled[0]);

rethink how this broken up.  Taking the cxl_dpa_rwsem
inside this function and is really hard to follow.  Ideally
manage it with scoped_guard()


> +	if (IS_ERR(cxlr))
> +		return cxlr;
> +
> +	rc = set_interleave_ways(cxlr, ways);
> +	if (rc)
> +		goto out;
> +
> +	rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
> +	if (rc)
here I think cxl_dpa_rwsem is held.
> +		goto out;
> +
> +	down_read(&cxl_dpa_rwsem);
> +	for (i = 0; i < ways; i++) {
> +		if (!cxled[i]->dpa_res)
> +			break;
> +		size += resource_size(cxled[i]->dpa_res);
> +	}
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (i < ways)

but not here and they go to the same place.

> +		goto out;
> +
> +	rc = alloc_hpa(cxlr, size);
> +	if (rc)
> +		goto out;
> +
> +	down_read(&cxl_dpa_rwsem);
> +	for (i = 0; i < ways; i++) {
> +		rc = cxl_region_attach(cxlr, cxled[i], i);
> +		if (rc)
> +			break;
> +	}
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (rc)
> +		goto out;
> +
> +	rc = cxl_region_decode_commit(cxlr);
> +	if (rc)
> +		goto out;
>  
> +	p = &cxlr->params;
> +	p->state = CXL_CONFIG_COMMIT;
> +out:
> +	construct_region_end();
> +	if (rc) {
> +		drop_region(cxlr);
> +		return ERR_PTR(rc);
> +	}
>  	return cxlr;
> +}

> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index a0e0795ec064..377bb3cd2d47 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -881,5 +881,7 @@ struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>  					       int interleave_ways,
>  					       unsigned long flags,
>  					       resource_size_t *max);
> -
Avoid whitespace noise.

> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
> +				     struct cxl_endpoint_decoder **cxled,
> +				     int ways);
>  #endif /* __CXL_MEM_H__ */

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 10/15] cxl: define a driver interface for DPA allocation
  2024-07-15 17:28 ` [PATCH v2 10/15] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
  2024-07-16  3:32   ` kernel test robot
  2024-08-04 18:07   ` Jonathan Cameron
@ 2024-08-06 17:33   ` Fan Ni
  2024-08-19 15:57     ` Alejandro Lucero Palau
  2 siblings, 1 reply; 114+ messages in thread
From: Fan Ni @ 2024-08-06 17:33 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero

On Mon, Jul 15, 2024 at 06:28:30PM +0100, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Region creation involves finding available DPA (device-physical-address)
> capacity to map into HPA (host-physical-address) space. Given the HPA
> capacity constraint, define an API, cxl_request_dpa(), that has the
> flexibility to  map the minimum amount of memory the driver needs to
> operate vs the total possible that can be mapped given HPA availability.
> 
> Factor out the core of cxl_dpa_alloc, that does free space scanning,
> into a cxl_dpa_freespace() helper, and use that to balance the capacity
> available to map vs the @min and @max arguments to cxl_request_dpa.
> 
> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m4271ee49a91615c8af54e3ab20679f8be3099393
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/core.h            |   1 +
>  drivers/cxl/core/hdm.c             | 153 +++++++++++++++++++++++++----
>  drivers/net/ethernet/sfc/efx.c     |   2 +
>  drivers/net/ethernet/sfc/efx_cxl.c |  18 +++-
>  drivers/net/ethernet/sfc/efx_cxl.h |   1 +
>  include/linux/cxl_accel_mem.h      |   7 ++
>  6 files changed, 161 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 625394486459..a243ff12c0f4 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -76,6 +76,7 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
>  		     enum cxl_decoder_mode mode);
>  int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size);
>  int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
> +int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);

Function declared twice here.

Fan
>  resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
>  resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled);
>  
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 4af9225d4b59..3e53ae222d40 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -3,6 +3,7 @@
>  #include <linux/seq_file.h>
>  #include <linux/device.h>
>  #include <linux/delay.h>
> +#include <linux/cxl_accel_mem.h>
>  
>  #include "cxlmem.h"
>  #include "core.h"
> @@ -420,6 +421,7 @@ int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
>  	up_write(&cxl_dpa_rwsem);
>  	return rc;
>  }
> +EXPORT_SYMBOL_NS_GPL(cxl_dpa_free, CXL);
>  
>  int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
>  		     enum cxl_decoder_mode mode)
> @@ -467,30 +469,17 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
>  	return rc;
>  }
>  
> -int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
> +static resource_size_t cxl_dpa_freespace(struct cxl_endpoint_decoder *cxled,
> +					 resource_size_t *start_out,
> +					 resource_size_t *skip_out)
>  {
>  	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>  	resource_size_t free_ram_start, free_pmem_start;
> -	struct cxl_port *port = cxled_to_port(cxled);
>  	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> -	struct device *dev = &cxled->cxld.dev;
>  	resource_size_t start, avail, skip;
>  	struct resource *p, *last;
> -	int rc;
> -
> -	down_write(&cxl_dpa_rwsem);
> -	if (cxled->cxld.region) {
> -		dev_dbg(dev, "decoder attached to %s\n",
> -			dev_name(&cxled->cxld.region->dev));
> -		rc = -EBUSY;
> -		goto out;
> -	}
>  
> -	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
> -		dev_dbg(dev, "decoder enabled\n");
> -		rc = -EBUSY;
> -		goto out;
> -	}
> +	lockdep_assert_held(&cxl_dpa_rwsem);
>  
>  	for (p = cxlds->ram_res.child, last = NULL; p; p = p->sibling)
>  		last = p;
> @@ -528,14 +517,45 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>  			skip_end = start - 1;
>  		skip = skip_end - skip_start + 1;
>  	} else {
> -		dev_dbg(dev, "mode not set\n");
> -		rc = -EINVAL;
> +		avail = 0;
> +	}
> +
> +	if (!avail)
> +		return 0;
> +	if (start_out)
> +		*start_out = start;
> +	if (skip_out)
> +		*skip_out = skip;
> +	return avail;
> +}
> +
> +int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
> +{
> +	struct cxl_port *port = cxled_to_port(cxled);
> +	struct device *dev = &cxled->cxld.dev;
> +	resource_size_t start, avail, skip;
> +	int rc;
> +
> +	down_write(&cxl_dpa_rwsem);
> +	if (cxled->cxld.region) {
> +		dev_dbg(dev, "EBUSY, decoder attached to %s\n",
> +			     dev_name(&cxled->cxld.region->dev));
> +		rc = -EBUSY;
>  		goto out;
>  	}
>  
> +	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
> +		dev_dbg(dev, "EBUSY, decoder enabled\n");
> +		rc = -EBUSY;
> +		goto out;
> +	}
> +
> +	avail = cxl_dpa_freespace(cxled, &start, &skip);
> +
>  	if (size > avail) {
>  		dev_dbg(dev, "%pa exceeds available %s capacity: %pa\n", &size,
> -			cxl_decoder_mode_name(cxled->mode), &avail);
> +			     cxled->mode == CXL_DECODER_RAM ? "ram" : "pmem",
> +			     &avail);
>  		rc = -ENOSPC;
>  		goto out;
>  	}
> @@ -550,6 +570,99 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>  	return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
>  }
>  
> +static int find_free_decoder(struct device *dev, void *data)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_port *port;
> +
> +	if (!is_endpoint_decoder(dev))
> +		return 0;
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	port = cxled_to_port(cxled);
> +
> +	if (cxled->cxld.id != port->hdm_end + 1) {
> +		return 0;
> +	}
> +	return 1;
> +}
> +
> +/**
> + * cxl_request_dpa - search and reserve DPA given input constraints
> + * @endpoint: an endpoint port with available decoders
> + * @mode: DPA operation mode (ram vs pmem)
> + * @min: the minimum amount of capacity the call needs
> + * @max: extra capacity to allocate after min is satisfied
> + *
> + * Given that a region needs to allocate from limited HPA capacity it
> + * may be the case that a device has more mappable DPA capacity than
> + * available HPA. So, the expectation is that @min is a driver known
> + * value for how much capacity is needed, and @max is based the limit of
> + * how much HPA space is available for a new region.
> + *
> + * Returns a pinned cxl_decoder with at least @min bytes of capacity
> + * reserved, or an error pointer. The caller is also expected to own the
> + * lifetime of the memdev registration associated with the endpoint to
> + * pin the decoder registered as well.
> + */
> +struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
> +					     bool is_ram,
> +					     resource_size_t min,
> +					     resource_size_t max)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +	enum cxl_decoder_mode mode;
> +	struct device *cxled_dev;
> +	resource_size_t alloc;
> +	int rc;
> +
> +	if (!IS_ALIGNED(min | max, SZ_256M))
> +		return ERR_PTR(-EINVAL);
> +
> +	down_read(&cxl_dpa_rwsem);
> +
> +	cxled_dev = device_find_child(&endpoint->dev, NULL, find_free_decoder);
> +	if (!cxled_dev)
> +		cxled = ERR_PTR(-ENXIO);
> +	else
> +		cxled = to_cxl_endpoint_decoder(cxled_dev);
> +
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (IS_ERR(cxled))
> +		return cxled;
> +
> +	if (is_ram)
> +		mode = CXL_DECODER_RAM;
> +	else
> +		mode = CXL_DECODER_PMEM;
> +
> +	rc = cxl_dpa_set_mode(cxled, mode);
> +	if (rc)
> +		goto err;
> +
> +	down_read(&cxl_dpa_rwsem);
> +	alloc = cxl_dpa_freespace(cxled, NULL, NULL);
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (max)
> +		alloc = min(max, alloc);
> +	if (alloc < min) {
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +
> +	rc = cxl_dpa_alloc(cxled, alloc);
> +	if (rc)
> +		goto err;
> +
> +	return cxled;
> +err:
> +	put_device(cxled_dev);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_request_dpa, CXL);
> +
>  static void cxld_set_interleave(struct cxl_decoder *cxld, u32 *ctrl)
>  {
>  	u16 eig;
> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
> index cb3f74d30852..9cfe29002d98 100644
> --- a/drivers/net/ethernet/sfc/efx.c
> +++ b/drivers/net/ethernet/sfc/efx.c
> @@ -901,6 +901,8 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>  
>  	efx_fini_io(efx);
>  
> +	efx_cxl_exit(efx);
> +
>  	pci_dbg(efx->pci_dev, "shutdown successful\n");
>  
>  	efx_fini_devlink_and_unlock(efx);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 6d49571ccff7..b5626d724b52 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -84,12 +84,28 @@ void efx_cxl_init(struct efx_nic *efx)
>  		goto out;
>  	}
>  
> -	if (max < EFX_CTPIO_BUFFER_SIZE)
> +	if (max < EFX_CTPIO_BUFFER_SIZE) {
>  		pci_info(pci_dev, "CXL accel not enough free HPA space %llu < %u\n",
>  				  max, EFX_CTPIO_BUFFER_SIZE);
> +		goto out;
> +	}
> +
> +	cxl->cxled = cxl_request_dpa(cxl->endpoint, true, EFX_CTPIO_BUFFER_SIZE,
> +				     EFX_CTPIO_BUFFER_SIZE);
> +	if (IS_ERR(cxl->cxled))
> +		pci_info(pci_dev, "CXL accel request DPA failed");
>  out:
>  	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
>  }
>  
> +void efx_cxl_exit(struct efx_nic *efx)
> +{
> +	struct efx_cxl *cxl = efx->cxl;
> +
> +	if (cxl->cxled)
> +		cxl_dpa_free(cxl->cxled);
> + 
> + 	return;
> + }
>  
>  MODULE_IMPORT_NS(CXL);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
> index 76c6794c20d8..59d5217a684c 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.h
> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
> @@ -26,4 +26,5 @@ struct efx_cxl {
>  };
>  
>  void efx_cxl_init(struct efx_nic *efx);
> +void efx_cxl_exit(struct efx_nic *efx);
>  #endif
> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
> index f3e77688ffe0..d4ecb5bb4fc8 100644
> --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -2,6 +2,7 @@
>  /* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>  
>  #include <linux/cdev.h>
> +#include <linux/pci.h>
>  
>  #ifndef __CXL_ACCEL_MEM_H
>  #define __CXL_ACCEL_MEM_H
> @@ -41,4 +42,10 @@ struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>  					       int interleave_ways,
>  					       unsigned long flags,
>  					       resource_size_t *max);
> +
> +struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
> +					     bool is_ram,
> +					     resource_size_t min,
> +					     resource_size_t max);
> +int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
>  #endif
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-04 16:44       ` Jonathan Cameron
@ 2024-08-09  7:26         ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-09  7:26 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 8/4/24 17:44, Jonathan Cameron wrote:
>>>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>>>> new file mode 100644
>>>> index 000000000000..daf46d41f59c
>>>> --- /dev/null
>>>> +++ b/include/linux/cxl_accel_mem.h
>>>> @@ -0,0 +1,22 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>>> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>>>> +
>>>> +#include <linux/cdev.h>
>>> Don't think this header is needed?
>>>   
>>>> +
>>>> +#ifndef __CXL_ACCEL_MEM_H
>>>> +#define __CXL_ACCEL_MEM_H
>>>> +
>>>> +enum accel_resource{
>>>> +	CXL_ACCEL_RES_DPA,
>>>> +	CXL_ACCEL_RES_RAM,
>>>> +	CXL_ACCEL_RES_PMEM,
>>>> +};
>>>> +
>>>> +typedef struct cxl_dev_state cxl_accel_state;
>>> Please use 'struct cxl_dev_state' directly. There's no good reason to hide the type.
>>
>> That is what I think I was told to do although not explicitly. There
>> were concerns in the RFC about accel drivers too loose for doing things
>> regarding CXL and somehow CXL core should keep control as much as
>> possible.  I was even thought I was being asked to implement auxbus with
>> the CXL part of an accel as an auxiliar device which should be bound to
>> a CXL core driver. Then Jonathan Cameron the only one explicitly giving
>> the possibility of the opaque approach and disadvising the auxbus idea.
> I wasn't thinking a typedef to hide it.
> More making all state accesses that are needed through accessor functions so
> that from the 'internals' become opaque to the accelerator code and
> we can radically change how things are structured internally with
> no impact to the (hopefully large number of) CXL accelerator drivers.
>
> So here, I'd just expect a
> struct cxl_device_state; forwards declaration.
>
> Or potentially one to a a different structure after refactors etc.


OK. It makes sense. I thought the concern was about external driver 
modules using the internal cxl structs.

This is the main point in this second patchset version, so if none else 
says the opposite during the next days, I will take it as the right move 
forward and send a new version 3 soon.

Thank you


>>
>> Maybe I need an explicit action here.
> J

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-07-15 17:28 ` [PATCH v2 01/15] cxl: add type2 device basic support alejandro.lucero-palau
                     ` (3 preceding siblings ...)
  2024-08-04 17:10   ` Jonathan Cameron
@ 2024-08-09  8:34   ` Zhi Wang
  2024-08-12 11:34     ` Alejandro Lucero Palau
  4 siblings, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-09  8:34 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero,
	targupta

On Mon, 15 Jul 2024 18:28:21 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Differientiate Type3, aka memory expanders, from Type2, aka device
> accelerators, with a new function for initializing cxl_dev_state.
> 
> Create opaque struct to be used by accelerators relying on new access
> functions in following patches.
> 
> Add SFC ethernet network driver as the client.
> 
> Based on
> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c             | 52 ++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/Makefile     |  2 +-
>  drivers/net/ethernet/sfc/efx.c        |  4 ++
>  drivers/net/ethernet/sfc/efx_cxl.c    | 53
> +++++++++++++++++++++++++++ drivers/net/ethernet/sfc/efx_cxl.h    |
> 29 +++++++++++++++ drivers/net/ethernet/sfc/net_driver.h |  4 ++
>  include/linux/cxl_accel_mem.h         | 22 +++++++++++
>  include/linux/cxl_accel_pci.h         | 23 ++++++++++++
>  8 files changed, 188 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>  create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
>  create mode 100644 include/linux/cxl_accel_mem.h
>  create mode 100644 include/linux/cxl_accel_pci.h
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 0277726afd04..61b5d35b49e7 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -8,6 +8,7 @@
>  #include <linux/idr.h>
>  #include <linux/pci.h>
>  #include <cxlmem.h>
> +#include <linux/cxl_accel_mem.h>

Let's keep the header inclusion in an alphabetical order. The same in
efx_cxl.c

>  #include "trace.h"
>  #include "core.h"
>  
> @@ -615,6 +616,25 @@ static void detach_memdev(struct work_struct
> *work) 
>  static struct lock_class_key cxl_memdev_key;
>  
> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
> +{
> +	struct cxl_dev_state *cxlds;
> +
> +	cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
> +	if (!cxlds)
> +		return ERR_PTR(-ENOMEM);
> +
> +	cxlds->dev = dev;
> +	cxlds->type = CXL_DEVTYPE_DEVMEM;
> +
> +	cxlds->dpa_res = DEFINE_RES_MEM_NAMED(0, 0, "dpa");
> +	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
> +	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
> +
> +	return cxlds;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
> +
>  static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state
> *cxlds, const struct file_operations *fops)
>  {
> @@ -692,6 +712,38 @@ static int cxl_memdev_open(struct inode *inode,
> struct file *file) return 0;
>  }
> 
> +
> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
> +{
> +	cxlds->cxl_dvsec = dvsec;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_dvsec, CXL);
> +
> +void cxl_accel_set_serial(struct cxl_dev_state *cxlds, u64 serial)
> +{
> +	cxlds->serial= serial;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_serial, CXL);
> +

It would be nice to explain about how the cxl core is using these in
the patch comments, as we just saw the stuff got promoted into the core.

> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct
> resource res,
> +			    enum accel_resource type)
> +{
> +	switch (type) {
> +	case CXL_ACCEL_RES_DPA:
> +		cxlds->dpa_res = res;
> +		return;
> +	case CXL_ACCEL_RES_RAM:
> +		cxlds->ram_res = res;
> +		return;
> +	case CXL_ACCEL_RES_PMEM:
> +		cxlds->pmem_res = res;
> +		return;
> +	default:
> +		dev_err(cxlds->dev, "unkown resource type (%u)\n",
> type);
> +	}
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
> +

I wonder in which situation this error can be triggered.
One can be a newer out-of-tree type-2 driver tries to work on an older
kernel. Other situations should be the coding problem of an in-tree
driver.

I prefer to WARN_ONCE() here.

>  static int cxl_memdev_release_file(struct inode *inode, struct file
> *file) {
>  	struct cxl_memdev *cxlmd =
> diff --git a/drivers/net/ethernet/sfc/Makefile
> b/drivers/net/ethernet/sfc/Makefile index 8f446b9bd5ee..e80c713c3b0c
> 100644 --- a/drivers/net/ethernet/sfc/Makefile
> +++ b/drivers/net/ethernet/sfc/Makefile
> @@ -7,7 +7,7 @@ sfc-y			+= efx.o efx_common.o
> efx_channels.o nic.o \ mcdi_functions.o mcdi_filters.o mcdi_mon.o \
>  			   ef100.o ef100_nic.o ef100_netdev.o \
>  			   ef100_ethtool.o ef100_rx.o ef100_tx.o \
> -			   efx_devlink.o
> +			   efx_devlink.o efx_cxl.o
>  sfc-$(CONFIG_SFC_MTD)	+= mtd.o
>  sfc-$(CONFIG_SFC_SRIOV)	+= sriov.o ef10_sriov.o ef100_sriov.o
> ef100_rep.o \ mae.o tc.o tc_bindings.o tc_counters.o \
> diff --git a/drivers/net/ethernet/sfc/efx.c
> b/drivers/net/ethernet/sfc/efx.c index e9d9de8e648a..cb3f74d30852
> 100644 --- a/drivers/net/ethernet/sfc/efx.c
> +++ b/drivers/net/ethernet/sfc/efx.c
> @@ -33,6 +33,7 @@
>  #include "selftest.h"
>  #include "sriov.h"
>  #include "efx_devlink.h"
> +#include "efx_cxl.h"
>  
>  #include "mcdi_port_common.h"
>  #include "mcdi_pcol.h"
> @@ -899,6 +900,7 @@ static void efx_pci_remove(struct pci_dev
> *pci_dev) efx_pci_remove_main(efx);
>  
>  	efx_fini_io(efx);
> +
>  	pci_dbg(efx->pci_dev, "shutdown successful\n");
>  
>  	efx_fini_devlink_and_unlock(efx);
> @@ -1109,6 +1111,8 @@ static int efx_pci_probe(struct pci_dev
> *pci_dev, if (rc)
>  		goto fail2;
>  
> +	efx_cxl_init(efx);
> +
>  	rc = efx_pci_probe_post_io(efx);
>  	if (rc) {
>  		/* On failure, retry once immediately.
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c
> b/drivers/net/ethernet/sfc/efx_cxl.c new file mode 100644
> index 000000000000..4554dd7cca76
> --- /dev/null
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -0,0 +1,53 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/****************************************************************************
> + * Driver for AMD network controllers and boards
> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or
> modify it
> + * under the terms of the GNU General Public License version 2 as
> published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +
> +#include <linux/pci.h>
> +#include <linux/cxl_accel_mem.h>
> +#include <linux/cxl_accel_pci.h>
> +

Let's keep them in alphabetical order. :)

> +#include "net_driver.h"
> +#include "efx_cxl.h"
> +
> +#define EFX_CTPIO_BUFFER_SIZE	(1024*1024*256)
> +
> +void efx_cxl_init(struct efx_nic *efx)
> +{
> +	struct pci_dev *pci_dev = efx->pci_dev;
> +	struct efx_cxl *cxl = efx->cxl;
> +	struct resource res;
> +	u16 dvsec;
> +
> +	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
> +					  CXL_DVSEC_PCIE_DEVICE);
> +
> +	if (!dvsec)
> +		return;
> +
> +	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability
> found"); +
> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
> +	if (IS_ERR(cxl->cxlds)) {
> +		pci_info(pci_dev, "CXL accel device state failed");
> +		return;
> +	}
> +
> +	cxl_accel_set_dvsec(cxl->cxlds, dvsec);
> +	cxl_accel_set_serial(cxl->cxlds, pci_dev->dev.id);
> +
> +	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
> +	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_DPA);
> +
> +	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
> +	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
> +}
> +
> +
> +MODULE_IMPORT_NS(CXL);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h
> b/drivers/net/ethernet/sfc/efx_cxl.h new file mode 100644
> index 000000000000..76c6794c20d8
> --- /dev/null
> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
> @@ -0,0 +1,29 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/****************************************************************************
> + * Driver for AMD network controllers and boards
> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or
> modify it
> + * under the terms of the GNU General Public License version 2 as
> published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +#ifndef EFX_CXL_H
> +#define EFX_CLX_H
> +
> +#include <linux/cxl_accel_mem.h>
> +
> +struct efx_nic;
> +
> +struct efx_cxl {
> +	cxl_accel_state *cxlds;
> +	struct cxl_memdev *cxlmd;
> +	struct cxl_root_decoder *cxlrd;
> +	struct cxl_port *endpoint;
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_region *efx_region;
> +	void __iomem *ctpio_cxl;
> +};
> +
> +void efx_cxl_init(struct efx_nic *efx);
> +#endif
> diff --git a/drivers/net/ethernet/sfc/net_driver.h
> b/drivers/net/ethernet/sfc/net_driver.h index
> f2dd7feb0e0c..58b7517afea4 100644 ---
> a/drivers/net/ethernet/sfc/net_driver.h +++
> b/drivers/net/ethernet/sfc/net_driver.h @@ -814,6 +814,8 @@ enum
> efx_xdp_tx_queues_mode { 
>  struct efx_mae;
>  
> +struct efx_cxl;
> +
>  /**
>   * struct efx_nic - an Efx NIC
>   * @name: Device name (net device name or bus id before net device
> registered) @@ -962,6 +964,7 @@ struct efx_mae;
>   * @tc: state for TC offload (EF100).
>   * @devlink: reference to devlink structure owned by this device
>   * @dl_port: devlink port associated with the PF
> + * @cxl: details of related cxl objects
>   * @mem_bar: The BAR that is mapped into membase.
>   * @reg_base: Offset from the start of the bar to the function
> control window.
>   * @monitor_work: Hardware monitor workitem
> @@ -1148,6 +1151,7 @@ struct efx_nic {
>  
>  	struct devlink *devlink;
>  	struct devlink_port *dl_port;
> +	struct efx_cxl *cxl;
>  	unsigned int mem_bar;
>  	u32 reg_base;
>  
> diff --git a/include/linux/cxl_accel_mem.h
> b/include/linux/cxl_accel_mem.h new file mode 100644
> index 000000000000..daf46d41f59c
> --- /dev/null
> +++ b/include/linux/cxl_accel_mem.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> +
> +#include <linux/cdev.h>
> +
> +#ifndef __CXL_ACCEL_MEM_H
> +#define __CXL_ACCEL_MEM_H
> +
> +enum accel_resource{
> +	CXL_ACCEL_RES_DPA,
> +	CXL_ACCEL_RES_RAM,
> +	CXL_ACCEL_RES_PMEM,
> +};
> +
> +typedef struct cxl_dev_state cxl_accel_state;

The case of using typedef in kernel coding is very rare (quite many
of them are still there due to history reason, you can also spot that
there is only one typedef in driver/cxl). Be sure to double check the
coding style bible [1] when deciding to use one. :)

[1] https://www.kernel.org/doc/html/v4.14/process/coding-style.html

> +cxl_accel_state *cxl_accel_state_create(struct device *dev);
> +
> +void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
> +void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct
> resource res,
> +			    enum accel_resource);
> +#endif
> diff --git a/include/linux/cxl_accel_pci.h
> b/include/linux/cxl_accel_pci.h new file mode 100644
> index 000000000000..c337ae8797e6
> --- /dev/null
> +++ b/include/linux/cxl_accel_pci.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> +
> +#ifndef __CXL_ACCEL_PCI_H
> +#define __CXL_ACCEL_PCI_H
> +
> +/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
> +#define CXL_DVSEC_PCIE_DEVICE
> 0 +#define   CXL_DVSEC_CAP_OFFSET		0xA
> +#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
> +#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
> +#define   CXL_DVSEC_CTRL_OFFSET		0xC
> +#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
> +#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
> +#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
> +#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
> +#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
> +#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
> +#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
> +#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
> +#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
> +
> +#endif


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 03/15] cxl: add function for type2 resource request
  2024-07-15 17:28 ` [PATCH v2 03/15] cxl: add function for type2 resource request alejandro.lucero-palau
  2024-07-18 23:36   ` Dave Jiang
@ 2024-08-09  9:01   ` Zhi Wang
  2024-08-22 13:07   ` Zhi Wang
  2 siblings, 0 replies; 114+ messages in thread
From: Zhi Wang @ 2024-08-09  9:01 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero,
	targupta

On Mon, 15 Jul 2024 18:28:23 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Create a new function for a type2 device requesting a resource
> passing the opaque struct to work with.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/memdev.c          | 13 +++++++++++++
>  drivers/net/ethernet/sfc/efx_cxl.c |  7 ++++++-
>  include/linux/cxl_accel_mem.h      |  1 +
>  3 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 61b5d35b49e7..04c3a0f8bc2e 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -744,6 +744,19 @@ void cxl_accel_set_resource(struct cxl_dev_state
> *cxlds, struct resource res, }
>  EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
>  
> +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool
> is_ram) +{
> +	int rc;
> +

In PATCH 1, you got the resource type enumeration. Let's use them here
instead of a bool. 

> +	if (is_ram)
> +		rc = request_resource(&cxlds->dpa_res,
> &cxlds->ram_res);
> +	else
> +		rc = request_resource(&cxlds->dpa_res,
> &cxlds->pmem_res); +
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
> +
>  static int cxl_memdev_release_file(struct inode *inode, struct file
> *file) {
>  	struct cxl_memdev *cxlmd =
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c
> b/drivers/net/ethernet/sfc/efx_cxl.c index 10c4fb915278..9cefcaf3caca
> 100644 --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -48,8 +48,13 @@ void efx_cxl_init(struct efx_nic *efx)
>  	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>  	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
>  
> -	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds)) {
>  		pci_info(pci_dev, "CXL accel setup regs failed");
> +		return;
> +	}
> +
> +	if (cxl_accel_request_resource(cxl->cxlds, true))
> +		pci_info(pci_dev, "CXL accel resource request
> failed"); }
>  
> 

The guidelines of error reporting from a driver is mostly considered
from the user perspective. If it is an error, shout, let the user know
what happened. Otherwise, we usually don't disturb the user other than
telling them we are loaded and everything works fine.

Please use pci_err() instead. So the user can spot it from a
message folder filtered by error level in a kernel dmesg logger.

> diff --git a/include/linux/cxl_accel_mem.h
> b/include/linux/cxl_accel_mem.h index ca7af4a9cefc..c7b254edc096
> 100644 --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -20,4 +20,5 @@ void cxl_accel_set_serial(cxl_accel_state *cxlds,
> u64 serial); void cxl_accel_set_resource(struct cxl_dev_state *cxlds,
> struct resource res, enum accel_resource);
>  int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct
> cxl_dev_state *cxlds); +int cxl_accel_request_resource(struct
> cxl_dev_state *cxlds, bool is_ram); #endif


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-07-15 17:28 ` [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state alejandro.lucero-palau
  2024-07-19 19:01   ` Dave Jiang
  2024-08-04 17:22   ` Jonathan Cameron
@ 2024-08-09  9:10   ` Zhi Wang
  2024-08-15 15:20     ` Alejandro Lucero Palau
  2 siblings, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-09  9:10 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero,
	targupta

On Mon, 15 Jul 2024 18:28:24 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Type2 devices have some Type3 functionalities as optional like an mbox
> or an hdm decoder, and CXL core needs a way to know what a CXL
> accelerator implements.
> 
> Add a new field for keeping device capabilities to be initialised by
> Type2 drivers. Advertise all those capabilities for Type3.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/mbox.c            |  1 +
>  drivers/cxl/core/memdev.c          |  4 +++-
>  drivers/cxl/core/port.c            |  2 +-
>  drivers/cxl/core/regs.c            | 11 ++++++-----
>  drivers/cxl/cxl.h                  |  2 +-
>  drivers/cxl/cxlmem.h               |  4 ++++
>  drivers/cxl/pci.c                  | 15 +++++++++------
>  drivers/net/ethernet/sfc/efx_cxl.c |  3 ++-
>  include/linux/cxl_accel_mem.h      |  5 ++++-
>  9 files changed, 31 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 2626f3fff201..2ba7d36e3f38 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1424,6 +1424,7 @@ struct cxl_memdev_state
> *cxl_memdev_state_create(struct device *dev) mds->cxlds.reg_map.host
> = dev; mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE;
>  	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
> +	mds->cxlds.capabilities = CXL_DRIVER_CAP_HDM |
> CXL_DRIVER_CAP_MBOX; mds->ram_perf.qos_class = CXL_QOS_CLASS_INVALID;
>  	mds->pmem_perf.qos_class = CXL_QOS_CLASS_INVALID;
>  
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 04c3a0f8bc2e..b4205ecca365 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -616,7 +616,7 @@ static void detach_memdev(struct work_struct
> *work) 
>  static struct lock_class_key cxl_memdev_key;
>  
> -struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev,
> uint8_t caps) {
>  	struct cxl_dev_state *cxlds;
>  
> @@ -631,6 +631,8 @@ struct cxl_dev_state
> *cxl_accel_state_create(struct device *dev) cxlds->ram_res =
> DEFINE_RES_MEM_NAMED(0, 0, "ram"); cxlds->pmem_res =
> DEFINE_RES_MEM_NAMED(0, 0, "pmem"); 
> +	cxlds->capabilities = caps;
> +
>  	return cxlds;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 887ed6e358fb..d66c6349ed2d 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -763,7 +763,7 @@ static int cxl_setup_comp_regs(struct device
> *host, struct cxl_register_map *map map->reg_type =
> CXL_REGLOC_RBI_COMPONENT; map->max_size =
> CXL_COMPONENT_REG_BLOCK_SIZE; 
> -	return cxl_setup_regs(map);
> +	return cxl_setup_regs(map, 0);
>  }
>  
>  static int cxl_port_setup_regs(struct cxl_port *port,
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index e1082e749c69..9d218ebe180d 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -421,7 +421,7 @@ static void cxl_unmap_regblock(struct
> cxl_register_map *map) map->base = NULL;
>  }
>  
> -static int cxl_probe_regs(struct cxl_register_map *map)
> +static int cxl_probe_regs(struct cxl_register_map *map, uint8_t caps)
>  {

Can we not use uintxx_t? Just like any other one in the
cxl-core. Generally, u{8,16...} are mostly used for kernel
programming, and your previous patches use them nicely.

Let's use u8 for caps. 

>  	struct cxl_component_reg_map *comp_map;
>  	struct cxl_device_reg_map *dev_map;
> @@ -437,11 +437,12 @@ static int cxl_probe_regs(struct
> cxl_register_map *map) case CXL_REGLOC_RBI_MEMDEV:
>  		dev_map = &map->device_map;
>  		cxl_probe_device_regs(host, base, dev_map);
> -		if (!dev_map->status.valid || !dev_map->mbox.valid ||
> +		if (!dev_map->status.valid ||
> +		    ((caps & CXL_DRIVER_CAP_MBOX) &&
> !dev_map->mbox.valid) || !dev_map->memdev.valid) {
>  			dev_err(host, "registers not found:
> %s%s%s\n", !dev_map->status.valid ? "status " : "",
> -				!dev_map->mbox.valid ? "mbox " : "",
> +				((caps & CXL_DRIVER_CAP_MBOX) &&
> !dev_map->mbox.valid) ? "mbox " : "", !dev_map->memdev.valid ?
> "memdev " : ""); return -ENXIO;
>  		}
> @@ -455,7 +456,7 @@ static int cxl_probe_regs(struct cxl_register_map
> *map) return 0;
>  }
>  
> -int cxl_setup_regs(struct cxl_register_map *map)
> +int cxl_setup_regs(struct cxl_register_map *map, uint8_t caps)
>  {
>  	int rc;
>  
> @@ -463,7 +464,7 @@ int cxl_setup_regs(struct cxl_register_map *map)
>  	if (rc)
>  		return rc;
>  
> -	rc = cxl_probe_regs(map);
> +	rc = cxl_probe_regs(map, caps);
>  	cxl_unmap_regblock(map);
>  
>  	return rc;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index a6613a6f8923..9973430d975f 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -300,7 +300,7 @@ int cxl_find_regblock_instance(struct pci_dev
> *pdev, enum cxl_regloc_type type, struct cxl_register_map *map, int
> index); int cxl_find_regblock(struct pci_dev *pdev, enum
> cxl_regloc_type type, struct cxl_register_map *map);
> -int cxl_setup_regs(struct cxl_register_map *map);
> +int cxl_setup_regs(struct cxl_register_map *map, uint8_t caps);
>  struct cxl_dport;
>  resource_size_t cxl_rcd_component_reg_phys(struct device *dev,
>  					   struct cxl_dport *dport);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index af8169ccdbc0..8f2a820bd92d 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -405,6 +405,9 @@ struct cxl_dpa_perf {
>  	int qos_class;
>  };
>  
> +#define CXL_DRIVER_CAP_HDM	0x1
> +#define CXL_DRIVER_CAP_MBOX	0x2
> +
>  /**
>   * struct cxl_dev_state - The driver device state
>   *
> @@ -438,6 +441,7 @@ struct cxl_dev_state {
>  	struct resource ram_res;
>  	u64 serial;
>  	enum cxl_devtype type;
> +	uint8_t capabilities;
>  };
>  
>  /**
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index b34d6259faf4..e2a978312281 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -502,7 +502,8 @@ static int cxl_rcrb_get_comp_regs(struct pci_dev
> *pdev, }
>  
>  static int cxl_pci_setup_regs(struct pci_dev *pdev, enum
> cxl_regloc_type type,
> -			      struct cxl_register_map *map)
> +			      struct cxl_register_map *map,
> +			      uint8_t cxl_dev_caps)
>  {
>  	int rc;
>  
> @@ -519,7 +520,7 @@ static int cxl_pci_setup_regs(struct pci_dev
> *pdev, enum cxl_regloc_type type, if (rc)
>  		return rc;
>  
> -	return cxl_setup_regs(map);
> +	return cxl_setup_regs(map, cxl_dev_caps);
>  }
>  
>  int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct
> cxl_dev_state *cxlds) @@ -527,7 +528,8 @@ int
> cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state
> *cxlds) struct cxl_register_map map; int rc;
>  
> -	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
> +				cxlds->capabilities);
>  	if (rc)
>  		return rc;
>  
> @@ -536,7 +538,7 @@ int cxl_pci_accel_setup_regs(struct pci_dev
> *pdev, struct cxl_dev_state *cxlds) return rc;
>  
>  	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> -				&cxlds->reg_map);
> +				&cxlds->reg_map,
> cxlds->capabilities); if (rc)
>  		dev_warn(&pdev->dev, "No component registers
> (%d)\n", rc); 
> @@ -850,7 +852,8 @@ static int cxl_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *id) dev_warn(&pdev->dev,
>  			 "Device DVSEC not present, skip CXL.mem
> init\n"); 
> -	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
> +				cxlds->capabilities);
>  	if (rc)
>  		return rc;
>  
> @@ -863,7 +866,7 @@ static int cxl_pci_probe(struct pci_dev *pdev,
> const struct pci_device_id *id)
>  	 * still be useful for management functions so don't return
> an error. */
>  	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> -				&cxlds->reg_map);
> +				&cxlds->reg_map,
> cxlds->capabilities); if (rc)
>  		dev_warn(&pdev->dev, "No component registers
> (%d)\n", rc); else if (!cxlds->reg_map.component_map.ras.valid)
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c
> b/drivers/net/ethernet/sfc/efx_cxl.c index 9cefcaf3caca..37d8bfdef517
> 100644 --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -33,7 +33,8 @@ void efx_cxl_init(struct efx_nic *efx)
>  
>  	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability
> found"); 
> -	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev,
> +
> CXL_ACCEL_DRIVER_CAP_HDM); if (IS_ERR(cxl->cxlds)) {
>  		pci_info(pci_dev, "CXL accel device state failed");
>  		return;
> diff --git a/include/linux/cxl_accel_mem.h
> b/include/linux/cxl_accel_mem.h index c7b254edc096..0ba2195b919b
> 100644 --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -12,8 +12,11 @@ enum accel_resource{
>  	CXL_ACCEL_RES_PMEM,
>  };
>  
> +#define CXL_ACCEL_DRIVER_CAP_HDM	0x1
> +#define CXL_ACCEL_DRIVER_CAP_MBOX	0x2
> +
>  typedef struct cxl_dev_state cxl_accel_state;
> -cxl_accel_state *cxl_accel_state_create(struct device *dev);
> +cxl_accel_state *cxl_accel_state_create(struct device *dev, uint8_t
> caps); 
>  void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>  void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 05/15] cxl: fix use of resource_contains
  2024-07-15 17:28 ` [PATCH v2 05/15] cxl: fix use of resource_contains alejandro.lucero-palau
  2024-07-24 21:25   ` fan
  2024-08-04 17:25   ` Jonathan Cameron
@ 2024-08-09  9:14   ` Zhi Wang
  2024-08-16 14:42     ` Alejandro Lucero Palau
  2 siblings, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-09  9:14 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero,
	targupta

On Mon, 15 Jul 2024 18:28:25 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> For a resource defined with size zero, resource contains will also
> return true.
> 
> Add resource size check before using it.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/hdm.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 3df10517a327..4af9225d4b59 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -327,10 +327,13 @@ static int __cxl_dpa_reserve(struct
> cxl_endpoint_decoder *cxled, cxled->dpa_res = res;
>  	cxled->skip = skipped;
>  
> -	if (resource_contains(&cxlds->pmem_res, res))
> +	if ((resource_size(&cxlds->pmem_res)) &&
> (resource_contains(&cxlds->pmem_res, res))) {
> +		printk("%s: resource_contains CXL_DECODER_PMEM\n",
> __func__); cxled->mode = CXL_DECODER_PMEM;
> -	else if (resource_contains(&cxlds->ram_res, res))
> +	} else if ((resource_size(&cxlds->ram_res)) &&
> (resource_contains(&cxlds->ram_res, res))) {
> +		printk("%s: resource_contains CXL_DECODER_RAM\n",
> __func__); cxled->mode = CXL_DECODER_RAM;
> +	}
>  	else {
>  		dev_warn(dev, "decoder%d.%d: %pr mixed mode not
> supported\n", port->id, cxled->cxld.id, cxled->dpa_res);

Also, please clean up your printks before sending them to stable.

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-07-23 13:43     ` Alejandro Lucero Palau
@ 2024-08-09 10:25       ` Zhi Wang
  2024-08-15 15:37         ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-09 10:25 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes, targupta

On Tue, 23 Jul 2024 14:43:24 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> 
> On 7/19/24 20:01, Dave Jiang wrote:
> >
> >>   
> >> -static int cxl_probe_regs(struct cxl_register_map *map)
> >> +static int cxl_probe_regs(struct cxl_register_map *map, uint8_t
> >> caps) {
> >>   	struct cxl_component_reg_map *comp_map;
> >>   	struct cxl_device_reg_map *dev_map;
> >> @@ -437,11 +437,12 @@ static int cxl_probe_regs(struct
> >> cxl_register_map *map) case CXL_REGLOC_RBI_MEMDEV:
> >>   		dev_map = &map->device_map;
> >>   		cxl_probe_device_regs(host, base, dev_map);
> >> -		if (!dev_map->status.valid ||
> >> !dev_map->mbox.valid ||
> >> +		if (!dev_map->status.valid ||
> >> +		    ((caps & CXL_DRIVER_CAP_MBOX) &&
> >> !dev_map->mbox.valid) || !dev_map->memdev.valid) {
> >>   			dev_err(host, "registers not found:
> >> %s%s%s\n", !dev_map->status.valid ? "status " : "",
> >> -				!dev_map->mbox.valid ? "mbox " :
> >> "",
> >> +				((caps & CXL_DRIVER_CAP_MBOX) &&
> >> !dev_map->mbox.valid) ? "mbox " : "",
> > According to the r3.1 8.2.8.2.1, the device status registers and
> > the primary mailbox registers are both mandatory if regloc id=3
> > block is found. So if the type2 device does not implement a mailbox
> > then it shouldn't be calling cxl_pci_setup_regs(pdev,
> > CXL_REGLOC_RBI_MEMDEV, &map) to begin with from the driver init
> > right? If the type2 device defines a regblock with id=3 but without
> > a mailbox, then isn't that a spec violation?
> >
> > DJ
> 
> 
> Right. The code needs to support the possibility of a Type2 having a 
> mailbox, and if it is not supported, the rest of the dvsec regs 
> initialization needs to be performed. This is not what the code does 
> now, so I'll fix this.
> 
> 
> A wider explanation is, for the RFC I used a test driver based on
> QEMU emulating a Type2 which had a CXL Device Register Interface
> defined (03h) but not a CXL Device Capability with id 2 for the
> primary mailbox register, breaking the spec as you spotted.
> 
> 

Because SFC driver uses (the 8.2.8.5.1.1 Memory Device Status
Register) to determine if the memory media is ready or not (in PATCH 6).
That register should be in a regloc id=3 block.

According to the spec paste above, the device that has regloc block
id=3 needs to have device status and mailbox.

Curious, does the SFC device have to implement the mailbox in this case
for spec compliance?

Previously, I always think that "CXL Memory Device" == "CXL Type-3
device" in the CXL spec.

Now I am little bit confused if a type-2 device that supports cxl.mem
== "CXL Memory Device" mentioned in the spec.

If the answer == Y, then having regloc id ==3 and mailbox turn
mandatory for a type-2 device that support cxl.mem for the spec
compliance.

If the answer == N, then a type-2 device can use approaches other than
Memory Device Status Register to determine the readiness of the memory?

ZW

> Thanks.
> 
> 


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 08/15] cxl: indicate probe deferral
  2024-07-15 17:28 ` [PATCH v2 08/15] cxl: indicate probe deferral alejandro.lucero-palau
                     ` (2 preceding siblings ...)
  2024-08-04 17:41   ` Jonathan Cameron
@ 2024-08-09 14:40   ` Zhi Wang
  2024-08-26 17:42   ` Zhi Wang
  4 siblings, 0 replies; 114+ messages in thread
From: Zhi Wang @ 2024-08-09 14:40 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero,
	targupta

On Mon, 15 Jul 2024 18:28:28 +0100
<alejandro.lucero-palau@amd.com> wrote:

Another spell check is spotted besides others review threads. Will
circle back with more comments once checking the users of the APIs.

> From: Alejandro Lucero <alucerop@amd.com>
> 
> The first stop for a CXL accelerator driver that wants to establish
> new CXL.mem regions is to register a 'struct cxl_memdev. That kicks
> off cxl_mem_probe() to enumerate all 'struct cxl_port' instances in
> the topology up to the root.
> 
> If the root driver has not attached yet the expectation is that the
> driver waits until that link is established. The common cxl_pci_driver
> has reason to keep the 'struct cxl_memdev' device attached to the bus
> until the root driver attaches. An accelerator may want to instead
> defer probing until CXL resources can be acquired.
> 
> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
> accelerator driver probing should be defferred vs failed. Provide that
                                         ^deferred
> indication via a new cxl_acquire_endpoint() API that can retrieve the
> probe status of the memdev.
> 
> The first consumer of this API is a test driver that excercises the
> CXL Type-2 flow.
> 
> Based on
> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m18497367d2ae38f88e94c06369eaa83fa23e92b2
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c          | 41
> ++++++++++++++++++++++++++++++ drivers/cxl/core/port.c            |
> 2 +- drivers/cxl/mem.c                  |  7 +++--
>  drivers/net/ethernet/sfc/efx_cxl.c | 10 +++++++-
>  include/linux/cxl_accel_mem.h      |  3 +++
>  5 files changed, 59 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index b902948b121f..d51c8bfb32e3 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -1137,6 +1137,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct
> device *host, }
>  EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
>  
> +/*
> + * Try to get a locked reference on a memdev's CXL port topology
> + * connection. Be careful to observe when cxl_mem_probe() has
> deposited
> + * a probe deferral awaiting the arrival of the CXL root driver
> +*/
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
> +{
> +	struct cxl_port *endpoint;
> +	int rc = -ENXIO;
> +
> +	device_lock(&cxlmd->dev);
> +	endpoint = cxlmd->endpoint;
> +	if (!endpoint)
> +		goto err;
> +
> +	if (IS_ERR(endpoint)) {
> +		rc = PTR_ERR(endpoint);
> +		goto err;
> +	}
> +
> +	device_lock(&endpoint->dev);
> +	if (!endpoint->dev.driver)
> +		goto err_endpoint;
> +
> +	return endpoint;
> +
> +err_endpoint:
> +	device_unlock(&endpoint->dev);
> +err:
> +	device_unlock(&cxlmd->dev);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
> +
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port
> *endpoint) +{
> +	device_unlock(&endpoint->dev);
> +	device_unlock(&cxlmd->dev);
> +}
> +EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
> +
>  static void sanitize_teardown_notifier(void *data)
>  {
>  	struct cxl_memdev_state *mds = data;
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index d66c6349ed2d..3c6b896c5f65 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev
> *cxlmd, */
>  		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>  			dev_name(dport_dev));
> -		return -ENXIO;
> +		return -EPROBE_DEFER;
>  	}
>  
>  	parent_port = find_cxl_port(dparent, &parent_dport);
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index f76af75a87b7..383a6f4829d3 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -145,13 +145,16 @@ static int cxl_mem_probe(struct device *dev)
>  		return rc;
>  
>  	rc = devm_cxl_enumerate_ports(cxlmd);
> -	if (rc)
> +	if (rc) {
> +		cxlmd->endpoint = ERR_PTR(rc);
>  		return rc;
> +	}
>  
>  	parent_port = cxl_mem_find_port(cxlmd, &dport);
>  	if (!parent_port) {
>  		dev_err(dev, "CXL port topology not found\n");
> -		return -ENXIO;
> +		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
> +		return -EPROBE_DEFER;
>  	}
>  
>  	if (resource_size(&cxlds->pmem_res) &&
> IS_ENABLED(CONFIG_CXL_PMEM)) { diff --git
> a/drivers/net/ethernet/sfc/efx_cxl.c
> b/drivers/net/ethernet/sfc/efx_cxl.c index 0abe66490ef5..2cf4837ddfc1
> 100644 --- a/drivers/net/ethernet/sfc/efx_cxl.c +++
> b/drivers/net/ethernet/sfc/efx_cxl.c @@ -65,8 +65,16 @@ void
> efx_cxl_init(struct efx_nic *efx) }
>  
>  	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
> -	if (IS_ERR(cxl->cxlmd))
> +	if (IS_ERR(cxl->cxlmd)) {
>  		pci_info(pci_dev, "CXL accel memdev creation
> failed");
> +		return;
> +	}
> +
> +	cxl->endpoint = cxl_acquire_endpoint(cxl->cxlmd);
> +	if (IS_ERR(cxl->endpoint))
> +		pci_info(pci_dev, "CXL accel acquire endpoint
> failed"); +
> +	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
>  }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h
> b/include/linux/cxl_accel_mem.h index 442ed9862292..701910021df8
> 100644 --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -29,4 +29,7 @@ int cxl_await_media_ready(struct cxl_dev_state
> *cxlds); 
>  struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>  				       struct cxl_dev_state *cxlds);
> +
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port
> *endpoint); #endif


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 14/15] cxl: add function for obtaining params from a region
  2024-07-15 17:28 ` [PATCH v2 14/15] cxl: add function for obtaining params from a region alejandro.lucero-palau
@ 2024-08-09 15:24   ` Zhi Wang
  2024-08-19 16:14     ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-09 15:24 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero,
	targupta

On Mon, 15 Jul 2024 18:28:34 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> A CXL region struct contains the physical address to work with.
> 
> Add a function for given a opaque cxl region struct returns the params
> to be used for mapping such memory range.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/region.c     | 16 ++++++++++++++++
>  drivers/cxl/cxl.h             |  3 +++
>  include/linux/cxl_accel_mem.h |  2 ++
>  3 files changed, 21 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index c8fc14ac437e..9ff10923e9fc 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3345,6 +3345,22 @@ static int devm_cxl_add_dax_region(struct
> cxl_region *cxlr) return rc;
>  }
>  
> +int cxl_accel_get_region_params(struct cxl_region *region,
> +				resource_size_t *start,
> resource_size_t *end) +{
> +	if (!region)
> +		return -ENODEV;
> +
> +	if (!region->params.res) {
> +		return -ENODEV;
> +	}

Remove the extra {}

> +	*start = region->params.res->start;
> +	*end = region->params.res->end;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_get_region_params, CXL);
> +
>  static int match_root_decoder_by_range(struct device *dev, void
> *data) {
>  	struct range *r1, *r2 = data;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 1bf3b74ff959..b4c4c4455ef1 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -906,6 +906,9 @@ void cxl_coordinates_combine(struct
> access_coordinate *out, bool
> cxl_endpoint_decoder_reset_detected(struct cxl_port *port); 
>  int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
> +
> +int cxl_accel_get_region_params(struct cxl_region *region,
> +				resource_size_t *start,
> resource_size_t *end); /*
>   * Unit test builds overrides this to __weak, find the 'strong'
> version
>   * of these symbols in tools/testing/cxl/.
> diff --git a/include/linux/cxl_accel_mem.h
> b/include/linux/cxl_accel_mem.h index a5f9ffc24509..5d715eea6e91
> 100644 --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -53,4 +53,6 @@ struct cxl_region *cxl_create_region(struct
> cxl_root_decoder *cxlrd, int ways);
>  
>  int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
> +int cxl_accel_get_region_params(struct cxl_region *region,
> +				resource_size_t *start,
> resource_size_t *end); #endif


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-04 17:10   ` Jonathan Cameron
@ 2024-08-12 11:16     ` Alejandro Lucero Palau
  2024-08-13  8:30       ` Alejandro Lucero Palau
  2024-08-15 16:35       ` Jonathan Cameron
  0 siblings, 2 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-12 11:16 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 18:10, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:21 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Differientiate Type3, aka memory expanders, from Type2, aka device
>> accelerators, with a new function for initializing cxl_dev_state.
>>
>> Create opaque struct to be used by accelerators relying on new access
>> functions in following patches.
>>
>> Add SFC ethernet network driver as the client.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>

>> +
>> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>> +{
>> +	cxlds->cxl_dvsec = dvsec;
> Nothing to do with accel. If these make sense promote to cxl
> core and a linux/cxl/ header.  Also we may want the type3 driver to
> switch to them long term. If nothing else, making that handle the
> cxl_dev_state as more opaque will show up what is still directly
> accessed and may need to be wrapped up for a future accelerator driver
> to use.
>

I will change the function name then, but not sure I follow the comment 
about more opaque ...


>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_dvsec, CXL);
>> +
>> +void cxl_accel_set_serial(struct cxl_dev_state *cxlds, u64 serial)
>> +{
>> +	cxlds->serial= serial;
> Run checkpatch over this series before v3 with --strict and fix the
> warnings. Probably would have spotted missing space before =
>
> Sure it's a series that is kind of RFC ish at the moment but clean
> code means you don't get nitpickers like me pointing this stuff out!
>

Sure. Thanks.

>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_serial, CXL);
>> +
>> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>> +			    enum accel_resource type)
>> +{
>> +	switch (type) {
>> +	case CXL_ACCEL_RES_DPA:
>> +		cxlds->dpa_res = res;
>> +		return;
>> +	case CXL_ACCEL_RES_RAM:
>> +		cxlds->ram_res = res;
>> +		return;
>> +	case CXL_ACCEL_RES_PMEM:
>> +		cxlds->pmem_res = res;
>> +		return;
>> +	default:
>> +		dev_err(cxlds->dev, "unkown resource type (%u)\n", type);
> typo. Plus I'd let this return an error as we may well have more types
> in future and not handle them all.
>

OK.


>>   	pci_dbg(efx->pci_dev, "shutdown successful\n");
>>   
>>   	efx_fini_devlink_and_unlock(efx);
>> @@ -1109,6 +1111,8 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
>>   	if (rc)
>>   		goto fail2;
>>   
>> +	efx_cxl_init(efx);
>> +
> As below, have an error code. This is not something we want to fail
> and have the driver carry on.


As you have seen in another patch when CXL initialization is taken into 
account, the driver can keep going if this fails.

Those pci_warn/err inside CXL core should be enough.


>>   	rc = efx_pci_probe_post_io(efx);
>>   	if (rc) {
>>   		/* On failure, retry once immediately.
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> new file mode 100644
>> index 000000000000..4554dd7cca76
>> --- /dev/null
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -0,0 +1,53 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/****************************************************************************
>> + * Driver for AMD network controllers and boards
>> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation, incorporated herein by reference.
>> + */
>> +
>> +
>> +#include <linux/pci.h>
>> +#include <linux/cxl_accel_mem.h>
>> +#include <linux/cxl_accel_pci.h>
>> +
>> +#include "net_driver.h"
>> +#include "efx_cxl.h"
>> +
>> +#define EFX_CTPIO_BUFFER_SIZE	(1024*1024*256)
>> +
>> +void efx_cxl_init(struct efx_nic *efx)
>> +{
>> +	struct pci_dev *pci_dev = efx->pci_dev;
>> +	struct efx_cxl *cxl = efx->cxl;
>> +	struct resource res;
>> +	u16 dvsec;
>> +
>> +	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
>> +					  CXL_DVSEC_PCIE_DEVICE);
>> +
>> +	if (!dvsec)
>> +		return;
>> +
>> +	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability found");
> pci_dbg();


Right.


>
>> diff --git a/include/linux/cxl_accel_pci.h b/include/linux/cxl_accel_pci.h
>> new file mode 100644
>> index 000000000000..c337ae8797e6
>> --- /dev/null
>> +++ b/include/linux/cxl_accel_pci.h
>> @@ -0,0 +1,23 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>> +
>> +#ifndef __CXL_ACCEL_PCI_H
>> +#define __CXL_ACCEL_PCI_H
>> +
>> +/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
>> +#define CXL_DVSEC_PCIE_DEVICE					0
>> +#define   CXL_DVSEC_CAP_OFFSET		0xA
>> +#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
>> +#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
>> +#define   CXL_DVSEC_CTRL_OFFSET		0xC
>> +#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
>> +#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
>> +#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
>> +#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
>> +#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
>> +#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
>> +#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
>> +#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
>> +#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
> As I think Dave suggested, pull any defs you need to linux/cxl/pci.h or whatever
> makes sense and make the exiting code look for them there.
>
> Ideally do that in a patch that does nothing else as simple
> moves are easier to review quickly than ones mixed with real changes.


I'll do.


Thanks


>
>
>> +
>> +#endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-09  8:34   ` Zhi Wang
@ 2024-08-12 11:34     ` Alejandro Lucero Palau
  2024-08-17 20:32       ` Zhi Wang
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-12 11:34 UTC (permalink / raw)
  To: Zhi Wang, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, targupta


On 8/9/24 09:34, Zhi Wang wrote:
> On Mon, 15 Jul 2024 18:28:21 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Differientiate Type3, aka memory expanders, from Type2, aka device
>> accelerators, with a new function for initializing cxl_dev_state.
>>
>> Create opaque struct to be used by accelerators relying on new access
>> functions in following patches.
>>
>> Add SFC ethernet network driver as the client.
>>
>> Based on
>> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>   drivers/cxl/core/memdev.c             | 52 ++++++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/Makefile     |  2 +-
>>   drivers/net/ethernet/sfc/efx.c        |  4 ++
>>   drivers/net/ethernet/sfc/efx_cxl.c    | 53
>> +++++++++++++++++++++++++++ drivers/net/ethernet/sfc/efx_cxl.h    |
>> 29 +++++++++++++++ drivers/net/ethernet/sfc/net_driver.h |  4 ++
>>   include/linux/cxl_accel_mem.h         | 22 +++++++++++
>>   include/linux/cxl_accel_pci.h         | 23 ++++++++++++
>>   8 files changed, 188 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
>>   create mode 100644 include/linux/cxl_accel_mem.h
>>   create mode 100644 include/linux/cxl_accel_pci.h
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index 0277726afd04..61b5d35b49e7 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -8,6 +8,7 @@
>>   #include <linux/idr.h>
>>   #include <linux/pci.h>
>>   #include <cxlmem.h>
>> +#include <linux/cxl_accel_mem.h>
> Let's keep the header inclusion in an alphabetical order. The same in
> efx_cxl.c


The headers seem to follow a reverse Christmas tree order here rather 
than an alphabetical one.

Should I rearrange them all?


>>   #include "trace.h"
>>   #include "core.h"
>>   
>> @@ -615,6 +616,25 @@ static void detach_memdev(struct work_struct
>> *work)
>>   static struct lock_class_key cxl_memdev_key;
>>   
>> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
>> +{
>> +	struct cxl_dev_state *cxlds;
>> +
>> +	cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
>> +	if (!cxlds)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	cxlds->dev = dev;
>> +	cxlds->type = CXL_DEVTYPE_DEVMEM;
>> +
>> +	cxlds->dpa_res = DEFINE_RES_MEM_NAMED(0, 0, "dpa");
>> +	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
>> +	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
>> +
>> +	return cxlds;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
>> +
>>   static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state
>> *cxlds, const struct file_operations *fops)
>>   {
>> @@ -692,6 +712,38 @@ static int cxl_memdev_open(struct inode *inode,
>> struct file *file) return 0;
>>   }
>>
>> +
>> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>> +{
>> +	cxlds->cxl_dvsec = dvsec;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_dvsec, CXL);
>> +
>> +void cxl_accel_set_serial(struct cxl_dev_state *cxlds, u64 serial)
>> +{
>> +	cxlds->serial= serial;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_serial, CXL);
>> +
> It would be nice to explain about how the cxl core is using these in
> the patch comments, as we just saw the stuff got promoted into the core.


As far as I can see, it is for info/debugging purposes. I will add such 
explanation in next version.


>
>> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct
>> resource res,
>> +			    enum accel_resource type)
>> +{
>> +	switch (type) {
>> +	case CXL_ACCEL_RES_DPA:
>> +		cxlds->dpa_res = res;
>> +		return;
>> +	case CXL_ACCEL_RES_RAM:
>> +		cxlds->ram_res = res;
>> +		return;
>> +	case CXL_ACCEL_RES_PMEM:
>> +		cxlds->pmem_res = res;
>> +		return;
>> +	default:
>> +		dev_err(cxlds->dev, "unkown resource type (%u)\n",
>> type);
>> +	}
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
>> +
> I wonder in which situation this error can be triggered.
> One can be a newer out-of-tree type-2 driver tries to work on an older
> kernel. Other situations should be the coding problem of an in-tree
> driver.


I guess that would point to an extension not updating this function.


> I prefer to WARN_ONCE() here.


I agree after your previous concern.


>
>>   
>> diff --git a/include/linux/cxl_accel_mem.h
>> b/include/linux/cxl_accel_mem.h new file mode 100644
>> index 000000000000..daf46d41f59c
>> --- /dev/null
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -0,0 +1,22 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>> +
>> +#include <linux/cdev.h>
>> +
>> +#ifndef __CXL_ACCEL_MEM_H
>> +#define __CXL_ACCEL_MEM_H
>> +
>> +enum accel_resource{
>> +	CXL_ACCEL_RES_DPA,
>> +	CXL_ACCEL_RES_RAM,
>> +	CXL_ACCEL_RES_PMEM,
>> +};
>> +
>> +typedef struct cxl_dev_state cxl_accel_state;
> The case of using typedef in kernel coding is very rare (quite many
> of them are still there due to history reason, you can also spot that
> there is only one typedef in driver/cxl). Be sure to double check the
> coding style bible [1] when deciding to use one. :)
>
> [1] https://www.kernel.org/doc/html/v4.14/process/coding-style.html


Right.

I think there is an agreement now in not using typedef but struct 
cxl_dev_state so problem solved.


Thanks!



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-12 11:16     ` Alejandro Lucero Palau
@ 2024-08-13  8:30       ` Alejandro Lucero Palau
  2024-08-15 16:38         ` Jonathan Cameron
  2024-08-15 16:35       ` Jonathan Cameron
  1 sibling, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-13  8:30 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/12/24 12:16, Alejandro Lucero Palau wrote:
>
> On 8/4/24 18:10, Jonathan Cameron wrote:
>> On Mon, 15 Jul 2024 18:28:21 +0100
>> <alejandro.lucero-palau@amd.com> wrote:
>>
>>> From: Alejandro Lucero <alucerop@amd.com>
>>>
>>> Differientiate Type3, aka memory expanders, from Type2, aka device
>>> accelerators, with a new function for initializing cxl_dev_state.
>>>
>>> Create opaque struct to be used by accelerators relying on new access
>>> functions in following patches.
>>>
>>> Add SFC ethernet network driver as the client.
>>>
>>> Based on 
>>> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
>>>
>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>>
>
>>> +
>>> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>>> +{
>>> +    cxlds->cxl_dvsec = dvsec;
>> Nothing to do with accel. If these make sense promote to cxl
>> core and a linux/cxl/ header.  Also we may want the type3 driver to
>> switch to them long term. If nothing else, making that handle the
>> cxl_dev_state as more opaque will show up what is still directly
>> accessed and may need to be wrapped up for a future accelerator driver
>> to use.
>>
>
> I will change the function name then, but not sure I follow the 
> comment about more opaque ...
>
>
>

I have second thoughts about this.


I consider this as an accessor  for, as you said in a previous exchange, 
facilitating changes to the core structs without touching those accel 
drivers using it.

Type3 driver is part of the CXL core and easy to change for these kind 
of updates since it will only be one driver supporting all Type3, and an 
accessor is not required then.

Let me know what you think.



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 02/15] cxl: add function for type2 cxl regs setup
  2024-07-16  6:26   ` Li, Ming4
@ 2024-08-14  7:46     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-14  7:46 UTC (permalink / raw)
  To: Li, Ming4, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 7/16/24 07:26, Li, Ming4 wrote:
> On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Create a new function for a type2 device initialising the opaque
>> cxl_dev_state struct regarding cxl regs setup and mapping.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/pci.c                  | 28 ++++++++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/efx_cxl.c |  3 +++
>>   include/linux/cxl_accel_mem.h      |  1 +
>>   3 files changed, 32 insertions(+)
>>
>> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
>> index e53646e9f2fb..b34d6259faf4 100644
>> --- a/drivers/cxl/pci.c
>> +++ b/drivers/cxl/pci.c
>> @@ -11,6 +11,7 @@
>>   #include <linux/pci.h>
>>   #include <linux/aer.h>
>>   #include <linux/io.h>
>> +#include <linux/cxl_accel_mem.h>
>>   #include "cxlmem.h"
>>   #include "cxlpci.h"
>>   #include "cxl.h"
>> @@ -521,6 +522,33 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>>   	return cxl_setup_regs(map);
>>   }
>>   
>> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
>> +{
>> +	struct cxl_register_map map;
>> +	int rc;
>> +
>> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = cxl_map_device_regs(&map, &cxlds->regs.device_regs);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
>> +				&cxlds->reg_map);
>> +	if (rc)
>> +		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
>> +
>> +	rc = cxl_map_component_regs(&cxlds->reg_map, &cxlds->regs.component,
>> +				    BIT(CXL_CM_CAP_CAP_ID_RAS));
>> +	if (rc)
>> +		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
>> +
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_pci_accel_setup_regs, CXL);
>> +
> My first feeling is that above function should be provided by cxl_core rather than cxl_pci.
>
> Let's see if Dan has comments on that.


This has also been suggested by another reviewer, so I take it as an 
action for v3.

Thanks


>
>>   static int cxl_pci_ras_unmask(struct pci_dev *pdev)
>>   {
>>   	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index 4554dd7cca76..10c4fb915278 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -47,6 +47,9 @@ void efx_cxl_init(struct efx_nic *efx)
>>   
>>   	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>>   	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
>> +
>> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
>> +		pci_info(pci_dev, "CXL accel setup regs failed");
>>   }
>>   
>>   
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> index daf46d41f59c..ca7af4a9cefc 100644
>> --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -19,4 +19,5 @@ void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>>   void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>>   void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>   			    enum accel_resource);
>> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>>   #endif
>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 02/15] cxl: add function for type2 cxl regs setup
  2024-07-18 23:27   ` Dave Jiang
@ 2024-08-14  7:49     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-14  7:49 UTC (permalink / raw)
  To: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 7/19/24 00:27, Dave Jiang wrote:
>
> On 7/15/24 10:28 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Create a new function for a type2 device initialising the opaque
>> cxl_dev_state struct regarding cxl regs setup and mapping.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/pci.c                  | 28 ++++++++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/efx_cxl.c |  3 +++
>>   include/linux/cxl_accel_mem.h      |  1 +
>>   3 files changed, 32 insertions(+)
>>
>> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
>> index e53646e9f2fb..b34d6259faf4 100644
>> --- a/drivers/cxl/pci.c
>> +++ b/drivers/cxl/pci.c
>> @@ -11,6 +11,7 @@
>>   #include <linux/pci.h>
>>   #include <linux/aer.h>
>>   #include <linux/io.h>
>> +#include <linux/cxl_accel_mem.h>
>>   #include "cxlmem.h"
>>   #include "cxlpci.h"
>>   #include "cxl.h"
>> @@ -521,6 +522,33 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>>   	return cxl_setup_regs(map);
>>   }
>>   
>> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
> Function should go into cxl/core/pci.c


It will be in v3.


>> +{
>> +	struct cxl_register_map map;
>> +	int rc;
>> +
>> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = cxl_map_device_regs(&map, &cxlds->regs.device_regs);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
>> +				&cxlds->reg_map);
>> +	if (rc)
>> +		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
>> +
>> +	rc = cxl_map_component_regs(&cxlds->reg_map, &cxlds->regs.component,
>> +				    BIT(CXL_CM_CAP_CAP_ID_RAS));
>> +	if (rc)
>> +		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
> dev_warn()? also maybe add the errno in the error emissioni.


Yes. Thanks


>
>> +
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_pci_accel_setup_regs, CXL);
>> +
>>   static int cxl_pci_ras_unmask(struct pci_dev *pdev)
>>   {
>>   	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index 4554dd7cca76..10c4fb915278 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -47,6 +47,9 @@ void efx_cxl_init(struct efx_nic *efx)
>>   
>>   	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>>   	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
>> +
>> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
>> +		pci_info(pci_dev, "CXL accel setup regs failed");
> pci_warn()? although seems unnecesary since error emitted in cxl_pci_accel_setup_regs().


Right. I think I'll remove it.

Thanks


>>   }
>>   
>>   
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> index daf46d41f59c..ca7af4a9cefc 100644
>> --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -19,4 +19,5 @@ void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>>   void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>>   void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>   			    enum accel_resource);
>> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>>   #endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 02/15] cxl: add function for type2 cxl regs setup
  2024-08-04 17:15   ` Jonathan Cameron
@ 2024-08-14  7:56     ` Alejandro Lucero Palau
  2024-08-15 16:40       ` Jonathan Cameron
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-14  7:56 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 18:15, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:22 +0100
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Create a new function for a type2 device initialising the opaque
>> cxl_dev_state struct regarding cxl regs setup and mapping.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/pci.c                  | 28 ++++++++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/efx_cxl.c |  3 +++
>>   include/linux/cxl_accel_mem.h      |  1 +
>>   3 files changed, 32 insertions(+)
>>
>> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
>> index e53646e9f2fb..b34d6259faf4 100644
>> --- a/drivers/cxl/pci.c
>> +++ b/drivers/cxl/pci.c
>> @@ -11,6 +11,7 @@
>>   #include <linux/pci.h>
>>   #include <linux/aer.h>
>>   #include <linux/io.h>
>> +#include <linux/cxl_accel_mem.h>
>>   #include "cxlmem.h"
>>   #include "cxlpci.h"
>>   #include "cxl.h"
>> @@ -521,6 +522,33 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>>   	return cxl_setup_regs(map);
>>   }
>>   
>> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
>> +{
>> +	struct cxl_register_map map;
>> +	int rc;
>> +
>> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = cxl_map_device_regs(&map, &cxlds->regs.device_regs);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
>> +				&cxlds->reg_map);
>> +	if (rc)
>> +		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
> Not fatal?  If we think it will happen on real devices, then dev_warn
> is too strong.


This is more complex than what it seems, and it is not properly handled 
with the current code.

I will cover it in another patch in more detail, but the fact is those 
calls to cxl_pci_setup_regs need to be handled better, because Type2 has 
some of these registers as optional.


>> +
>> +	rc = cxl_map_component_regs(&cxlds->reg_map, &cxlds->regs.component,
>> +				    BIT(CXL_CM_CAP_CAP_ID_RAS));
>> +	if (rc)
>> +		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
> pci_err() or similar would make sense here as we have asked for something
> that isn't happening. Specification says this is mandatory so
> definitely smells like a fatal error to me.
>
>
>> +
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_pci_accel_setup_regs, CXL);
>> +
>>   static int cxl_pci_ras_unmask(struct pci_dev *pdev)
>>   {
>>   	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index 4554dd7cca76..10c4fb915278 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -47,6 +47,9 @@ void efx_cxl_init(struct efx_nic *efx)
>>   
>>   	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>>   	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
>> +
>> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
>> +		pci_info(pci_dev, "CXL accel setup regs failed");
> Handle errors fully. That is report them  up to the caller.
>
>>   }
>>   
>>   
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> index daf46d41f59c..ca7af4a9cefc 100644
>> --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -19,4 +19,5 @@ void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>>   void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>>   void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>   			    enum accel_resource);
>> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>>   #endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 03/15] cxl: add function for type2 resource request
  2024-07-18 23:36   ` Dave Jiang
  2024-08-04 17:16     ` Jonathan Cameron
@ 2024-08-14  8:00     ` Alejandro Lucero Palau
  1 sibling, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-14  8:00 UTC (permalink / raw)
  To: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 7/19/24 00:36, Dave Jiang wrote:
>
> On 7/15/24 10:28 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Create a new function for a type2 device requesting a resource
>> passing the opaque struct to work with.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/memdev.c          | 13 +++++++++++++
>>   drivers/net/ethernet/sfc/efx_cxl.c |  7 ++++++-
>>   include/linux/cxl_accel_mem.h      |  1 +
>>   3 files changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index 61b5d35b49e7..04c3a0f8bc2e 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -744,6 +744,19 @@ void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
>>   
>> +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram)
> Maybe declare a common enum like cxl_resource_type instead of 'enum accel_resource' and use here instead of bool?


Yes. Thanks

>> +{
>> +	int rc;
>> +
>> +	if (is_ram)
>> +		rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
>> +	else
>> +		rc = request_resource(&cxlds->dpa_res, &cxlds->pmem_res);
>> +
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
>> +
>>   static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>>   {
>>   	struct cxl_memdev *cxlmd =
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index 10c4fb915278..9cefcaf3caca 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -48,8 +48,13 @@ void efx_cxl_init(struct efx_nic *efx)
>>   	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>>   	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
>>   
>> -	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
>> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds)) {
>>   		pci_info(pci_dev, "CXL accel setup regs failed");
>> +		return;
>> +	}
>> +
>> +	if (cxl_accel_request_resource(cxl->cxlds, true))
>> +		pci_info(pci_dev, "CXL accel resource request failed");
> pci_warn()? also emitting the errno would be nice.
>
>>   }
>>   
>>   
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> index ca7af4a9cefc..c7b254edc096 100644
>> --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -20,4 +20,5 @@ void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>>   void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>   			    enum accel_resource);
>>   int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>> +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
>>   #endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 03/15] cxl: add function for type2 resource request
  2024-08-04 17:16     ` Jonathan Cameron
@ 2024-08-14  8:08       ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-14  8:08 UTC (permalink / raw)
  To: Jonathan Cameron, Dave Jiang
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes


On 8/4/24 18:16, Jonathan Cameron wrote:
> On Thu, 18 Jul 2024 16:36:00 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
>
>> On 7/15/24 10:28 AM, alejandro.lucero-palau@amd.com wrote:
>>> From: Alejandro Lucero <alucerop@amd.com>
>>>
>>> Create a new function for a type2 device requesting a resource
>>> passing the opaque struct to work with.
>>>
>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>> ---
>>>   drivers/cxl/core/memdev.c          | 13 +++++++++++++
>>>   drivers/net/ethernet/sfc/efx_cxl.c |  7 ++++++-
>>>   include/linux/cxl_accel_mem.h      |  1 +
>>>   3 files changed, 20 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>>> index 61b5d35b49e7..04c3a0f8bc2e 100644
>>> --- a/drivers/cxl/core/memdev.c
>>> +++ b/drivers/cxl/core/memdev.c
>>> @@ -744,6 +744,19 @@ void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>>   }
>>>   EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
>>>   
>>> +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram)
>> Maybe declare a common enum like cxl_resource_type instead of 'enum accel_resource' and use here instead of bool?
>>
>>> +{
>>> +	int rc;
>>> +
>>> +	if (is_ram)
>>> +		rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
>>> +	else
>>> +		rc = request_resource(&cxlds->dpa_res, &cxlds->pmem_res);
>>> +
>>> +	return rc;
>>> +}
>>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
>>> +
>>>   static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>>>   {
>>>   	struct cxl_memdev *cxlmd =
>>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>>> index 10c4fb915278..9cefcaf3caca 100644
>>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>>> @@ -48,8 +48,13 @@ void efx_cxl_init(struct efx_nic *efx)
>>>   	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>>>   	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
>>>   
>>> -	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
>>> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds)) {
>>>   		pci_info(pci_dev, "CXL accel setup regs failed");
>>> +		return;
>>> +	}
>>> +
>>> +	if (cxl_accel_request_resource(cxl->cxlds, true))
>>> +		pci_info(pci_dev, "CXL accel resource request failed");
>> pci_warn()? also emitting the errno would be nice.
> Don't hide it at all.  Fail if this doesn't succeed and let the caller
> know. Not to mention, tear down any other state already set up.
>   


It is obvious I have problems with the way errors are reported, 
specifically about what should be considered a serious problem not 
expected at all.

I think we can expect some unexpected situations with the novelty behind 
CXL and, indeed, Type2 support. I guess a good approach could be to be 
chatty at this point and refine the way these errors are reported later 
when the maturity of the support and our experience give us the knowledge.


>>>   }
>>>   
>>>   
>>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>>> index ca7af4a9cefc..c7b254edc096 100644
>>> --- a/include/linux/cxl_accel_mem.h
>>> +++ b/include/linux/cxl_accel_mem.h
>>> @@ -20,4 +20,5 @@ void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>>>   void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>>   			    enum accel_resource);
>>>   int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>>> +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
>>>   #endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-08-09  9:10   ` Zhi Wang
@ 2024-08-15 15:20     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-15 15:20 UTC (permalink / raw)
  To: Zhi Wang, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, targupta


On 8/9/24 10:10, Zhi Wang wrote:
> On Mon, 15 Jul 2024 18:28:24 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Type2 devices have some Type3 functionalities as optional like an mbox
>> or an hdm decoder, and CXL core needs a way to know what a CXL
>> accelerator implements.
>>
>> Add a new field for keeping device capabilities to be initialised by
>> Type2 drivers. Advertise all those capabilities for Type3.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/mbox.c            |  1 +
>>   drivers/cxl/core/memdev.c          |  4 +++-
>>   drivers/cxl/core/port.c            |  2 +-
>>   drivers/cxl/core/regs.c            | 11 ++++++-----
>>   drivers/cxl/cxl.h                  |  2 +-
>>   drivers/cxl/cxlmem.h               |  4 ++++
>>   drivers/cxl/pci.c                  | 15 +++++++++------
>>   drivers/net/ethernet/sfc/efx_cxl.c |  3 ++-
>>   include/linux/cxl_accel_mem.h      |  5 ++++-
>>   9 files changed, 31 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>> index 2626f3fff201..2ba7d36e3f38 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -1424,6 +1424,7 @@ struct cxl_memdev_state
>> *cxl_memdev_state_create(struct device *dev) mds->cxlds.reg_map.host
>> = dev; mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE;
>>   	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
>> +	mds->cxlds.capabilities = CXL_DRIVER_CAP_HDM |
>> CXL_DRIVER_CAP_MBOX; mds->ram_perf.qos_class = CXL_QOS_CLASS_INVALID;
>>   	mds->pmem_perf.qos_class = CXL_QOS_CLASS_INVALID;
>>   
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index 04c3a0f8bc2e..b4205ecca365 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -616,7 +616,7 @@ static void detach_memdev(struct work_struct
>> *work)
>>   static struct lock_class_key cxl_memdev_key;
>>   
>> -struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
>> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev,
>> uint8_t caps) {
>>   	struct cxl_dev_state *cxlds;
>>   
>> @@ -631,6 +631,8 @@ struct cxl_dev_state
>> *cxl_accel_state_create(struct device *dev) cxlds->ram_res =
>> DEFINE_RES_MEM_NAMED(0, 0, "ram"); cxlds->pmem_res =
>> DEFINE_RES_MEM_NAMED(0, 0, "pmem");
>> +	cxlds->capabilities = caps;
>> +
>>   	return cxlds;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 887ed6e358fb..d66c6349ed2d 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -763,7 +763,7 @@ static int cxl_setup_comp_regs(struct device
>> *host, struct cxl_register_map *map map->reg_type =
>> CXL_REGLOC_RBI_COMPONENT; map->max_size =
>> CXL_COMPONENT_REG_BLOCK_SIZE;
>> -	return cxl_setup_regs(map);
>> +	return cxl_setup_regs(map, 0);
>>   }
>>   
>>   static int cxl_port_setup_regs(struct cxl_port *port,
>> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
>> index e1082e749c69..9d218ebe180d 100644
>> --- a/drivers/cxl/core/regs.c
>> +++ b/drivers/cxl/core/regs.c
>> @@ -421,7 +421,7 @@ static void cxl_unmap_regblock(struct
>> cxl_register_map *map) map->base = NULL;
>>   }
>>   
>> -static int cxl_probe_regs(struct cxl_register_map *map)
>> +static int cxl_probe_regs(struct cxl_register_map *map, uint8_t caps)
>>   {
> Can we not use uintxx_t? Just like any other one in the
> cxl-core. Generally, u{8,16...} are mostly used for kernel
> programming, and your previous patches use them nicely.
>
> Let's use u8 for caps.
>

Sure.

Thanks


>>   	struct cxl_component_reg_map *comp_map;
>>   	struct cxl_device_reg_map *dev_map;
>> @@ -437,11 +437,12 @@ static int cxl_probe_regs(struct
>> cxl_register_map *map) case CXL_REGLOC_RBI_MEMDEV:
>>   		dev_map = &map->device_map;
>>   		cxl_probe_device_regs(host, base, dev_map);
>> -		if (!dev_map->status.valid || !dev_map->mbox.valid ||
>> +		if (!dev_map->status.valid ||
>> +		    ((caps & CXL_DRIVER_CAP_MBOX) &&
>> !dev_map->mbox.valid) || !dev_map->memdev.valid) {
>>   			dev_err(host, "registers not found:
>> %s%s%s\n", !dev_map->status.valid ? "status " : "",
>> -				!dev_map->mbox.valid ? "mbox " : "",
>> +				((caps & CXL_DRIVER_CAP_MBOX) &&
>> !dev_map->mbox.valid) ? "mbox " : "", !dev_map->memdev.valid ?
>> "memdev " : ""); return -ENXIO;
>>   		}
>> @@ -455,7 +456,7 @@ static int cxl_probe_regs(struct cxl_register_map
>> *map) return 0;
>>   }
>>   
>> -int cxl_setup_regs(struct cxl_register_map *map)
>> +int cxl_setup_regs(struct cxl_register_map *map, uint8_t caps)
>>   {
>>   	int rc;
>>   
>> @@ -463,7 +464,7 @@ int cxl_setup_regs(struct cxl_register_map *map)
>>   	if (rc)
>>   		return rc;
>>   
>> -	rc = cxl_probe_regs(map);
>> +	rc = cxl_probe_regs(map, caps);
>>   	cxl_unmap_regblock(map);
>>   
>>   	return rc;
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index a6613a6f8923..9973430d975f 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -300,7 +300,7 @@ int cxl_find_regblock_instance(struct pci_dev
>> *pdev, enum cxl_regloc_type type, struct cxl_register_map *map, int
>> index); int cxl_find_regblock(struct pci_dev *pdev, enum
>> cxl_regloc_type type, struct cxl_register_map *map);
>> -int cxl_setup_regs(struct cxl_register_map *map);
>> +int cxl_setup_regs(struct cxl_register_map *map, uint8_t caps);
>>   struct cxl_dport;
>>   resource_size_t cxl_rcd_component_reg_phys(struct device *dev,
>>   					   struct cxl_dport *dport);
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index af8169ccdbc0..8f2a820bd92d 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -405,6 +405,9 @@ struct cxl_dpa_perf {
>>   	int qos_class;
>>   };
>>   
>> +#define CXL_DRIVER_CAP_HDM	0x1
>> +#define CXL_DRIVER_CAP_MBOX	0x2
>> +
>>   /**
>>    * struct cxl_dev_state - The driver device state
>>    *
>> @@ -438,6 +441,7 @@ struct cxl_dev_state {
>>   	struct resource ram_res;
>>   	u64 serial;
>>   	enum cxl_devtype type;
>> +	uint8_t capabilities;
>>   };
>>   
>>   /**
>> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
>> index b34d6259faf4..e2a978312281 100644
>> --- a/drivers/cxl/pci.c
>> +++ b/drivers/cxl/pci.c
>> @@ -502,7 +502,8 @@ static int cxl_rcrb_get_comp_regs(struct pci_dev
>> *pdev, }
>>   
>>   static int cxl_pci_setup_regs(struct pci_dev *pdev, enum
>> cxl_regloc_type type,
>> -			      struct cxl_register_map *map)
>> +			      struct cxl_register_map *map,
>> +			      uint8_t cxl_dev_caps)
>>   {
>>   	int rc;
>>   
>> @@ -519,7 +520,7 @@ static int cxl_pci_setup_regs(struct pci_dev
>> *pdev, enum cxl_regloc_type type, if (rc)
>>   		return rc;
>>   
>> -	return cxl_setup_regs(map);
>> +	return cxl_setup_regs(map, cxl_dev_caps);
>>   }
>>   
>>   int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct
>> cxl_dev_state *cxlds) @@ -527,7 +528,8 @@ int
>> cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state
>> *cxlds) struct cxl_register_map map; int rc;
>>   
>> -	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
>> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
>> +				cxlds->capabilities);
>>   	if (rc)
>>   		return rc;
>>   
>> @@ -536,7 +538,7 @@ int cxl_pci_accel_setup_regs(struct pci_dev
>> *pdev, struct cxl_dev_state *cxlds) return rc;
>>   
>>   	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
>> -				&cxlds->reg_map);
>> +				&cxlds->reg_map,
>> cxlds->capabilities); if (rc)
>>   		dev_warn(&pdev->dev, "No component registers
>> (%d)\n", rc);
>> @@ -850,7 +852,8 @@ static int cxl_pci_probe(struct pci_dev *pdev,
>> const struct pci_device_id *id) dev_warn(&pdev->dev,
>>   			 "Device DVSEC not present, skip CXL.mem
>> init\n");
>> -	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
>> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
>> +				cxlds->capabilities);
>>   	if (rc)
>>   		return rc;
>>   
>> @@ -863,7 +866,7 @@ static int cxl_pci_probe(struct pci_dev *pdev,
>> const struct pci_device_id *id)
>>   	 * still be useful for management functions so don't return
>> an error. */
>>   	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
>> -				&cxlds->reg_map);
>> +				&cxlds->reg_map,
>> cxlds->capabilities); if (rc)
>>   		dev_warn(&pdev->dev, "No component registers
>> (%d)\n", rc); else if (!cxlds->reg_map.component_map.ras.valid)
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c
>> b/drivers/net/ethernet/sfc/efx_cxl.c index 9cefcaf3caca..37d8bfdef517
>> 100644 --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -33,7 +33,8 @@ void efx_cxl_init(struct efx_nic *efx)
>>   
>>   	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability
>> found");
>> -	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
>> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev,
>> +
>> CXL_ACCEL_DRIVER_CAP_HDM); if (IS_ERR(cxl->cxlds)) {
>>   		pci_info(pci_dev, "CXL accel device state failed");
>>   		return;
>> diff --git a/include/linux/cxl_accel_mem.h
>> b/include/linux/cxl_accel_mem.h index c7b254edc096..0ba2195b919b
>> 100644 --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -12,8 +12,11 @@ enum accel_resource{
>>   	CXL_ACCEL_RES_PMEM,
>>   };
>>   
>> +#define CXL_ACCEL_DRIVER_CAP_HDM	0x1
>> +#define CXL_ACCEL_DRIVER_CAP_MBOX	0x2
>> +
>>   typedef struct cxl_dev_state cxl_accel_state;
>> -cxl_accel_state *cxl_accel_state_create(struct device *dev);
>> +cxl_accel_state *cxl_accel_state_create(struct device *dev, uint8_t
>> caps);
>>   void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>>   void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-08-09 10:25       ` Zhi Wang
@ 2024-08-15 15:37         ` Alejandro Lucero Palau
  2024-08-18  6:55           ` Zhi Wang
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-15 15:37 UTC (permalink / raw)
  To: Zhi Wang
  Cc: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes, targupta


On 8/9/24 11:25, Zhi Wang wrote:
> On Tue, 23 Jul 2024 14:43:24 +0100
> Alejandro Lucero Palau <alucerop@amd.com> wrote:
>
>> On 7/19/24 20:01, Dave Jiang wrote:
>>>>    
>>>> -static int cxl_probe_regs(struct cxl_register_map *map)
>>>> +static int cxl_probe_regs(struct cxl_register_map *map, uint8_t
>>>> caps) {
>>>>    	struct cxl_component_reg_map *comp_map;
>>>>    	struct cxl_device_reg_map *dev_map;
>>>> @@ -437,11 +437,12 @@ static int cxl_probe_regs(struct
>>>> cxl_register_map *map) case CXL_REGLOC_RBI_MEMDEV:
>>>>    		dev_map = &map->device_map;
>>>>    		cxl_probe_device_regs(host, base, dev_map);
>>>> -		if (!dev_map->status.valid ||
>>>> !dev_map->mbox.valid ||
>>>> +		if (!dev_map->status.valid ||
>>>> +		    ((caps & CXL_DRIVER_CAP_MBOX) &&
>>>> !dev_map->mbox.valid) || !dev_map->memdev.valid) {
>>>>    			dev_err(host, "registers not found:
>>>> %s%s%s\n", !dev_map->status.valid ? "status " : "",
>>>> -				!dev_map->mbox.valid ? "mbox " :
>>>> "",
>>>> +				((caps & CXL_DRIVER_CAP_MBOX) &&
>>>> !dev_map->mbox.valid) ? "mbox " : "",
>>> According to the r3.1 8.2.8.2.1, the device status registers and
>>> the primary mailbox registers are both mandatory if regloc id=3
>>> block is found. So if the type2 device does not implement a mailbox
>>> then it shouldn't be calling cxl_pci_setup_regs(pdev,
>>> CXL_REGLOC_RBI_MEMDEV, &map) to begin with from the driver init
>>> right? If the type2 device defines a regblock with id=3 but without
>>> a mailbox, then isn't that a spec violation?
>>>
>>> DJ
>>
>> Right. The code needs to support the possibility of a Type2 having a
>> mailbox, and if it is not supported, the rest of the dvsec regs
>> initialization needs to be performed. This is not what the code does
>> now, so I'll fix this.
>>
>>
>> A wider explanation is, for the RFC I used a test driver based on
>> QEMU emulating a Type2 which had a CXL Device Register Interface
>> defined (03h) but not a CXL Device Capability with id 2 for the
>> primary mailbox register, breaking the spec as you spotted.
>>
>>
> Because SFC driver uses (the 8.2.8.5.1.1 Memory Device Status
> Register) to determine if the memory media is ready or not (in PATCH 6).
> That register should be in a regloc id=3 block.


Right. Note patch 6 calls first cxl_await_media_ready and if it returns 
error, what happens if the register is not found, it sets the media 
ready field since it is required later on.

Damn it! I realize the code is wrong because the manual setting is based 
on no error. The testing has been a pain until recently with a partial 
emulation, so I had to follow undesired development steps. This is 
better now so v3 will fix some minor bugs like this one.

I also realize in our case this first call is useless, so I plan to 
remove it in next version.

Thanks!


> According to the spec paste above, the device that has regloc block
> id=3 needs to have device status and mailbox.
>
> Curious, does the SFC device have to implement the mailbox in this case
> for spec compliance?


I think It should, but no status register either in our case.


> Previously, I always think that "CXL Memory Device" == "CXL Type-3
> device" in the CXL spec.
>
> Now I am little bit confused if a type-2 device that supports cxl.mem
> == "CXL Memory Device" mentioned in the spec.
>
> If the answer == Y, then having regloc id ==3 and mailbox turn
> mandatory for a type-2 device that support cxl.mem for the spec
> compliance.
>
> If the answer == N, then a type-2 device can use approaches other than
> Memory Device Status Register to determine the readiness of the memory?


Right again. Our device is not advertised as a Memory Device but as a 
ethernet one, so we are not implementing those mandatory ones for a 
memory device.

Regarding the readiness of the CXL memory, I have been told this is so 
once some initial negotiation is performed (I do not know the details). 
That is the reason for setting this manually by our driver and the 
accessor added.


> ZW
>
>> Thanks.
>>
>>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-08-04 17:22   ` Jonathan Cameron
@ 2024-08-15 15:43     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-15 15:43 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 18:22, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:24 +0100
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Type2 devices have some Type3 functionalities as optional like an mbox
>> or an hdm decoder, and CXL core needs a way to know what a CXL accelerator
>> implements.
>>
>> Add a new field for keeping device capabilities to be initialised by
>> Type2 drivers. Advertise all those capabilities for Type3.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> In general seems a reasonable approach, so just minor comments.
>
>> ---
>>   drivers/cxl/core/mbox.c            |  1 +
>>   drivers/cxl/core/memdev.c          |  4 +++-
>>   drivers/cxl/core/port.c            |  2 +-
>>   drivers/cxl/core/regs.c            | 11 ++++++-----
>>   drivers/cxl/cxl.h                  |  2 +-
>>   drivers/cxl/cxlmem.h               |  4 ++++
>>   drivers/cxl/pci.c                  | 15 +++++++++------
>>   drivers/net/ethernet/sfc/efx_cxl.c |  3 ++-
>>   include/linux/cxl_accel_mem.h      |  5 ++++-
>>   9 files changed, 31 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>> index 2626f3fff201..2ba7d36e3f38 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -1424,6 +1424,7 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
>>   	mds->cxlds.reg_map.host = dev;
>>   	mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE;
>>   	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
>> +	mds->cxlds.capabilities = CXL_DRIVER_CAP_HDM | CXL_DRIVER_CAP_MBOX;
> Add a reference for this perhaps.  Make it clear that a type3 device must
> support mailbox and hdm by pointing at requirement for the various structures
> in a spec reference.
>

I think it would be worth to have documentation, distilling out 
dis-ambiguities from the specs about mandatory/optional registers.


>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index af8169ccdbc0..8f2a820bd92d 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -405,6 +405,9 @@ struct cxl_dpa_perf {
>>   	int qos_class;
>>   };
>>   
>> +#define CXL_DRIVER_CAP_HDM	0x1
>> +#define CXL_DRIVER_CAP_MBOX	0x2
>> +
> Enum and BIT() for the defines.  Avoids someone in future
> thinking they can define 0x3 to be something.
>
> Definitely only one definition as well. Seems reasonable for
> this to be CXL wide.
>

OK.

Thanks!


>>   /**
>>    * struct cxl_dev_state - The driver device state
>>    *
>> @@ -438,6 +441,7 @@ struct cxl_dev_state {
>>   	struct resource ram_res;
>>   	u64 serial;
>>   	enum cxl_devtype type;
>> +	uint8_t capabilities;
>>   };

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-12 11:16     ` Alejandro Lucero Palau
  2024-08-13  8:30       ` Alejandro Lucero Palau
@ 2024-08-15 16:35       ` Jonathan Cameron
  2024-08-19 11:10         ` Alejandro Lucero Palau
  1 sibling, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-15 16:35 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes

On Mon, 12 Aug 2024 12:16:02 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 8/4/24 18:10, Jonathan Cameron wrote:
> > On Mon, 15 Jul 2024 18:28:21 +0100
> > <alejandro.lucero-palau@amd.com> wrote:
> >  
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> Differientiate Type3, aka memory expanders, from Type2, aka device
> >> accelerators, with a new function for initializing cxl_dev_state.
> >>
> >> Create opaque struct to be used by accelerators relying on new access
> >> functions in following patches.
> >>
> >> Add SFC ethernet network driver as the client.
> >>
> >> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
> >>
> >> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >> Co-developed-by: Dan Williams <dan.j.williams@intel.com>  
> >  
> 
> >> +
> >> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
> >> +{
> >> +	cxlds->cxl_dvsec = dvsec;  
> > Nothing to do with accel. If these make sense promote to cxl
> > core and a linux/cxl/ header.  Also we may want the type3 driver to
> > switch to them long term. If nothing else, making that handle the
> > cxl_dev_state as more opaque will show up what is still directly
> > accessed and may need to be wrapped up for a future accelerator driver
> > to use.
> >  
> 
> I will change the function name then, but not sure I follow the comment 
> about more opaque ...
If most code can't see the internals of cxl_dev_state because it
doesn't include the header that defines it, then we will generally
spot data that may not belong in that state structure in the first place
or where it is appropriate to have an accessor function mediating that
access.

Jonathan



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-13  8:30       ` Alejandro Lucero Palau
@ 2024-08-15 16:38         ` Jonathan Cameron
  2024-08-19 11:12           ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-15 16:38 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes

On Tue, 13 Aug 2024 09:30:08 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 8/12/24 12:16, Alejandro Lucero Palau wrote:
> >
> > On 8/4/24 18:10, Jonathan Cameron wrote:  
> >> On Mon, 15 Jul 2024 18:28:21 +0100
> >> <alejandro.lucero-palau@amd.com> wrote:
> >>  
> >>> From: Alejandro Lucero <alucerop@amd.com>
> >>>
> >>> Differientiate Type3, aka memory expanders, from Type2, aka device
> >>> accelerators, with a new function for initializing cxl_dev_state.
> >>>
> >>> Create opaque struct to be used by accelerators relying on new access
> >>> functions in following patches.
> >>>
> >>> Add SFC ethernet network driver as the client.
> >>>
> >>> Based on 
> >>> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
> >>>
> >>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>  
> >>  
> >  
> >>> +
> >>> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
> >>> +{
> >>> +    cxlds->cxl_dvsec = dvsec;  
> >> Nothing to do with accel. If these make sense promote to cxl
> >> core and a linux/cxl/ header.  Also we may want the type3 driver to
> >> switch to them long term. If nothing else, making that handle the
> >> cxl_dev_state as more opaque will show up what is still directly
> >> accessed and may need to be wrapped up for a future accelerator driver
> >> to use.
> >>  
> >
> > I will change the function name then, but not sure I follow the 
> > comment about more opaque ...
> >
> >
> >  
> 
> I have second thoughts about this.
> 
> 
> I consider this as an accessor  for, as you said in a previous exchange, 
> facilitating changes to the core structs without touching those accel 
> drivers using it.
> 
> Type3 driver is part of the CXL core and easy to change for these kind 
> of updates since it will only be one driver supporting all Type3, and an 
> accessor is not required then.
> 
> Let me know what you think.

It's less critical, but longer term I'd expect any stuff that makes
sense for accelerators and the type 3 driver to use the same
approaches and code paths.  Makes it easier to see where they
are related than opencoding the accesses in the type 3 driver will
do.  In the very long term, I'd expect the type 3 driver to just be
another CXL driver alongside many others.

Jonathan

> 
> 


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 02/15] cxl: add function for type2 cxl regs setup
  2024-08-14  7:56     ` Alejandro Lucero Palau
@ 2024-08-15 16:40       ` Jonathan Cameron
  2024-08-18  8:07         ` Zhi Wang
  0 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-15 16:40 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes

On Wed, 14 Aug 2024 08:56:35 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 8/4/24 18:15, Jonathan Cameron wrote:
> > On Mon, 15 Jul 2024 18:28:22 +0100
> > alejandro.lucero-palau@amd.com wrote:
> >  
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> Create a new function for a type2 device initialising the opaque
> >> cxl_dev_state struct regarding cxl regs setup and mapping.
> >>
> >> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >> ---
> >>   drivers/cxl/pci.c                  | 28 ++++++++++++++++++++++++++++
> >>   drivers/net/ethernet/sfc/efx_cxl.c |  3 +++
> >>   include/linux/cxl_accel_mem.h      |  1 +
> >>   3 files changed, 32 insertions(+)
> >>
> >> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> >> index e53646e9f2fb..b34d6259faf4 100644
> >> --- a/drivers/cxl/pci.c
> >> +++ b/drivers/cxl/pci.c
> >> @@ -11,6 +11,7 @@
> >>   #include <linux/pci.h>
> >>   #include <linux/aer.h>
> >>   #include <linux/io.h>
> >> +#include <linux/cxl_accel_mem.h>
> >>   #include "cxlmem.h"
> >>   #include "cxlpci.h"
> >>   #include "cxl.h"
> >> @@ -521,6 +522,33 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> >>   	return cxl_setup_regs(map);
> >>   }
> >>   
> >> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
> >> +{
> >> +	struct cxl_register_map map;
> >> +	int rc;
> >> +
> >> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> >> +	if (rc)
> >> +		return rc;
> >> +
> >> +	rc = cxl_map_device_regs(&map, &cxlds->regs.device_regs);
> >> +	if (rc)
> >> +		return rc;
> >> +
> >> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> >> +				&cxlds->reg_map);
> >> +	if (rc)
> >> +		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);  
> > Not fatal?  If we think it will happen on real devices, then dev_warn
> > is too strong.  
> 
> 
> This is more complex than what it seems, and it is not properly handled 
> with the current code.
> 
> I will cover it in another patch in more detail, but the fact is those 
> calls to cxl_pci_setup_regs need to be handled better, because Type2 has 
> some of these registers as optional.

I'd argue you don't have to support all type 2 devices with your
first code.  Things like optionality of registers can come in when
a device shows up where they aren't present.

Jonathan

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 05/15] cxl: fix use of resource_contains
  2024-08-04 17:25   ` Jonathan Cameron
@ 2024-08-16 14:37     ` Alejandro Lucero Palau
  2024-08-27 15:12       ` Jonathan Cameron
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-16 14:37 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 18:25, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:25 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> For a resource defined with size zero, resource contains will also
>> return true.
>>
>> Add resource size check before using it.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> If this can happen in existing type 3 case the fixes tag
> and send it separately from this series.


I have been looking at this possibility and although not with 100% 
certainty, I would say it is not for Type3.

"Type3 regions" are (usually) created from user space, and:

1) if it is RAM, dax code is invoked for creating the region

2) if it is pmem, pmem region creation code is invoked.

None of these possibilities use the affected code in this patch.

There exist two options where that code could be used by Type3, which 
are confusing:

1) regions created during device initialization, but for that the 
decoder needs to be committed and it is not expected for Type3 without 
user space intervention.

2) when emulating an hdm decoder, what I think it is not possible for 
Type3 since it is mandatory.


Finally we have code when sysfs dpa_size file is written, which I'm not 
familiar with.



> If there is no path due to some external code, then
> drop the word fix from the title and call it
>
> cxl: harden resource_contains checks to handle zero size resources


After the explanation above, I will do as you say.

Thanks!


> Avoids it getting backported into stable / distros picking it
> up if there isn't a real issue before this series.
>
> Thanks,
>
> Jonathan
>
>> ---
>>   drivers/cxl/core/hdm.c | 7 +++++--
>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
>> index 3df10517a327..4af9225d4b59 100644
>> --- a/drivers/cxl/core/hdm.c
>> +++ b/drivers/cxl/core/hdm.c
>> @@ -327,10 +327,13 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
>>   	cxled->dpa_res = res;
>>   	cxled->skip = skipped;
>>   
>> -	if (resource_contains(&cxlds->pmem_res, res))
>> +	if ((resource_size(&cxlds->pmem_res)) && (resource_contains(&cxlds->pmem_res, res))) {
>> +		printk("%s: resource_contains CXL_DECODER_PMEM\n", __func__);
>>   		cxled->mode = CXL_DECODER_PMEM;
>> -	else if (resource_contains(&cxlds->ram_res, res))
>> +	} else if ((resource_size(&cxlds->ram_res)) && (resource_contains(&cxlds->ram_res, res))) {
>> +		printk("%s: resource_contains CXL_DECODER_RAM\n", __func__);
>>   		cxled->mode = CXL_DECODER_RAM;
>> +	}
>>   	else {
>>   		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
>>   			 port->id, cxled->cxld.id, cxled->dpa_res);

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 05/15] cxl: fix use of resource_contains
  2024-08-09  9:14   ` Zhi Wang
@ 2024-08-16 14:42     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-16 14:42 UTC (permalink / raw)
  To: Zhi Wang, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, targupta


On 8/9/24 10:14, Zhi Wang wrote:
> On Mon, 15 Jul 2024 18:28:25 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> For a resource defined with size zero, resource contains will also
>> return true.
>>
>> Add resource size check before using it.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/hdm.c | 7 +++++--
>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
>> index 3df10517a327..4af9225d4b59 100644
>> --- a/drivers/cxl/core/hdm.c
>> +++ b/drivers/cxl/core/hdm.c
>> @@ -327,10 +327,13 @@ static int __cxl_dpa_reserve(struct
>> cxl_endpoint_decoder *cxled, cxled->dpa_res = res;
>>   	cxled->skip = skipped;
>>   
>> -	if (resource_contains(&cxlds->pmem_res, res))
>> +	if ((resource_size(&cxlds->pmem_res)) &&
>> (resource_contains(&cxlds->pmem_res, res))) {
>> +		printk("%s: resource_contains CXL_DECODER_PMEM\n",
>> __func__); cxled->mode = CXL_DECODER_PMEM;
>> -	else if (resource_contains(&cxlds->ram_res, res))
>> +	} else if ((resource_size(&cxlds->ram_res)) &&
>> (resource_contains(&cxlds->ram_res, res))) {
>> +		printk("%s: resource_contains CXL_DECODER_RAM\n",
>> __func__); cxled->mode = CXL_DECODER_RAM;
>> +	}
>>   	else {
>>   		dev_warn(dev, "decoder%d.%d: %pr mixed mode not
>> supported\n", port->id, cxled->cxld.id, cxled->dpa_res);
> Also, please clean up your printks before sending them to stable.


Sure.

Thanks!


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 05/15] cxl: fix use of resource_contains
  2024-07-24 21:25   ` fan
@ 2024-08-16 14:43     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-16 14:43 UTC (permalink / raw)
  To: fan, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 7/24/24 22:25, fan wrote:
> On Mon, Jul 15, 2024 at 06:28:25PM +0100, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> For a resource defined with size zero, resource contains will also
>> return true.
> s/resource contains/resource_contains/
>
> Fan


I'll fix it.

Thanks!


>> Add resource size check before using it.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/hdm.c | 7 +++++--
>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
>> index 3df10517a327..4af9225d4b59 100644
>> --- a/drivers/cxl/core/hdm.c
>> +++ b/drivers/cxl/core/hdm.c
>> @@ -327,10 +327,13 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
>>   	cxled->dpa_res = res;
>>   	cxled->skip = skipped;
>>   
>> -	if (resource_contains(&cxlds->pmem_res, res))
>> +	if ((resource_size(&cxlds->pmem_res)) && (resource_contains(&cxlds->pmem_res, res))) {
>> +		printk("%s: resource_contains CXL_DECODER_PMEM\n", __func__);
>>   		cxled->mode = CXL_DECODER_PMEM;
>> -	else if (resource_contains(&cxlds->ram_res, res))
>> +	} else if ((resource_size(&cxlds->ram_res)) && (resource_contains(&cxlds->ram_res, res))) {
>> +		printk("%s: resource_contains CXL_DECODER_RAM\n", __func__);
>>   		cxled->mode = CXL_DECODER_RAM;
>> +	}
>>   	else {
>>   		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
>>   			 port->id, cxled->cxld.id, cxled->dpa_res);
>> -- 
>> 2.17.1
>>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 06/15] cxl: add function for setting media ready by an accelerator
  2024-08-04 17:26   ` Jonathan Cameron
@ 2024-08-16 14:54     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-16 14:54 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 18:26, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:26 +0100
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> A Type-2 driver can require to set the memory availability explicitly.
>>
>> Add a function to the exported CXL API for accelerator drivers.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/memdev.c          | 7 ++++++-
>>   drivers/net/ethernet/sfc/efx_cxl.c | 5 +++++
>>   include/linux/cxl_accel_mem.h      | 2 ++
>>   3 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index b4205ecca365..58a51e7fd37f 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -714,7 +714,6 @@ static int cxl_memdev_open(struct inode *inode, struct file *file)
>>   	return 0;
>>   }
>>   
>> -
> Grumpy maintainer time ;)
> Scrub for this stuff before posting.  Move the whitespace cleanup to the
> earlier patch so we have less noise here.
>

I will avoid this kind of things in v3.


>>   void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>>   {
>>   	cxlds->cxl_dvsec = dvsec;
>> @@ -759,6 +758,12 @@ int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
>>   
>> +void cxl_accel_set_media_ready(struct cxl_dev_state *cxlds)
>> +{
>> +	cxlds->media_ready = true;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_media_ready, CXL);
>> +
>>   static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>>   {
>>   	struct cxl_memdev *cxlmd =
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index 37d8bfdef517..a84fe7992c53 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -56,6 +56,11 @@ void efx_cxl_init(struct efx_nic *efx)
>>   
>>   	if (cxl_accel_request_resource(cxl->cxlds, true))
>>   		pci_info(pci_dev, "CXL accel resource request failed");
>> +
>> +	if (!cxl_await_media_ready(cxl->cxlds))
>> +		cxl_accel_set_media_ready(cxl->cxlds);
>> +	else
>> +		pci_info(pci_dev, "CXL accel media not active");
> Feels fatal. pci_err() and return an error.


As I commented yesterday when this patch was pointed to in another patch 
review, this is unnecessary in our case and it will be fixed in next 
version:

cxl_await_media_ready will not be invoked only using the accessor for 
manually setting the media ready.

Thanks


>>   }
>>   
>>   
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> index 0ba2195b919b..b883c438a132 100644
>> --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -24,4 +24,6 @@ void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>   			    enum accel_resource);
>>   int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>>   int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
>> +void cxl_accel_set_media_ready(struct cxl_dev_state *cxlds);
>> +int cxl_await_media_ready(struct cxl_dev_state *cxlds);
>>   #endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 07/15] cxl: support type2 memdev creation
  2024-07-24 21:32   ` fan
@ 2024-08-16 14:57     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-16 14:57 UTC (permalink / raw)
  To: fan, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 7/24/24 22:32, fan wrote:
> On Mon, Jul 15, 2024 at 06:28:27PM +0100, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Add memdev creation from sfc driver.
>>
>> Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when
>> creating a memdev leading to problems when obtaining cxl_memdev_state
>> references from a CXL_DEVTYPE_DEVMEM type. This last device type is
>> managed by a specific vendor driver and does not need same sysfs files
>> since not userspace intervention is expected. This patch checks for the
>> right device type in those functions using cxl_memdev_state.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/cdat.c            |  3 +++
>>   drivers/cxl/core/memdev.c          |  9 +++++++++
>>   drivers/cxl/mem.c                  | 17 +++++++++++------
>>   drivers/net/ethernet/sfc/efx_cxl.c | 10 ++++++++--
>>   include/linux/cxl_accel_mem.h      |  3 +++
>>   5 files changed, 34 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>> index bb83867d9fec..0d4679c137d4 100644
>> --- a/drivers/cxl/core/cdat.c
>> +++ b/drivers/cxl/core/cdat.c
>> @@ -558,6 +558,9 @@ void cxl_region_perf_data_calculate(struct cxl_region *cxlr,
>>   	};
>>   	struct cxl_dpa_perf *perf;
>>   
>> +	if (!mds)
>> +		return;
>> +
>>   	switch (cxlr->mode) {
>>   	case CXL_DECODER_RAM:
>>   		perf = &mds->ram_perf;
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index 58a51e7fd37f..b902948b121f 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -468,6 +468,9 @@ static umode_t cxl_ram_visible(struct kobject *kobj, struct attribute *a, int n)
>>   	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>>   	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>>   
>> +	if (!mds)
>> +		return 0;
>> +
>>   	if (a == &dev_attr_ram_qos_class.attr)
>>   		if (mds->ram_perf.qos_class == CXL_QOS_CLASS_INVALID)
>>   			return 0;
>> @@ -487,6 +490,9 @@ static umode_t cxl_pmem_visible(struct kobject *kobj, struct attribute *a, int n
>>   	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>>   	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>>   
>> +	if (!mds)
>> +		return 0;
>> +
>>   	if (a == &dev_attr_pmem_qos_class.attr)
>>   		if (mds->pmem_perf.qos_class == CXL_QOS_CLASS_INVALID)
>>   			return 0;
>> @@ -507,6 +513,9 @@ static umode_t cxl_memdev_security_visible(struct kobject *kobj,
>>   	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>>   	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>>   
>> +	if (!mds)
>> +		return 0;
>> +
>>   	if (a == &dev_attr_security_sanitize.attr &&
>>   	    !test_bit(CXL_SEC_ENABLED_SANITIZE, mds->security.enabled_cmds))
>>   		return 0;
>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>> index 2f1b49bfe162..f76af75a87b7 100644
>> --- a/drivers/cxl/mem.c
>> +++ b/drivers/cxl/mem.c
>> @@ -131,12 +131,14 @@ static int cxl_mem_probe(struct device *dev)
>>   	dentry = cxl_debugfs_create_dir(dev_name(dev));
>>   	debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show);
>>   
>> -	if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
>> -		debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
>> -				    &cxl_poison_inject_fops);
>> -	if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
>> -		debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
>> -				    &cxl_poison_clear_fops);
>> +	if (mds) {
>> +		if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
>> +			debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
>> +					    &cxl_poison_inject_fops);
>> +		if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
>> +			debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
>> +					    &cxl_poison_clear_fops);
>> +	}
>>   
>>   	rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
>>   	if (rc)
>> @@ -222,6 +224,9 @@ static umode_t cxl_mem_visible(struct kobject *kobj, struct attribute *a, int n)
>>   	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>>   	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>>   
>> +	if (!mds)
>> +		return 0;
>> +
>>   	if (a == &dev_attr_trigger_poison_list.attr)
>>   		if (!test_bit(CXL_POISON_ENABLED_LIST,
>>   			      mds->poison.enabled_cmds))
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index a84fe7992c53..0abe66490ef5 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -57,10 +57,16 @@ void efx_cxl_init(struct efx_nic *efx)
>>   	if (cxl_accel_request_resource(cxl->cxlds, true))
>>   		pci_info(pci_dev, "CXL accel resource request failed");
>>   
>> -	if (!cxl_await_media_ready(cxl->cxlds))
>> +	if (!cxl_await_media_ready(cxl->cxlds)) {
>>   		cxl_accel_set_media_ready(cxl->cxlds);
>> -	else
>> +	} else {
>>   		pci_info(pci_dev, "CXL accel media not active");
> pci_warning() ??


The code will be modified and no error will be needed to be handled.


>> +		return;
>> +	}
>> +
>> +	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
>> +	if (IS_ERR(cxl->cxlmd))
>> +		pci_info(pci_dev, "CXL accel memdev creation failed");
> pci_err()


Yes. I'll fix it.

Thanks



> Fan
>>   }
>>   
>>   
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> index b883c438a132..442ed9862292 100644
>> --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -26,4 +26,7 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>>   int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool is_ram);
>>   void cxl_accel_set_media_ready(struct cxl_dev_state *cxlds);
>>   int cxl_await_media_ready(struct cxl_dev_state *cxlds);
>> +
>> +struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>> +				       struct cxl_dev_state *cxlds);
>>   #endif
>> -- 
>> 2.17.1
>>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 07/15] cxl: support type2 memdev creation
  2024-08-04 17:31   ` Jonathan Cameron
@ 2024-08-16 15:00     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-16 15:00 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 18:31, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:27 +0100
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Add memdev creation from sfc driver.
>>
>> Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when
>> creating a memdev leading to problems when obtaining cxl_memdev_state
>> references from a CXL_DEVTYPE_DEVMEM type. This last device type is
>> managed by a specific vendor driver and does not need same sysfs files
>> since not userspace intervention is expected. This patch checks for the
>> right device type in those functions using cxl_memdev_state.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Same general comment about treating failure to get things you expect
> as proper driver probe errors.  Very unlikely we'd ever want to carry
> on if these fail. If we do want to, that should be a high level decision
> and the chances are the driver needs to know that the error occurred
> so it can take some mitigating measures (using some alternative mechanisms
> etc).


OK


Other comments below already addressed when replying to fan.

Thanks!


>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index a84fe7992c53..0abe66490ef5 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -57,10 +57,16 @@ void efx_cxl_init(struct efx_nic *efx)
>>   	if (cxl_accel_request_resource(cxl->cxlds, true))
>>   		pci_info(pci_dev, "CXL accel resource request failed");
>>   
>> -	if (!cxl_await_media_ready(cxl->cxlds))
>> +	if (!cxl_await_media_ready(cxl->cxlds)) {
>>   		cxl_accel_set_media_ready(cxl->cxlds);
>> -	else
>> +	} else {
>>   		pci_info(pci_dev, "CXL accel media not active");
>> +		return;
> Once you are returning an error in this path you can just have
> 		return -ETIMEDOUT; or similar here adn avoid
> this code changing in this patch.
>> +	}
>> +
>> +	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
>> +	if (IS_ERR(cxl->cxlmd))
>> +		pci_info(pci_dev, "CXL accel memdev creation failed");
> I'd treat this one as fatal as well.
>
> People argue in favor of muddling on to allow firmware upgrade etc.
> That is fine, but pass up the errors then decide to ignore them
> at the higher levels.
>
>>   }
>>   
>>   
>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-12 11:34     ` Alejandro Lucero Palau
@ 2024-08-17 20:32       ` Zhi Wang
  2024-08-19 11:13         ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-17 20:32 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes, targupta, zhiwang

On Mon, 12 Aug 2024 12:34:55 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> 
> On 8/9/24 09:34, Zhi Wang wrote:
> > On Mon, 15 Jul 2024 18:28:21 +0100
> > <alejandro.lucero-palau@amd.com> wrote:
> >
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> Differientiate Type3, aka memory expanders, from Type2, aka device
> >> accelerators, with a new function for initializing cxl_dev_state.
> >>
> >> Create opaque struct to be used by accelerators relying on new
> >> access functions in following patches.
> >>
> >> Add SFC ethernet network driver as the client.
> >>
> >> Based on
> >> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
> >>
> >> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> >> ---
> >>   drivers/cxl/core/memdev.c             | 52
> >> ++++++++++++++++++++++++++ drivers/net/ethernet/sfc/Makefile     |
> >>  2 +- drivers/net/ethernet/sfc/efx.c        |  4 ++
> >>   drivers/net/ethernet/sfc/efx_cxl.c    | 53
> >> +++++++++++++++++++++++++++ drivers/net/ethernet/sfc/efx_cxl.h    |
> >> 29 +++++++++++++++ drivers/net/ethernet/sfc/net_driver.h |  4 ++
> >>   include/linux/cxl_accel_mem.h         | 22 +++++++++++
> >>   include/linux/cxl_accel_pci.h         | 23 ++++++++++++
> >>   8 files changed, 188 insertions(+), 1 deletion(-)
> >>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
> >>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
> >>   create mode 100644 include/linux/cxl_accel_mem.h
> >>   create mode 100644 include/linux/cxl_accel_pci.h
> >>
> >> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> >> index 0277726afd04..61b5d35b49e7 100644
> >> --- a/drivers/cxl/core/memdev.c
> >> +++ b/drivers/cxl/core/memdev.c
> >> @@ -8,6 +8,7 @@
> >>   #include <linux/idr.h>
> >>   #include <linux/pci.h>
> >>   #include <cxlmem.h>
> >> +#include <linux/cxl_accel_mem.h>
> > Let's keep the header inclusion in an alphabetical order. The same
> > in efx_cxl.c
> 
> 
> The headers seem to follow a reverse Christmas tree order here rather 
> than an alphabetical one.
> 
> Should I rearrange them all?
> 

Let's fix them.

> 
> >>   #include "trace.h"
> >>   #include "core.h"
> >>   
> >> @@ -615,6 +616,25 @@ static void detach_memdev(struct work_struct
> >> *work)
> >>   static struct lock_class_key cxl_memdev_key;
> >>   
> >> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
> >> +{
> >> +	struct cxl_dev_state *cxlds;
> >> +
> >> +	cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
> >> +	if (!cxlds)
> >> +		return ERR_PTR(-ENOMEM);
> >> +
> >> +	cxlds->dev = dev;
> >> +	cxlds->type = CXL_DEVTYPE_DEVMEM;
> >> +
> >> +	cxlds->dpa_res = DEFINE_RES_MEM_NAMED(0, 0, "dpa");
> >> +	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
> >> +	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
> >> +
> >> +	return cxlds;
> >> +}
> >> +EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
> >> +
> >>   static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state
> >> *cxlds, const struct file_operations *fops)
> >>   {
> >> @@ -692,6 +712,38 @@ static int cxl_memdev_open(struct inode
> >> *inode, struct file *file) return 0;
> >>   }
> >>
> >> +
> >> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
> >> +{
> >> +	cxlds->cxl_dvsec = dvsec;
> >> +}
> >> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_dvsec, CXL);
> >> +
> >> +void cxl_accel_set_serial(struct cxl_dev_state *cxlds, u64 serial)
> >> +{
> >> +	cxlds->serial= serial;
> >> +}
> >> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_serial, CXL);
> >> +
> > It would be nice to explain about how the cxl core is using these in
> > the patch comments, as we just saw the stuff got promoted into the
> > core.
> 
> 
> As far as I can see, it is for info/debugging purposes. I will add
> such explanation in next version.
> 
> 
> >
> >> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct
> >> resource res,
> >> +			    enum accel_resource type)
> >> +{
> >> +	switch (type) {
> >> +	case CXL_ACCEL_RES_DPA:
> >> +		cxlds->dpa_res = res;
> >> +		return;
> >> +	case CXL_ACCEL_RES_RAM:
> >> +		cxlds->ram_res = res;
> >> +		return;
> >> +	case CXL_ACCEL_RES_PMEM:
> >> +		cxlds->pmem_res = res;
> >> +		return;
> >> +	default:
> >> +		dev_err(cxlds->dev, "unkown resource type (%u)\n",
> >> type);
> >> +	}
> >> +}
> >> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
> >> +
> > I wonder in which situation this error can be triggered.
> > One can be a newer out-of-tree type-2 driver tries to work on an
> > older kernel. Other situations should be the coding problem of an
> > in-tree driver.
> 
> 
> I guess that would point to an extension not updating this function.
> 
> 
> > I prefer to WARN_ONCE() here.
> 
> 
> I agree after your previous concern.
> 
> 
> >
> >>   
> >> diff --git a/include/linux/cxl_accel_mem.h
> >> b/include/linux/cxl_accel_mem.h new file mode 100644
> >> index 000000000000..daf46d41f59c
> >> --- /dev/null
> >> +++ b/include/linux/cxl_accel_mem.h
> >> @@ -0,0 +1,22 @@
> >> +/* SPDX-License-Identifier: GPL-2.0 */
> >> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> >> +
> >> +#include <linux/cdev.h>
> >> +
> >> +#ifndef __CXL_ACCEL_MEM_H
> >> +#define __CXL_ACCEL_MEM_H
> >> +
> >> +enum accel_resource{
> >> +	CXL_ACCEL_RES_DPA,
> >> +	CXL_ACCEL_RES_RAM,
> >> +	CXL_ACCEL_RES_PMEM,
> >> +};
> >> +
> >> +typedef struct cxl_dev_state cxl_accel_state;
> > The case of using typedef in kernel coding is very rare (quite many
> > of them are still there due to history reason, you can also spot
> > that there is only one typedef in driver/cxl). Be sure to double
> > check the coding style bible [1] when deciding to use one. :)
> >
> > [1] https://www.kernel.org/doc/html/v4.14/process/coding-style.html
> 
> 
> Right.
> 
> I think there is an agreement now in not using typedef but struct 
> cxl_dev_state so problem solved.
> 
> 
> Thanks!
> 
> 
> 


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-08-15 15:37         ` Alejandro Lucero Palau
@ 2024-08-18  6:55           ` Zhi Wang
  2024-08-19 13:14             ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-18  6:55 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes, targupta, Vikram Sethi, zhiwang

On Thu, 15 Aug 2024 16:37:21 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> 
> On 8/9/24 11:25, Zhi Wang wrote:
> > On Tue, 23 Jul 2024 14:43:24 +0100
> > Alejandro Lucero Palau <alucerop@amd.com> wrote:
> >
> >> On 7/19/24 20:01, Dave Jiang wrote:
> >>>>    
> >>>> -static int cxl_probe_regs(struct cxl_register_map *map)
> >>>> +static int cxl_probe_regs(struct cxl_register_map *map, uint8_t
> >>>> caps) {
> >>>>    	struct cxl_component_reg_map *comp_map;
> >>>>    	struct cxl_device_reg_map *dev_map;
> >>>> @@ -437,11 +437,12 @@ static int cxl_probe_regs(struct
> >>>> cxl_register_map *map) case CXL_REGLOC_RBI_MEMDEV:
> >>>>    		dev_map = &map->device_map;
> >>>>    		cxl_probe_device_regs(host, base, dev_map);
> >>>> -		if (!dev_map->status.valid ||
> >>>> !dev_map->mbox.valid ||
> >>>> +		if (!dev_map->status.valid ||
> >>>> +		    ((caps & CXL_DRIVER_CAP_MBOX) &&
> >>>> !dev_map->mbox.valid) || !dev_map->memdev.valid) {
> >>>>    			dev_err(host, "registers not found:
> >>>> %s%s%s\n", !dev_map->status.valid ? "status " : "",
> >>>> -				!dev_map->mbox.valid ? "mbox " :
> >>>> "",
> >>>> +				((caps & CXL_DRIVER_CAP_MBOX) &&
> >>>> !dev_map->mbox.valid) ? "mbox " : "",
> >>> According to the r3.1 8.2.8.2.1, the device status registers and
> >>> the primary mailbox registers are both mandatory if regloc id=3
> >>> block is found. So if the type2 device does not implement a
> >>> mailbox then it shouldn't be calling cxl_pci_setup_regs(pdev,
> >>> CXL_REGLOC_RBI_MEMDEV, &map) to begin with from the driver init
> >>> right? If the type2 device defines a regblock with id=3 but
> >>> without a mailbox, then isn't that a spec violation?
> >>>
> >>> DJ
> >>
> >> Right. The code needs to support the possibility of a Type2 having
> >> a mailbox, and if it is not supported, the rest of the dvsec regs
> >> initialization needs to be performed. This is not what the code
> >> does now, so I'll fix this.
> >>
> >>
> >> A wider explanation is, for the RFC I used a test driver based on
> >> QEMU emulating a Type2 which had a CXL Device Register Interface
> >> defined (03h) but not a CXL Device Capability with id 2 for the
> >> primary mailbox register, breaking the spec as you spotted.
> >>
> >>
> > Because SFC driver uses (the 8.2.8.5.1.1 Memory Device Status
> > Register) to determine if the memory media is ready or not (in
> > PATCH 6). That register should be in a regloc id=3 block.
> 
> 
> Right. Note patch 6 calls first cxl_await_media_ready and if it
> returns error, what happens if the register is not found, it sets the
> media ready field since it is required later on.
> 
> Damn it! I realize the code is wrong because the manual setting is
> based on no error. The testing has been a pain until recently with a
> partial emulation, so I had to follow undesired development steps.
> This is better now so v3 will fix some minor bugs like this one.
> 
> I also realize in our case this first call is useless, so I plan to 
> remove it in next version.
> 
> Thanks!
>

Hi Alejandro:

No worries. Let's push forward. :)

For a type-2, I think cxl_await_media_ready() still gives value on
provide a type-2 vendor driver a generic core call to make sure the HDM
region is ready to use. Because judging CXL_RANGE active & valid in
CXL_RANGE_{1,2}_SIZE_LO can be useful to type-2.

I think the problem of cxl_await_media_ready() is: it assumes the
Memory Device Status Register is always present, which is true for
type-3 but not always true for type-2. I think we need:

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index a663e7566c48..0ba1cedfc0ba 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -203,6 +203,9 @@ int cxl_await_media_ready(struct cxl_dev_state
*cxlds)
                        return rc;
        }

+       if (!cxlds->regs.memdev)
+               return 0;
+
        md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
        if (!CXLMDEV_READY(md_status))
                return -EIO;

Then for the type-2 device, if it doesn't implement regloc=3, it can
still call cxl_await_media_ready() to make sure the media is ready. For
type-2 and type-3 which implements regloc=3, the check can continue.

I think SFC can use this as well, because according to the spec 8.1.3.8
DVSEC CXL Range Registers:

"The DVSEC CXL Range 1 register set must be implemented if
Mem_Capable=1 in the DVSEC CXL Capability register. The DVSEC CXL Range
2 register set must be implemented if (Mem_Capable=1 and HDM_Count=10b
in the DVSEC CXL Capability register)."

So SFC should have this. With the change above maybe you don't need
set_media_ready stuff in the later patch. Just simply call
cxl_await_media_ready(), everything should be fine then.

Thanks,
Zhi.

> 
> > According to the spec paste above, the device that has regloc block
> > id=3 needs to have device status and mailbox.
> >
> > Curious, does the SFC device have to implement the mailbox in this
> > case for spec compliance?
> 
> 
> I think It should, but no status register either in our case.
> 
> 
> > Previously, I always think that "CXL Memory Device" == "CXL Type-3
> > device" in the CXL spec.
> >
> > Now I am little bit confused if a type-2 device that supports
> > cxl.mem == "CXL Memory Device" mentioned in the spec.
> >
> > If the answer == Y, then having regloc id ==3 and mailbox turn
> > mandatory for a type-2 device that support cxl.mem for the spec
> > compliance.
> >
> > If the answer == N, then a type-2 device can use approaches other
> > than Memory Device Status Register to determine the readiness of
> > the memory?
> 
> 
> Right again. Our device is not advertised as a Memory Device but as a 
> ethernet one, so we are not implementing those mandatory ones for a 
> memory device.
> 
> Regarding the readiness of the CXL memory, I have been told this is
> so once some initial negotiation is performed (I do not know the
> details). That is the reason for setting this manually by our driver
> and the accessor added.
> 
> 
> > ZW
> >
> >> Thanks.
> >>
> >>
> 


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 02/15] cxl: add function for type2 cxl regs setup
  2024-08-15 16:40       ` Jonathan Cameron
@ 2024-08-18  8:07         ` Zhi Wang
  2024-08-19 11:28           ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-18  8:07 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Alejandro Lucero Palau, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes, targupta, vsethi, zhiwang

On Thu, 15 Aug 2024 17:40:35 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Wed, 14 Aug 2024 08:56:35 +0100
> Alejandro Lucero Palau <alucerop@amd.com> wrote:
> 
> > On 8/4/24 18:15, Jonathan Cameron wrote:
> > > On Mon, 15 Jul 2024 18:28:22 +0100
> > > alejandro.lucero-palau@amd.com wrote:
> > >  
> > >> From: Alejandro Lucero <alucerop@amd.com>
> > >>
> > >> Create a new function for a type2 device initialising the opaque
> > >> cxl_dev_state struct regarding cxl regs setup and mapping.
> > >>
> > >> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> > >> ---
> > >>   drivers/cxl/pci.c                  | 28
> > >> ++++++++++++++++++++++++++++ drivers/net/ethernet/sfc/efx_cxl.c
> > >> |  3 +++ include/linux/cxl_accel_mem.h      |  1 +
> > >>   3 files changed, 32 insertions(+)
> > >>
> > >> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > >> index e53646e9f2fb..b34d6259faf4 100644
> > >> --- a/drivers/cxl/pci.c
> > >> +++ b/drivers/cxl/pci.c
> > >> @@ -11,6 +11,7 @@
> > >>   #include <linux/pci.h>
> > >>   #include <linux/aer.h>
> > >>   #include <linux/io.h>
> > >> +#include <linux/cxl_accel_mem.h>
> > >>   #include "cxlmem.h"
> > >>   #include "cxlpci.h"
> > >>   #include "cxl.h"
> > >> @@ -521,6 +522,33 @@ static int cxl_pci_setup_regs(struct
> > >> pci_dev *pdev, enum cxl_regloc_type type, return
> > >> cxl_setup_regs(map); }
> > >>   
> > >> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct
> > >> cxl_dev_state *cxlds) +{
> > >> +	struct cxl_register_map map;
> > >> +	int rc;
> > >> +
> > >> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV,
> > >> &map);
> > >> +	if (rc)
> > >> +		return rc;
> > >> +
> > >> +	rc = cxl_map_device_regs(&map,
> > >> &cxlds->regs.device_regs);
> > >> +	if (rc)
> > >> +		return rc;
> > >> +
> > >> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> > >> +				&cxlds->reg_map);
> > >> +	if (rc)
> > >> +		dev_warn(&pdev->dev, "No component registers
> > >> (%d)\n", rc);  
> > > Not fatal?  If we think it will happen on real devices, then
> > > dev_warn is too strong.  
> > 
> > 
> > This is more complex than what it seems, and it is not properly
> > handled with the current code.
> > 
> > I will cover it in another patch in more detail, but the fact is
> > those calls to cxl_pci_setup_regs need to be handled better,
> > because Type2 has some of these registers as optional.
> 
> I'd argue you don't have to support all type 2 devices with your
> first code.  Things like optionality of registers can come in when
> a device shows up where they aren't present.
> 
> Jonathan
> 

I think it is more like we need to change those register
probe routines to probe and return the result, but not decide
if the result is fatal or not. Let the caller decide it. E.g. type-3
assumes some registers group must be present, then the caller of type-3
can throw a fatal. While, type-2 just need to remember if the register
group is present or not. A register group is missing might not be fatal
to a type-2.

E.g.

1) moving the judges out of cxl_probe_regs() and wrap them into a
function. e.g. cxl_check_check_device_regs():
        case CXL_REGLOC_RBI_MEMDEV:
                dev_map = &map->device_map;
                cxl_probe_device_regs(host, base, dev_map);

		/* Moving the judeges out of here. */
                if (!dev_map->status.valid ||
                    ((caps & CXL_DRIVER_CAP_MBOX) &&
                !dev_map->mbox.valid) || !dev_map->memdev.valid) {
                        dev_err(host, "registers not found: %s%s%s\n",
                                !dev_map->status.valid ? "status " : "",
                                ((caps & CXL_DRIVER_CAP_MBOX) &&
                !dev_map->mbox.valid) ? "mbox " : "",
                !dev_map->memdev.valid ? "memdev " : ""); return -ENXIO;
                }

2) At the top caller for type-3 cxl_pci_probe():

        rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
                                cxlds->capabilities);
        if (rc)
                return rc;

	/* call cxl_check_device_regs() here, if fail, throw fatal! */

3) At the top caller for type-2 cxl_pci_accel_setup_regs():

	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
                                cxlds->capabilities);
        if (rc)
                return rc;

/* call cxl_check_device_regs() here,
 * if succeed, map the registers
 * if fail, move on, no need to throw fatal.
 */
	rc = cxl_map_device_regs(&map, &cxlds->regs.device_regs);
        if (rc)
                return rc;

With the changes, we can let the CXL core detects what the registers the
device has, maybe the driver even doesn't need to tell the CXL core,
what caps the driver/device has, then we don't need to introduce the
cxlds->capabilities? the CXL core just go to check if a register group's
vaddr mapping is present, then it knows if the device has a
register group or not, after the cxl_pci_accel_setup_regs().

Thanks,
Zhi.

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-15 16:35       ` Jonathan Cameron
@ 2024-08-19 11:10         ` Alejandro Lucero Palau
  2024-08-27 15:06           ` Jonathan Cameron
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 11:10 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes


On 8/15/24 17:35, Jonathan Cameron wrote:
> On Mon, 12 Aug 2024 12:16:02 +0100
> Alejandro Lucero Palau <alucerop@amd.com> wrote:
>
>> On 8/4/24 18:10, Jonathan Cameron wrote:
>>> On Mon, 15 Jul 2024 18:28:21 +0100
>>> <alejandro.lucero-palau@amd.com> wrote:
>>>   
>>>> From: Alejandro Lucero <alucerop@amd.com>
>>>>
>>>> Differientiate Type3, aka memory expanders, from Type2, aka device
>>>> accelerators, with a new function for initializing cxl_dev_state.
>>>>
>>>> Create opaque struct to be used by accelerators relying on new access
>>>> functions in following patches.
>>>>
>>>> Add SFC ethernet network driver as the client.
>>>>
>>>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
>>>>
>>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>>>   
>>>> +
>>>> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>>>> +{
>>>> +	cxlds->cxl_dvsec = dvsec;
>>> Nothing to do with accel. If these make sense promote to cxl
>>> core and a linux/cxl/ header.  Also we may want the type3 driver to
>>> switch to them long term. If nothing else, making that handle the
>>> cxl_dev_state as more opaque will show up what is still directly
>>> accessed and may need to be wrapped up for a future accelerator driver
>>> to use.
>>>   
>> I will change the function name then, but not sure I follow the comment
>> about more opaque ...
> If most code can't see the internals of cxl_dev_state because it
> doesn't include the header that defines it, then we will generally
> spot data that may not belong in that state structure in the first place
> or where it is appropriate to have an accessor function mediating that
> access.


I follow that but I do not know if you are suggesting here to make it 
opaque which conflicts with a previous comment stating it does not need 
to be.


> Jonathan
>
>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-15 16:38         ` Jonathan Cameron
@ 2024-08-19 11:12           ` Alejandro Lucero Palau
  2024-08-20 10:44             ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 11:12 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes


On 8/15/24 17:38, Jonathan Cameron wrote:
> On Tue, 13 Aug 2024 09:30:08 +0100
> Alejandro Lucero Palau <alucerop@amd.com> wrote:
>
>> On 8/12/24 12:16, Alejandro Lucero Palau wrote:
>>> On 8/4/24 18:10, Jonathan Cameron wrote:
>>>> On Mon, 15 Jul 2024 18:28:21 +0100
>>>> <alejandro.lucero-palau@amd.com> wrote:
>>>>   
>>>>> From: Alejandro Lucero <alucerop@amd.com>
>>>>>
>>>>> Differientiate Type3, aka memory expanders, from Type2, aka device
>>>>> accelerators, with a new function for initializing cxl_dev_state.
>>>>>
>>>>> Create opaque struct to be used by accelerators relying on new access
>>>>> functions in following patches.
>>>>>
>>>>> Add SFC ethernet network driver as the client.
>>>>>
>>>>> Based on
>>>>> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
>>>>>
>>>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>>>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>>>>   
>>>   
>>>>> +
>>>>> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>>>>> +{
>>>>> +    cxlds->cxl_dvsec = dvsec;
>>>> Nothing to do with accel. If these make sense promote to cxl
>>>> core and a linux/cxl/ header.  Also we may want the type3 driver to
>>>> switch to them long term. If nothing else, making that handle the
>>>> cxl_dev_state as more opaque will show up what is still directly
>>>> accessed and may need to be wrapped up for a future accelerator driver
>>>> to use.
>>>>   
>>> I will change the function name then, but not sure I follow the
>>> comment about more opaque ...
>>>
>>>
>>>   
>> I have second thoughts about this.
>>
>>
>> I consider this as an accessor  for, as you said in a previous exchange,
>> facilitating changes to the core structs without touching those accel
>> drivers using it.
>>
>> Type3 driver is part of the CXL core and easy to change for these kind
>> of updates since it will only be one driver supporting all Type3, and an
>> accessor is not required then.
>>
>> Let me know what you think.
> It's less critical, but longer term I'd expect any stuff that makes
> sense for accelerators and the type 3 driver to use the same
> approaches and code paths.  Makes it easier to see where they
> are related than opencoding the accesses in the type 3 driver will
> do.  In the very long term, I'd expect the type 3 driver to just be
> another CXL driver alongside many others.


It makes sense, so I will change the name.

A following patchset when this is hopefully going through will be to use 
the accessors in the CXL PCI driver.

Thanks!


> Jonathan
>
>>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-17 20:32       ` Zhi Wang
@ 2024-08-19 11:13         ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 11:13 UTC (permalink / raw)
  To: Zhi Wang
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes, targupta, zhiwang


On 8/17/24 21:32, Zhi Wang wrote:
> On Mon, 12 Aug 2024 12:34:55 +0100
> Alejandro Lucero Palau <alucerop@amd.com> wrote:
>
>> On 8/9/24 09:34, Zhi Wang wrote:
>>> On Mon, 15 Jul 2024 18:28:21 +0100
>>> <alejandro.lucero-palau@amd.com> wrote:
>>>
>>>> From: Alejandro Lucero <alucerop@amd.com>
>>>>
>>>> Differientiate Type3, aka memory expanders, from Type2, aka device
>>>> accelerators, with a new function for initializing cxl_dev_state.
>>>>
>>>> Create opaque struct to be used by accelerators relying on new
>>>> access functions in following patches.
>>>>
>>>> Add SFC ethernet network driver as the client.
>>>>
>>>> Based on
>>>> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
>>>>
>>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>>>> ---
>>>>    drivers/cxl/core/memdev.c             | 52
>>>> ++++++++++++++++++++++++++ drivers/net/ethernet/sfc/Makefile     |
>>>>   2 +- drivers/net/ethernet/sfc/efx.c        |  4 ++
>>>>    drivers/net/ethernet/sfc/efx_cxl.c    | 53
>>>> +++++++++++++++++++++++++++ drivers/net/ethernet/sfc/efx_cxl.h    |
>>>> 29 +++++++++++++++ drivers/net/ethernet/sfc/net_driver.h |  4 ++
>>>>    include/linux/cxl_accel_mem.h         | 22 +++++++++++
>>>>    include/linux/cxl_accel_pci.h         | 23 ++++++++++++
>>>>    8 files changed, 188 insertions(+), 1 deletion(-)
>>>>    create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>>>>    create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
>>>>    create mode 100644 include/linux/cxl_accel_mem.h
>>>>    create mode 100644 include/linux/cxl_accel_pci.h
>>>>
>>>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>>>> index 0277726afd04..61b5d35b49e7 100644
>>>> --- a/drivers/cxl/core/memdev.c
>>>> +++ b/drivers/cxl/core/memdev.c
>>>> @@ -8,6 +8,7 @@
>>>>    #include <linux/idr.h>
>>>>    #include <linux/pci.h>
>>>>    #include <cxlmem.h>
>>>> +#include <linux/cxl_accel_mem.h>
>>> Let's keep the header inclusion in an alphabetical order. The same
>>> in efx_cxl.c
>>
>> The headers seem to follow a reverse Christmas tree order here rather
>> than an alphabetical one.
>>
>> Should I rearrange them all?
>>
> Let's fix them.
>

I'll do.

Thanks!



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 02/15] cxl: add function for type2 cxl regs setup
  2024-08-18  8:07         ` Zhi Wang
@ 2024-08-19 11:28           ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 11:28 UTC (permalink / raw)
  To: Zhi Wang, Jonathan Cameron
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes, targupta, vsethi, zhiwang


On 8/18/24 09:07, Zhi Wang wrote:
> On Thu, 15 Aug 2024 17:40:35 +0100
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
>
>> On Wed, 14 Aug 2024 08:56:35 +0100
>> Alejandro Lucero Palau <alucerop@amd.com> wrote:
>>
>>> On 8/4/24 18:15, Jonathan Cameron wrote:
>>>> On Mon, 15 Jul 2024 18:28:22 +0100
>>>> alejandro.lucero-palau@amd.com wrote:
>>>>   
>>>>> From: Alejandro Lucero <alucerop@amd.com>
>>>>>
>>>>> Create a new function for a type2 device initialising the opaque
>>>>> cxl_dev_state struct regarding cxl regs setup and mapping.
>>>>>
>>>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>>>> ---
>>>>>    drivers/cxl/pci.c                  | 28
>>>>> ++++++++++++++++++++++++++++ drivers/net/ethernet/sfc/efx_cxl.c
>>>>> |  3 +++ include/linux/cxl_accel_mem.h      |  1 +
>>>>>    3 files changed, 32 insertions(+)
>>>>>
>>>>> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
>>>>> index e53646e9f2fb..b34d6259faf4 100644
>>>>> --- a/drivers/cxl/pci.c
>>>>> +++ b/drivers/cxl/pci.c
>>>>> @@ -11,6 +11,7 @@
>>>>>    #include <linux/pci.h>
>>>>>    #include <linux/aer.h>
>>>>>    #include <linux/io.h>
>>>>> +#include <linux/cxl_accel_mem.h>
>>>>>    #include "cxlmem.h"
>>>>>    #include "cxlpci.h"
>>>>>    #include "cxl.h"
>>>>> @@ -521,6 +522,33 @@ static int cxl_pci_setup_regs(struct
>>>>> pci_dev *pdev, enum cxl_regloc_type type, return
>>>>> cxl_setup_regs(map); }
>>>>>    
>>>>> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct
>>>>> cxl_dev_state *cxlds) +{
>>>>> +	struct cxl_register_map map;
>>>>> +	int rc;
>>>>> +
>>>>> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV,
>>>>> &map);
>>>>> +	if (rc)
>>>>> +		return rc;
>>>>> +
>>>>> +	rc = cxl_map_device_regs(&map,
>>>>> &cxlds->regs.device_regs);
>>>>> +	if (rc)
>>>>> +		return rc;
>>>>> +
>>>>> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
>>>>> +				&cxlds->reg_map);
>>>>> +	if (rc)
>>>>> +		dev_warn(&pdev->dev, "No component registers
>>>>> (%d)\n", rc);
>>>> Not fatal?  If we think it will happen on real devices, then
>>>> dev_warn is too strong.
>>>
>>> This is more complex than what it seems, and it is not properly
>>> handled with the current code.
>>>
>>> I will cover it in another patch in more detail, but the fact is
>>> those calls to cxl_pci_setup_regs need to be handled better,
>>> because Type2 has some of these registers as optional.
>> I'd argue you don't have to support all type 2 devices with your
>> first code.  Things like optionality of registers can come in when
>> a device shows up where they aren't present.
>>
>> Jonathan
>>
> I think it is more like we need to change those register
> probe routines to probe and return the result, but not decide
> if the result is fatal or not. Let the caller decide it. E.g. type-3
> assumes some registers group must be present, then the caller of type-3
> can throw a fatal. While, type-2 just need to remember if the register
> group is present or not. A register group is missing might not be fatal
> to a type-2.


I agree.


> E.g.
>
> 1) moving the judges out of cxl_probe_regs() and wrap them into a
> function. e.g. cxl_check_check_device_regs():
>          case CXL_REGLOC_RBI_MEMDEV:
>                  dev_map = &map->device_map;
>                  cxl_probe_device_regs(host, base, dev_map);
>
> 		/* Moving the judeges out of here. */
>                  if (!dev_map->status.valid ||
>                      ((caps & CXL_DRIVER_CAP_MBOX) &&
>                  !dev_map->mbox.valid) || !dev_map->memdev.valid) {
>                          dev_err(host, "registers not found: %s%s%s\n",
>                                  !dev_map->status.valid ? "status " : "",
>                                  ((caps & CXL_DRIVER_CAP_MBOX) &&
>                  !dev_map->mbox.valid) ? "mbox " : "",
>                  !dev_map->memdev.valid ? "memdev " : ""); return -ENXIO;
>                  }
>
> 2) At the top caller for type-3 cxl_pci_probe():
>
>          rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
>                                  cxlds->capabilities);
>          if (rc)
>                  return rc;
>
> 	/* call cxl_check_device_regs() here, if fail, throw fatal! */
>
> 3) At the top caller for type-2 cxl_pci_accel_setup_regs():
>
> 	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
>                                  cxlds->capabilities);
>          if (rc)
>                  return rc;
>
> /* call cxl_check_device_regs() here,
>   * if succeed, map the registers
>   * if fail, move on, no need to throw fatal.
>   */
> 	rc = cxl_map_device_regs(&map, &cxlds->regs.device_regs);
>          if (rc)
>                  return rc;
>
> With the changes, we can let the CXL core detects what the registers the
> device has, maybe the driver even doesn't need to tell the CXL core,
> what caps the driver/device has, then we don't need to introduce the
> cxlds->capabilities? the CXL core just go to check if a register group's
> vaddr mapping is present, then it knows if the device has a
> register group or not, after the cxl_pci_accel_setup_regs().


I thought about building up the device capabilities based on what the 
registers show instead of explicitly stated by the driver, what I think 
it is your point, but I think we need those capabilities in one way or 
another, not just for pure information purposes but also for finding out 
if other initialization should fail or not, what was the original goal 
behind this patch. The driver could also define those capabilities to 
expect and check out after identified by the registers initialization if 
they match.


So yes, I think it could go this way, but I would prefer to do such a 
refactoring after this initial type2 support.


> Thanks,
> Zhi.
>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state
  2024-08-18  6:55           ` Zhi Wang
@ 2024-08-19 13:14             ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 13:14 UTC (permalink / raw)
  To: Zhi Wang
  Cc: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes, targupta, Vikram Sethi, zhiwang


On 8/18/24 07:55, Zhi Wang wrote:
> On Thu, 15 Aug 2024 16:37:21 +0100
> Alejandro Lucero Palau <alucerop@amd.com> wrote:
>
>> On 8/9/24 11:25, Zhi Wang wrote:
>>> On Tue, 23 Jul 2024 14:43:24 +0100
>>> Alejandro Lucero Palau <alucerop@amd.com> wrote:
>>>
>>>> On 7/19/24 20:01, Dave Jiang wrote:
>>>>>>     
>>>>>> -static int cxl_probe_regs(struct cxl_register_map *map)
>>>>>> +static int cxl_probe_regs(struct cxl_register_map *map, uint8_t
>>>>>> caps) {
>>>>>>     	struct cxl_component_reg_map *comp_map;
>>>>>>     	struct cxl_device_reg_map *dev_map;
>>>>>> @@ -437,11 +437,12 @@ static int cxl_probe_regs(struct
>>>>>> cxl_register_map *map) case CXL_REGLOC_RBI_MEMDEV:
>>>>>>     		dev_map = &map->device_map;
>>>>>>     		cxl_probe_device_regs(host, base, dev_map);
>>>>>> -		if (!dev_map->status.valid ||
>>>>>> !dev_map->mbox.valid ||
>>>>>> +		if (!dev_map->status.valid ||
>>>>>> +		    ((caps & CXL_DRIVER_CAP_MBOX) &&
>>>>>> !dev_map->mbox.valid) || !dev_map->memdev.valid) {
>>>>>>     			dev_err(host, "registers not found:
>>>>>> %s%s%s\n", !dev_map->status.valid ? "status " : "",
>>>>>> -				!dev_map->mbox.valid ? "mbox " :
>>>>>> "",
>>>>>> +				((caps & CXL_DRIVER_CAP_MBOX) &&
>>>>>> !dev_map->mbox.valid) ? "mbox " : "",
>>>>> According to the r3.1 8.2.8.2.1, the device status registers and
>>>>> the primary mailbox registers are both mandatory if regloc id=3
>>>>> block is found. So if the type2 device does not implement a
>>>>> mailbox then it shouldn't be calling cxl_pci_setup_regs(pdev,
>>>>> CXL_REGLOC_RBI_MEMDEV, &map) to begin with from the driver init
>>>>> right? If the type2 device defines a regblock with id=3 but
>>>>> without a mailbox, then isn't that a spec violation?
>>>>>
>>>>> DJ
>>>> Right. The code needs to support the possibility of a Type2 having
>>>> a mailbox, and if it is not supported, the rest of the dvsec regs
>>>> initialization needs to be performed. This is not what the code
>>>> does now, so I'll fix this.
>>>>
>>>>
>>>> A wider explanation is, for the RFC I used a test driver based on
>>>> QEMU emulating a Type2 which had a CXL Device Register Interface
>>>> defined (03h) but not a CXL Device Capability with id 2 for the
>>>> primary mailbox register, breaking the spec as you spotted.
>>>>
>>>>
>>> Because SFC driver uses (the 8.2.8.5.1.1 Memory Device Status
>>> Register) to determine if the memory media is ready or not (in
>>> PATCH 6). That register should be in a regloc id=3 block.
>>
>> Right. Note patch 6 calls first cxl_await_media_ready and if it
>> returns error, what happens if the register is not found, it sets the
>> media ready field since it is required later on.
>>
>> Damn it! I realize the code is wrong because the manual setting is
>> based on no error. The testing has been a pain until recently with a
>> partial emulation, so I had to follow undesired development steps.
>> This is better now so v3 will fix some minor bugs like this one.
>>
>> I also realize in our case this first call is useless, so I plan to
>> remove it in next version.
>>
>> Thanks!
>>
> Hi Alejandro:
>
> No worries. Let's push forward. :)
>
> For a type-2, I think cxl_await_media_ready() still gives value on
> provide a type-2 vendor driver a generic core call to make sure the HDM
> region is ready to use. Because judging CXL_RANGE active & valid in
> CXL_RANGE_{1,2}_SIZE_LO can be useful to type-2.
>
> I think the problem of cxl_await_media_ready() is: it assumes the
> Memory Device Status Register is always present, which is true for
> type-3 but not always true for type-2. I think we need:
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index a663e7566c48..0ba1cedfc0ba 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -203,6 +203,9 @@ int cxl_await_media_ready(struct cxl_dev_state
> *cxlds)
>                          return rc;
>          }
>
> +       if (!cxlds->regs.memdev)
> +               return 0;
> +
>          md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
>          if (!CXLMDEV_READY(md_status))
>                  return -EIO;
>
> Then for the type-2 device, if it doesn't implement regloc=3, it can
> still call cxl_await_media_ready() to make sure the media is ready. For
> type-2 and type-3 which implements regloc=3, the check can continue.


In this case I think the driver should know if calling this function 
makes sense, apart from the code checking if the proper register does exist.


>
> I think SFC can use this as well, because according to the spec 8.1.3.8
> DVSEC CXL Range Registers:
>
> "The DVSEC CXL Range 1 register set must be implemented if
> Mem_Capable=1 in the DVSEC CXL Capability register. The DVSEC CXL Range
> 2 register set must be implemented if (Mem_Capable=1 and HDM_Count=10b
> in the DVSEC CXL Capability register)."


I have discussed this internally, and what you point to implies it is, 
as we understand it, only mandatory for memory devices what we are not. 
I guess this is an ambiguity in the specs but the fact is the current 
hardware design which will be part of the silicon coming has not such 
register implemented.

> So SFC should have this. With the change above maybe you don't need
> set_media_ready stuff in the later patch. Just simply call
> cxl_await_media_ready(), everything should be fine then.


The media_ready field inside cxl_dev_state needs to be set to true for 
avoiding later checks to preclude further initialization.

I could avoid this accessor as we have decided to not make cxl_dev_state 
opaque but in prevision of core cxl struct refactoring in the future, I 
think it is worth to keep the accessor.

Thanks


>
> Thanks,
> Zhi.
>
>>> According to the spec paste above, the device that has regloc block
>>> id=3 needs to have device status and mailbox.
>>>
>>> Curious, does the SFC device have to implement the mailbox in this
>>> case for spec compliance?
>>
>> I think It should, but no status register either in our case.
>>
>>
>>> Previously, I always think that "CXL Memory Device" == "CXL Type-3
>>> device" in the CXL spec.
>>>
>>> Now I am little bit confused if a type-2 device that supports
>>> cxl.mem == "CXL Memory Device" mentioned in the spec.
>>>
>>> If the answer == Y, then having regloc id ==3 and mailbox turn
>>> mandatory for a type-2 device that support cxl.mem for the spec
>>> compliance.
>>>
>>> If the answer == N, then a type-2 device can use approaches other
>>> than Memory Device Status Register to determine the readiness of
>>> the memory?
>>
>> Right again. Our device is not advertised as a Memory Device but as a
>> ethernet one, so we are not implementing those mandatory ones for a
>> memory device.
>>
>> Regarding the readiness of the CXL memory, I have been told this is
>> so once some initial negotiation is performed (I do not know the
>> details). That is the reason for setting this manually by our driver
>> and the accessor added.
>>
>>
>>> ZW
>>>
>>>> Thanks.
>>>>
>>>>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 08/15] cxl: indicate probe deferral
  2024-08-04 17:41   ` Jonathan Cameron
@ 2024-08-19 13:54     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 13:54 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 18:41, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:28 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> The first stop for a CXL accelerator driver that wants to establish new
>> CXL.mem regions is to register a 'struct cxl_memdev. That kicks off
>> cxl_mem_probe() to enumerate all 'struct cxl_port' instances in the
>> topology up to the root.
>>
>> If the root driver has not attached yet the expectation is that the
>> driver waits until that link is established. The common cxl_pci_driver
>> has reason to keep the 'struct cxl_memdev' device attached to the bus
>> until the root driver attaches. An accelerator may want to instead defer
>> probing until CXL resources can be acquired.
>>
>> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
>> accelerator driver probing should be defferred vs failed. Provide that
>> indication via a new cxl_acquire_endpoint() API that can retrieve the
>> probe status of the memdev.
>>
>> The first consumer of this API is a test driver that excercises the CXL
> Spell check.
> exercises


I'll fix it along with step instead of stop in the first line.


>> Type-2 flow.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m18497367d2ae38f88e94c06369eaa83fa23e92b2
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>   drivers/cxl/core/memdev.c          | 41 ++++++++++++++++++++++++++++++
>>   drivers/cxl/core/port.c            |  2 +-
>>   drivers/cxl/mem.c                  |  7 +++--
>>   drivers/net/ethernet/sfc/efx_cxl.c | 10 +++++++-
>>   include/linux/cxl_accel_mem.h      |  3 +++
>>   5 files changed, 59 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index b902948b121f..d51c8bfb32e3 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -1137,6 +1137,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>>   }
>>   EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
>>   
>> +/*
>> + * Try to get a locked reference on a memdev's CXL port topology
>> + * connection. Be careful to observe when cxl_mem_probe() has deposited
>> + * a probe deferral awaiting the arrival of the CXL root driver
> It might have deposited an error that isn't deferral I think.
> I would be careful to make that clear in this comment.


Yes. The situation this patch is dealing with is not easy to handle. I 
realize the accel driver needs to be aware of it what the sfc code does 
not handle.

I need to work on this starting with emulating the situation and maybe 
adding the work as a test ... where we need some emulated Type2 device. 
Dan was asking about some work done before my initial RFC where Type2 
support in qemu was the target, maybe something we can talk about in the 
LPC.


>> +*/
>> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
>> +{
>> +	struct cxl_port *endpoint;
>> +	int rc = -ENXIO;
>> +
>> +	device_lock(&cxlmd->dev);
> I'd not really expect an 'acquire endpoint' to exit
> in the good path with the cxlmd->dev device lock held.
> Perhaps that needs a bit more shouting in the naming of
> the function?


Uhmm, not clear to me at this point if that is needed. This is basically 
the original patch by Dan so as said above, I need to work on this a bit 
further.

I'll try to get this sorted out for v3.

Thanks


>> +	endpoint = cxlmd->endpoint;
>> +	if (!endpoint)
>> +		goto err;
>> +
>> +	if (IS_ERR(endpoint)) {
>> +		rc = PTR_ERR(endpoint);
>> +		goto err;
>> +	}
>> +
>> +	device_lock(&endpoint->dev);
>> +	if (!endpoint->dev.driver)
>> +		goto err_endpoint;
>> +
>> +	return endpoint;
>> +
>> +err_endpoint:
>> +	device_unlock(&endpoint->dev);
>> +err:
>> +	device_unlock(&cxlmd->dev);
>> +	return ERR_PTR(rc);
>> +}
>> +EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
>> +
>> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
>> +{
>> +	device_unlock(&endpoint->dev);
>> +	device_unlock(&cxlmd->dev);
>> +}
>> +EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
>> +
>>   static void sanitize_teardown_notifier(void *data)
>>   {
>>   	struct cxl_memdev_state *mds = data;
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index d66c6349ed2d..3c6b896c5f65 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>>   		 */
>>   		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>>   			dev_name(dport_dev));
>> -		return -ENXIO;
>> +		return -EPROBE_DEFER;
>>   	}
>>   
>>   	parent_port = find_cxl_port(dparent, &parent_dport);
>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>> index f76af75a87b7..383a6f4829d3 100644
>> --- a/drivers/cxl/mem.c
>> +++ b/drivers/cxl/mem.c
>> @@ -145,13 +145,16 @@ static int cxl_mem_probe(struct device *dev)
>>   		return rc;
>>   
>>   	rc = devm_cxl_enumerate_ports(cxlmd);
>> -	if (rc)
>> +	if (rc) {
>> +		cxlmd->endpoint = ERR_PTR(rc);
>>   		return rc;
>> +	}
>>   
>>   	parent_port = cxl_mem_find_port(cxlmd, &dport);
>>   	if (!parent_port) {
>>   		dev_err(dev, "CXL port topology not found\n");
> Hmm. This seems excessive error print for a deferred path.
>
>> -		return -ENXIO;
>> +		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
>> +		return -EPROBE_DEFER;
>>   	}
>>   
>>   	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-08-04 17:57   ` Jonathan Cameron
@ 2024-08-19 14:47     ` Alejandro Lucero Palau
  2024-08-27 15:18       ` Jonathan Cameron
  2024-08-28 10:18     ` Alejandro Lucero Palau
  2024-08-28 10:41     ` Alejandro Lucero Palau
  2 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 14:47 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 18:57, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:29 +0100
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> CXL region creation involves allocating capacity from device DPA
>> (device-physical-address space) and assigning it to decode a given HPA
>> (host-physical-address space). Before determining how much DPA to
>> allocate the amount of available HPA must be determined. Also, not all
>> HPA is create equal, some specifically targets RAM, some target PMEM,
>> some is prepared for device-memory flows like HDM-D and HDM-DB, and some
>> is host-only (HDM-H).
>>
>> Wrap all of those concerns into an API that retrieves a root decoder
>> (platform CXL window) that fits the specified constraints and the
>> capacity available for a new region.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m6fbe775541da3cd477d65fa95c8acdc347345b4f
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> Hi.
>
> This seems a lot more complex than an accelerator would need.
> If plan is to use this in the type3 driver as well, I'd like to
> see that done as a precursor to the main series.
> If it only matters to accelerator drivers (as in type 3 I think
> we make this a userspace problem), then limit the code to handle
> interleave ways == 1 only.  Maybe we will care about higher interleave
> in the long run, but do you have a multihead accelerator today?


I would say this is needed for Type3 as well but current support relies 
on user space requests. I think Type3 support uses the legacy 
implementation for memory devices where initially the requirements are 
quite similar, but I think where CXL is going requires less manual 
intervention or more automatic assisted manual intervention. I'll wait 
until Dan can comment on this one for sending it as a precursor or as 
part of the type2 support.


Regarding the interleave, I know you are joking ... but who knows what 
the future will bring. O maybe I'm misunderstanding your comment, 
because in my view multi-head device and interleave are not directly 
related. Are they? I think you can have a single head and support 
interleaving, with multi-head implying different hosts and therefore 
different HPAs.


> Jonathan
>
>> ---
>>   drivers/cxl/core/region.c          | 161 +++++++++++++++++++++++++++++
>>   drivers/cxl/cxl.h                  |   3 +
>>   drivers/cxl/cxlmem.h               |   5 +
>>   drivers/net/ethernet/sfc/efx_cxl.c |  14 +++
>>   include/linux/cxl_accel_mem.h      |   9 ++
>>   5 files changed, 192 insertions(+)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 538ebd5a64fd..ca464bfef77b 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -702,6 +702,167 @@ static int free_hpa(struct cxl_region *cxlr)
>>   	return 0;
>>   }
>>   
>> +
>> +struct cxlrd_max_context {
>> +	struct device * const *host_bridges;
>> +	int interleave_ways;
>> +	unsigned long flags;
>> +	resource_size_t max_hpa;
>> +	struct cxl_root_decoder *cxlrd;
>> +};
>> +
>> +static int find_max_hpa(struct device *dev, void *data)
>> +{
>> +	struct cxlrd_max_context *ctx = data;
>> +	struct cxl_switch_decoder *cxlsd;
>> +	struct cxl_root_decoder *cxlrd;
>> +	struct resource *res, *prev;
>> +	struct cxl_decoder *cxld;
>> +	resource_size_t max;
>> +	int found;
>> +
>> +	if (!is_root_decoder(dev))
>> +		return 0;
>> +
>> +	cxlrd = to_cxl_root_decoder(dev);
>> +	cxld = &cxlrd->cxlsd.cxld;
>> +	if ((cxld->flags & ctx->flags) != ctx->flags) {
>> +		dev_dbg(dev, "find_max_hpa, flags not matching: %08lx vs %08lx\n",
>> +			      cxld->flags, ctx->flags);
>> +		return 0;
>> +	}
>> +
>> +	/* A Host bridge could have more interleave ways than an
>> +	 * endpoint, couldn´t it?
> EP interleave ways is about working out how the full HPA address (it's
> all sent over the wire) is modified to get to the DPA.  So it needs
> to know what the overall interleave is.  Host bridge can't interleave
> and then have the EP not know about it.  If there are switch HDM decoders
> in the path, the host bridge interleave may be less than that the EP needs
> to deal with.
>
> Does an accelerator actually cope with interleave? Is aim here to ensure
> that IW is never anything other than 1?  Or is this meant to have
> more general use? I guess it is meant to. In which case, I'd like to
> see this used in the type3 driver as well.
>
>> +	 *
>> +	 * What does interleave ways mean here in terms of the requestor?
>> +	 * Why the FFMWS has 0 interleave ways but root port has 1?
> FFMWS?
>
>> +	 */
>> +	if (cxld->interleave_ways != ctx->interleave_ways) {
>> +		dev_dbg(dev, "find_max_hpa, interleave_ways  not matching\n");
>> +		return 0;
>> +	}
>> +
>> +	cxlsd = &cxlrd->cxlsd;
>> +
>> +	guard(rwsem_read)(&cxl_region_rwsem);
>> +	found = 0;
>> +	for (int i = 0; i < ctx->interleave_ways; i++)
>> +		for (int j = 0; j < ctx->interleave_ways; j++)
>> +			if (ctx->host_bridges[i] ==
>> +					cxlsd->target[j]->dport_dev) {
>> +				found++;
>> +				break;
>> +			}
>> +
>> +	if (found != ctx->interleave_ways) {
>> +		dev_dbg(dev, "find_max_hpa, no interleave_ways found\n");
>> +		return 0;
>> +	}
>> +
>> +	/*
>> +	 * Walk the root decoder resource range relying on cxl_region_rwsem to
>> +	 * preclude sibling arrival/departure and find the largest free space
>> +	 * gap.
>> +	 */
>> +	lockdep_assert_held_read(&cxl_region_rwsem);
>> +	max = 0;
>> +	res = cxlrd->res->child;
>> +	if (!res)
>> +		max = resource_size(cxlrd->res);
>> +	else
>> +		max = 0;
>> +
>> +	for (prev = NULL; res; prev = res, res = res->sibling) {
>> +		struct resource *next = res->sibling;
>> +		resource_size_t free = 0;
>> +
>> +		if (!prev && res->start > cxlrd->res->start) {
>> +			free = res->start - cxlrd->res->start;
>> +			max = max(free, max);
>> +		}
>> +		if (prev && res->start > prev->end + 1) {
>> +			free = res->start - prev->end + 1;
>> +			max = max(free, max);
>> +		}
>> +		if (next && res->end + 1 < next->start) {
>> +			free = next->start - res->end + 1;
>> +			max = max(free, max);
>> +		}
>> +		if (!next && res->end + 1 < cxlrd->res->end + 1) {
>> +			free = cxlrd->res->end + 1 - res->end + 1;
>> +			max = max(free, max);
>> +		}
>> +	}
>> +
>> +	if (max > ctx->max_hpa) {
>> +		if (ctx->cxlrd)
>> +			put_device(CXLRD_DEV(ctx->cxlrd));
>> +		get_device(CXLRD_DEV(cxlrd));
>> +		ctx->cxlrd = cxlrd;
>> +		ctx->max_hpa = max;
>> +		dev_info(CXLRD_DEV(cxlrd), "found %pa bytes of free space\n", &max);
> dev_dbg()
>
>> +	}
>> +	return 0;
>> +}
>> +
>> +/**
>> + * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
>> + * @endpoint: an endpoint that is mapped by the returned decoder
>> + * @interleave_ways: number of entries in @host_bridges
>> + * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
>> + * @max: output parameter of bytes available in the returned decoder
> @available_size
> or something along those lines. I'd expect max to be the end address of the available
> region
>
>> + *
>> + * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
>> + * is a point in time snapshot. If by the time the caller goes to use this root
>> + * decoder's capacity the capacity is reduced then caller needs to loop and
>> + * retry.
>> + *
>> + * The returned root decoder has an elevated reference count that needs to be
>> + * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
>> + * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
>> + * does not race.
>> + */
>> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>> +					       int interleave_ways,
>> +					       unsigned long flags,
>> +					       resource_size_t *max)
>> +{
>> +
>> +	struct cxlrd_max_context ctx = {
>> +		.host_bridges = &endpoint->host_bridge,
>> +		.interleave_ways = interleave_ways,
>> +		.flags = flags,
>> +	};
>> +	struct cxl_port *root_port;
>> +	struct cxl_root *root;
>> +
>> +	if (!is_cxl_endpoint(endpoint)) {
>> +		dev_dbg(&endpoint->dev, "hpa requestor is not an endpoint\n");
>> +		return ERR_PTR(-EINVAL);
>> +	}
>> +
>> +	root = find_cxl_root(endpoint);
>> +	if (!root) {
>> +		dev_dbg(&endpoint->dev, "endpoint can not be related to a root port\n");
>> +		return ERR_PTR(-ENXIO);
>> +	}
>> +
>> +	root_port = &root->port;
>> +	down_read(&cxl_region_rwsem);
>> +	device_for_each_child(&root_port->dev, &ctx, find_max_hpa);
>> +	up_read(&cxl_region_rwsem);
>> +	put_device(&root_port->dev);
>> +
>> +	if (!ctx.cxlrd)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	*max = ctx.max_hpa;
> Rename max_hpa to available_hpa.
>
>> +	return ctx.cxlrd;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_get_hpa_freespace, CXL);
>> +
>> +

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 10/15] cxl: define a driver interface for DPA allocation
  2024-08-04 18:07   ` Jonathan Cameron
@ 2024-08-19 15:52     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 15:52 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 19:07, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:30 +0100
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Region creation involves finding available DPA (device-physical-address)
>> capacity to map into HPA (host-physical-address) space. Given the HPA
>> capacity constraint, define an API, cxl_request_dpa(), that has the
>> flexibility to  map the minimum amount of memory the driver needs to
>> operate vs the total possible that can be mapped given HPA availability.
>>
>> Factor out the core of cxl_dpa_alloc, that does free space scanning,
>> into a cxl_dpa_freespace() helper, and use that to balance the capacity
>> available to map vs the @min and @max arguments to cxl_request_dpa.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m4271ee49a91615c8af54e3ab20679f8be3099393
>>
> Use the permalink link under these to get shorter links.
> https://lore.kernel.org/linux-cxl/168592158743.1948938.7622563891193802610.stgit@dwillia2-xfh.jf.intel.com/
> goes to the same patch.


I'll do.


>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>
>> +
>> +int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>> +{
>> +	struct cxl_port *port = cxled_to_port(cxled);
>> +	struct device *dev = &cxled->cxld.dev;
>> +	resource_size_t start, avail, skip;
>> +	int rc;
>> +
>> +	down_write(&cxl_dpa_rwsem);
> Some cleanup.h magic would help here by allowing early returns.
> Needs the scoped lock though to ensure it's released before the
> devm_add_action_or_reset() as I'd guess we will deadlock otherwise
> if that fails.


Yes, I'll try to use it making cleaner code.


>> +	if (cxled->cxld.region) {
>> +		dev_dbg(dev, "EBUSY, decoder attached to %s\n",
>> +			     dev_name(&cxled->cxld.region->dev));
>> +		rc = -EBUSY;
>>   		goto out;
>>   	}
>>   
>> +	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
>> +		dev_dbg(dev, "EBUSY, decoder enabled\n");
>> +		rc = -EBUSY;
>> +		goto out;
>> +	}
>> +
>> +	avail = cxl_dpa_freespace(cxled, &start, &skip);
>> +
>>   	if (size > avail) {
>>   		dev_dbg(dev, "%pa exceeds available %s capacity: %pa\n", &size,
>> -			cxl_decoder_mode_name(cxled->mode), &avail);
>> +			     cxled->mode == CXL_DECODER_RAM ? "ram" : "pmem",
>> +			     &avail);
>>   		rc = -ENOSPC;
>>   		goto out;
>>   	}
>> @@ -550,6 +570,99 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>>   	return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
>>   }
>>   
>> +static int find_free_decoder(struct device *dev, void *data)
>> +{
>> +	struct cxl_endpoint_decoder *cxled;
>> +	struct cxl_port *port;
>> +
>> +	if (!is_endpoint_decoder(dev))
>> +		return 0;
>> +
>> +	cxled = to_cxl_endpoint_decoder(dev);
>> +	port = cxled_to_port(cxled);
>> +
>> +	if (cxled->cxld.id != port->hdm_end + 1) {
>> +		return 0;
> No brackets


Sure.


>> +	}
>> +	return 1;
>> +}
>> +
>> +/**
>> + * cxl_request_dpa - search and reserve DPA given input constraints
>> + * @endpoint: an endpoint port with available decoders
>> + * @mode: DPA operation mode (ram vs pmem)
>> + * @min: the minimum amount of capacity the call needs
>> + * @max: extra capacity to allocate after min is satisfied
>> + *
>> + * Given that a region needs to allocate from limited HPA capacity it
>> + * may be the case that a device has more mappable DPA capacity than
>> + * available HPA. So, the expectation is that @min is a driver known
>> + * value for how much capacity is needed, and @max is based the limit of
>> + * how much HPA space is available for a new region.
> We are going to need a policy control on the max value.
> Otherwise, if you have two devices that support huge capacity and
> not enough space, who gets it will just be a race.
>
> Not a problem for now though!


I agree. If CXL ends up being what we hope, these races will need to be 
better handled.

Thanks!


>> + *
>> + * Returns a pinned cxl_decoder with at least @min bytes of capacity
>> + * reserved, or an error pointer. The caller is also expected to own the
>> + * lifetime of the memdev registration associated with the endpoint to
>> + * pin the decoder registered as well.
>> + */
>
>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 10/15] cxl: define a driver interface for DPA allocation
  2024-08-06 17:33   ` Fan Ni
@ 2024-08-19 15:57     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 15:57 UTC (permalink / raw)
  To: Fan Ni, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/6/24 18:33, Fan Ni wrote:
> On Mon, Jul 15, 2024 at 06:28:30PM +0100, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Region creation involves finding available DPA (device-physical-address)
>> capacity to map into HPA (host-physical-address) space. Given the HPA
>> capacity constraint, define an API, cxl_request_dpa(), that has the
>> flexibility to  map the minimum amount of memory the driver needs to
>> operate vs the total possible that can be mapped given HPA availability.
>>
>> Factor out the core of cxl_dpa_alloc, that does free space scanning,
>> into a cxl_dpa_freespace() helper, and use that to balance the capacity
>> available to map vs the @min and @max arguments to cxl_request_dpa.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m4271ee49a91615c8af54e3ab20679f8be3099393
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>   drivers/cxl/core/core.h            |   1 +
>>   drivers/cxl/core/hdm.c             | 153 +++++++++++++++++++++++++----
>>   drivers/net/ethernet/sfc/efx.c     |   2 +
>>   drivers/net/ethernet/sfc/efx_cxl.c |  18 +++-
>>   drivers/net/ethernet/sfc/efx_cxl.h |   1 +
>>   include/linux/cxl_accel_mem.h      |   7 ++
>>   6 files changed, 161 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
>> index 625394486459..a243ff12c0f4 100644
>> --- a/drivers/cxl/core/core.h
>> +++ b/drivers/cxl/core/core.h
>> @@ -76,6 +76,7 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
>>   		     enum cxl_decoder_mode mode);
>>   int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size);
>>   int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
>> +int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
> Function declared twice here.


I'll fixed.

Thanks!


> Fan
>>   resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
>>   resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled);
>>   
>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
>> index 4af9225d4b59..3e53ae222d40 100644
>> --- a/drivers/cxl/core/hdm.c
>> +++ b/drivers/cxl/core/hdm.c
>> @@ -3,6 +3,7 @@
>>   #include <linux/seq_file.h>
>>   #include <linux/device.h>
>>   #include <linux/delay.h>
>> +#include <linux/cxl_accel_mem.h>
>>   
>>   #include "cxlmem.h"
>>   #include "core.h"
>> @@ -420,6 +421,7 @@ int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
>>   	up_write(&cxl_dpa_rwsem);
>>   	return rc;
>>   }
>> +EXPORT_SYMBOL_NS_GPL(cxl_dpa_free, CXL);
>>   
>>   int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
>>   		     enum cxl_decoder_mode mode)
>> @@ -467,30 +469,17 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
>>   	return rc;
>>   }
>>   
>> -int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>> +static resource_size_t cxl_dpa_freespace(struct cxl_endpoint_decoder *cxled,
>> +					 resource_size_t *start_out,
>> +					 resource_size_t *skip_out)
>>   {
>>   	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>>   	resource_size_t free_ram_start, free_pmem_start;
>> -	struct cxl_port *port = cxled_to_port(cxled);
>>   	struct cxl_dev_state *cxlds = cxlmd->cxlds;
>> -	struct device *dev = &cxled->cxld.dev;
>>   	resource_size_t start, avail, skip;
>>   	struct resource *p, *last;
>> -	int rc;
>> -
>> -	down_write(&cxl_dpa_rwsem);
>> -	if (cxled->cxld.region) {
>> -		dev_dbg(dev, "decoder attached to %s\n",
>> -			dev_name(&cxled->cxld.region->dev));
>> -		rc = -EBUSY;
>> -		goto out;
>> -	}
>>   
>> -	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
>> -		dev_dbg(dev, "decoder enabled\n");
>> -		rc = -EBUSY;
>> -		goto out;
>> -	}
>> +	lockdep_assert_held(&cxl_dpa_rwsem);
>>   
>>   	for (p = cxlds->ram_res.child, last = NULL; p; p = p->sibling)
>>   		last = p;
>> @@ -528,14 +517,45 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>>   			skip_end = start - 1;
>>   		skip = skip_end - skip_start + 1;
>>   	} else {
>> -		dev_dbg(dev, "mode not set\n");
>> -		rc = -EINVAL;
>> +		avail = 0;
>> +	}
>> +
>> +	if (!avail)
>> +		return 0;
>> +	if (start_out)
>> +		*start_out = start;
>> +	if (skip_out)
>> +		*skip_out = skip;
>> +	return avail;
>> +}
>> +
>> +int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>> +{
>> +	struct cxl_port *port = cxled_to_port(cxled);
>> +	struct device *dev = &cxled->cxld.dev;
>> +	resource_size_t start, avail, skip;
>> +	int rc;
>> +
>> +	down_write(&cxl_dpa_rwsem);
>> +	if (cxled->cxld.region) {
>> +		dev_dbg(dev, "EBUSY, decoder attached to %s\n",
>> +			     dev_name(&cxled->cxld.region->dev));
>> +		rc = -EBUSY;
>>   		goto out;
>>   	}
>>   
>> +	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
>> +		dev_dbg(dev, "EBUSY, decoder enabled\n");
>> +		rc = -EBUSY;
>> +		goto out;
>> +	}
>> +
>> +	avail = cxl_dpa_freespace(cxled, &start, &skip);
>> +
>>   	if (size > avail) {
>>   		dev_dbg(dev, "%pa exceeds available %s capacity: %pa\n", &size,
>> -			cxl_decoder_mode_name(cxled->mode), &avail);
>> +			     cxled->mode == CXL_DECODER_RAM ? "ram" : "pmem",
>> +			     &avail);
>>   		rc = -ENOSPC;
>>   		goto out;
>>   	}
>> @@ -550,6 +570,99 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>>   	return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
>>   }
>>   
>> +static int find_free_decoder(struct device *dev, void *data)
>> +{
>> +	struct cxl_endpoint_decoder *cxled;
>> +	struct cxl_port *port;
>> +
>> +	if (!is_endpoint_decoder(dev))
>> +		return 0;
>> +
>> +	cxled = to_cxl_endpoint_decoder(dev);
>> +	port = cxled_to_port(cxled);
>> +
>> +	if (cxled->cxld.id != port->hdm_end + 1) {
>> +		return 0;
>> +	}
>> +	return 1;
>> +}
>> +
>> +/**
>> + * cxl_request_dpa - search and reserve DPA given input constraints
>> + * @endpoint: an endpoint port with available decoders
>> + * @mode: DPA operation mode (ram vs pmem)
>> + * @min: the minimum amount of capacity the call needs
>> + * @max: extra capacity to allocate after min is satisfied
>> + *
>> + * Given that a region needs to allocate from limited HPA capacity it
>> + * may be the case that a device has more mappable DPA capacity than
>> + * available HPA. So, the expectation is that @min is a driver known
>> + * value for how much capacity is needed, and @max is based the limit of
>> + * how much HPA space is available for a new region.
>> + *
>> + * Returns a pinned cxl_decoder with at least @min bytes of capacity
>> + * reserved, or an error pointer. The caller is also expected to own the
>> + * lifetime of the memdev registration associated with the endpoint to
>> + * pin the decoder registered as well.
>> + */
>> +struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
>> +					     bool is_ram,
>> +					     resource_size_t min,
>> +					     resource_size_t max)
>> +{
>> +	struct cxl_endpoint_decoder *cxled;
>> +	enum cxl_decoder_mode mode;
>> +	struct device *cxled_dev;
>> +	resource_size_t alloc;
>> +	int rc;
>> +
>> +	if (!IS_ALIGNED(min | max, SZ_256M))
>> +		return ERR_PTR(-EINVAL);
>> +
>> +	down_read(&cxl_dpa_rwsem);
>> +
>> +	cxled_dev = device_find_child(&endpoint->dev, NULL, find_free_decoder);
>> +	if (!cxled_dev)
>> +		cxled = ERR_PTR(-ENXIO);
>> +	else
>> +		cxled = to_cxl_endpoint_decoder(cxled_dev);
>> +
>> +	up_read(&cxl_dpa_rwsem);
>> +
>> +	if (IS_ERR(cxled))
>> +		return cxled;
>> +
>> +	if (is_ram)
>> +		mode = CXL_DECODER_RAM;
>> +	else
>> +		mode = CXL_DECODER_PMEM;
>> +
>> +	rc = cxl_dpa_set_mode(cxled, mode);
>> +	if (rc)
>> +		goto err;
>> +
>> +	down_read(&cxl_dpa_rwsem);
>> +	alloc = cxl_dpa_freespace(cxled, NULL, NULL);
>> +	up_read(&cxl_dpa_rwsem);
>> +
>> +	if (max)
>> +		alloc = min(max, alloc);
>> +	if (alloc < min) {
>> +		rc = -ENOMEM;
>> +		goto err;
>> +	}
>> +
>> +	rc = cxl_dpa_alloc(cxled, alloc);
>> +	if (rc)
>> +		goto err;
>> +
>> +	return cxled;
>> +err:
>> +	put_device(cxled_dev);
>> +	return ERR_PTR(rc);
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_request_dpa, CXL);
>> +
>>   static void cxld_set_interleave(struct cxl_decoder *cxld, u32 *ctrl)
>>   {
>>   	u16 eig;
>> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
>> index cb3f74d30852..9cfe29002d98 100644
>> --- a/drivers/net/ethernet/sfc/efx.c
>> +++ b/drivers/net/ethernet/sfc/efx.c
>> @@ -901,6 +901,8 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>>   
>>   	efx_fini_io(efx);
>>   
>> +	efx_cxl_exit(efx);
>> +
>>   	pci_dbg(efx->pci_dev, "shutdown successful\n");
>>   
>>   	efx_fini_devlink_and_unlock(efx);
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index 6d49571ccff7..b5626d724b52 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -84,12 +84,28 @@ void efx_cxl_init(struct efx_nic *efx)
>>   		goto out;
>>   	}
>>   
>> -	if (max < EFX_CTPIO_BUFFER_SIZE)
>> +	if (max < EFX_CTPIO_BUFFER_SIZE) {
>>   		pci_info(pci_dev, "CXL accel not enough free HPA space %llu < %u\n",
>>   				  max, EFX_CTPIO_BUFFER_SIZE);
>> +		goto out;
>> +	}
>> +
>> +	cxl->cxled = cxl_request_dpa(cxl->endpoint, true, EFX_CTPIO_BUFFER_SIZE,
>> +				     EFX_CTPIO_BUFFER_SIZE);
>> +	if (IS_ERR(cxl->cxled))
>> +		pci_info(pci_dev, "CXL accel request DPA failed");
>>   out:
>>   	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
>>   }
>>   
>> +void efx_cxl_exit(struct efx_nic *efx)
>> +{
>> +	struct efx_cxl *cxl = efx->cxl;
>> +
>> +	if (cxl->cxled)
>> +		cxl_dpa_free(cxl->cxled);
>> +
>> + 	return;
>> + }
>>   
>>   MODULE_IMPORT_NS(CXL);
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
>> index 76c6794c20d8..59d5217a684c 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.h
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
>> @@ -26,4 +26,5 @@ struct efx_cxl {
>>   };
>>   
>>   void efx_cxl_init(struct efx_nic *efx);
>> +void efx_cxl_exit(struct efx_nic *efx);
>>   #endif
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> index f3e77688ffe0..d4ecb5bb4fc8 100644
>> --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -2,6 +2,7 @@
>>   /* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>>   
>>   #include <linux/cdev.h>
>> +#include <linux/pci.h>
>>   
>>   #ifndef __CXL_ACCEL_MEM_H
>>   #define __CXL_ACCEL_MEM_H
>> @@ -41,4 +42,10 @@ struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>>   					       int interleave_ways,
>>   					       unsigned long flags,
>>   					       resource_size_t *max);
>> +
>> +struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
>> +					     bool is_ram,
>> +					     resource_size_t min,
>> +					     resource_size_t max);
>> +int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
>>   #endif
>> -- 
>> 2.17.1
>>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 12/15] cxl: allow region creation by type2 drivers
  2024-08-04 18:29   ` Jonathan Cameron
@ 2024-08-19 16:11     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 16:11 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 19:29, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:32 +0100
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Creating a CXL region requires userspace intervention through the cxl
>> sysfs files. Type2 support should allow accelerator drivers to create
>> such cxl region from kernel code.
>>
>> Adding that functionality and integrating it with current support for
>> memory expanders.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m84598b534cc5664f5bb31521ba6e41c7bc213758
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> Needs a co-developed or similar given Dan didn't email this patch
> (which this sign off list suggests he did).


Yes, I'll fix it.


>
> I'll take another look at the locking, but my main comment is
> that it is really confusing so I have no idea if it's right.
> Consider different ways of breaking up the code you need
> to try and keep the locking obvious.


I have to agree and this means I need to work on it. I know it works for 
my case, what was my main focus for the RFC, but not looked at it with 
the right mindset.

I take your next comments as valuable inputs for the required work.

Thanks!


> Jonathan
>
>> +
>> +static ssize_t interleave_ways_store(struct device *dev,
>> +				     struct device_attribute *attr,
>> +				     const char *buf, size_t len)
>> +{
>> +	struct cxl_region *cxlr = to_cxl_region(dev);
>> +	unsigned int val;
>> +	int rc;
>> +
>> +	rc = kstrtouint(buf, 0, &val);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = down_write_killable(&cxl_region_rwsem);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = set_interleave_ways(cxlr, val);
>>   	up_write(&cxl_region_rwsem);
>>   	if (rc)
>>   		return rc;
>>   	return len;
>>   }
>> +
> This was probably intentional. Common to group a macro like this
> with the function it is using by not having a blank line.
>>   static DEVICE_ATTR_RW(interleave_ways);
>>   
>>   static ssize_t interleave_granularity_show(struct device *dev,
>> @@ -547,21 +556,14 @@ static ssize_t interleave_granularity_show(struct device *dev,
>>   	return rc;
>>   }
>> +static ssize_t interleave_granularity_store(struct device *dev,
>> +					    struct device_attribute *attr,
>> +					    const char *buf, size_t len)
>> +{
>> +	struct cxl_region *cxlr = to_cxl_region(dev);
>> +	int rc, val;
>> +
>> +	rc = kstrtoint(buf, 0, &val);
>> +	if (rc)
>> +		return rc;
>> +
>>   	rc = down_write_killable(&cxl_region_rwsem);
>>   	if (rc)
>>   		return rc;
>> -	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>> -		rc = -EBUSY;
>> -		goto out;
>> -	}
>>   
>> -	p->interleave_granularity = val;
>> -out:
>> +	rc = set_interleave_granularity(cxlr, val);
>>   	up_write(&cxl_region_rwsem);
>>   	if (rc)
>>   		return rc;
>>   	return len;
>>   }
>> +
> grump.
>
>>   static DEVICE_ATTR_RW(interleave_granularity);
>> +/* Establish an empty region covering the given HPA range */
>> +static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>> +					   struct cxl_endpoint_decoder *cxled)
>> +{
>> +	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>> +	struct range *hpa = &cxled->cxld.hpa_range;
>> +	struct cxl_region_params *p;
>> +	struct cxl_region *cxlr;
>> +	struct resource *res;
>> +	int rc;
>> +
>> +	cxlr = construct_region_begin(cxlrd, cxled);
>> +	if (IS_ERR(cxlr))
>> +		return cxlr;
>>   
>>   	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>>   
>>   	res = kmalloc(sizeof(*res), GFP_KERNEL);
>>   	if (!res) {
>>   		rc = -ENOMEM;
>> -		goto err;
>> +		goto out;
>>   	}
>>   
>>   	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
>>   				    dev_name(&cxlr->dev));
>> +
>>   	rc = insert_resource(cxlrd->res, res);
>>   	if (rc) {
>>   		/*
>> @@ -3412,6 +3462,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>   			 __func__, dev_name(&cxlr->dev));
>>   	}
>>   
>> +	p = &cxlr->params;
>>   	p->res = res;
>>   	p->interleave_ways = cxled->cxld.interleave_ways;
>>   	p->interleave_granularity = cxled->cxld.interleave_granularity;
>> @@ -3419,24 +3470,124 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>   
>>   	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
>>   	if (rc)
>> -		goto err;
>> +		goto out;
>>   
>>   	dev_dbg(cxlmd->dev.parent, "%s:%s: %s %s res: %pr iw: %d ig: %d\n",
>> -		dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), __func__,
>> -		dev_name(&cxlr->dev), p->res, p->interleave_ways,
>> -		p->interleave_granularity);
>> +				   dev_name(&cxlmd->dev),
>> +				   dev_name(&cxled->cxld.dev), __func__,
>> +				   dev_name(&cxlr->dev), p->res,
>> +				   p->interleave_ways,
>> +				   p->interleave_granularity);
>>   
>>   	/* ...to match put_device() in cxl_add_to_region() */
>>   	get_device(&cxlr->dev);
>>   	up_write(&cxl_region_rwsem);
>> +out:
>> +	construct_region_end();
> two calls to up_write(&cxl_region_rwsem) next to each other?
>
>> +	if (rc) {
>> +		drop_region(cxlr);
>> +		return ERR_PTR(rc);
>> +	}
>> +	return cxlr;
>> +}
>> +
>> +static struct cxl_region *
>> +__construct_new_region(struct cxl_root_decoder *cxlrd,
>> +		       struct cxl_endpoint_decoder **cxled, int ways)
>> +{
>> +	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
>> +	struct cxl_region_params *p;
>> +	resource_size_t size = 0;
>> +	struct cxl_region *cxlr;
>> +	int rc, i;
>> +
>> +	/* If interleaving is not supported, why does ways need to be at least 1? */
> I think 1 means no interleave. It's simpler to do this than have 0 and 1 both
> mean no interleave because 1 works for programmable decoders.
>
>> +	if (ways < 1)
>> +		return ERR_PTR(-EINVAL);
>> +
>> +	cxlr = construct_region_begin(cxlrd, cxled[0]);
> rethink how this broken up.  Taking the cxl_dpa_rwsem
> inside this function and is really hard to follow.  Ideally
> manage it with scoped_guard()
>
>
>> +	if (IS_ERR(cxlr))
>> +		return cxlr;
>> +
>> +	rc = set_interleave_ways(cxlr, ways);
>> +	if (rc)
>> +		goto out;
>> +
>> +	rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
>> +	if (rc)
> here I think cxl_dpa_rwsem is held.
>> +		goto out;
>> +
>> +	down_read(&cxl_dpa_rwsem);
>> +	for (i = 0; i < ways; i++) {
>> +		if (!cxled[i]->dpa_res)
>> +			break;
>> +		size += resource_size(cxled[i]->dpa_res);
>> +	}
>> +	up_read(&cxl_dpa_rwsem);
>> +
>> +	if (i < ways)
> but not here and they go to the same place.
>
>> +		goto out;
>> +
>> +	rc = alloc_hpa(cxlr, size);
>> +	if (rc)
>> +		goto out;
>> +
>> +	down_read(&cxl_dpa_rwsem);
>> +	for (i = 0; i < ways; i++) {
>> +		rc = cxl_region_attach(cxlr, cxled[i], i);
>> +		if (rc)
>> +			break;
>> +	}
>> +	up_read(&cxl_dpa_rwsem);
>> +
>> +	if (rc)
>> +		goto out;
>> +
>> +	rc = cxl_region_decode_commit(cxlr);
>> +	if (rc)
>> +		goto out;
>>   
>> +	p = &cxlr->params;
>> +	p->state = CXL_CONFIG_COMMIT;
>> +out:
>> +	construct_region_end();
>> +	if (rc) {
>> +		drop_region(cxlr);
>> +		return ERR_PTR(rc);
>> +	}
>>   	return cxlr;
>> +}
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index a0e0795ec064..377bb3cd2d47 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -881,5 +881,7 @@ struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>>   					       int interleave_ways,
>>   					       unsigned long flags,
>>   					       resource_size_t *max);
>> -
> Avoid whitespace noise.
>
>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>> +				     struct cxl_endpoint_decoder **cxled,
>> +				     int ways);
>>   #endif /* __CXL_MEM_H__ */

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 14/15] cxl: add function for obtaining params from a region
  2024-08-09 15:24   ` Zhi Wang
@ 2024-08-19 16:14     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 16:14 UTC (permalink / raw)
  To: Zhi Wang, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, targupta


On 8/9/24 16:24, Zhi Wang wrote:
> On Mon, 15 Jul 2024 18:28:34 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> A CXL region struct contains the physical address to work with.
>>
>> Add a function for given a opaque cxl region struct returns the params
>> to be used for mapping such memory range.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/region.c     | 16 ++++++++++++++++
>>   drivers/cxl/cxl.h             |  3 +++
>>   include/linux/cxl_accel_mem.h |  2 ++
>>   3 files changed, 21 insertions(+)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index c8fc14ac437e..9ff10923e9fc 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -3345,6 +3345,22 @@ static int devm_cxl_add_dax_region(struct
>> cxl_region *cxlr) return rc;
>>   }
>>   
>> +int cxl_accel_get_region_params(struct cxl_region *region,
>> +				resource_size_t *start,
>> resource_size_t *end) +{
>> +	if (!region)
>> +		return -ENODEV;
>> +
>> +	if (!region->params.res) {
>> +		return -ENODEV;
>> +	}
> Remove the extra {}
>

Sure.

Thanks!


>> +	*start = region->params.res->start;
>> +	*end = region->params.res->end;
>> +
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_get_region_params, CXL);
>> +
>>   static int match_root_decoder_by_range(struct device *dev, void
>> *data) {
>>   	struct range *r1, *r2 = data;
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 1bf3b74ff959..b4c4c4455ef1 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -906,6 +906,9 @@ void cxl_coordinates_combine(struct
>> access_coordinate *out, bool
>> cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>>   int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
>> +
>> +int cxl_accel_get_region_params(struct cxl_region *region,
>> +				resource_size_t *start,
>> resource_size_t *end); /*
>>    * Unit test builds overrides this to __weak, find the 'strong'
>> version
>>    * of these symbols in tools/testing/cxl/.
>> diff --git a/include/linux/cxl_accel_mem.h
>> b/include/linux/cxl_accel_mem.h index a5f9ffc24509..5d715eea6e91
>> 100644 --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -53,4 +53,6 @@ struct cxl_region *cxl_create_region(struct
>> cxl_root_decoder *cxlrd, int ways);
>>   
>>   int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
>> +int cxl_accel_get_region_params(struct cxl_region *region,
>> +				resource_size_t *start,
>> resource_size_t *end); #endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 15/15] efx: support pio mapping based on cxl
  2024-08-04 18:13   ` Jonathan Cameron
@ 2024-08-19 16:28     ` Alejandro Lucero Palau
  2024-08-27 15:23       ` Jonathan Cameron
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-19 16:28 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 19:13, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:35 +0100
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> With a device supporting CXL and successfully initialised, use the cxl
>> region to map the memory range and use this mapping for PIO buffers.
> This explains why you weren't worried about any step of the CXL
> code failing and why that wasn't a 'bug' as such.
>
> I'd argue that you should still have the cxl intialization return
> an error code and cleanup any state it if hits an error.


Ideally, but with devm* being used, this is not easy to do if the error 
is not fatal.


> Then the top level driver can of course elect to use an alternative
> path given that failure.  Logically it belongs there rather than relying
> on a buffer being mapped or not.
>

Same driver needs to support same functionality which relies on those 
specific hardware buffers.

The functionality is expected to be there with or without CXL. If the 
hardware has no CXL, the system or the device, the functionality will be 
there with legacy PCIe BAR regions. The green light for CXL use comes 
from two sources: the firmware and the kernel. Both need to give the 
thumbs up. If not, legacy PCIe BAR regions will be used.


>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/net/ethernet/sfc/ef10.c      | 25 +++++++++++++++++++++----
>>   drivers/net/ethernet/sfc/efx_cxl.c   | 12 +++++++++++-
>>   drivers/net/ethernet/sfc/mcdi_pcol.h |  3 +++
>>   drivers/net/ethernet/sfc/nic.h       |  1 +
>>   4 files changed, 36 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
>> index 8fa6c0e9195b..3924076d2628 100644
>> --- a/drivers/net/ethernet/sfc/ef10.c
>> +++ b/drivers/net/ethernet/sfc/ef10.c
>> @@ -24,6 +24,7 @@
>>   #include <linux/wait.h>
>>   #include <linux/workqueue.h>
>>   #include <net/udp_tunnel.h>
>> +#include "efx_cxl.h"
>>   
>>   /* Hardware control for EF10 architecture including 'Huntington'. */
>>   
>> @@ -177,6 +178,12 @@ static int efx_ef10_init_datapath_caps(struct efx_nic *efx)
>>   			  efx->num_mac_stats);
>>   	}
>>   
>> +	if (outlen < MC_CMD_GET_CAPABILITIES_V7_OUT_LEN)
>> +		nic_data->datapath_caps3 = 0;
>> +	else
>> +		nic_data->datapath_caps3 = MCDI_DWORD(outbuf,
>> +						      GET_CAPABILITIES_V7_OUT_FLAGS3);
>> +
>>   	return 0;
>>   }
>>   
>> @@ -1275,10 +1282,20 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx)
>>   			return -ENOMEM;
>>   		}
>>   		nic_data->pio_write_vi_base = pio_write_vi_base;
>> -		nic_data->pio_write_base =
>> -			nic_data->wc_membase +
>> -			(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
>> -			 uc_mem_map_size);
>> +
>> +		if ((nic_data->datapath_caps3 &
>> +		    (1 << MC_CMD_GET_CAPABILITIES_V10_OUT_CXL_CONFIG_ENABLE_LBN)) &&
>> +		    efx->cxl->ctpio_cxl)
> As per comment at the top, I'd prefer to see some clean handling of the an
> error passed up to the caller of the cxl init that then sets a flag that
> we can clearly see is all about whether we have CXL or not.
>
> Using this buffer mapping is a it too much of a detail in my opinion.


Yes, maybe that is clearer than relying on the pointer from the CXL 
mapping.

I will do it.

Thanks!


>> +		{
>> +			nic_data->pio_write_base =
>> +				efx->cxl->ctpio_cxl +
>> +				(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
>> +				 uc_mem_map_size);
>> +		} else {
>> +			nic_data->pio_write_base =nic_data->wc_membase +
>> +				(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
>> +				 uc_mem_map_size);
>> +		}
>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-19 11:12           ` Alejandro Lucero Palau
@ 2024-08-20 10:44             ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-20 10:44 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes


On 8/19/24 12:12, Alejandro Lucero Palau wrote:
>
> On 8/15/24 17:38, Jonathan Cameron wrote:
>> On Tue, 13 Aug 2024 09:30:08 +0100
>> Alejandro Lucero Palau <alucerop@amd.com> wrote:
>>
>>> On 8/12/24 12:16, Alejandro Lucero Palau wrote:
>>>> On 8/4/24 18:10, Jonathan Cameron wrote:
>>>>> On Mon, 15 Jul 2024 18:28:21 +0100
>>>>> <alejandro.lucero-palau@amd.com> wrote:
>>>>>> From: Alejandro Lucero <alucerop@amd.com>
>>>>>>
>>>>>> Differientiate Type3, aka memory expanders, from Type2, aka device
>>>>>> accelerators, with a new function for initializing cxl_dev_state.
>>>>>>
>>>>>> Create opaque struct to be used by accelerators relying on new 
>>>>>> access
>>>>>> functions in following patches.
>>>>>>
>>>>>> Add SFC ethernet network driver as the client.
>>>>>>
>>>>>> Based on
>>>>>> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e 
>>>>>>
>>>>>>
>>>>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>>>>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>>>>>> +
>>>>>> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>>>>>> +{
>>>>>> +    cxlds->cxl_dvsec = dvsec;
>>>>> Nothing to do with accel. If these make sense promote to cxl
>>>>> core and a linux/cxl/ header.  Also we may want the type3 driver to
>>>>> switch to them long term. If nothing else, making that handle the
>>>>> cxl_dev_state as more opaque will show up what is still directly
>>>>> accessed and may need to be wrapped up for a future accelerator 
>>>>> driver
>>>>> to use.
>>>> I will change the function name then, but not sure I follow the
>>>> comment about more opaque ...
>>>>
>>>>
>>> I have second thoughts about this.
>>>
>>>
>>> I consider this as an accessor  for, as you said in a previous 
>>> exchange,
>>> facilitating changes to the core structs without touching those accel
>>> drivers using it.
>>>
>>> Type3 driver is part of the CXL core and easy to change for these kind
>>> of updates since it will only be one driver supporting all Type3, 
>>> and an
>>> accessor is not required then.
>>>
>>> Let me know what you think.
>> It's less critical, but longer term I'd expect any stuff that makes
>> sense for accelerators and the type 3 driver to use the same
>> approaches and code paths.  Makes it easier to see where they
>> are related than opencoding the accesses in the type 3 driver will
>> do.  In the very long term, I'd expect the type 3 driver to just be
>> another CXL driver alongside many others.
>
>
> It makes sense, so I will change the name.
>
> A following patchset when this is hopefully going through will be to 
> use the accessors in the CXL PCI driver.
>
> Thanks!
>

I realize you likely mean all the accessors and not just the dvsec one. 
Right?

Also, I think I could add the changes to the pci driver for using them 
within this patchset.


>
>> Jonathan
>>
>>>

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 03/15] cxl: add function for type2 resource request
  2024-07-15 17:28 ` [PATCH v2 03/15] cxl: add function for type2 resource request alejandro.lucero-palau
  2024-07-18 23:36   ` Dave Jiang
  2024-08-09  9:01   ` Zhi Wang
@ 2024-08-22 13:07   ` Zhi Wang
  2024-08-23  9:30     ` Alejandro Lucero Palau
  2 siblings, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-22 13:07 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero,
	targupta, zhiwang

On Mon, 15 Jul 2024 18:28:23 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Create a new function for a type2 device requesting a resource
> passing the opaque struct to work with.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/memdev.c          | 13 +++++++++++++
>  drivers/net/ethernet/sfc/efx_cxl.c |  7 ++++++-
>  include/linux/cxl_accel_mem.h      |  1 +
>  3 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 61b5d35b49e7..04c3a0f8bc2e 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -744,6 +744,19 @@ void cxl_accel_set_resource(struct cxl_dev_state
> *cxlds, struct resource res, }
>  EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
>  
> +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool
> is_ram) +{
> +	int rc;
> +
> +	if (is_ram)
> +		rc = request_resource(&cxlds->dpa_res,
> &cxlds->ram_res);
> +	else
> +		rc = request_resource(&cxlds->dpa_res,
> &cxlds->pmem_res); +
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
> +

Hi Alejandro:

Since we only have cxl_accel_request_resource() here, how is
the resource going to be released? e.g. in an error handling path. 

Thanks,
Zhi.

>  static int cxl_memdev_release_file(struct inode *inode, struct file
> *file) {
>  	struct cxl_memdev *cxlmd =
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c
> b/drivers/net/ethernet/sfc/efx_cxl.c index 10c4fb915278..9cefcaf3caca
> 100644 --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -48,8 +48,13 @@ void efx_cxl_init(struct efx_nic *efx)
>  	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>  	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
>  
> -	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds))
> +	if (cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds)) {
>  		pci_info(pci_dev, "CXL accel setup regs failed");
> +		return;
> +	}
> +
> +	if (cxl_accel_request_resource(cxl->cxlds, true))
> +		pci_info(pci_dev, "CXL accel resource request
> failed"); }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h
> b/include/linux/cxl_accel_mem.h index ca7af4a9cefc..c7b254edc096
> 100644 --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -20,4 +20,5 @@ void cxl_accel_set_serial(cxl_accel_state *cxlds,
> u64 serial); void cxl_accel_set_resource(struct cxl_dev_state *cxlds,
> struct resource res, enum accel_resource);
>  int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct
> cxl_dev_state *cxlds); +int cxl_accel_request_resource(struct
> cxl_dev_state *cxlds, bool is_ram); #endif


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 12/15] cxl: allow region creation by type2 drivers
  2024-07-15 17:28 ` [PATCH v2 12/15] cxl: allow region creation by type2 drivers alejandro.lucero-palau
  2024-08-04 18:29   ` Jonathan Cameron
@ 2024-08-22 13:12   ` Zhi Wang
  2024-08-23  9:31     ` Alejandro Lucero Palau
  1 sibling, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-22 13:12 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero,
	targupta, zhiwang

On Mon, 15 Jul 2024 18:28:32 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Creating a CXL region requires userspace intervention through the cxl
> sysfs files. Type2 support should allow accelerator drivers to create
> such cxl region from kernel code.
> 
> Adding that functionality and integrating it with current support for
> memory expanders.
> 
> Based on
> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m84598b534cc5664f5bb31521ba6e41c7bc213758
> Signed-off-by: Alejandro Lucero <alucerop@amd.com> Signed-off-by: Dan
> Williams <dan.j.williams@intel.com> ---
>  drivers/cxl/core/region.c          | 265
> ++++++++++++++++++++++------- drivers/cxl/cxl.h                  |
> 1 + drivers/cxl/cxlmem.h               |   4 +-
>  drivers/net/ethernet/sfc/efx_cxl.c |  15 +-
>  include/linux/cxl_accel_mem.h      |   5 +
>  5 files changed, 231 insertions(+), 59 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 5cc71b8868bc..697c8df83a4b 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -479,22 +479,14 @@ static ssize_t interleave_ways_show(struct
> device *dev, 
>  static const struct attribute_group
> *get_cxl_region_target_group(void); 
> -static ssize_t interleave_ways_store(struct device *dev,
> -				     struct device_attribute *attr,
> -				     const char *buf, size_t len)
> +static int set_interleave_ways(struct cxl_region *cxlr, int val)
>  {
> -	struct cxl_root_decoder *cxlrd =
> to_cxl_root_decoder(dev->parent);
> +	struct cxl_root_decoder *cxlrd =
> to_cxl_root_decoder(cxlr->dev.parent); struct cxl_decoder *cxld =
> &cxlrd->cxlsd.cxld;
> -	struct cxl_region *cxlr = to_cxl_region(dev);
>  	struct cxl_region_params *p = &cxlr->params;
> -	unsigned int val, save;
> -	int rc;
> +	int save, rc;
>  	u8 iw;
>  
> -	rc = kstrtouint(buf, 0, &val);
> -	if (rc)
> -		return rc;
> -
>  	rc = ways_to_eiw(val, &iw);
>  	if (rc)
>  		return rc;
> @@ -509,25 +501,42 @@ static ssize_t interleave_ways_store(struct
> device *dev, return -EINVAL;
>  	}
>  
> -	rc = down_write_killable(&cxl_region_rwsem);
> -	if (rc)
> -		return rc;
> -	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
> -		rc = -EBUSY;
> -		goto out;
> -	}
> +	lockdep_assert_held_write(&cxl_region_rwsem);
> +	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
> +		return -EBUSY;
>  
>  	save = p->interleave_ways;
>  	p->interleave_ways = val;
>  	rc = sysfs_update_group(&cxlr->dev.kobj,
> get_cxl_region_target_group()); if (rc)
>  		p->interleave_ways = save;
> -out:
> +
> +	return rc;
> +}
> +
> +static ssize_t interleave_ways_store(struct device *dev,
> +				     struct device_attribute *attr,
> +				     const char *buf, size_t len)
> +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +	unsigned int val;
> +	int rc;
> +
> +	rc = kstrtouint(buf, 0, &val);
> +	if (rc)
> +		return rc;
> +
> +	rc = down_write_killable(&cxl_region_rwsem);
> +	if (rc)
> +		return rc;
> +
> +	rc = set_interleave_ways(cxlr, val);
>  	up_write(&cxl_region_rwsem);
>  	if (rc)
>  		return rc;
>  	return len;
>  }
> +
>  static DEVICE_ATTR_RW(interleave_ways);
>  
>  static ssize_t interleave_granularity_show(struct device *dev,
> @@ -547,21 +556,14 @@ static ssize_t
> interleave_granularity_show(struct device *dev, return rc;
>  }
>  
> -static ssize_t interleave_granularity_store(struct device *dev,
> -					    struct device_attribute
> *attr,
> -					    const char *buf, size_t
> len) +static int set_interleave_granularity(struct cxl_region *cxlr,
> int val) {
> -	struct cxl_root_decoder *cxlrd =
> to_cxl_root_decoder(dev->parent);
> +	struct cxl_root_decoder *cxlrd =
> to_cxl_root_decoder(cxlr->dev.parent); struct cxl_decoder *cxld =
> &cxlrd->cxlsd.cxld;
> -	struct cxl_region *cxlr = to_cxl_region(dev);
>  	struct cxl_region_params *p = &cxlr->params;
> -	int rc, val;
> +	int rc;
>  	u16 ig;
>  
> -	rc = kstrtoint(buf, 0, &val);
> -	if (rc)
> -		return rc;
> -
>  	rc = granularity_to_eig(val, &ig);
>  	if (rc)
>  		return rc;
> @@ -577,21 +579,36 @@ static ssize_t
> interleave_granularity_store(struct device *dev, if
> (cxld->interleave_ways > 1 && val != cxld->interleave_granularity)
> return -EINVAL; 
> +	lockdep_assert_held_write(&cxl_region_rwsem);
> +	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
> +		return -EBUSY;
> +
> +	p->interleave_granularity = val;
> +	return 0;
> +}
> +
> +static ssize_t interleave_granularity_store(struct device *dev,
> +					    struct device_attribute
> *attr,
> +					    const char *buf, size_t
> len) +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +	int rc, val;
> +
> +	rc = kstrtoint(buf, 0, &val);
> +	if (rc)
> +		return rc;
> +
>  	rc = down_write_killable(&cxl_region_rwsem);
>  	if (rc)
>  		return rc;
> -	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
> -		rc = -EBUSY;
> -		goto out;
> -	}
>  
> -	p->interleave_granularity = val;
> -out:
> +	rc = set_interleave_granularity(cxlr, val);
>  	up_write(&cxl_region_rwsem);
>  	if (rc)
>  		return rc;
>  	return len;
>  }
> +
>  static DEVICE_ATTR_RW(interleave_granularity);
>  
>  static ssize_t resource_show(struct device *dev, struct
> device_attribute *attr, @@ -2193,7 +2210,7 @@ static int
> cxl_region_attach(struct cxl_region *cxlr, return 0;
>  }
>  
> -static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
> +int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
>  {
>  	struct cxl_port *iter, *ep_port = cxled_to_port(cxled);
>  	struct cxl_region *cxlr = cxled->cxld.region;
> @@ -2252,6 +2269,7 @@ static int cxl_region_detach(struct
> cxl_endpoint_decoder *cxled) put_device(&cxlr->dev);
>  	return rc;
>  }
> +EXPORT_SYMBOL_NS_GPL(cxl_region_detach, CXL);
>  
>  void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
>  {
> @@ -2746,6 +2764,14 @@ cxl_find_region_by_name(struct
> cxl_root_decoder *cxlrd, const char *name) return
> to_cxl_region(region_dev); }
>  
> +static void drop_region(struct cxl_region *cxlr)
> +{
> +	struct cxl_root_decoder *cxlrd =
> to_cxl_root_decoder(cxlr->dev.parent);
> +	struct cxl_port *port = cxlrd_to_port(cxlrd);
> +
> +	devm_release_action(port->uport_dev, unregister_region,
> cxlr); +}
> +
>  static ssize_t delete_region_store(struct device *dev,
>  				   struct device_attribute *attr,
>  				   const char *buf, size_t len)
> @@ -3353,17 +3379,18 @@ static int match_region_by_range(struct
> device *dev, void *data) return rc;
>  }
>  
> -/* Establish an empty region covering the given HPA range */
> -static struct cxl_region *construct_region(struct cxl_root_decoder
> *cxlrd,
> -					   struct
> cxl_endpoint_decoder *cxled) +static void construct_region_end(void)
> +{
> +	up_write(&cxl_region_rwsem);
> +}
> +
> +static struct cxl_region *construct_region_begin(struct
> cxl_root_decoder *cxlrd,
> +						 struct
> cxl_endpoint_decoder *cxled) {
>  	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> -	struct cxl_port *port = cxlrd_to_port(cxlrd);
> -	struct range *hpa = &cxled->cxld.hpa_range;
>  	struct cxl_region_params *p;
>  	struct cxl_region *cxlr;
> -	struct resource *res;
> -	int rc;
> +	int err = 0;
>  
>  	do {
>  		cxlr = __create_region(cxlrd, cxled->mode,
> @@ -3372,8 +3399,7 @@ static struct cxl_region
> *construct_region(struct cxl_root_decoder *cxlrd, } while
> (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY); 
>  	if (IS_ERR(cxlr)) {
> -		dev_err(cxlmd->dev.parent,
> -			"%s:%s: %s failed assign region: %ld\n",
> +		dev_err(cxlmd->dev.parent,"%s:%s: %s failed assign
> region: %ld\n", dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>  			__func__, PTR_ERR(cxlr));
>  		return cxlr;
> @@ -3383,23 +3409,47 @@ static struct cxl_region
> *construct_region(struct cxl_root_decoder *cxlrd, p = &cxlr->params;
>  	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>  		dev_err(cxlmd->dev.parent,
> -			"%s:%s: %s autodiscovery interrupted\n",
> +			"%s:%s: %s region setup interrupted\n",
>  			dev_name(&cxlmd->dev),
> dev_name(&cxled->cxld.dev), __func__);
> -		rc = -EBUSY;
> -		goto err;
> +		err = -EBUSY;
> +	}
> +
> +	if (err) {
> +		construct_region_end();
> +		drop_region(cxlr);
> +		return ERR_PTR(err);
>  	}
> +	return cxlr;
> +}
> +
> +
> +/* Establish an empty region covering the given HPA range */
> +static struct cxl_region *construct_region(struct cxl_root_decoder
> *cxlrd,
> +					   struct
> cxl_endpoint_decoder *cxled) +{
> +	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +	struct range *hpa = &cxled->cxld.hpa_range;
> +	struct cxl_region_params *p;
> +	struct cxl_region *cxlr;
> +	struct resource *res;
> +	int rc;
> +
> +	cxlr = construct_region_begin(cxlrd, cxled);
> +	if (IS_ERR(cxlr))
> +		return cxlr;
>  
>  	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>  
>  	res = kmalloc(sizeof(*res), GFP_KERNEL);
>  	if (!res) {
>  		rc = -ENOMEM;
> -		goto err;
> +		goto out;
>  	}
>  
>  	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
>  				    dev_name(&cxlr->dev));
> +
>  	rc = insert_resource(cxlrd->res, res);
>  	if (rc) {
>  		/*
> @@ -3412,6 +3462,7 @@ static struct cxl_region
> *construct_region(struct cxl_root_decoder *cxlrd, __func__,
> dev_name(&cxlr->dev)); }
>  
> +	p = &cxlr->params;
>  	p->res = res;
>  	p->interleave_ways = cxled->cxld.interleave_ways;
>  	p->interleave_granularity =
> cxled->cxld.interleave_granularity; @@ -3419,24 +3470,124 @@ static
> struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd, 
>  	rc = sysfs_update_group(&cxlr->dev.kobj,
> get_cxl_region_target_group()); if (rc)
> -		goto err;
> +		goto out;
>  
>  	dev_dbg(cxlmd->dev.parent, "%s:%s: %s %s res: %pr iw: %d ig:
> %d\n",
> -		dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> __func__,
> -		dev_name(&cxlr->dev), p->res, p->interleave_ways,
> -		p->interleave_granularity);
> +				   dev_name(&cxlmd->dev),
> +				   dev_name(&cxled->cxld.dev),
> __func__,
> +				   dev_name(&cxlr->dev), p->res,
> +				   p->interleave_ways,
> +				   p->interleave_granularity);
>  
>  	/* ...to match put_device() in cxl_add_to_region() */
>  	get_device(&cxlr->dev);
>  	up_write(&cxl_region_rwsem);
> +out:
> +	construct_region_end();
> +	if (rc) {
> +		drop_region(cxlr);
> +		return ERR_PTR(rc);
> +	}
> +	return cxlr;
> +}
> +
> +static struct cxl_region *
> +__construct_new_region(struct cxl_root_decoder *cxlrd,
> +		       struct cxl_endpoint_decoder **cxled, int ways)
> +{
> +	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
> +	struct cxl_region_params *p;
> +	resource_size_t size = 0;
> +	struct cxl_region *cxlr;
> +	int rc, i;
> +
> +	/* If interleaving is not supported, why does ways need to
> be at least 1? */
> +	if (ways < 1)
> +		return ERR_PTR(-EINVAL);
> +
> +	cxlr = construct_region_begin(cxlrd, cxled[0]);
> +	if (IS_ERR(cxlr))
> +		return cxlr;
> +
> +	rc = set_interleave_ways(cxlr, ways);
> +	if (rc)
> +		goto out;
> +
> +	rc = set_interleave_granularity(cxlr,
> cxld->interleave_granularity);
> +	if (rc)
> +		goto out;
> +
> +	down_read(&cxl_dpa_rwsem);
> +	for (i = 0; i < ways; i++) {
> +		if (!cxled[i]->dpa_res)
> +			break;
> +		size += resource_size(cxled[i]->dpa_res);
> +	}
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (i < ways)
> +		goto out;
> +
> +	rc = alloc_hpa(cxlr, size);
> +	if (rc)
> +		goto out;
> +
> +	down_read(&cxl_dpa_rwsem);
> +	for (i = 0; i < ways; i++) {
> +		rc = cxl_region_attach(cxlr, cxled[i], i);
> +		if (rc)
> +			break;
> +	}
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (rc)
> +		goto out;
> +
> +	rc = cxl_region_decode_commit(cxlr);
> +	if (rc)
> +		goto out;
>  
> +	p = &cxlr->params;
> +	p->state = CXL_CONFIG_COMMIT;
> +out:
> +	construct_region_end();
> +	if (rc) {
> +		drop_region(cxlr);
> +		return ERR_PTR(rc);
> +	}
>  	return cxlr;
> +}
>  
> -err:
> -	up_write(&cxl_region_rwsem);
> -	devm_release_action(port->uport_dev, unregister_region,
> cxlr);
> -	return ERR_PTR(rc);
> +/**
> + * cxl_create_region - Establish a region given an array of endpoint
> decoders
> + * @cxlrd: root decoder to allocate HPA
> + * @cxled: array of endpoint decoders with reserved DPA capacity
> + * @ways: size of @cxled array
> + *
> + * Returns a fully formed region in the commit state and attached to
> the
> + * cxl_region driver.
> + */
> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
> +				     struct cxl_endpoint_decoder
> **cxled,
> +				     int ways)
> +{
> +	struct cxl_region *cxlr;
> +
> +	mutex_lock(&cxlrd->range_lock);
> +	cxlr = __construct_new_region(cxlrd, cxled, ways);
> +	mutex_unlock(&cxlrd->range_lock);
> +
> +	if (IS_ERR(cxlr))
> +		return cxlr;
> +
> +	if (device_attach(&cxlr->dev) <= 0) {
> +		dev_err(&cxlr->dev, "failed to create region\n");
> +		drop_region(cxlr);
> +		return ERR_PTR(-ENODEV);
> +	}
> +	return cxlr;
>  }
> +EXPORT_SYMBOL_NS_GPL(cxl_create_region, CXL);
>  
>  int cxl_add_to_region(struct cxl_port *root, struct
> cxl_endpoint_decoder *cxled) {
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index d3fdd2c1e066..1bf3b74ff959 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -905,6 +905,7 @@ void cxl_coordinates_combine(struct
> access_coordinate *out, 
>  bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>  
> +int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong'
> version
>   * of these symbols in tools/testing/cxl/.
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index a0e0795ec064..377bb3cd2d47 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -881,5 +881,7 @@ struct cxl_root_decoder
> *cxl_get_hpa_freespace(struct cxl_port *endpoint, int interleave_ways,
>  					       unsigned long flags,
>  					       resource_size_t *max);
> -
> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
> +				     struct cxl_endpoint_decoder
> **cxled,
> +				     int ways);
>  #endif /* __CXL_MEM_H__ */
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c
> b/drivers/net/ethernet/sfc/efx_cxl.c index b5626d724b52..4012e3faa298
> 100644 --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -92,8 +92,18 @@ void efx_cxl_init(struct efx_nic *efx)
>  
>  	cxl->cxled = cxl_request_dpa(cxl->endpoint, true,
> EFX_CTPIO_BUFFER_SIZE, EFX_CTPIO_BUFFER_SIZE);
> -	if (IS_ERR(cxl->cxled))
> +	if (IS_ERR(cxl->cxled)) {
>  		pci_info(pci_dev, "CXL accel request DPA failed");
> +		return;
> +	}
> +
> +	cxl->efx_region = cxl_create_region(cxl->cxlrd, &cxl->cxled,
> 1);
> +	if (!cxl->efx_region) {

if (IS_ERR(cxl->efx_region))

> +		pci_info(pci_dev, "CXL accel create region failed");
> +		cxl_dpa_free(cxl->cxled);
> +		return;
> +	}
> +
>  out:
>  	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
>  }
> @@ -102,6 +112,9 @@ void efx_cxl_exit(struct efx_nic *efx)
>  {
>  	struct efx_cxl *cxl = efx->cxl;
>  
> +	if (cxl->efx_region)
> +		cxl_region_detach(cxl->cxled);
> +
>  	if (cxl->cxled)
>  		cxl_dpa_free(cxl->cxled);
>   
> diff --git a/include/linux/cxl_accel_mem.h
> b/include/linux/cxl_accel_mem.h index d4ecb5bb4fc8..a5f9ffc24509
> 100644 --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -48,4 +48,9 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct
> cxl_port *endpoint, resource_size_t min,
>  					     resource_size_t max);
>  int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
> +				     struct cxl_endpoint_decoder
> **cxled,
> +				     int ways);
> +
> +int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
>  #endif


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 03/15] cxl: add function for type2 resource request
  2024-08-22 13:07   ` Zhi Wang
@ 2024-08-23  9:30     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-23  9:30 UTC (permalink / raw)
  To: Zhi Wang, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, targupta, zhiwang


On 8/22/24 14:07, Zhi Wang wrote:
> On Mon, 15 Jul 2024 18:28:23 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Create a new function for a type2 device requesting a resource
>> passing the opaque struct to work with.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/memdev.c          | 13 +++++++++++++
>>   drivers/net/ethernet/sfc/efx_cxl.c |  7 ++++++-
>>   include/linux/cxl_accel_mem.h      |  1 +
>>   3 files changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index 61b5d35b49e7..04c3a0f8bc2e 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -744,6 +744,19 @@ void cxl_accel_set_resource(struct cxl_dev_state
>> *cxlds, struct resource res, }
>>   EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
>>   
>> +int cxl_accel_request_resource(struct cxl_dev_state *cxlds, bool
>> is_ram) +{
>> +	int rc;
>> +
>> +	if (is_ram)
>> +		rc = request_resource(&cxlds->dpa_res,
>> &cxlds->ram_res);
>> +	else
>> +		rc = request_resource(&cxlds->dpa_res,
>> &cxlds->pmem_res); +
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_request_resource, CXL);
>> +
> Hi Alejandro:
>
> Since we only have cxl_accel_request_resource() here, how is
> the resource going to be released? e.g. in an error handling path.
>
> Thanks,
> Zhi.
>

Right. I will use devm_request_resource in v3 using cxlds->dev and the 
owner.


Thanks


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 12/15] cxl: allow region creation by type2 drivers
  2024-08-22 13:12   ` Zhi Wang
@ 2024-08-23  9:31     ` Alejandro Lucero Palau
  2024-08-27 15:20       ` Jonathan Cameron
  0 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-23  9:31 UTC (permalink / raw)
  To: Zhi Wang, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, targupta, zhiwang


On 8/22/24 14:12, Zhi Wang wrote:
> On Mon, 15 Jul 2024 18:28:32 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Creating a CXL region requires userspace intervention through the cxl
>> sysfs files. Type2 support should allow accelerator drivers to create
>> such cxl region from kernel code.
>>
>> Adding that functionality and integrating it with current support for
>> memory expanders.
>>
>> Based on
>> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m84598b534cc5664f5bb31521ba6e41c7bc213758
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com> Signed-off-by: Dan
>> Williams <dan.j.williams@intel.com> ---
>>   drivers/cxl/core/region.c          | 265
>> ++++++++++++++++++++++------- drivers/cxl/cxl.h                  |
>> 1 + drivers/cxl/cxlmem.h               |   4 +-
>>   drivers/net/ethernet/sfc/efx_cxl.c |  15 +-
>>   include/linux/cxl_accel_mem.h      |   5 +
>>   5 files changed, 231 insertions(+), 59 deletions(-)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 5cc71b8868bc..697c8df83a4b 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -479,22 +479,14 @@ static ssize_t interleave_ways_show(struct
>> device *dev,
>>   static const struct attribute_group
>> *get_cxl_region_target_group(void);
>> -static ssize_t interleave_ways_store(struct device *dev,
>> -				     struct device_attribute *attr,
>> -				     const char *buf, size_t len)
>> +static int set_interleave_ways(struct cxl_region *cxlr, int val)
>>   {
>> -	struct cxl_root_decoder *cxlrd =
>> to_cxl_root_decoder(dev->parent);
>> +	struct cxl_root_decoder *cxlrd =
>> to_cxl_root_decoder(cxlr->dev.parent); struct cxl_decoder *cxld =
>> &cxlrd->cxlsd.cxld;
>> -	struct cxl_region *cxlr = to_cxl_region(dev);
>>   	struct cxl_region_params *p = &cxlr->params;
>> -	unsigned int val, save;
>> -	int rc;
>> +	int save, rc;
>>   	u8 iw;
>>   
>> -	rc = kstrtouint(buf, 0, &val);
>> -	if (rc)
>> -		return rc;
>> -
>>   	rc = ways_to_eiw(val, &iw);
>>   	if (rc)
>>   		return rc;
>> @@ -509,25 +501,42 @@ static ssize_t interleave_ways_store(struct
>> device *dev, return -EINVAL;
>>   	}
>>   
>> -	rc = down_write_killable(&cxl_region_rwsem);
>> -	if (rc)
>> -		return rc;
>> -	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>> -		rc = -EBUSY;
>> -		goto out;
>> -	}
>> +	lockdep_assert_held_write(&cxl_region_rwsem);
>> +	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
>> +		return -EBUSY;
>>   
>>   	save = p->interleave_ways;
>>   	p->interleave_ways = val;
>>   	rc = sysfs_update_group(&cxlr->dev.kobj,
>> get_cxl_region_target_group()); if (rc)
>>   		p->interleave_ways = save;
>> -out:
>> +
>> +	return rc;
>> +}
>> +
>> +static ssize_t interleave_ways_store(struct device *dev,
>> +				     struct device_attribute *attr,
>> +				     const char *buf, size_t len)
>> +{
>> +	struct cxl_region *cxlr = to_cxl_region(dev);
>> +	unsigned int val;
>> +	int rc;
>> +
>> +	rc = kstrtouint(buf, 0, &val);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = down_write_killable(&cxl_region_rwsem);
>> +	if (rc)
>> +		return rc;
>> +
>> +	rc = set_interleave_ways(cxlr, val);
>>   	up_write(&cxl_region_rwsem);
>>   	if (rc)
>>   		return rc;
>>   	return len;
>>   }
>> +
>>   static DEVICE_ATTR_RW(interleave_ways);
>>   
>>   static ssize_t interleave_granularity_show(struct device *dev,
>> @@ -547,21 +556,14 @@ static ssize_t
>> interleave_granularity_show(struct device *dev, return rc;
>>   }
>>   
>> -static ssize_t interleave_granularity_store(struct device *dev,
>> -					    struct device_attribute
>> *attr,
>> -					    const char *buf, size_t
>> len) +static int set_interleave_granularity(struct cxl_region *cxlr,
>> int val) {
>> -	struct cxl_root_decoder *cxlrd =
>> to_cxl_root_decoder(dev->parent);
>> +	struct cxl_root_decoder *cxlrd =
>> to_cxl_root_decoder(cxlr->dev.parent); struct cxl_decoder *cxld =
>> &cxlrd->cxlsd.cxld;
>> -	struct cxl_region *cxlr = to_cxl_region(dev);
>>   	struct cxl_region_params *p = &cxlr->params;
>> -	int rc, val;
>> +	int rc;
>>   	u16 ig;
>>   
>> -	rc = kstrtoint(buf, 0, &val);
>> -	if (rc)
>> -		return rc;
>> -
>>   	rc = granularity_to_eig(val, &ig);
>>   	if (rc)
>>   		return rc;
>> @@ -577,21 +579,36 @@ static ssize_t
>> interleave_granularity_store(struct device *dev, if
>> (cxld->interleave_ways > 1 && val != cxld->interleave_granularity)
>> return -EINVAL;
>> +	lockdep_assert_held_write(&cxl_region_rwsem);
>> +	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
>> +		return -EBUSY;
>> +
>> +	p->interleave_granularity = val;
>> +	return 0;
>> +}
>> +
>> +static ssize_t interleave_granularity_store(struct device *dev,
>> +					    struct device_attribute
>> *attr,
>> +					    const char *buf, size_t
>> len) +{
>> +	struct cxl_region *cxlr = to_cxl_region(dev);
>> +	int rc, val;
>> +
>> +	rc = kstrtoint(buf, 0, &val);
>> +	if (rc)
>> +		return rc;
>> +
>>   	rc = down_write_killable(&cxl_region_rwsem);
>>   	if (rc)
>>   		return rc;
>> -	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>> -		rc = -EBUSY;
>> -		goto out;
>> -	}
>>   
>> -	p->interleave_granularity = val;
>> -out:
>> +	rc = set_interleave_granularity(cxlr, val);
>>   	up_write(&cxl_region_rwsem);
>>   	if (rc)
>>   		return rc;
>>   	return len;
>>   }
>> +
>>   static DEVICE_ATTR_RW(interleave_granularity);
>>   
>>   static ssize_t resource_show(struct device *dev, struct
>> device_attribute *attr, @@ -2193,7 +2210,7 @@ static int
>> cxl_region_attach(struct cxl_region *cxlr, return 0;
>>   }
>>   
>> -static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
>> +int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
>>   {
>>   	struct cxl_port *iter, *ep_port = cxled_to_port(cxled);
>>   	struct cxl_region *cxlr = cxled->cxld.region;
>> @@ -2252,6 +2269,7 @@ static int cxl_region_detach(struct
>> cxl_endpoint_decoder *cxled) put_device(&cxlr->dev);
>>   	return rc;
>>   }
>> +EXPORT_SYMBOL_NS_GPL(cxl_region_detach, CXL);
>>   
>>   void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
>>   {
>> @@ -2746,6 +2764,14 @@ cxl_find_region_by_name(struct
>> cxl_root_decoder *cxlrd, const char *name) return
>> to_cxl_region(region_dev); }
>>   
>> +static void drop_region(struct cxl_region *cxlr)
>> +{
>> +	struct cxl_root_decoder *cxlrd =
>> to_cxl_root_decoder(cxlr->dev.parent);
>> +	struct cxl_port *port = cxlrd_to_port(cxlrd);
>> +
>> +	devm_release_action(port->uport_dev, unregister_region,
>> cxlr); +}
>> +
>>   static ssize_t delete_region_store(struct device *dev,
>>   				   struct device_attribute *attr,
>>   				   const char *buf, size_t len)
>> @@ -3353,17 +3379,18 @@ static int match_region_by_range(struct
>> device *dev, void *data) return rc;
>>   }
>>   
>> -/* Establish an empty region covering the given HPA range */
>> -static struct cxl_region *construct_region(struct cxl_root_decoder
>> *cxlrd,
>> -					   struct
>> cxl_endpoint_decoder *cxled) +static void construct_region_end(void)
>> +{
>> +	up_write(&cxl_region_rwsem);
>> +}
>> +
>> +static struct cxl_region *construct_region_begin(struct
>> cxl_root_decoder *cxlrd,
>> +						 struct
>> cxl_endpoint_decoder *cxled) {
>>   	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>> -	struct cxl_port *port = cxlrd_to_port(cxlrd);
>> -	struct range *hpa = &cxled->cxld.hpa_range;
>>   	struct cxl_region_params *p;
>>   	struct cxl_region *cxlr;
>> -	struct resource *res;
>> -	int rc;
>> +	int err = 0;
>>   
>>   	do {
>>   		cxlr = __create_region(cxlrd, cxled->mode,
>> @@ -3372,8 +3399,7 @@ static struct cxl_region
>> *construct_region(struct cxl_root_decoder *cxlrd, } while
>> (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
>>   	if (IS_ERR(cxlr)) {
>> -		dev_err(cxlmd->dev.parent,
>> -			"%s:%s: %s failed assign region: %ld\n",
>> +		dev_err(cxlmd->dev.parent,"%s:%s: %s failed assign
>> region: %ld\n", dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>>   			__func__, PTR_ERR(cxlr));
>>   		return cxlr;
>> @@ -3383,23 +3409,47 @@ static struct cxl_region
>> *construct_region(struct cxl_root_decoder *cxlrd, p = &cxlr->params;
>>   	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>>   		dev_err(cxlmd->dev.parent,
>> -			"%s:%s: %s autodiscovery interrupted\n",
>> +			"%s:%s: %s region setup interrupted\n",
>>   			dev_name(&cxlmd->dev),
>> dev_name(&cxled->cxld.dev), __func__);
>> -		rc = -EBUSY;
>> -		goto err;
>> +		err = -EBUSY;
>> +	}
>> +
>> +	if (err) {
>> +		construct_region_end();
>> +		drop_region(cxlr);
>> +		return ERR_PTR(err);
>>   	}
>> +	return cxlr;
>> +}
>> +
>> +
>> +/* Establish an empty region covering the given HPA range */
>> +static struct cxl_region *construct_region(struct cxl_root_decoder
>> *cxlrd,
>> +					   struct
>> cxl_endpoint_decoder *cxled) +{
>> +	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>> +	struct range *hpa = &cxled->cxld.hpa_range;
>> +	struct cxl_region_params *p;
>> +	struct cxl_region *cxlr;
>> +	struct resource *res;
>> +	int rc;
>> +
>> +	cxlr = construct_region_begin(cxlrd, cxled);
>> +	if (IS_ERR(cxlr))
>> +		return cxlr;
>>   
>>   	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>>   
>>   	res = kmalloc(sizeof(*res), GFP_KERNEL);
>>   	if (!res) {
>>   		rc = -ENOMEM;
>> -		goto err;
>> +		goto out;
>>   	}
>>   
>>   	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
>>   				    dev_name(&cxlr->dev));
>> +
>>   	rc = insert_resource(cxlrd->res, res);
>>   	if (rc) {
>>   		/*
>> @@ -3412,6 +3462,7 @@ static struct cxl_region
>> *construct_region(struct cxl_root_decoder *cxlrd, __func__,
>> dev_name(&cxlr->dev)); }
>>   
>> +	p = &cxlr->params;
>>   	p->res = res;
>>   	p->interleave_ways = cxled->cxld.interleave_ways;
>>   	p->interleave_granularity =
>> cxled->cxld.interleave_granularity; @@ -3419,24 +3470,124 @@ static
>> struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>   	rc = sysfs_update_group(&cxlr->dev.kobj,
>> get_cxl_region_target_group()); if (rc)
>> -		goto err;
>> +		goto out;
>>   
>>   	dev_dbg(cxlmd->dev.parent, "%s:%s: %s %s res: %pr iw: %d ig:
>> %d\n",
>> -		dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>> __func__,
>> -		dev_name(&cxlr->dev), p->res, p->interleave_ways,
>> -		p->interleave_granularity);
>> +				   dev_name(&cxlmd->dev),
>> +				   dev_name(&cxled->cxld.dev),
>> __func__,
>> +				   dev_name(&cxlr->dev), p->res,
>> +				   p->interleave_ways,
>> +				   p->interleave_granularity);
>>   
>>   	/* ...to match put_device() in cxl_add_to_region() */
>>   	get_device(&cxlr->dev);
>>   	up_write(&cxl_region_rwsem);
>> +out:
>> +	construct_region_end();
>> +	if (rc) {
>> +		drop_region(cxlr);
>> +		return ERR_PTR(rc);
>> +	}
>> +	return cxlr;
>> +}
>> +
>> +static struct cxl_region *
>> +__construct_new_region(struct cxl_root_decoder *cxlrd,
>> +		       struct cxl_endpoint_decoder **cxled, int ways)
>> +{
>> +	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
>> +	struct cxl_region_params *p;
>> +	resource_size_t size = 0;
>> +	struct cxl_region *cxlr;
>> +	int rc, i;
>> +
>> +	/* If interleaving is not supported, why does ways need to
>> be at least 1? */
>> +	if (ways < 1)
>> +		return ERR_PTR(-EINVAL);
>> +
>> +	cxlr = construct_region_begin(cxlrd, cxled[0]);
>> +	if (IS_ERR(cxlr))
>> +		return cxlr;
>> +
>> +	rc = set_interleave_ways(cxlr, ways);
>> +	if (rc)
>> +		goto out;
>> +
>> +	rc = set_interleave_granularity(cxlr,
>> cxld->interleave_granularity);
>> +	if (rc)
>> +		goto out;
>> +
>> +	down_read(&cxl_dpa_rwsem);
>> +	for (i = 0; i < ways; i++) {
>> +		if (!cxled[i]->dpa_res)
>> +			break;
>> +		size += resource_size(cxled[i]->dpa_res);
>> +	}
>> +	up_read(&cxl_dpa_rwsem);
>> +
>> +	if (i < ways)
>> +		goto out;
>> +
>> +	rc = alloc_hpa(cxlr, size);
>> +	if (rc)
>> +		goto out;
>> +
>> +	down_read(&cxl_dpa_rwsem);
>> +	for (i = 0; i < ways; i++) {
>> +		rc = cxl_region_attach(cxlr, cxled[i], i);
>> +		if (rc)
>> +			break;
>> +	}
>> +	up_read(&cxl_dpa_rwsem);
>> +
>> +	if (rc)
>> +		goto out;
>> +
>> +	rc = cxl_region_decode_commit(cxlr);
>> +	if (rc)
>> +		goto out;
>>   
>> +	p = &cxlr->params;
>> +	p->state = CXL_CONFIG_COMMIT;
>> +out:
>> +	construct_region_end();
>> +	if (rc) {
>> +		drop_region(cxlr);
>> +		return ERR_PTR(rc);
>> +	}
>>   	return cxlr;
>> +}
>>   
>> -err:
>> -	up_write(&cxl_region_rwsem);
>> -	devm_release_action(port->uport_dev, unregister_region,
>> cxlr);
>> -	return ERR_PTR(rc);
>> +/**
>> + * cxl_create_region - Establish a region given an array of endpoint
>> decoders
>> + * @cxlrd: root decoder to allocate HPA
>> + * @cxled: array of endpoint decoders with reserved DPA capacity
>> + * @ways: size of @cxled array
>> + *
>> + * Returns a fully formed region in the commit state and attached to
>> the
>> + * cxl_region driver.
>> + */
>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>> +				     struct cxl_endpoint_decoder
>> **cxled,
>> +				     int ways)
>> +{
>> +	struct cxl_region *cxlr;
>> +
>> +	mutex_lock(&cxlrd->range_lock);
>> +	cxlr = __construct_new_region(cxlrd, cxled, ways);
>> +	mutex_unlock(&cxlrd->range_lock);
>> +
>> +	if (IS_ERR(cxlr))
>> +		return cxlr;
>> +
>> +	if (device_attach(&cxlr->dev) <= 0) {
>> +		dev_err(&cxlr->dev, "failed to create region\n");
>> +		drop_region(cxlr);
>> +		return ERR_PTR(-ENODEV);
>> +	}
>> +	return cxlr;
>>   }
>> +EXPORT_SYMBOL_NS_GPL(cxl_create_region, CXL);
>>   
>>   int cxl_add_to_region(struct cxl_port *root, struct
>> cxl_endpoint_decoder *cxled) {
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index d3fdd2c1e066..1bf3b74ff959 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -905,6 +905,7 @@ void cxl_coordinates_combine(struct
>> access_coordinate *out,
>>   bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>>   
>> +int cxl_region_detach(struct cxl_endpoint_decoder *cxled);
>>   /*
>>    * Unit test builds overrides this to __weak, find the 'strong'
>> version
>>    * of these symbols in tools/testing/cxl/.
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index a0e0795ec064..377bb3cd2d47 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -881,5 +881,7 @@ struct cxl_root_decoder
>> *cxl_get_hpa_freespace(struct cxl_port *endpoint, int interleave_ways,
>>   					       unsigned long flags,
>>   					       resource_size_t *max);
>> -
>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>> +				     struct cxl_endpoint_decoder
>> **cxled,
>> +				     int ways);
>>   #endif /* __CXL_MEM_H__ */
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c
>> b/drivers/net/ethernet/sfc/efx_cxl.c index b5626d724b52..4012e3faa298
>> 100644 --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -92,8 +92,18 @@ void efx_cxl_init(struct efx_nic *efx)
>>   
>>   	cxl->cxled = cxl_request_dpa(cxl->endpoint, true,
>> EFX_CTPIO_BUFFER_SIZE, EFX_CTPIO_BUFFER_SIZE);
>> -	if (IS_ERR(cxl->cxled))
>> +	if (IS_ERR(cxl->cxled)) {
>>   		pci_info(pci_dev, "CXL accel request DPA failed");
>> +		return;
>> +	}
>> +
>> +	cxl->efx_region = cxl_create_region(cxl->cxlrd, &cxl->cxled,
>> 1);
>> +	if (!cxl->efx_region) {
> if (IS_ERR(cxl->efx_region))
>

I'll fix it.

Thanks



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 08/15] cxl: indicate probe deferral
  2024-07-15 17:28 ` [PATCH v2 08/15] cxl: indicate probe deferral alejandro.lucero-palau
                     ` (3 preceding siblings ...)
  2024-08-09 14:40   ` Zhi Wang
@ 2024-08-26 17:42   ` Zhi Wang
  2024-08-28 13:43     ` Alejandro Lucero Palau
  4 siblings, 1 reply; 114+ messages in thread
From: Zhi Wang @ 2024-08-26 17:42 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, Alejandro Lucero,
	targupta, vsethi

On Mon, 15 Jul 2024 18:28:28 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> The first stop for a CXL accelerator driver that wants to establish
> new CXL.mem regions is to register a 'struct cxl_memdev. That kicks
> off cxl_mem_probe() to enumerate all 'struct cxl_port' instances in
> the topology up to the root.
> 
> If the root driver has not attached yet the expectation is that the
> driver waits until that link is established. The common cxl_pci_driver
> has reason to keep the 'struct cxl_memdev' device attached to the bus
> until the root driver attaches. An accelerator may want to instead
> defer probing until CXL resources can be acquired.
> 
> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
> accelerator driver probing should be defferred vs failed. Provide that
> indication via a new cxl_acquire_endpoint() API that can retrieve the
> probe status of the memdev.
> 
> The first consumer of this API is a test driver that excercises the
> CXL Type-2 flow.
> 

Out of curiosity, when and where do we probe CXL_DVSEC_CACHE_CAPABLE and
enable the CXL_DVSEC_CACHE_ENABLE bit for a type-2 device?

Thanks,
Zhi.

> Based on
> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m18497367d2ae38f88e94c06369eaa83fa23e92b2
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c          | 41
> ++++++++++++++++++++++++++++++ drivers/cxl/core/port.c            |
> 2 +- drivers/cxl/mem.c                  |  7 +++--
>  drivers/net/ethernet/sfc/efx_cxl.c | 10 +++++++-
>  include/linux/cxl_accel_mem.h      |  3 +++
>  5 files changed, 59 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index b902948b121f..d51c8bfb32e3 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -1137,6 +1137,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct
> device *host, }
>  EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
>  
> +/*
> + * Try to get a locked reference on a memdev's CXL port topology
> + * connection. Be careful to observe when cxl_mem_probe() has
> deposited
> + * a probe deferral awaiting the arrival of the CXL root driver
> +*/
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
> +{
> +	struct cxl_port *endpoint;
> +	int rc = -ENXIO;
> +
> +	device_lock(&cxlmd->dev);
> +	endpoint = cxlmd->endpoint;
> +	if (!endpoint)
> +		goto err;
> +
> +	if (IS_ERR(endpoint)) {
> +		rc = PTR_ERR(endpoint);
> +		goto err;
> +	}
> +
> +	device_lock(&endpoint->dev);
> +	if (!endpoint->dev.driver)
> +		goto err_endpoint;
> +
> +	return endpoint;
> +
> +err_endpoint:
> +	device_unlock(&endpoint->dev);
> +err:
> +	device_unlock(&cxlmd->dev);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
> +
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port
> *endpoint) +{
> +	device_unlock(&endpoint->dev);
> +	device_unlock(&cxlmd->dev);
> +}
> +EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
> +
>  static void sanitize_teardown_notifier(void *data)
>  {
>  	struct cxl_memdev_state *mds = data;
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index d66c6349ed2d..3c6b896c5f65 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev
> *cxlmd, */
>  		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>  			dev_name(dport_dev));
> -		return -ENXIO;
> +		return -EPROBE_DEFER;
>  	}
>  
>  	parent_port = find_cxl_port(dparent, &parent_dport);
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index f76af75a87b7..383a6f4829d3 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -145,13 +145,16 @@ static int cxl_mem_probe(struct device *dev)
>  		return rc;
>  
>  	rc = devm_cxl_enumerate_ports(cxlmd);
> -	if (rc)
> +	if (rc) {
> +		cxlmd->endpoint = ERR_PTR(rc);
>  		return rc;
> +	}
>  
>  	parent_port = cxl_mem_find_port(cxlmd, &dport);
>  	if (!parent_port) {
>  		dev_err(dev, "CXL port topology not found\n");
> -		return -ENXIO;
> +		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
> +		return -EPROBE_DEFER;
>  	}
>  
>  	if (resource_size(&cxlds->pmem_res) &&
> IS_ENABLED(CONFIG_CXL_PMEM)) { diff --git
> a/drivers/net/ethernet/sfc/efx_cxl.c
> b/drivers/net/ethernet/sfc/efx_cxl.c index 0abe66490ef5..2cf4837ddfc1
> 100644 --- a/drivers/net/ethernet/sfc/efx_cxl.c +++
> b/drivers/net/ethernet/sfc/efx_cxl.c @@ -65,8 +65,16 @@ void
> efx_cxl_init(struct efx_nic *efx) }
>  
>  	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
> -	if (IS_ERR(cxl->cxlmd))
> +	if (IS_ERR(cxl->cxlmd)) {
>  		pci_info(pci_dev, "CXL accel memdev creation
> failed");
> +		return;
> +	}
> +
> +	cxl->endpoint = cxl_acquire_endpoint(cxl->cxlmd);
> +	if (IS_ERR(cxl->endpoint))
> +		pci_info(pci_dev, "CXL accel acquire endpoint
> failed"); +
> +	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
>  }
>  
>  
> diff --git a/include/linux/cxl_accel_mem.h
> b/include/linux/cxl_accel_mem.h index 442ed9862292..701910021df8
> 100644 --- a/include/linux/cxl_accel_mem.h
> +++ b/include/linux/cxl_accel_mem.h
> @@ -29,4 +29,7 @@ int cxl_await_media_ready(struct cxl_dev_state
> *cxlds); 
>  struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>  				       struct cxl_dev_state *cxlds);
> +
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port
> *endpoint); #endif


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/15] cxl: add type2 device basic support
  2024-08-19 11:10         ` Alejandro Lucero Palau
@ 2024-08-27 15:06           ` Jonathan Cameron
  0 siblings, 0 replies; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-27 15:06 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes

On Mon, 19 Aug 2024 12:10:34 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 8/15/24 17:35, Jonathan Cameron wrote:
> > On Mon, 12 Aug 2024 12:16:02 +0100
> > Alejandro Lucero Palau <alucerop@amd.com> wrote:
> >  
> >> On 8/4/24 18:10, Jonathan Cameron wrote:  
> >>> On Mon, 15 Jul 2024 18:28:21 +0100
> >>> <alejandro.lucero-palau@amd.com> wrote:
> >>>     
> >>>> From: Alejandro Lucero <alucerop@amd.com>
> >>>>
> >>>> Differientiate Type3, aka memory expanders, from Type2, aka device
> >>>> accelerators, with a new function for initializing cxl_dev_state.
> >>>>
> >>>> Create opaque struct to be used by accelerators relying on new access
> >>>> functions in following patches.
> >>>>
> >>>> Add SFC ethernet network driver as the client.
> >>>>
> >>>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
> >>>>
> >>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >>>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>  
> >>>     
> >>>> +
> >>>> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
> >>>> +{
> >>>> +	cxlds->cxl_dvsec = dvsec;  
> >>> Nothing to do with accel. If these make sense promote to cxl
> >>> core and a linux/cxl/ header.  Also we may want the type3 driver to
> >>> switch to them long term. If nothing else, making that handle the
> >>> cxl_dev_state as more opaque will show up what is still directly
> >>> accessed and may need to be wrapped up for a future accelerator driver
> >>> to use.
> >>>     
> >> I will change the function name then, but not sure I follow the comment
> >> about more opaque ...  
> > If most code can't see the internals of cxl_dev_state because it
> > doesn't include the header that defines it, then we will generally
> > spot data that may not belong in that state structure in the first place
> > or where it is appropriate to have an accessor function mediating that
> > access.  
> 
> 
> I follow that but I do not know if you are suggesting here to make it 
> opaque which conflicts with a previous comment stating it does not need 
> to be.
> 
Different potential approaches.  I'm not totally sure we 'yet' care
about making it opaque as we don't have that many drivers so review for
misuse is enough. Longer term I think we want to get there - maybe now
is the convenient moment to do so.

Jonathan

> 
> > Jonathan
> >
> >  


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 05/15] cxl: fix use of resource_contains
  2024-08-16 14:37     ` Alejandro Lucero Palau
@ 2024-08-27 15:12       ` Jonathan Cameron
  0 siblings, 0 replies; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-27 15:12 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes

On Fri, 16 Aug 2024 15:37:14 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 8/4/24 18:25, Jonathan Cameron wrote:
> > On Mon, 15 Jul 2024 18:28:25 +0100
> > <alejandro.lucero-palau@amd.com> wrote:
> >  
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> For a resource defined with size zero, resource contains will also
> >> return true.
> >>
> >> Add resource size check before using it.
> >>
> >> Signed-off-by: Alejandro Lucero <alucerop@amd.com>  
> > If this can happen in existing type 3 case the fixes tag
> > and send it separately from this series.  
> 
> 
> I have been looking at this possibility and although not with 100% 
> certainty, I would say it is not for Type3.
> 
> "Type3 regions" are (usually) created from user space, and:
> 
> 1) if it is RAM, dax code is invoked for creating the region
> 
> 2) if it is pmem, pmem region creation code is invoked.
> 
> None of these possibilities use the affected code in this patch.
> 
> There exist two options where that code could be used by Type3, which 
> are confusing:
> 
> 1) regions created during device initialization, but for that the 
> decoder needs to be committed and it is not expected for Type3 without 
> user space intervention.

More than possible a bios already set them up.

> 
> 2) when emulating an hdm decoder, what I think it is not possible for 
> Type3 since it is mandatory.

HDM Decoders are not mandatory for older devices and not mandatory for
bios to have used them.  Papering over that gap is what the emulation code
is there for.

> 
> 
> Finally we have code when sysfs dpa_size file is written, which I'm not 
> familiar with.

That's an early part of the userspace bringing up the memory if it
wasn't set up by bios or from pmem lsa data.

Not sure any that helps though ;)

Jonathan

> 
> 
> 
> > If there is no path due to some external code, then
> > drop the word fix from the title and call it
> >
> > cxl: harden resource_contains checks to handle zero size resources  
> 
> 
> After the explanation above, I will do as you say.
> 
> Thanks!
> 
> 
> > Avoids it getting backported into stable / distros picking it
> > up if there isn't a real issue before this series.
> >
> > Thanks,
> >
> > Jonathan
> >  
> >> ---
> >>   drivers/cxl/core/hdm.c | 7 +++++--
> >>   1 file changed, 5 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> >> index 3df10517a327..4af9225d4b59 100644
> >> --- a/drivers/cxl/core/hdm.c
> >> +++ b/drivers/cxl/core/hdm.c
> >> @@ -327,10 +327,13 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
> >>   	cxled->dpa_res = res;
> >>   	cxled->skip = skipped;
> >>   
> >> -	if (resource_contains(&cxlds->pmem_res, res))
> >> +	if ((resource_size(&cxlds->pmem_res)) && (resource_contains(&cxlds->pmem_res, res))) {
> >> +		printk("%s: resource_contains CXL_DECODER_PMEM\n", __func__);
> >>   		cxled->mode = CXL_DECODER_PMEM;
> >> -	else if (resource_contains(&cxlds->ram_res, res))
> >> +	} else if ((resource_size(&cxlds->ram_res)) && (resource_contains(&cxlds->ram_res, res))) {
> >> +		printk("%s: resource_contains CXL_DECODER_RAM\n", __func__);
> >>   		cxled->mode = CXL_DECODER_RAM;
> >> +	}
> >>   	else {
> >>   		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
> >>   			 port->id, cxled->cxld.id, cxled->dpa_res);  


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-08-19 14:47     ` Alejandro Lucero Palau
@ 2024-08-27 15:18       ` Jonathan Cameron
  0 siblings, 0 replies; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-27 15:18 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes

On Mon, 19 Aug 2024 15:47:48 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 8/4/24 18:57, Jonathan Cameron wrote:
> > On Mon, 15 Jul 2024 18:28:29 +0100
> > alejandro.lucero-palau@amd.com wrote:
> >  
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> CXL region creation involves allocating capacity from device DPA
> >> (device-physical-address space) and assigning it to decode a given HPA
> >> (host-physical-address space). Before determining how much DPA to
> >> allocate the amount of available HPA must be determined. Also, not all
> >> HPA is create equal, some specifically targets RAM, some target PMEM,
> >> some is prepared for device-memory flows like HDM-D and HDM-DB, and some
> >> is host-only (HDM-H).
> >>
> >> Wrap all of those concerns into an API that retrieves a root decoder
> >> (platform CXL window) that fits the specified constraints and the
> >> capacity available for a new region.
> >>
> >> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m6fbe775541da3cd477d65fa95c8acdc347345b4f
> >>
> >> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >> Co-developed-by: Dan Williams <dan.j.williams@intel.com>  
> > Hi.
> >
> > This seems a lot more complex than an accelerator would need.
> > If plan is to use this in the type3 driver as well, I'd like to
> > see that done as a precursor to the main series.
> > If it only matters to accelerator drivers (as in type 3 I think
> > we make this a userspace problem), then limit the code to handle
> > interleave ways == 1 only.  Maybe we will care about higher interleave
> > in the long run, but do you have a multihead accelerator today?  
> 
> 
> I would say this is needed for Type3 as well but current support relies 
> on user space requests. I think Type3 support uses the legacy 
> implementation for memory devices where initially the requirements are 
> quite similar, but I think where CXL is going requires less manual 
> intervention or more automatic assisted manual intervention. I'll wait 
> until Dan can comment on this one for sending it as a precursor or as 
> part of the type2 support.
> 
> 
> Regarding the interleave, I know you are joking ... but who knows what 
> the future will bring. O maybe I'm misunderstanding your comment, 
> because in my view multi-head device and interleave are not directly 
> related. Are they? I think you can have a single head and support 
> interleaving, with multi-head implying different hosts and therefore 
> different HPAs.

Nothing says they heads are connected to different hosts.

For type 3 version the reason you'd do this is to spread load across
multiple root ports.  So it's just a bandwidth play and as far
as the host is concerned they might as well be separate devices.

For accelerators in theory you can do stuff like that but it gets
fiddly fast and in theory you might care that they are the same
device for reasons beyond RAS etc and might interleave access to
device memory across the two heads.

Don't think we care today though, so for now I'd just reject any
interleaving.

Jonathan

 


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 12/15] cxl: allow region creation by type2 drivers
  2024-08-23  9:31     ` Alejandro Lucero Palau
@ 2024-08-27 15:20       ` Jonathan Cameron
  0 siblings, 0 replies; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-27 15:20 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: Zhi Wang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes, targupta, zhiwang

On Fri, 23 Aug 2024 10:31:20 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 8/22/24 14:12, Zhi Wang wrote:
> > On Mon, 15 Jul 2024 18:28:32 +0100
> > <alejandro.lucero-palau@amd.com> wrote:
> >  
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> Creating a CXL region requires userspace intervention through the cxl
> >> sysfs files. Type2 support should allow accelerator drivers to create
> >> such cxl region from kernel code.
> >>
> >> Adding that functionality and integrating it with current support for
> >> memory expanders.
> >>
> >> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c
> >> b/drivers/net/ethernet/sfc/efx_cxl.c index b5626d724b52..4012e3faa298
> >> 100644 --- a/drivers/net/ethernet/sfc/efx_cxl.c
> >> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> >> @@ -92,8 +92,18 @@ void efx_cxl_init(struct efx_nic *efx)
> >>   
> >>   	cxl->cxled = cxl_request_dpa(cxl->endpoint, true,
> >> EFX_CTPIO_BUFFER_SIZE, EFX_CTPIO_BUFFER_SIZE);
> >> -	if (IS_ERR(cxl->cxled))
> >> +	if (IS_ERR(cxl->cxled)) {
> >>   		pci_info(pci_dev, "CXL accel request DPA failed");
> >> +		return;
> >> +	}
> >> +
> >> +	cxl->efx_region = cxl_create_region(cxl->cxlrd, &cxl->cxled,
> >> 1);
> >> +	if (!cxl->efx_region) {  
> > if (IS_ERR(cxl->efx_region))
> >  
> 
> I'll fix it.
> 
> Thanks
Please crop replies. It's really easy to miss the important stuff
otherwise!

Jonathan



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 15/15] efx: support pio mapping based on cxl
  2024-08-19 16:28     ` Alejandro Lucero Palau
@ 2024-08-27 15:23       ` Jonathan Cameron
  0 siblings, 0 replies; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-27 15:23 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes

On Mon, 19 Aug 2024 17:28:46 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 8/4/24 19:13, Jonathan Cameron wrote:
> > On Mon, 15 Jul 2024 18:28:35 +0100
> > alejandro.lucero-palau@amd.com wrote:
> >  
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> With a device supporting CXL and successfully initialised, use the cxl
> >> region to map the memory range and use this mapping for PIO buffers.  
> > This explains why you weren't worried about any step of the CXL
> > code failing and why that wasn't a 'bug' as such.
> >
> > I'd argue that you should still have the cxl intialization return
> > an error code and cleanup any state it if hits an error.  
> 
> 
> Ideally, but with devm* being used, this is not easy to do if the error 
> is not fatal.

That's usually a strong argument that you shouldn't use devm at that
level of abstraction.  

> 
> 
> > Then the top level driver can of course elect to use an alternative
> > path given that failure.  Logically it belongs there rather than relying
> > on a buffer being mapped or not.
> >  
> 
> Same driver needs to support same functionality which relies on those 
> specific hardware buffers.
> 
> The functionality is expected to be there with or without CXL. If the 
> hardware has no CXL, the system or the device, the functionality will be 
> there with legacy PCIe BAR regions. The green light for CXL use comes 
> from two sources: the firmware and the kernel. Both need to give the 
> thumbs up. If not, legacy PCIe BAR regions will be used.

Rather than going through full setup, see if you can figure out a minimal
(state free) check on whether it should work.

If a system is broken, then it's very different from a legacy system
with no support for CXL and we can maybe just handle the broken system
with errors (or quirks if it's a shipping system).

Jonathan
  
> 

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-08-04 17:57   ` Jonathan Cameron
  2024-08-19 14:47     ` Alejandro Lucero Palau
@ 2024-08-28 10:18     ` Alejandro Lucero Palau
  2024-08-28 11:19       ` Jonathan Cameron
  2024-08-28 10:41     ` Alejandro Lucero Palau
  2 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-28 10:18 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 18:57, Jonathan Cameron wrote:
> On Mon, 15 Jul 2024 18:28:29 +0100
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> CXL region creation involves allocating capacity from device DPA
>> (device-physical-address space) and assigning it to decode a given HPA
>> (host-physical-address space). Before determining how much DPA to
>> allocate the amount of available HPA must be determined. Also, not all
>> HPA is create equal, some specifically targets RAM, some target PMEM,
>> some is prepared for device-memory flows like HDM-D and HDM-DB, and some
>> is host-only (HDM-H).
>>
>> Wrap all of those concerns into an API that retrieves a root decoder
>> (platform CXL window) that fits the specified constraints and the
>> capacity available for a new region.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m6fbe775541da3cd477d65fa95c8acdc347345b4f
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> Hi.
>
> This seems a lot more complex than an accelerator would need.
> If plan is to use this in the type3 driver as well, I'd like to
> see that done as a precursor to the main series.
> If it only matters to accelerator drivers (as in type 3 I think
> we make this a userspace problem), then limit the code to handle
> interleave ways == 1 only.  Maybe we will care about higher interleave
> in the long run, but do you have a multihead accelerator today?
>
> Jonathan
>
>> ---
>>   drivers/cxl/core/region.c          | 161 +++++++++++++++++++++++++++++
>>   drivers/cxl/cxl.h                  |   3 +
>>   drivers/cxl/cxlmem.h               |   5 +
>>   drivers/net/ethernet/sfc/efx_cxl.c |  14 +++
>>   include/linux/cxl_accel_mem.h      |   9 ++
>>   5 files changed, 192 insertions(+)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 538ebd5a64fd..ca464bfef77b 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -702,6 +702,167 @@ static int free_hpa(struct cxl_region *cxlr)
>>   	return 0;
>>   }
>>   
>> +
>> +struct cxlrd_max_context {
>> +	struct device * const *host_bridges;
>> +	int interleave_ways;
>> +	unsigned long flags;
>> +	resource_size_t max_hpa;
>> +	struct cxl_root_decoder *cxlrd;
>> +};
>> +
>> +static int find_max_hpa(struct device *dev, void *data)
>> +{
>> +	struct cxlrd_max_context *ctx = data;
>> +	struct cxl_switch_decoder *cxlsd;
>> +	struct cxl_root_decoder *cxlrd;
>> +	struct resource *res, *prev;
>> +	struct cxl_decoder *cxld;
>> +	resource_size_t max;
>> +	int found;
>> +
>> +	if (!is_root_decoder(dev))
>> +		return 0;
>> +
>> +	cxlrd = to_cxl_root_decoder(dev);
>> +	cxld = &cxlrd->cxlsd.cxld;
>> +	if ((cxld->flags & ctx->flags) != ctx->flags) {
>> +		dev_dbg(dev, "find_max_hpa, flags not matching: %08lx vs %08lx\n",
>> +			      cxld->flags, ctx->flags);
>> +		return 0;
>> +	}
>> +
>> +	/* A Host bridge could have more interleave ways than an
>> +	 * endpoint, couldn´t it?
> EP interleave ways is about working out how the full HPA address (it's
> all sent over the wire) is modified to get to the DPA.  So it needs
> to know what the overall interleave is.  Host bridge can't interleave
> and then have the EP not know about it.  If there are switch HDM decoders
> in the path, the host bridge interleave may be less than that the EP needs
> to deal with.
>
> Does an accelerator actually cope with interleave? Is aim here to ensure
> that IW is never anything other than 1?  Or is this meant to have
> more general use? I guess it is meant to. In which case, I'd like to
> see this used in the type3 driver as well.


I guess an accelerator could cope with interleave ways > 1, but not ours.

And it does not make sense to me an accelerator being an EP for an 
interleaved HPA because the memory does not make sense out of the 
accelerator.

So if the CFMW and the Host Bridge have an interleave way of 2, implying 
accesses to the HPA through different wires, I assume an accelerator 
should not be allowed.


>> +	 *
>> +	 * What does interleave ways mean here in terms of the requestor?
>> +	 * Why the FFMWS has 0 interleave ways but root port has 1?
> FFMWS?


I meant CFMW, and I think this comment is because I found out the CFMW 
is parsed with interleave ways = 0 then the root port having 1, what is 
confusing.


>
>> +	 */
>> +	if (cxld->interleave_ways != ctx->interleave_ways) {
>> +		dev_dbg(dev, "find_max_hpa, interleave_ways  not matching\n");
>> +		return 0;
>> +	}
>> +
>> +	cxlsd = &cxlrd->cxlsd;
>> +
>> +	guard(rwsem_read)(&cxl_region_rwsem);
>> +	found = 0;
>> +	for (int i = 0; i < ctx->interleave_ways; i++)
>> +		for (int j = 0; j < ctx->interleave_ways; j++)
>> +			if (ctx->host_bridges[i] ==
>> +					cxlsd->target[j]->dport_dev) {
>> +				found++;
>> +				break;
>> +			}
>> +
>> +	if (found != ctx->interleave_ways) {
>> +		dev_dbg(dev, "find_max_hpa, no interleave_ways found\n");
>> +		return 0;
>> +	}
>> +
>> +	/*
>> +	 * Walk the root decoder resource range relying on cxl_region_rwsem to
>> +	 * preclude sibling arrival/departure and find the largest free space
>> +	 * gap.
>> +	 */
>> +	lockdep_assert_held_read(&cxl_region_rwsem);
>> +	max = 0;
>> +	res = cxlrd->res->child;
>> +	if (!res)
>> +		max = resource_size(cxlrd->res);
>> +	else
>> +		max = 0;
>> +
>> +	for (prev = NULL; res; prev = res, res = res->sibling) {
>> +		struct resource *next = res->sibling;
>> +		resource_size_t free = 0;
>> +
>> +		if (!prev && res->start > cxlrd->res->start) {
>> +			free = res->start - cxlrd->res->start;
>> +			max = max(free, max);
>> +		}
>> +		if (prev && res->start > prev->end + 1) {
>> +			free = res->start - prev->end + 1;
>> +			max = max(free, max);
>> +		}
>> +		if (next && res->end + 1 < next->start) {
>> +			free = next->start - res->end + 1;
>> +			max = max(free, max);
>> +		}
>> +		if (!next && res->end + 1 < cxlrd->res->end + 1) {
>> +			free = cxlrd->res->end + 1 - res->end + 1;
>> +			max = max(free, max);
>> +		}
>> +	}
>> +
>> +	if (max > ctx->max_hpa) {
>> +		if (ctx->cxlrd)
>> +			put_device(CXLRD_DEV(ctx->cxlrd));
>> +		get_device(CXLRD_DEV(cxlrd));
>> +		ctx->cxlrd = cxlrd;
>> +		ctx->max_hpa = max;
>> +		dev_info(CXLRD_DEV(cxlrd), "found %pa bytes of free space\n", &max);
> dev_dbg()
>
>> +	}
>> +	return 0;
>> +}
>> +
>> +/**
>> + * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
>> + * @endpoint: an endpoint that is mapped by the returned decoder
>> + * @interleave_ways: number of entries in @host_bridges
>> + * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
>> + * @max: output parameter of bytes available in the returned decoder
> @available_size
> or something along those lines. I'd expect max to be the end address of the available
> region
>
>> + *
>> + * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
>> + * is a point in time snapshot. If by the time the caller goes to use this root
>> + * decoder's capacity the capacity is reduced then caller needs to loop and
>> + * retry.
>> + *
>> + * The returned root decoder has an elevated reference count that needs to be
>> + * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
>> + * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
>> + * does not race.
>> + */
>> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>> +					       int interleave_ways,
>> +					       unsigned long flags,
>> +					       resource_size_t *max)
>> +{
>> +
>> +	struct cxlrd_max_context ctx = {
>> +		.host_bridges = &endpoint->host_bridge,
>> +		.interleave_ways = interleave_ways,
>> +		.flags = flags,
>> +	};
>> +	struct cxl_port *root_port;
>> +	struct cxl_root *root;
>> +
>> +	if (!is_cxl_endpoint(endpoint)) {
>> +		dev_dbg(&endpoint->dev, "hpa requestor is not an endpoint\n");
>> +		return ERR_PTR(-EINVAL);
>> +	}
>> +
>> +	root = find_cxl_root(endpoint);
>> +	if (!root) {
>> +		dev_dbg(&endpoint->dev, "endpoint can not be related to a root port\n");
>> +		return ERR_PTR(-ENXIO);
>> +	}
>> +
>> +	root_port = &root->port;
>> +	down_read(&cxl_region_rwsem);
>> +	device_for_each_child(&root_port->dev, &ctx, find_max_hpa);
>> +	up_read(&cxl_region_rwsem);
>> +	put_device(&root_port->dev);
>> +
>> +	if (!ctx.cxlrd)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	*max = ctx.max_hpa;
> Rename max_hpa to available_hpa.
>
>> +	return ctx.cxlrd;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_get_hpa_freespace, CXL);
>> +
>> +

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-08-04 17:57   ` Jonathan Cameron
  2024-08-19 14:47     ` Alejandro Lucero Palau
  2024-08-28 10:18     ` Alejandro Lucero Palau
@ 2024-08-28 10:41     ` Alejandro Lucero Palau
  2024-08-28 11:26       ` Jonathan Cameron
  2 siblings, 1 reply; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-28 10:41 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes


On 8/4/24 18:57, Jonathan Cameron wrote:
> + }
>> +	return 0;
>> +}
>> +
>> +/**
>> + * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
>> + * @endpoint: an endpoint that is mapped by the returned decoder
>> + * @interleave_ways: number of entries in @host_bridges
>> + * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
>> + * @max: output parameter of bytes available in the returned decoder
> @available_size
> or something along those lines. I'd expect max to be the end address of the available
> region


No really. The code looks for the biggest free hole in the HPA. 
Returning available size does not help except from informing about the 
"internal fragmentation".


>> + *
>> + * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
>> + * is a point in time snapshot. If by the time the caller goes to use this root
>> + * decoder's capacity the capacity is reduced then caller needs to loop and
>> + * retry.
>> + *
>> + * The returned root decoder has an elevated reference count that needs to be
>> + * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
>> + * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
>> + * does not race.
>> + */
>> +struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_port *endpoint,
>> +					       int interleave_ways,
>> +					       unsigned long flags,
>> +					       resource_size_t *max)
>> +{
>> +
>> +	struct cxlrd_max_context ctx = {
>> +		.host_bridges = &endpoint->host_bridge,
>> +		.interleave_ways = interleave_ways,
>> +		.flags = flags,
>> +	};
>> +	struct cxl_port *root_port;
>> +	struct cxl_root *root;
>> +
>> +	if (!is_cxl_endpoint(endpoint)) {
>> +		dev_dbg(&endpoint->dev, "hpa requestor is not an endpoint\n");
>> +		return ERR_PTR(-EINVAL);
>> +	}
>> +
>> +	root = find_cxl_root(endpoint);
>> +	if (!root) {
>> +		dev_dbg(&endpoint->dev, "endpoint can not be related to a root port\n");
>> +		return ERR_PTR(-ENXIO);
>> +	}
>> +
>> +	root_port = &root->port;
>> +	down_read(&cxl_region_rwsem);
>> +	device_for_each_child(&root_port->dev, &ctx, find_max_hpa);
>> +	up_read(&cxl_region_rwsem);
>> +	put_device(&root_port->dev);
>> +
>> +	if (!ctx.cxlrd)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	*max = ctx.max_hpa;
> Rename max_hpa to available_hpa.
>
>> +	return ctx.cxlrd;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_get_hpa_freespace, CXL);
>> +
>> +

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-08-28 10:18     ` Alejandro Lucero Palau
@ 2024-08-28 11:19       ` Jonathan Cameron
  0 siblings, 0 replies; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-28 11:19 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes

On Wed, 28 Aug 2024 11:18:12 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 8/4/24 18:57, Jonathan Cameron wrote:
> > On Mon, 15 Jul 2024 18:28:29 +0100
> > alejandro.lucero-palau@amd.com wrote:
> >  
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> CXL region creation involves allocating capacity from device DPA
> >> (device-physical-address space) and assigning it to decode a given HPA
> >> (host-physical-address space). Before determining how much DPA to
> >> allocate the amount of available HPA must be determined. Also, not all
> >> HPA is create equal, some specifically targets RAM, some target PMEM,
> >> some is prepared for device-memory flows like HDM-D and HDM-DB, and some
> >> is host-only (HDM-H).
> >>
> >> Wrap all of those concerns into an API that retrieves a root decoder
> >> (platform CXL window) that fits the specified constraints and the
> >> capacity available for a new region.
> >>
> >> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m6fbe775541da3cd477d65fa95c8acdc347345b4f
> >>
> >> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >> Co-developed-by: Dan Williams <dan.j.williams@intel.com>  
> > Hi.
> >
> > This seems a lot more complex than an accelerator would need.
> > If plan is to use this in the type3 driver as well, I'd like to
> > see that done as a precursor to the main series.
> > If it only matters to accelerator drivers (as in type 3 I think
> > we make this a userspace problem), then limit the code to handle
> > interleave ways == 1 only.  Maybe we will care about higher interleave
> > in the long run, but do you have a multihead accelerator today?
> >
> > Jonathan
> >  
> >> ---
> >>   drivers/cxl/core/region.c          | 161 +++++++++++++++++++++++++++++
> >>   drivers/cxl/cxl.h                  |   3 +
> >>   drivers/cxl/cxlmem.h               |   5 +
> >>   drivers/net/ethernet/sfc/efx_cxl.c |  14 +++
> >>   include/linux/cxl_accel_mem.h      |   9 ++
> >>   5 files changed, 192 insertions(+)
> >>
> >> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> >> index 538ebd5a64fd..ca464bfef77b 100644
> >> --- a/drivers/cxl/core/region.c
> >> +++ b/drivers/cxl/core/region.c
> >> @@ -702,6 +702,167 @@ static int free_hpa(struct cxl_region *cxlr)
> >>   	return 0;
> >>   }
> >>   
> >> +
> >> +struct cxlrd_max_context {
> >> +	struct device * const *host_bridges;
> >> +	int interleave_ways;
> >> +	unsigned long flags;
> >> +	resource_size_t max_hpa;
> >> +	struct cxl_root_decoder *cxlrd;
> >> +};
> >> +
> >> +static int find_max_hpa(struct device *dev, void *data)
> >> +{
> >> +	struct cxlrd_max_context *ctx = data;
> >> +	struct cxl_switch_decoder *cxlsd;
> >> +	struct cxl_root_decoder *cxlrd;
> >> +	struct resource *res, *prev;
> >> +	struct cxl_decoder *cxld;
> >> +	resource_size_t max;
> >> +	int found;
> >> +
> >> +	if (!is_root_decoder(dev))
> >> +		return 0;
> >> +
> >> +	cxlrd = to_cxl_root_decoder(dev);
> >> +	cxld = &cxlrd->cxlsd.cxld;
> >> +	if ((cxld->flags & ctx->flags) != ctx->flags) {
> >> +		dev_dbg(dev, "find_max_hpa, flags not matching: %08lx vs %08lx\n",
> >> +			      cxld->flags, ctx->flags);
> >> +		return 0;
> >> +	}
> >> +
> >> +	/* A Host bridge could have more interleave ways than an
> >> +	 * endpoint, couldn´t it?  
> > EP interleave ways is about working out how the full HPA address (it's
> > all sent over the wire) is modified to get to the DPA.  So it needs
> > to know what the overall interleave is.  Host bridge can't interleave
> > and then have the EP not know about it.  If there are switch HDM decoders
> > in the path, the host bridge interleave may be less than that the EP needs
> > to deal with.
> >
> > Does an accelerator actually cope with interleave? Is aim here to ensure
> > that IW is never anything other than 1?  Or is this meant to have
> > more general use? I guess it is meant to. In which case, I'd like to
> > see this used in the type3 driver as well.  
> 
> 
> I guess an accelerator could cope with interleave ways > 1, but not ours.
> 
> And it does not make sense to me an accelerator being an EP for an 
> interleaved HPA because the memory does not make sense out of the 
> accelerator.
> 
> So if the CFMW and the Host Bridge have an interleave way of 2, implying 
> accesses to the HPA through different wires, I assume an accelerator 
> should not be allowed.
That's certainly fine for now. 'maybe' something will come along that can
make use of interleaving (I'm thinking of Processing near memory type setup
where it's offloading minor stuff more local to the memory but is basically
type 3 memory)
> 
> 
> >> +	 *
> >> +	 * What does interleave ways mean here in terms of the requestor?
> >> +	 * Why the FFMWS has 0 interleave ways but root port has 1?  
> > FFMWS?  
> 
> 
> I meant CFMW, and I think this comment is because I found out the CFMW 
> is parsed with interleave ways = 0 then the root port having 1, what is 
> confusing.
> 
I'm a bit lost.  Maybe this is just encoded and 'real' values?
1 way interleave is just not interleaving.

Jonathan



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-08-28 10:41     ` Alejandro Lucero Palau
@ 2024-08-28 11:26       ` Jonathan Cameron
  2024-08-28 13:08         ` Alejandro Lucero Palau
  0 siblings, 1 reply; 114+ messages in thread
From: Jonathan Cameron @ 2024-08-28 11:26 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes

On Wed, 28 Aug 2024 11:41:11 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 8/4/24 18:57, Jonathan Cameron wrote:
> > + }  
> >> +	return 0;
> >> +}
> >> +
> >> +/**
> >> + * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
> >> + * @endpoint: an endpoint that is mapped by the returned decoder
> >> + * @interleave_ways: number of entries in @host_bridges
> >> + * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
> >> + * @max: output parameter of bytes available in the returned decoder  
> > @available_size
> > or something along those lines. I'd expect max to be the end address of the available
> > region  
> 
> 
> No really. The code looks for the biggest free hole in the HPA. 
> Returning available size does not help except from informing about the 
> "internal fragmentation".

I worded that badly.  Intent was that to me 'max' ==  maximum address, not maximum available
contiguous range.  max_hole or max_avail_contig maybe?

> 

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration
  2024-08-28 11:26       ` Jonathan Cameron
@ 2024-08-28 13:08         ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-28 13:08 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet,
	richard.hughes


On 8/28/24 12:26, Jonathan Cameron wrote:
> On Wed, 28 Aug 2024 11:41:11 +0100
> Alejandro Lucero Palau <alucerop@amd.com> wrote:
>
>> On 8/4/24 18:57, Jonathan Cameron wrote:
>>> + }
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +/**
>>>> + * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
>>>> + * @endpoint: an endpoint that is mapped by the returned decoder
>>>> + * @interleave_ways: number of entries in @host_bridges
>>>> + * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
>>>> + * @max: output parameter of bytes available in the returned decoder
>>> @available_size
>>> or something along those lines. I'd expect max to be the end address of the available
>>> region
>>
>> No really. The code looks for the biggest free hole in the HPA.
>> Returning available size does not help except from informing about the
>> "internal fragmentation".
> I worded that badly.  Intent was that to me 'max' ==  maximum address, not maximum available
> contiguous range.  max_hole or max_avail_contig maybe?
>

Let's go with max_avail_contig.

Thanks!



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 08/15] cxl: indicate probe deferral
  2024-08-26 17:42   ` Zhi Wang
@ 2024-08-28 13:43     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-28 13:43 UTC (permalink / raw)
  To: Zhi Wang, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, richard.hughes, targupta, vsethi


On 8/26/24 18:42, Zhi Wang wrote:
> On Mon, 15 Jul 2024 18:28:28 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> The first stop for a CXL accelerator driver that wants to establish
>> new CXL.mem regions is to register a 'struct cxl_memdev. That kicks
>> off cxl_mem_probe() to enumerate all 'struct cxl_port' instances in
>> the topology up to the root.
>>
>> If the root driver has not attached yet the expectation is that the
>> driver waits until that link is established. The common cxl_pci_driver
>> has reason to keep the 'struct cxl_memdev' device attached to the bus
>> until the root driver attaches. An accelerator may want to instead
>> defer probing until CXL resources can be acquired.
>>
>> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
>> accelerator driver probing should be defferred vs failed. Provide that
>> indication via a new cxl_acquire_endpoint() API that can retrieve the
>> probe status of the memdev.
>>
>> The first consumer of this API is a test driver that excercises the
>> CXL Type-2 flow.
>>
> Out of curiosity, when and where do we probe CXL_DVSEC_CACHE_CAPABLE and
> enable the CXL_DVSEC_CACHE_ENABLE bit for a type-2 device?
>
> Thanks,
> Zhi.


As It is mentioned in the cover letter, this is a Type2 device but not 
working on CXL.cache yet.

I hope we can discuss how to deal with CXL.cache in the LPC next month. 
I'll be talking about it and current state of this patchset.

Thank you


>> Based on
>> https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m18497367d2ae38f88e94c06369eaa83fa23e92b2
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>   drivers/cxl/core/memdev.c          | 41
>> ++++++++++++++++++++++++++++++ drivers/cxl/core/port.c            |
>> 2 +- drivers/cxl/mem.c                  |  7 +++--
>>   drivers/net/ethernet/sfc/efx_cxl.c | 10 +++++++-
>>   include/linux/cxl_accel_mem.h      |  3 +++
>>   5 files changed, 59 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index b902948b121f..d51c8bfb32e3 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -1137,6 +1137,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct
>> device *host, }
>>   EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
>>   
>> +/*
>> + * Try to get a locked reference on a memdev's CXL port topology
>> + * connection. Be careful to observe when cxl_mem_probe() has
>> deposited
>> + * a probe deferral awaiting the arrival of the CXL root driver
>> +*/
>> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
>> +{
>> +	struct cxl_port *endpoint;
>> +	int rc = -ENXIO;
>> +
>> +	device_lock(&cxlmd->dev);
>> +	endpoint = cxlmd->endpoint;
>> +	if (!endpoint)
>> +		goto err;
>> +
>> +	if (IS_ERR(endpoint)) {
>> +		rc = PTR_ERR(endpoint);
>> +		goto err;
>> +	}
>> +
>> +	device_lock(&endpoint->dev);
>> +	if (!endpoint->dev.driver)
>> +		goto err_endpoint;
>> +
>> +	return endpoint;
>> +
>> +err_endpoint:
>> +	device_unlock(&endpoint->dev);
>> +err:
>> +	device_unlock(&cxlmd->dev);
>> +	return ERR_PTR(rc);
>> +}
>> +EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
>> +
>> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port
>> *endpoint) +{
>> +	device_unlock(&endpoint->dev);
>> +	device_unlock(&cxlmd->dev);
>> +}
>> +EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
>> +
>>   static void sanitize_teardown_notifier(void *data)
>>   {
>>   	struct cxl_memdev_state *mds = data;
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index d66c6349ed2d..3c6b896c5f65 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1553,7 +1553,7 @@ static int add_port_attach_ep(struct cxl_memdev
>> *cxlmd, */
>>   		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>>   			dev_name(dport_dev));
>> -		return -ENXIO;
>> +		return -EPROBE_DEFER;
>>   	}
>>   
>>   	parent_port = find_cxl_port(dparent, &parent_dport);
>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>> index f76af75a87b7..383a6f4829d3 100644
>> --- a/drivers/cxl/mem.c
>> +++ b/drivers/cxl/mem.c
>> @@ -145,13 +145,16 @@ static int cxl_mem_probe(struct device *dev)
>>   		return rc;
>>   
>>   	rc = devm_cxl_enumerate_ports(cxlmd);
>> -	if (rc)
>> +	if (rc) {
>> +		cxlmd->endpoint = ERR_PTR(rc);
>>   		return rc;
>> +	}
>>   
>>   	parent_port = cxl_mem_find_port(cxlmd, &dport);
>>   	if (!parent_port) {
>>   		dev_err(dev, "CXL port topology not found\n");
>> -		return -ENXIO;
>> +		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
>> +		return -EPROBE_DEFER;
>>   	}
>>   
>>   	if (resource_size(&cxlds->pmem_res) &&
>> IS_ENABLED(CONFIG_CXL_PMEM)) { diff --git
>> a/drivers/net/ethernet/sfc/efx_cxl.c
>> b/drivers/net/ethernet/sfc/efx_cxl.c index 0abe66490ef5..2cf4837ddfc1
>> 100644 --- a/drivers/net/ethernet/sfc/efx_cxl.c +++
>> b/drivers/net/ethernet/sfc/efx_cxl.c @@ -65,8 +65,16 @@ void
>> efx_cxl_init(struct efx_nic *efx) }
>>   
>>   	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
>> -	if (IS_ERR(cxl->cxlmd))
>> +	if (IS_ERR(cxl->cxlmd)) {
>>   		pci_info(pci_dev, "CXL accel memdev creation
>> failed");
>> +		return;
>> +	}
>> +
>> +	cxl->endpoint = cxl_acquire_endpoint(cxl->cxlmd);
>> +	if (IS_ERR(cxl->endpoint))
>> +		pci_info(pci_dev, "CXL accel acquire endpoint
>> failed"); +
>> +	cxl_release_endpoint(cxl->cxlmd, cxl->endpoint);
>>   }
>>   
>>   
>> diff --git a/include/linux/cxl_accel_mem.h
>> b/include/linux/cxl_accel_mem.h index 442ed9862292..701910021df8
>> 100644 --- a/include/linux/cxl_accel_mem.h
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -29,4 +29,7 @@ int cxl_await_media_ready(struct cxl_dev_state
>> *cxlds);
>>   struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>>   				       struct cxl_dev_state *cxlds);
>> +
>> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
>> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port
>> *endpoint); #endif

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 11/15] cxl: make region type based on endpoint type
  2024-07-16  8:13     ` Alejandro Lucero Palau
@ 2024-08-28 16:06       ` Alejandro Lucero Palau
  0 siblings, 0 replies; 114+ messages in thread
From: Alejandro Lucero Palau @ 2024-08-28 16:06 UTC (permalink / raw)
  To: Li, Ming4, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet, richard.hughes


On 7/16/24 09:13, Alejandro Lucero Palau wrote:
>
> On 7/16/24 08:14, Li, Ming4 wrote:
>> On 7/16/2024 1:28 AM, alejandro.lucero-palau@amd.com wrote:
>>> From: Alejandro Lucero <alucerop@amd.com>
>>>
>>> Current code is expecting Type3 or CXL_DECODER_HOSTONLYMEM devices 
>>> only.
>>> Suport for Type2 implies region type needs to be based on the endpoint
>>> type instead.
>>>
>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>> ---
>>>   drivers/cxl/core/region.c | 14 +++++++++-----
>>>   1 file changed, 9 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>> index ca464bfef77b..5cc71b8868bc 100644
>>> --- a/drivers/cxl/core/region.c
>>> +++ b/drivers/cxl/core/region.c
>>> @@ -2645,7 +2645,8 @@ static ssize_t create_ram_region_show(struct 
>>> device *dev,
>>>   }
>>>     static struct cxl_region *__create_region(struct 
>>> cxl_root_decoder *cxlrd,
>>> -                      enum cxl_decoder_mode mode, int id)
>>> +                      enum cxl_decoder_mode mode, int id,
>>> +                      enum cxl_decoder_type target_type)
>>>   {
>>>       int rc;
>>>   @@ -2667,7 +2668,7 @@ static struct cxl_region 
>>> *__create_region(struct cxl_root_decoder *cxlrd,
>>>           return ERR_PTR(-EBUSY);
>>>       }
>>>   -    return devm_cxl_add_region(cxlrd, id, mode, 
>>> CXL_DECODER_HOSTONLYMEM);
>>> +    return devm_cxl_add_region(cxlrd, id, mode, target_type);
>>>   }
>>>     static ssize_t create_pmem_region_store(struct device *dev,
>>> @@ -2682,7 +2683,8 @@ static ssize_t create_pmem_region_store(struct 
>>> device *dev,
>>>       if (rc != 1)
>>>           return -EINVAL;
>>>   -    cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id);
>>> +    cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id,
>>> +                   CXL_DECODER_HOSTONLYMEM);
>>>       if (IS_ERR(cxlr))
>>>           return PTR_ERR(cxlr);
>>>   @@ -2702,7 +2704,8 @@ static ssize_t 
>>> create_ram_region_store(struct device *dev,
>>>       if (rc != 1)
>>>           return -EINVAL;
>>>   -    cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id);
>>> +    cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id,
>>> +                   CXL_DECODER_HOSTONLYMEM);
>>>       if (IS_ERR(cxlr))
>>>           return PTR_ERR(cxlr);
>>>   @@ -3364,7 +3367,8 @@ static struct cxl_region 
>>> *construct_region(struct cxl_root_decoder *cxlrd,
>>>         do {
>>>           cxlr = __create_region(cxlrd, cxled->mode,
>>> - atomic_read(&cxlrd->region_id));
>>> +                       atomic_read(&cxlrd->region_id),
>>> +                       cxled->cxld.target_type);
>>>       } while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
>>>         if (IS_ERR(cxlr)) {
>> I think that one more check between the type of root decoder and 
>> endpoint decoder is necessary in this case. Currently, root decoder 
>> type is hard coded to CXL_DECODER_HOSTONLYMEM, but it should be 
>> CXL_DECODER_DEVMEM or CXL_DECODER_HOSTONLYMEM based on 
>> cfmws->restrictions.
>>
>
> I think you are completely right.
>
> I will work on this looking also for other implications.
>
> Thanks
>
>
>>

I think the check could be performed inside cxl_attach_region where the 
region type is already matched against the endpoint type. That is the 
check triggering a failure for my Type2 support and the reason behind 
this patch.

However, I think the way encoder type is managed requires a refactoring. 
From the cedt cfmw restrictions I assume a decoder can support different 
types and not restricted to just one, what is what the code does now 
using a enumeration for the encoder type. With no restrictions, what is 
the current implementation with qemu, I would say a root decoder should 
be fine for a Type3 or a Type2. Adding that check for matching the root 
decoder type with the region type is therefore not possible without 
major changes. Because other potential restrictions like only supporting 
RAM and no PMEM is not currently being managed, I think this initial 
type2 support should be fine without the checking you propose, but a 
following patch should address this problem, of course, assuming I'm not 
wrong with all this.



^ permalink raw reply	[flat|nested] 114+ messages in thread

end of thread, other threads:[~2024-08-28 16:06 UTC | newest]

Thread overview: 114+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-15 17:28 [PATCH v2 00/15] cxl: add Type2 device support alejandro.lucero-palau
2024-07-15 17:28 ` [PATCH v2 01/15] cxl: add type2 device basic support alejandro.lucero-palau
2024-07-15 18:48   ` Andrew Lunn
2024-07-16  8:50     ` Alejandro Lucero Palau
2024-07-16  1:57   ` kernel test robot
2024-07-18 23:12   ` Dave Jiang
2024-07-19  6:03     ` Alejandro Lucero Palau
2024-08-04 16:44       ` Jonathan Cameron
2024-08-09  7:26         ` Alejandro Lucero Palau
2024-08-04 17:10   ` Jonathan Cameron
2024-08-12 11:16     ` Alejandro Lucero Palau
2024-08-13  8:30       ` Alejandro Lucero Palau
2024-08-15 16:38         ` Jonathan Cameron
2024-08-19 11:12           ` Alejandro Lucero Palau
2024-08-20 10:44             ` Alejandro Lucero Palau
2024-08-15 16:35       ` Jonathan Cameron
2024-08-19 11:10         ` Alejandro Lucero Palau
2024-08-27 15:06           ` Jonathan Cameron
2024-08-09  8:34   ` Zhi Wang
2024-08-12 11:34     ` Alejandro Lucero Palau
2024-08-17 20:32       ` Zhi Wang
2024-08-19 11:13         ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 02/15] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
2024-07-16  6:26   ` Li, Ming4
2024-08-14  7:46     ` Alejandro Lucero Palau
2024-07-18 23:27   ` Dave Jiang
2024-08-14  7:49     ` Alejandro Lucero Palau
2024-08-04 17:15   ` Jonathan Cameron
2024-08-14  7:56     ` Alejandro Lucero Palau
2024-08-15 16:40       ` Jonathan Cameron
2024-08-18  8:07         ` Zhi Wang
2024-08-19 11:28           ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 03/15] cxl: add function for type2 resource request alejandro.lucero-palau
2024-07-18 23:36   ` Dave Jiang
2024-08-04 17:16     ` Jonathan Cameron
2024-08-14  8:08       ` Alejandro Lucero Palau
2024-08-14  8:00     ` Alejandro Lucero Palau
2024-08-09  9:01   ` Zhi Wang
2024-08-22 13:07   ` Zhi Wang
2024-08-23  9:30     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 04/15] cxl: add capabilities field to cxl_dev_state alejandro.lucero-palau
2024-07-19 19:01   ` Dave Jiang
2024-07-23 13:43     ` Alejandro Lucero Palau
2024-08-09 10:25       ` Zhi Wang
2024-08-15 15:37         ` Alejandro Lucero Palau
2024-08-18  6:55           ` Zhi Wang
2024-08-19 13:14             ` Alejandro Lucero Palau
2024-08-04 17:22   ` Jonathan Cameron
2024-08-15 15:43     ` Alejandro Lucero Palau
2024-08-09  9:10   ` Zhi Wang
2024-08-15 15:20     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 05/15] cxl: fix use of resource_contains alejandro.lucero-palau
2024-07-24 21:25   ` fan
2024-08-16 14:43     ` Alejandro Lucero Palau
2024-08-04 17:25   ` Jonathan Cameron
2024-08-16 14:37     ` Alejandro Lucero Palau
2024-08-27 15:12       ` Jonathan Cameron
2024-08-09  9:14   ` Zhi Wang
2024-08-16 14:42     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 06/15] cxl: add function for setting media ready by an accelerator alejandro.lucero-palau
2024-08-04 17:26   ` Jonathan Cameron
2024-08-16 14:54     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 07/15] cxl: support type2 memdev creation alejandro.lucero-palau
2024-07-24 21:32   ` fan
2024-08-16 14:57     ` Alejandro Lucero Palau
2024-08-04 17:31   ` Jonathan Cameron
2024-08-16 15:00     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 08/15] cxl: indicate probe deferral alejandro.lucero-palau
2024-07-16  5:52   ` Li, Ming4
2024-07-16  8:10     ` Alejandro Lucero Palau
2024-07-30 16:43   ` Fan Ni
2024-08-04 17:41   ` Jonathan Cameron
2024-08-19 13:54     ` Alejandro Lucero Palau
2024-08-09 14:40   ` Zhi Wang
2024-08-26 17:42   ` Zhi Wang
2024-08-28 13:43     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 09/15] cxl: define a driver interface for HPA free space enumaration alejandro.lucero-palau
2024-07-16  0:53   ` kernel test robot
2024-07-16  6:06   ` Li, Ming4
2024-07-24  8:24     ` Alejandro Lucero Palau
2024-07-25  5:51       ` Li, Ming4
2024-07-25 11:59         ` Alejandro Lucero Palau
2024-08-04 17:57   ` Jonathan Cameron
2024-08-19 14:47     ` Alejandro Lucero Palau
2024-08-27 15:18       ` Jonathan Cameron
2024-08-28 10:18     ` Alejandro Lucero Palau
2024-08-28 11:19       ` Jonathan Cameron
2024-08-28 10:41     ` Alejandro Lucero Palau
2024-08-28 11:26       ` Jonathan Cameron
2024-08-28 13:08         ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 10/15] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
2024-07-16  3:32   ` kernel test robot
2024-08-04 18:07   ` Jonathan Cameron
2024-08-19 15:52     ` Alejandro Lucero Palau
2024-08-06 17:33   ` Fan Ni
2024-08-19 15:57     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 11/15] cxl: make region type based on endpoint type alejandro.lucero-palau
2024-07-16  7:14   ` Li, Ming4
2024-07-16  8:13     ` Alejandro Lucero Palau
2024-08-28 16:06       ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 12/15] cxl: allow region creation by type2 drivers alejandro.lucero-palau
2024-08-04 18:29   ` Jonathan Cameron
2024-08-19 16:11     ` Alejandro Lucero Palau
2024-08-22 13:12   ` Zhi Wang
2024-08-23  9:31     ` Alejandro Lucero Palau
2024-08-27 15:20       ` Jonathan Cameron
2024-07-15 17:28 ` [PATCH v2 13/15] cxl: preclude device memory to be used for dax alejandro.lucero-palau
2024-07-15 17:28 ` [PATCH v2 14/15] cxl: add function for obtaining params from a region alejandro.lucero-palau
2024-08-09 15:24   ` Zhi Wang
2024-08-19 16:14     ` Alejandro Lucero Palau
2024-07-15 17:28 ` [PATCH v2 15/15] efx: support pio mapping based on cxl alejandro.lucero-palau
2024-08-04 18:13   ` Jonathan Cameron
2024-08-19 16:28     ` Alejandro Lucero Palau
2024-08-27 15:23       ` Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).