[PATCH v4 00/26] cxl: add Type2 device support

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 00/26] cxl: add Type2 device support
@ 2024-10-17 16:51 alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 01/26] cxl: add type2 device basic support alejandro.lucero-palau
                   ` (26 more replies)
  0 siblings, 27 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:51 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alejandro.lucero-palau@amd.com>

v4 changes:

 - Use bitmap for capabilities new field (Jonathan Cameron)

 - Use cxl_mem attributes for sysfs based on device type (Dave Jian)

 - Add conditional cxl sfc compilation relying on kernel CXL config (kernel test robot)

 - Add sfc changes in different patches for facilitating backport (Jonathan Cameron)

 - Remove patch for dealing with cxl modules dependencies and using sfc kconfig plus
   MODULE_SOFTDEP instead.

v3 changes:

 - cxl_dev_state not defined as opaque but only manipulated by accel drivers
   through accessors.

 - accessors names not identified as only for accel drivers.

 - move pci code from pci driver (drivers/cxl/pci.c) to generic pci code
   (drivers/cxl/core/pci.c).

 - capabilities field from u8 to u32 and initialised by CXL regs discovering
   code.

 - add capabilities check and removing current check by CXL regs discovering
   code.

 - Not fail if CXL Device Registers not found. Not mandatory for Type2.

 - add timeout in acquire_endpoint for solving a race with the endpoint port
   creation.

 - handle EPROBE_DEFER by sfc driver.

 - Limiting interleave ways to 1 for accel driver HPA/DPA requests.

 - factoring out interleave ways and granularity helpers from type2 region
   creation patch.

 - restricting region_creation for type2 to one endpoint decoder.

 - add accessor for release_resource.

 - handle errors and errors messages properly.

v2 changes:

I have removed the introduction about the concerns with BIOS/UEFI after the
discussion leading to confirm the need of the functionality implemented, at
least is some scenarios.

There are two main changes from the RFC:

1) Following concerns about drivers using CXL core without restrictions, the CXL
struct to work with is opaque to those drivers, therefore functions are
implemented for modifying or reading those structs indirectly.

2) The driver for using the added functionality is not a test driver but a real
one: the SFC ethernet network driver. It uses the CXL region mapped for PIO
buffers instead of regions inside PCIe BARs.

RFC:

Current CXL kernel code is focused on supporting Type3 CXL devices, aka memory
expanders. Type2 CXL devices, aka device accelerators, share some functionalities
but require some special handling.

First of all, Type2 are by definition specific to drivers doing something and not just
a memory expander, so it is expected to work with the CXL specifics. This implies the CXL
setup needs to be done by such a driver instead of by a generic CXL PCI driver
as for memory expanders. Most of such setup needs to use current CXL core code
and therefore needs to be accessible to those vendor drivers. This is accomplished
exporting opaque CXL structs and adding and exporting functions for working with
those structs indirectly.

Some of the patches are based on a patchset sent by Dan Williams [1] which was just
partially integrated, most related to making things ready for Type2 but none
related to specific Type2 support. Those patches based on Dan´s work have Dan´s
signing as co-developer, and a link to the original patch.

A final note about CXL.cache is needed. This patchset does not cover it at all,
although the emulated Type2 device advertises it. From the kernel point of view
supporting CXL.cache will imply to be sure the CXL path supports what the Type2
device needs. A device accelerator will likely be connected to a Root Switch,
but other configurations can not be discarded. Therefore the kernel will need to
check not just HPA, DPA, interleave and granularity, but also the available
CXL.cache support and resources in each switch in the CXL path to the Type2
device. I expect to contribute to this support in the following months, and
it would be good to discuss about it when possible.

[1] https://lore.kernel.org/linux-cxl/98b1f61a-e6c2-71d4-c368-50d958501b0c@intel.com/T/

Alejandro Lucero (26):
  cxl: add type2 device basic support
  sfc: add cxl support using new CXL API
  cxl: add capabilities field to cxl_dev_state and cxl_port
  cxl/pci: add check for validating capabilities
  cxl: move pci generic code
  cxl: add function for type2 cxl regs setup
  sfc: use cxl api for regs setup and checking
  cxl: add functions for resource request/release by a driver
  sfc: request cxl ram resource
  cxl: harden resource_contains checks to handle zero size resources
  cxl: add function for setting media ready by a driver
  sfc: set cxl media ready
  cxl: prepare memdev creation for type2
  sfc: create type2 cxl memdev
  cxl: define a driver interface for HPA free space enumeration
  sfc: obtain root decoder with enough HPA free space
  cxl: define a driver interface for DPA allocation
  sfc: get endpoint decoder
  cxl: make region type based on endpoint type
  cxl/region: factor out interleave ways setup
  cxl/region: factor out interleave granularity setup
  cxl: allow region creation by type2 drivers
  sfc: create cxl region
  cxl: preclude device memory to be used for dax
  cxl: add function for obtaining params from a region
  sfc: support pio mapping based on cxl

 drivers/cxl/core/hdm.c                | 160 ++++++++--
 drivers/cxl/core/memdev.c             | 124 +++++++-
 drivers/cxl/core/pci.c                | 124 ++++++++
 drivers/cxl/core/port.c               |  11 +-
 drivers/cxl/core/region.c             | 409 ++++++++++++++++++++++----
 drivers/cxl/core/regs.c               |  30 +-
 drivers/cxl/cxl.h                     |  14 +-
 drivers/cxl/cxlmem.h                  |   4 +
 drivers/cxl/cxlpci.h                  |  19 +-
 drivers/cxl/mem.c                     |  25 +-
 drivers/cxl/pci.c                     |  95 ++----
 drivers/net/ethernet/sfc/Kconfig      |   1 +
 drivers/net/ethernet/sfc/Makefile     |   2 +-
 drivers/net/ethernet/sfc/ef10.c       |  34 ++-
 drivers/net/ethernet/sfc/efx.c        |  16 +
 drivers/net/ethernet/sfc/efx_cxl.c    | 186 ++++++++++++
 drivers/net/ethernet/sfc/efx_cxl.h    |  29 ++
 drivers/net/ethernet/sfc/mcdi_pcol.h  |  12 +
 drivers/net/ethernet/sfc/net_driver.h |   8 +
 drivers/net/ethernet/sfc/nic.h        |   2 +
 include/linux/cxl/cxl.h               |  81 +++++
 include/linux/cxl/pci.h               |  23 ++
 22 files changed, 1210 insertions(+), 199 deletions(-)
 create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
 create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
 create mode 100644 include/linux/cxl/cxl.h
 create mode 100644 include/linux/cxl/pci.h

-- 
2.17.1

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 01/26] cxl: add type2 device basic support
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-25 13:50   ` Jonathan Cameron
  2024-10-28 18:05   ` Dave Jiang
  2024-10-17 16:52 ` [PATCH v4 02/26] sfc: add cxl support using new CXL API alejandro.lucero-palau
                   ` (25 subsequent siblings)
  26 siblings, 2 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Differentiate Type3, aka memory expanders, from Type2, aka device
accelerators, with a new function for initializing cxl_dev_state.

Create accessors to cxl_dev_state to be used by accel drivers.

Based on previous work by Dan Williams [1]

Link: [1] https://lore.kernel.org/linux-cxl/168592160379.1948938.12863272903570476312.stgit@dwillia2-xfh.jf.intel.com/
Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c | 52 +++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/pci.c    |  1 +
 drivers/cxl/cxlpci.h      | 16 ------------
 drivers/cxl/pci.c         | 13 +++++++---
 include/linux/cxl/cxl.h   | 21 ++++++++++++++++
 include/linux/cxl/pci.h   | 23 +++++++++++++++++
 6 files changed, 106 insertions(+), 20 deletions(-)
 create mode 100644 include/linux/cxl/cxl.h
 create mode 100644 include/linux/cxl/pci.h

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 0277726afd04..94b8a7b53c92 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2020 Intel Corporation. */
 
+#include <linux/cxl/cxl.h>
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/firmware.h>
 #include <linux/device.h>
@@ -615,6 +616,25 @@ static void detach_memdev(struct work_struct *work)
 
 static struct lock_class_key cxl_memdev_key;
 
+struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
+{
+	struct cxl_dev_state *cxlds;
+
+	cxlds = kzalloc(sizeof(*cxlds), GFP_KERNEL);
+	if (!cxlds)
+		return ERR_PTR(-ENOMEM);
+
+	cxlds->dev = dev;
+	cxlds->type = CXL_DEVTYPE_DEVMEM;
+
+	cxlds->dpa_res = DEFINE_RES_MEM_NAMED(0, 0, "dpa");
+	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
+	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
+
+	return cxlds;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
+
 static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
 					   const struct file_operations *fops)
 {
@@ -692,6 +712,38 @@ static int cxl_memdev_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
+void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
+{
+	cxlds->cxl_dvsec = dvsec;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_set_dvsec, CXL);
+
+void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial)
+{
+	cxlds->serial = serial;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_set_serial, CXL);
+
+int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
+		     enum cxl_resource type)
+{
+	switch (type) {
+	case CXL_RES_DPA:
+		cxlds->dpa_res = res;
+		return 0;
+	case CXL_RES_RAM:
+		cxlds->ram_res = res;
+		return 0;
+	case CXL_RES_PMEM:
+		cxlds->pmem_res = res;
+		return 0;
+	}
+
+	dev_err(cxlds->dev, "unknown resource type (%u)\n", type);
+	return -EINVAL;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_set_resource, CXL);
+
 static int cxl_memdev_release_file(struct inode *inode, struct file *file)
 {
 	struct cxl_memdev *cxlmd =
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 51132a575b27..3d6564dbda57 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -7,6 +7,7 @@
 #include <linux/pci.h>
 #include <linux/pci-doe.h>
 #include <linux/aer.h>
+#include <linux/cxl/pci.h>
 #include <cxlpci.h>
 #include <cxlmem.h>
 #include <cxl.h>
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 4da07727ab9c..eb59019fe5f3 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -14,22 +14,6 @@
  */
 #define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
 
-/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
-#define CXL_DVSEC_PCIE_DEVICE					0
-#define   CXL_DVSEC_CAP_OFFSET		0xA
-#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
-#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
-#define   CXL_DVSEC_CTRL_OFFSET		0xC
-#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
-#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
-#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
-#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
-#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
-#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
-#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
-#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
-#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
-
 #define CXL_DVSEC_RANGE_MAX		2
 
 /* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 4be35dc22202..246930932ea6 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -3,6 +3,8 @@
 #include <asm-generic/unaligned.h>
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/moduleparam.h>
+#include <linux/cxl/cxl.h>
+#include <linux/cxl/pci.h>
 #include <linux/module.h>
 #include <linux/delay.h>
 #include <linux/sizes.h>
@@ -795,6 +797,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	struct cxl_memdev *cxlmd;
 	int i, rc, pmu_count;
 	bool irq_avail;
+	u16 dvsec;
 
 	/*
 	 * Double check the anonymous union trickery in struct cxl_regs
@@ -815,13 +818,15 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	pci_set_drvdata(pdev, cxlds);
 
 	cxlds->rcd = is_cxl_restricted(pdev);
-	cxlds->serial = pci_get_dsn(pdev);
-	cxlds->cxl_dvsec = pci_find_dvsec_capability(
-		pdev, PCI_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE);
-	if (!cxlds->cxl_dvsec)
+	cxl_set_serial(cxlds, pci_get_dsn(pdev));
+	dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
+					  CXL_DVSEC_PCIE_DEVICE);
+	if (!dvsec)
 		dev_warn(&pdev->dev,
 			 "Device DVSEC not present, skip CXL.mem init\n");
 
+	cxl_set_dvsec(cxlds, dvsec);
+
 	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
 	if (rc)
 		return rc;
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
new file mode 100644
index 000000000000..c06ca750168f
--- /dev/null
+++ b/include/linux/cxl/cxl.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
+
+#ifndef __CXL_H
+#define __CXL_H
+
+#include <linux/device.h>
+
+enum cxl_resource {
+	CXL_RES_DPA,
+	CXL_RES_RAM,
+	CXL_RES_PMEM,
+};
+
+struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
+
+void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);
+void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial);
+int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
+		     enum cxl_resource);
+#endif
diff --git a/include/linux/cxl/pci.h b/include/linux/cxl/pci.h
new file mode 100644
index 000000000000..ad63560caa2c
--- /dev/null
+++ b/include/linux/cxl/pci.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
+
+#ifndef __CXL_ACCEL_PCI_H
+#define __CXL_ACCEL_PCI_H
+
+/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
+#define CXL_DVSEC_PCIE_DEVICE					0
+#define   CXL_DVSEC_CAP_OFFSET		0xA
+#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
+#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
+#define   CXL_DVSEC_CTRL_OFFSET		0xC
+#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
+#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + ((i) * 0x10))
+#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + ((i) * 0x10))
+#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
+#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
+#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
+#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + ((i) * 0x10))
+#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + ((i) * 0x10))
+#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
+
+#endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 01/26] cxl: add type2 device basic support
  2024-10-17 16:52 ` [PATCH v4 01/26] cxl: add type2 device basic support alejandro.lucero-palau
@ 2024-10-25 13:50   ` Jonathan Cameron
  2024-10-28  9:37     ` Alejandro Lucero Palau
  2024-10-28 18:05   ` Dave Jiang
  1 sibling, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2024-10-25 13:50 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, Alejandro Lucero

On Thu, 17 Oct 2024 17:52:00 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Differentiate Type3, aka memory expanders, from Type2, aka device
> accelerators, with a new function for initializing cxl_dev_state.
> 
> Create accessors to cxl_dev_state to be used by accel drivers.
> 
> Based on previous work by Dan Williams [1]
> 
> Link: [1] https://lore.kernel.org/linux-cxl/168592160379.1948938.12863272903570476312.stgit@dwillia2-xfh.jf.intel.com/
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Hi Alejandro,

A couple of trivial comments inline on things that that would be good to tidy up.

> ---
>  drivers/cxl/core/memdev.c | 52 +++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/pci.c    |  1 +
>  drivers/cxl/cxlpci.h      | 16 ------------
>  drivers/cxl/pci.c         | 13 +++++++---
>  include/linux/cxl/cxl.h   | 21 ++++++++++++++++
>  include/linux/cxl/pci.h   | 23 +++++++++++++++++
>  6 files changed, 106 insertions(+), 20 deletions(-)
>  create mode 100644 include/linux/cxl/cxl.h
>  create mode 100644 include/linux/cxl/pci.h
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 0277726afd04..94b8a7b53c92 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c

> +int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> +		     enum cxl_resource type)
> +{
> +	switch (type) {
> +	case CXL_RES_DPA:
> +		cxlds->dpa_res = res;
> +		return 0;
> +	case CXL_RES_RAM:
> +		cxlds->ram_res = res;
> +		return 0;
> +	case CXL_RES_PMEM:
> +		cxlds->pmem_res = res;
> +		return 0;
> +	}
> +
> +	dev_err(cxlds->dev, "unknown resource type (%u)\n", type);

Given it's an enum and only enum values are ever passed to it, we should never
get here as they are all handled above.

So maybe drop?  Then if an another type is added we will get a build
warning.

> +	return -EINVAL;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_set_resource, CXL);
> +
>  static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>  {
>  	struct cxl_memdev *cxlmd =

> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
> new file mode 100644
> index 000000000000..c06ca750168f
> --- /dev/null
> +++ b/include/linux/cxl/cxl.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> +
> +#ifndef __CXL_H
> +#define __CXL_H
> +
> +#include <linux/device.h>

I'd avoid this if possible and use a forwards definition for
struct device;
Also needed for 
struct cxl_dev_state;
And an include needed for linux/ioport.h for the struct
resource.


> +
> +enum cxl_resource {
> +	CXL_RES_DPA,
> +	CXL_RES_RAM,
> +	CXL_RES_PMEM,
> +};
> +
> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
> +
> +void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);
> +void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial);
> +int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> +		     enum cxl_resource);
> +#endif


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 01/26] cxl: add type2 device basic support
  2024-10-25 13:50   ` Jonathan Cameron
@ 2024-10-28  9:37     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-28  9:37 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/25/24 14:50, Jonathan Cameron wrote:
> On Thu, 17 Oct 2024 17:52:00 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Differentiate Type3, aka memory expanders, from Type2, aka device
>> accelerators, with a new function for initializing cxl_dev_state.
>>
>> Create accessors to cxl_dev_state to be used by accel drivers.
>>
>> Based on previous work by Dan Williams [1]
>>
>> Link: [1] https://lore.kernel.org/linux-cxl/168592160379.1948938.12863272903570476312.stgit@dwillia2-xfh.jf.intel.com/
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> Hi Alejandro,
>
> A couple of trivial comments inline on things that that would be good to tidy up.
>
>> ---
>>   drivers/cxl/core/memdev.c | 52 +++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/core/pci.c    |  1 +
>>   drivers/cxl/cxlpci.h      | 16 ------------
>>   drivers/cxl/pci.c         | 13 +++++++---
>>   include/linux/cxl/cxl.h   | 21 ++++++++++++++++
>>   include/linux/cxl/pci.h   | 23 +++++++++++++++++
>>   6 files changed, 106 insertions(+), 20 deletions(-)
>>   create mode 100644 include/linux/cxl/cxl.h
>>   create mode 100644 include/linux/cxl/pci.h
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index 0277726afd04..94b8a7b53c92 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> +int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>> +		     enum cxl_resource type)
>> +{
>> +	switch (type) {
>> +	case CXL_RES_DPA:
>> +		cxlds->dpa_res = res;
>> +		return 0;
>> +	case CXL_RES_RAM:
>> +		cxlds->ram_res = res;
>> +		return 0;
>> +	case CXL_RES_PMEM:
>> +		cxlds->pmem_res = res;
>> +		return 0;
>> +	}
>> +
>> +	dev_err(cxlds->dev, "unknown resource type (%u)\n", type);
> Given it's an enum and only enum values are ever passed to it, we should never
> get here as they are all handled above.
>
> So maybe drop?  Then if an another type is added we will get a build
> warning.


OK. I'll do that.


>> +	return -EINVAL;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_set_resource, CXL);
>> +
>>   static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>>   {
>>   	struct cxl_memdev *cxlmd =
>> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
>> new file mode 100644
>> index 000000000000..c06ca750168f
>> --- /dev/null
>> +++ b/include/linux/cxl/cxl.h
>> @@ -0,0 +1,21 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>> +
>> +#ifndef __CXL_H
>> +#define __CXL_H
>> +
>> +#include <linux/device.h>
> I'd avoid this if possible and use a forwards definition for
> struct device;
> Also needed for
> struct cxl_dev_state;
> And an include needed for linux/ioport.h for the struct
> resource.


Right. And I'm adding another include later on in this patchset that 
makes this one unnecessary.

I guess the problematic thing is to refer to that core device header 
directly which makes things suspicious.

I'll change it.

Thanks!


>
>> +
>> +enum cxl_resource {
>> +	CXL_RES_DPA,
>> +	CXL_RES_RAM,
>> +	CXL_RES_PMEM,
>> +};
>> +
>> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
>> +
>> +void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);
>> +void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial);
>> +int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>> +		     enum cxl_resource);
>> +#endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 01/26] cxl: add type2 device basic support
  2024-10-17 16:52 ` [PATCH v4 01/26] cxl: add type2 device basic support alejandro.lucero-palau
  2024-10-25 13:50   ` Jonathan Cameron
@ 2024-10-28 18:05   ` Dave Jiang
  2024-10-30 16:26     ` Alejandro Lucero Palau
  1 sibling, 1 reply; 64+ messages in thread
From: Dave Jiang @ 2024-10-28 18:05 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero



On 10/17/24 9:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Differentiate Type3, aka memory expanders, from Type2, aka device
> accelerators, with a new function for initializing cxl_dev_state.
> 
> Create accessors to cxl_dev_state to be used by accel drivers.
> 
> Based on previous work by Dan Williams [1]
> 
> Link: [1] https://lore.kernel.org/linux-cxl/168592160379.1948938.12863272903570476312.stgit@dwillia2-xfh.jf.intel.com/
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c | 52 +++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/pci.c    |  1 +
>  drivers/cxl/cxlpci.h      | 16 ------------
>  drivers/cxl/pci.c         | 13 +++++++---
>  include/linux/cxl/cxl.h   | 21 ++++++++++++++++
>  include/linux/cxl/pci.h   | 23 +++++++++++++++++
>  6 files changed, 106 insertions(+), 20 deletions(-)
>  create mode 100644 include/linux/cxl/cxl.h
>  create mode 100644 include/linux/cxl/pci.h
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 0277726afd04..94b8a7b53c92 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /* Copyright(c) 2020 Intel Corporation. */
>  
> +#include <linux/cxl/cxl.h>
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/firmware.h>
>  #include <linux/device.h>
> @@ -615,6 +616,25 @@ static void detach_memdev(struct work_struct *work)
>  
>  static struct lock_class_key cxl_memdev_key;
>  
> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
> +{
> +	struct cxl_dev_state *cxlds;
> +
> +	cxlds = kzalloc(sizeof(*cxlds), GFP_KERNEL);
> +	if (!cxlds)
> +		return ERR_PTR(-ENOMEM);
> +
> +	cxlds->dev = dev;
> +	cxlds->type = CXL_DEVTYPE_DEVMEM;
> +
> +	cxlds->dpa_res = DEFINE_RES_MEM_NAMED(0, 0, "dpa");
> +	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
> +	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
> +
> +	return cxlds;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
> +
>  static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>  					   const struct file_operations *fops)
>  {
> @@ -692,6 +712,38 @@ static int cxl_memdev_open(struct inode *inode, struct file *file)
>  	return 0;
>  }
>  
> +void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
> +{
> +	cxlds->cxl_dvsec = dvsec;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_set_dvsec, CXL);
> +
> +void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial)
> +{
> +	cxlds->serial = serial;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_set_serial, CXL);
> +
> +int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> +		     enum cxl_resource type)
> +{
> +	switch (type) {
> +	case CXL_RES_DPA:
> +		cxlds->dpa_res = res;
> +		return 0;
> +	case CXL_RES_RAM:
> +		cxlds->ram_res = res;
> +		return 0;
> +	case CXL_RES_PMEM:
> +		cxlds->pmem_res = res;
> +		return 0;
> +	}
> +
> +	dev_err(cxlds->dev, "unknown resource type (%u)\n", type);
> +	return -EINVAL;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_set_resource, CXL);
> +
>  static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>  {
>  	struct cxl_memdev *cxlmd =
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 51132a575b27..3d6564dbda57 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -7,6 +7,7 @@
>  #include <linux/pci.h>
>  #include <linux/pci-doe.h>
>  #include <linux/aer.h>
> +#include <linux/cxl/pci.h>
>  #include <cxlpci.h>
>  #include <cxlmem.h>
>  #include <cxl.h>
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 4da07727ab9c..eb59019fe5f3 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -14,22 +14,6 @@
>   */
>  #define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
>  
> -/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
> -#define CXL_DVSEC_PCIE_DEVICE					0
> -#define   CXL_DVSEC_CAP_OFFSET		0xA
> -#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
> -#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
> -#define   CXL_DVSEC_CTRL_OFFSET		0xC
> -#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
> -#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
> -#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
> -#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
> -#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
> -#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
> -#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
> -#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
> -#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
> -
>  #define CXL_DVSEC_RANGE_MAX		2
>  
>  /* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 4be35dc22202..246930932ea6 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -3,6 +3,8 @@
>  #include <asm-generic/unaligned.h>
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/moduleparam.h>
> +#include <linux/cxl/cxl.h>
> +#include <linux/cxl/pci.h>
>  #include <linux/module.h>
>  #include <linux/delay.h>
>  #include <linux/sizes.h>
> @@ -795,6 +797,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	struct cxl_memdev *cxlmd;
>  	int i, rc, pmu_count;
>  	bool irq_avail;
> +	u16 dvsec;
>  
>  	/*
>  	 * Double check the anonymous union trickery in struct cxl_regs
> @@ -815,13 +818,15 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	pci_set_drvdata(pdev, cxlds);
>  
>  	cxlds->rcd = is_cxl_restricted(pdev);
> -	cxlds->serial = pci_get_dsn(pdev);
> -	cxlds->cxl_dvsec = pci_find_dvsec_capability(
> -		pdev, PCI_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE);
> -	if (!cxlds->cxl_dvsec)
> +	cxl_set_serial(cxlds, pci_get_dsn(pdev));
> +	dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
> +					  CXL_DVSEC_PCIE_DEVICE);
> +	if (!dvsec)
>  		dev_warn(&pdev->dev,
>  			 "Device DVSEC not present, skip CXL.mem init\n");
>  
> +	cxl_set_dvsec(cxlds, dvsec);
> +
>  	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
>  	if (rc)
>  		return rc;
> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
> new file mode 100644
> index 000000000000..c06ca750168f
> --- /dev/null
> +++ b/include/linux/cxl/cxl.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
> +
> +#ifndef __CXL_H
> +#define __CXL_H
> +
> +#include <linux/device.h>
> +
> +enum cxl_resource {
> +	CXL_RES_DPA,
> +	CXL_RES_RAM,
> +	CXL_RES_PMEM,
> +};
> +
> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
> +
> +void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);
> +void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial);
> +int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> +		     enum cxl_resource);
> +#endif
> diff --git a/include/linux/cxl/pci.h b/include/linux/cxl/pci.h
> new file mode 100644
> index 000000000000..ad63560caa2c
> --- /dev/null
> +++ b/include/linux/cxl/pci.h

Just a reminder that this should go in as include/cxl/pci.h now.

DJ

> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> +
> +#ifndef __CXL_ACCEL_PCI_H
> +#define __CXL_ACCEL_PCI_H
> +
> +/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
> +#define CXL_DVSEC_PCIE_DEVICE					0
> +#define   CXL_DVSEC_CAP_OFFSET		0xA
> +#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
> +#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
> +#define   CXL_DVSEC_CTRL_OFFSET		0xC
> +#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
> +#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + ((i) * 0x10))
> +#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + ((i) * 0x10))
> +#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
> +#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
> +#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
> +#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + ((i) * 0x10))
> +#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + ((i) * 0x10))
> +#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
> +
> +#endif


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 01/26] cxl: add type2 device basic support
  2024-10-28 18:05   ` Dave Jiang
@ 2024-10-30 16:26     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-30 16:26 UTC (permalink / raw)
  To: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet


On 10/28/24 18:05, Dave Jiang wrote:


<snip>


>
> diff --git a/include/linux/cxl/pci.h b/include/linux/cxl/pci.h
> new file mode 100644
> index 000000000000..ad63560caa2c
> --- /dev/null
> +++ b/include/linux/cxl/pci.h
> Just a reminder that this should go in as include/cxl/pci.h now.
>
> DJ


Yes, I'm aware of it. Just the kernel I'm using not having that change 
yet, but I'll do for v5.

Thanks!


>> @@ -0,0 +1,23 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
>> +
>> +#ifndef __CXL_ACCEL_PCI_H
>> +#define __CXL_ACCEL_PCI_H
>> +
>> +/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
>> +#define CXL_DVSEC_PCIE_DEVICE					0
>> +#define   CXL_DVSEC_CAP_OFFSET		0xA
>> +#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
>> +#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
>> +#define   CXL_DVSEC_CTRL_OFFSET		0xC
>> +#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
>> +#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + ((i) * 0x10))
>> +#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + ((i) * 0x10))
>> +#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
>> +#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
>> +#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
>> +#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + ((i) * 0x10))
>> +#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + ((i) * 0x10))
>> +#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
>> +
>> +#endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 02/26] sfc: add cxl support using new CXL API
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 01/26] cxl: add type2 device basic support alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 21:48   ` Ben Cheatham
  2024-10-25 14:03   ` Jonathan Cameron
  2024-10-17 16:52 ` [PATCH v4 03/26] cxl: add capabilities field to cxl_dev_state and cxl_port alejandro.lucero-palau
                   ` (24 subsequent siblings)
  26 siblings, 2 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Add CXL initialization based on new CXL API for accel drivers and make
it dependable on kernel CXL configuration.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/net/ethernet/sfc/Kconfig      |  1 +
 drivers/net/ethernet/sfc/Makefile     |  2 +-
 drivers/net/ethernet/sfc/efx.c        | 16 +++++
 drivers/net/ethernet/sfc/efx_cxl.c    | 92 +++++++++++++++++++++++++++
 drivers/net/ethernet/sfc/efx_cxl.h    | 29 +++++++++
 drivers/net/ethernet/sfc/net_driver.h |  6 ++
 6 files changed, 145 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
 create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h

diff --git a/drivers/net/ethernet/sfc/Kconfig b/drivers/net/ethernet/sfc/Kconfig
index 3eb55dcfa8a6..b308a6f674b2 100644
--- a/drivers/net/ethernet/sfc/Kconfig
+++ b/drivers/net/ethernet/sfc/Kconfig
@@ -20,6 +20,7 @@ config SFC
 	tristate "Solarflare SFC9100/EF100-family support"
 	depends on PCI
 	depends on PTP_1588_CLOCK_OPTIONAL
+	depends on CXL_BUS && CXL_BUS=m && m
 	select MDIO
 	select CRC32
 	select NET_DEVLINK
diff --git a/drivers/net/ethernet/sfc/Makefile b/drivers/net/ethernet/sfc/Makefile
index 8f446b9bd5ee..e80c713c3b0c 100644
--- a/drivers/net/ethernet/sfc/Makefile
+++ b/drivers/net/ethernet/sfc/Makefile
@@ -7,7 +7,7 @@ sfc-y			+= efx.o efx_common.o efx_channels.o nic.o \
 			   mcdi_functions.o mcdi_filters.o mcdi_mon.o \
 			   ef100.o ef100_nic.o ef100_netdev.o \
 			   ef100_ethtool.o ef100_rx.o ef100_tx.o \
-			   efx_devlink.o
+			   efx_devlink.o efx_cxl.o
 sfc-$(CONFIG_SFC_MTD)	+= mtd.o
 sfc-$(CONFIG_SFC_SRIOV)	+= sriov.o ef10_sriov.o ef100_sriov.o ef100_rep.o \
                            mae.o tc.o tc_bindings.o tc_counters.o \
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 6f1a01ded7d4..cc7cdaccc5ed 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -33,6 +33,7 @@
 #include "selftest.h"
 #include "sriov.h"
 #include "efx_devlink.h"
+#include "efx_cxl.h"
 
 #include "mcdi_port_common.h"
 #include "mcdi_pcol.h"
@@ -899,6 +900,9 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
 	efx_pci_remove_main(efx);
 
 	efx_fini_io(efx);
+
+	efx_cxl_exit(efx);
+
 	pci_dbg(efx->pci_dev, "shutdown successful\n");
 
 	efx_fini_devlink_and_unlock(efx);
@@ -1109,6 +1113,15 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
 	if (rc)
 		goto fail2;
 
+	/* A successful cxl initialization implies a CXL region created to be
+	 * used for PIO buffers. If there is no CXL support, or initialization
+	 * fails, efx_cxl_pio_initialised wll be false and legacy PIO buffers
+	 * defined at specific PCI BAR regions will be used.
+	 */
+	rc = efx_cxl_init(efx);
+	if (rc)
+		pci_err(pci_dev, "CXL initialization failed with error %d\n", rc);
+
 	rc = efx_pci_probe_post_io(efx);
 	if (rc) {
 		/* On failure, retry once immediately.
@@ -1380,3 +1393,6 @@ MODULE_AUTHOR("Solarflare Communications and "
 MODULE_DESCRIPTION("Solarflare network driver");
 MODULE_LICENSE("GPL");
 MODULE_DEVICE_TABLE(pci, efx_pci_table);
+#if IS_ENABLED(CONFIG_CXL_BUS)
+MODULE_SOFTDEP("pre: cxl_core cxl_port cxl_acpi cxl-mem");
+#endif
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
new file mode 100644
index 000000000000..fb3eef339b34
--- /dev/null
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -0,0 +1,92 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/****************************************************************************
+ *
+ * Driver for AMD network controllers and boards
+ * Copyright (C) 2024, Advanced Micro Devices, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation, incorporated herein by reference.
+ */
+
+#include <linux/cxl/cxl.h>
+#include <linux/cxl/pci.h>
+#include <linux/pci.h>
+
+#include "net_driver.h"
+#include "efx_cxl.h"
+
+#define EFX_CTPIO_BUFFER_SIZE	SZ_256M
+
+int efx_cxl_init(struct efx_nic *efx)
+{
+#if IS_ENABLED(CONFIG_CXL_BUS)
+	struct pci_dev *pci_dev = efx->pci_dev;
+	struct efx_cxl *cxl;
+	struct resource res;
+	u16 dvsec;
+	int rc;
+
+	efx->efx_cxl_pio_initialised = false;
+
+	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
+					  CXL_DVSEC_PCIE_DEVICE);
+	if (!dvsec)
+		return 0;
+
+	pci_dbg(pci_dev, "CXL_DVSEC_PCIE_DEVICE capability found\n");
+
+	cxl = kzalloc(sizeof(*cxl), GFP_KERNEL);
+	if (!cxl)
+		return -ENOMEM;
+
+	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
+	if (IS_ERR(cxl->cxlds)) {
+		pci_err(pci_dev, "CXL accel device state failed");
+		rc = -ENOMEM;
+		goto err1;
+	}
+
+	cxl_set_dvsec(cxl->cxlds, dvsec);
+	cxl_set_serial(cxl->cxlds, pci_dev->dev.id);
+
+	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
+	if (cxl_set_resource(cxl->cxlds, res, CXL_RES_DPA)) {
+		pci_err(pci_dev, "cxl_set_resource DPA failed\n");
+		rc = -EINVAL;
+		goto err2;
+	}
+
+	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
+	if (cxl_set_resource(cxl->cxlds, res, CXL_RES_RAM)) {
+		pci_err(pci_dev, "cxl_set_resource RAM failed\n");
+		rc = -EINVAL;
+		goto err2;
+	}
+
+	efx->cxl = cxl;
+#endif
+
+	return 0;
+
+#if IS_ENABLED(CONFIG_CXL_BUS)
+err2:
+	kfree(cxl->cxlds);
+err1:
+	kfree(cxl);
+	return rc;
+
+#endif
+}
+
+void efx_cxl_exit(struct efx_nic *efx)
+{
+#if IS_ENABLED(CONFIG_CXL_BUS)
+	if (efx->cxl) {
+		kfree(efx->cxl->cxlds);
+		kfree(efx->cxl);
+	}
+#endif
+}
+
+MODULE_IMPORT_NS(CXL);
diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
new file mode 100644
index 000000000000..f57fb2afd124
--- /dev/null
+++ b/drivers/net/ethernet/sfc/efx_cxl.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/****************************************************************************
+ * Driver for AMD network controllers and boards
+ * Copyright (C) 2024, Advanced Micro Devices, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation, incorporated herein by reference.
+ */
+
+#ifndef EFX_CXL_H
+#define EFX_CXL_H
+
+struct efx_nic;
+struct cxl_dev_state;
+
+struct efx_cxl {
+	struct cxl_dev_state *cxlds;
+	struct cxl_memdev *cxlmd;
+	struct cxl_root_decoder *cxlrd;
+	struct cxl_port *endpoint;
+	struct cxl_endpoint_decoder *cxled;
+	struct cxl_region *efx_region;
+	void __iomem *ctpio_cxl;
+};
+
+int efx_cxl_init(struct efx_nic *efx);
+void efx_cxl_exit(struct efx_nic *efx);
+#endif
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index b85c51cbe7f9..77261de65e63 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -817,6 +817,8 @@ enum efx_xdp_tx_queues_mode {
 
 struct efx_mae;
 
+struct efx_cxl;
+
 /**
  * struct efx_nic - an Efx NIC
  * @name: Device name (net device name or bus id before net device registered)
@@ -963,6 +965,8 @@ struct efx_mae;
  * @tc: state for TC offload (EF100).
  * @devlink: reference to devlink structure owned by this device
  * @dl_port: devlink port associated with the PF
+ * @cxl: details of related cxl objects
+ * @efx_cxl_pio_initialised: clx initialization outcome.
  * @mem_bar: The BAR that is mapped into membase.
  * @reg_base: Offset from the start of the bar to the function control window.
  * @monitor_work: Hardware monitor workitem
@@ -1148,6 +1152,8 @@ struct efx_nic {
 
 	struct devlink *devlink;
 	struct devlink_port *dl_port;
+	struct efx_cxl *cxl;
+	bool efx_cxl_pio_initialised;
 	unsigned int mem_bar;
 	u32 reg_base;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 02/26] sfc: add cxl support using new CXL API
  2024-10-17 16:52 ` [PATCH v4 02/26] sfc: add cxl support using new CXL API alejandro.lucero-palau
@ 2024-10-17 21:48   ` Ben Cheatham
  2024-10-18 13:38     ` Alejandro Lucero Palau
  2024-10-25 14:03   ` Jonathan Cameron
  1 sibling, 1 reply; 64+ messages in thread
From: Ben Cheatham @ 2024-10-17 21:48 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet

Hi Alejandro,

Thanks for sending this out, comments inline (for this patch and more).

On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Add CXL initialization based on new CXL API for accel drivers and make
> it dependable on kernel CXL configuration.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/net/ethernet/sfc/Kconfig      |  1 +
>  drivers/net/ethernet/sfc/Makefile     |  2 +-
>  drivers/net/ethernet/sfc/efx.c        | 16 +++++
>  drivers/net/ethernet/sfc/efx_cxl.c    | 92 +++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/efx_cxl.h    | 29 +++++++++
>  drivers/net/ethernet/sfc/net_driver.h |  6 ++
>  6 files changed, 145 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>  create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
> 
> diff --git a/drivers/net/ethernet/sfc/Kconfig b/drivers/net/ethernet/sfc/Kconfig
> index 3eb55dcfa8a6..b308a6f674b2 100644
> --- a/drivers/net/ethernet/sfc/Kconfig
> +++ b/drivers/net/ethernet/sfc/Kconfig
> @@ -20,6 +20,7 @@ config SFC
>  	tristate "Solarflare SFC9100/EF100-family support"
>  	depends on PCI
>  	depends on PTP_1588_CLOCK_OPTIONAL
> +	depends on CXL_BUS && CXL_BUS=m && m

It seems weird to me that this would be marked as a tristate Kconfig option, but is
required to be set to 'm'. Also, I'm assuming that SFC cards exist without CXL support,
so this would add an unecessary dependency for those cards. So, I'm going to suggest
using a secondary Kconfig symbol like this:

config SFC_CXL
	tristate "Colarflare SFC9100/EF100-family CXL support"
	depends on SFC && m
	depends on CXL_BUS=m
	help
	  CXL support for SFC driver...

And then only compiling efx_cxl.c when that symbol is selected. This would also
require having a stub for efx_cxl_init()/exit() in efx_cxl.h. That *should* have
the same behavior as what you want above, but without requiring CXL to enable the
base SFC driver. I'm no Kconfig wizard, so it would pay to double check the above,
but I don't see a reason why something like it shouldn't be possible.

>  	select MDIO
>  	select CRC32
>  	select NET_DEVLINK
> diff --git a/drivers/net/ethernet/sfc/Makefile b/drivers/net/ethernet/sfc/Makefile
> index 8f446b9bd5ee..e80c713c3b0c 100644
> --- a/drivers/net/ethernet/sfc/Makefile
> +++ b/drivers/net/ethernet/sfc/Makefile
> @@ -7,7 +7,7 @@ sfc-y			+= efx.o efx_common.o efx_channels.o nic.o \
>  			   mcdi_functions.o mcdi_filters.o mcdi_mon.o \
>  			   ef100.o ef100_nic.o ef100_netdev.o \
>  			   ef100_ethtool.o ef100_rx.o ef100_tx.o \
> -			   efx_devlink.o
> +			   efx_devlink.o efx_cxl.o

With above suggestion this becomes:

+ sfc-$(CONFIG_SFC_CXL)		+= efx_cxl.o

>  sfc-$(CONFIG_SFC_MTD)	+= mtd.o
>  sfc-$(CONFIG_SFC_SRIOV)	+= sriov.o ef10_sriov.o ef100_sriov.o ef100_rep.o \
>                             mae.o tc.o tc_bindings.o tc_counters.o \
> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
> index 6f1a01ded7d4..cc7cdaccc5ed 100644
> --- a/drivers/net/ethernet/sfc/efx.c
> +++ b/drivers/net/ethernet/sfc/efx.c
> @@ -33,6 +33,7 @@
>  #include "selftest.h"
>  #include "sriov.h"
>  #include "efx_devlink.h"
> +#include "efx_cxl.h"
>  
>  #include "mcdi_port_common.h"
>  #include "mcdi_pcol.h"
> @@ -899,6 +900,9 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>  	efx_pci_remove_main(efx);
>  
>  	efx_fini_io(efx);
> +
> +	efx_cxl_exit(efx);
> +
>  	pci_dbg(efx->pci_dev, "shutdown successful\n");
>  
>  	efx_fini_devlink_and_unlock(efx);
> @@ -1109,6 +1113,15 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
>  	if (rc)
>  		goto fail2;
>  
> +	/* A successful cxl initialization implies a CXL region created to be
> +	 * used for PIO buffers. If there is no CXL support, or initialization
> +	 * fails, efx_cxl_pio_initialised wll be false and legacy PIO buffers
> +	 * defined at specific PCI BAR regions will be used.
> +	 */
> +	rc = efx_cxl_init(efx);
> +	if (rc)
> +		pci_err(pci_dev, "CXL initialization failed with error %d\n", rc);
> +
>  	rc = efx_pci_probe_post_io(efx);
>  	if (rc) {
>  		/* On failure, retry once immediately.
> @@ -1380,3 +1393,6 @@ MODULE_AUTHOR("Solarflare Communications and "
>  MODULE_DESCRIPTION("Solarflare network driver");
>  MODULE_LICENSE("GPL");
>  MODULE_DEVICE_TABLE(pci, efx_pci_table);
> +#if IS_ENABLED(CONFIG_CXL_BUS)
> +MODULE_SOFTDEP("pre: cxl_core cxl_port cxl_acpi cxl-mem");
> +#endif
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> new file mode 100644
> index 000000000000..fb3eef339b34
> --- /dev/null
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -0,0 +1,92 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/****************************************************************************
> + *
> + * Driver for AMD network controllers and boards
> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +#include <linux/cxl/cxl.h>
> +#include <linux/cxl/pci.h>
> +#include <linux/pci.h>
> +
> +#include "net_driver.h"
> +#include "efx_cxl.h"
> +
> +#define EFX_CTPIO_BUFFER_SIZE	SZ_256M
> +
> +int efx_cxl_init(struct efx_nic *efx)
> +{
> +#if IS_ENABLED(CONFIG_CXL_BUS)

With suggestion above you can drop this #if, since the file won't be
compiled when this is false.

> +	struct pci_dev *pci_dev = efx->pci_dev;
> +	struct efx_cxl *cxl;
> +	struct resource res;
> +	u16 dvsec;
> +	int rc;
> +
> +	efx->efx_cxl_pio_initialised = false;
> +
> +	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
> +					  CXL_DVSEC_PCIE_DEVICE);
> +	if (!dvsec)
> +		return 0;
> +
> +	pci_dbg(pci_dev, "CXL_DVSEC_PCIE_DEVICE capability found\n");
> +
> +	cxl = kzalloc(sizeof(*cxl), GFP_KERNEL);
> +	if (!cxl)
> +		return -ENOMEM;
> +
> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
> +	if (IS_ERR(cxl->cxlds)) {
> +		pci_err(pci_dev, "CXL accel device state failed");
> +		rc = -ENOMEM;
> +		goto err1;
> +	}
> +
> +	cxl_set_dvsec(cxl->cxlds, dvsec);
> +	cxl_set_serial(cxl->cxlds, pci_dev->dev.id);
> +
> +	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
> +	if (cxl_set_resource(cxl->cxlds, res, CXL_RES_DPA)) {
> +		pci_err(pci_dev, "cxl_set_resource DPA failed\n");
> +		rc = -EINVAL;
> +		goto err2;
> +	}
> +
> +	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
> +	if (cxl_set_resource(cxl->cxlds, res, CXL_RES_RAM)) {
> +		pci_err(pci_dev, "cxl_set_resource RAM failed\n");
> +		rc = -EINVAL;
> +		goto err2;
> +	}
> +
> +	efx->cxl = cxl;
> +#endif
> +
> +	return 0;
> +
> +#if IS_ENABLED(CONFIG_CXL_BUS)

Same here...

> +err2:
> +	kfree(cxl->cxlds);
> +err1:
> +	kfree(cxl);
> +	return rc;
> +
> +#endif
> +}
> +
> +void efx_cxl_exit(struct efx_nic *efx)
> +{
> +#if IS_ENABLED(CONFIG_CXL_BUS)

and here.

> +	if (efx->cxl) {
> +		kfree(efx->cxl->cxlds);
> +		kfree(efx->cxl);
> +	}
> +#endif
> +}
> +
> +MODULE_IMPORT_NS(CXL);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
> new file mode 100644
> index 000000000000..f57fb2afd124
> --- /dev/null
> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/****************************************************************************
> + * Driver for AMD network controllers and boards
> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +#ifndef EFX_CXL_H
> +#define EFX_CXL_H
> +
> +struct efx_nic;
> +struct cxl_dev_state;
> +
> +struct efx_cxl {
> +	struct cxl_dev_state *cxlds;
> +	struct cxl_memdev *cxlmd;
> +	struct cxl_root_decoder *cxlrd;
> +	struct cxl_port *endpoint;
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_region *efx_region;
> +	void __iomem *ctpio_cxl;
> +};
> +
> +int efx_cxl_init(struct efx_nic *efx);
> +void efx_cxl_exit(struct efx_nic *efx);

As mentioned above, you would need a #ifdef block here with stubs for when CONFIG_SFC_CXL isn't enabled.

> +#endif
> diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
> index b85c51cbe7f9..77261de65e63 100644
> --- a/drivers/net/ethernet/sfc/net_driver.h
> +++ b/drivers/net/ethernet/sfc/net_driver.h
> @@ -817,6 +817,8 @@ enum efx_xdp_tx_queues_mode {
>  
>  struct efx_mae;
>  
> +struct efx_cxl;
> +
>  /**
>   * struct efx_nic - an Efx NIC
>   * @name: Device name (net device name or bus id before net device registered)
> @@ -963,6 +965,8 @@ struct efx_mae;
>   * @tc: state for TC offload (EF100).
>   * @devlink: reference to devlink structure owned by this device
>   * @dl_port: devlink port associated with the PF
> + * @cxl: details of related cxl objects
> + * @efx_cxl_pio_initialised: clx initialization outcome.
>   * @mem_bar: The BAR that is mapped into membase.
>   * @reg_base: Offset from the start of the bar to the function control window.
>   * @monitor_work: Hardware monitor workitem
> @@ -1148,6 +1152,8 @@ struct efx_nic {
>  
>  	struct devlink *devlink;
>  	struct devlink_port *dl_port;
> +	struct efx_cxl *cxl;
> +	bool efx_cxl_pio_initialised;
>  	unsigned int mem_bar;
>  	u32 reg_base;
>  


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 02/26] sfc: add cxl support using new CXL API
  2024-10-17 21:48   ` Ben Cheatham
@ 2024-10-18 13:38     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-18 13:38 UTC (permalink / raw)
  To: benjamin.cheatham, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/17/24 22:48, Ben Cheatham wrote:
> Hi Alejandro,
>
> Thanks for sending this out, comments inline (for this patch and more).
>
> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Add CXL initialization based on new CXL API for accel drivers and make
>> it dependable on kernel CXL configuration.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/net/ethernet/sfc/Kconfig      |  1 +
>>   drivers/net/ethernet/sfc/Makefile     |  2 +-
>>   drivers/net/ethernet/sfc/efx.c        | 16 +++++
>>   drivers/net/ethernet/sfc/efx_cxl.c    | 92 +++++++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/efx_cxl.h    | 29 +++++++++
>>   drivers/net/ethernet/sfc/net_driver.h |  6 ++
>>   6 files changed, 145 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
>>
>> diff --git a/drivers/net/ethernet/sfc/Kconfig b/drivers/net/ethernet/sfc/Kconfig
>> index 3eb55dcfa8a6..b308a6f674b2 100644
>> --- a/drivers/net/ethernet/sfc/Kconfig
>> +++ b/drivers/net/ethernet/sfc/Kconfig
>> @@ -20,6 +20,7 @@ config SFC
>>   	tristate "Solarflare SFC9100/EF100-family support"
>>   	depends on PCI
>>   	depends on PTP_1588_CLOCK_OPTIONAL
>> +	depends on CXL_BUS && CXL_BUS=m && m
> It seems weird to me that this would be marked as a tristate Kconfig option, but is
> required to be set to 'm'. Also, I'm assuming that SFC cards exist without CXL support,
> so this would add an unecessary dependency for those cards. So, I'm going to suggest
> using a secondary Kconfig symbol like this:


Yes, you are right.


My idea was to force sfc as a module if cxl_bus was a module. I tested 
that case, the cxl_bus within the kernel image and sfc as module, and 
both cxl_bus and sfc part of the kernel image. And I forgot the case of 
no cxl_bus!


So, I've already followed your suggestion, not exactly, but I think the 
idea remains. Now there's a cxl option only appearing and therefore 
configurable if the cxl_bus is a module. Inside sfc we have already 
another option, MTD, which is a similar case and requires kernel mtd as 
a module, so I think it is good enough.


I'll do a bit more of testing and do the changes for v5.


Thanks!


> config SFC_CXL
> 	tristate "Colarflare SFC9100/EF100-family CXL support"
> 	depends on SFC && m
> 	depends on CXL_BUS=m
> 	help
> 	  CXL support for SFC driver...
>
> And then only compiling efx_cxl.c when that symbol is selected. This would also
> require having a stub for efx_cxl_init()/exit() in efx_cxl.h. That *should* have
> the same behavior as what you want above, but without requiring CXL to enable the
> base SFC driver. I'm no Kconfig wizard, so it would pay to double check the above,
> but I don't see a reason why something like it shouldn't be possible.
>
>>   	select MDIO
>>   	select CRC32
>>   	select NET_DEVLINK
>> diff --git a/drivers/net/ethernet/sfc/Makefile b/drivers/net/ethernet/sfc/Makefile
>> index 8f446b9bd5ee..e80c713c3b0c 100644
>> --- a/drivers/net/ethernet/sfc/Makefile
>> +++ b/drivers/net/ethernet/sfc/Makefile
>> @@ -7,7 +7,7 @@ sfc-y			+= efx.o efx_common.o efx_channels.o nic.o \
>>   			   mcdi_functions.o mcdi_filters.o mcdi_mon.o \
>>   			   ef100.o ef100_nic.o ef100_netdev.o \
>>   			   ef100_ethtool.o ef100_rx.o ef100_tx.o \
>> -			   efx_devlink.o
>> +			   efx_devlink.o efx_cxl.o
> With above suggestion this becomes:
>
> + sfc-$(CONFIG_SFC_CXL)		+= efx_cxl.o
>
>>   sfc-$(CONFIG_SFC_MTD)	+= mtd.o
>>   sfc-$(CONFIG_SFC_SRIOV)	+= sriov.o ef10_sriov.o ef100_sriov.o ef100_rep.o \
>>                              mae.o tc.o tc_bindings.o tc_counters.o \
>> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
>> index 6f1a01ded7d4..cc7cdaccc5ed 100644
>> --- a/drivers/net/ethernet/sfc/efx.c
>> +++ b/drivers/net/ethernet/sfc/efx.c
>> @@ -33,6 +33,7 @@
>>   #include "selftest.h"
>>   #include "sriov.h"
>>   #include "efx_devlink.h"
>> +#include "efx_cxl.h"
>>   
>>   #include "mcdi_port_common.h"
>>   #include "mcdi_pcol.h"
>> @@ -899,6 +900,9 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>>   	efx_pci_remove_main(efx);
>>   
>>   	efx_fini_io(efx);
>> +
>> +	efx_cxl_exit(efx);
>> +
>>   	pci_dbg(efx->pci_dev, "shutdown successful\n");
>>   
>>   	efx_fini_devlink_and_unlock(efx);
>> @@ -1109,6 +1113,15 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
>>   	if (rc)
>>   		goto fail2;
>>   
>> +	/* A successful cxl initialization implies a CXL region created to be
>> +	 * used for PIO buffers. If there is no CXL support, or initialization
>> +	 * fails, efx_cxl_pio_initialised wll be false and legacy PIO buffers
>> +	 * defined at specific PCI BAR regions will be used.
>> +	 */
>> +	rc = efx_cxl_init(efx);
>> +	if (rc)
>> +		pci_err(pci_dev, "CXL initialization failed with error %d\n", rc);
>> +
>>   	rc = efx_pci_probe_post_io(efx);
>>   	if (rc) {
>>   		/* On failure, retry once immediately.
>> @@ -1380,3 +1393,6 @@ MODULE_AUTHOR("Solarflare Communications and "
>>   MODULE_DESCRIPTION("Solarflare network driver");
>>   MODULE_LICENSE("GPL");
>>   MODULE_DEVICE_TABLE(pci, efx_pci_table);
>> +#if IS_ENABLED(CONFIG_CXL_BUS)
>> +MODULE_SOFTDEP("pre: cxl_core cxl_port cxl_acpi cxl-mem");
>> +#endif
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> new file mode 100644
>> index 000000000000..fb3eef339b34
>> --- /dev/null
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -0,0 +1,92 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/****************************************************************************
>> + *
>> + * Driver for AMD network controllers and boards
>> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation, incorporated herein by reference.
>> + */
>> +
>> +#include <linux/cxl/cxl.h>
>> +#include <linux/cxl/pci.h>
>> +#include <linux/pci.h>
>> +
>> +#include "net_driver.h"
>> +#include "efx_cxl.h"
>> +
>> +#define EFX_CTPIO_BUFFER_SIZE	SZ_256M
>> +
>> +int efx_cxl_init(struct efx_nic *efx)
>> +{
>> +#if IS_ENABLED(CONFIG_CXL_BUS)
> With suggestion above you can drop this #if, since the file won't be
> compiled when this is false.
>
>> +	struct pci_dev *pci_dev = efx->pci_dev;
>> +	struct efx_cxl *cxl;
>> +	struct resource res;
>> +	u16 dvsec;
>> +	int rc;
>> +
>> +	efx->efx_cxl_pio_initialised = false;
>> +
>> +	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
>> +					  CXL_DVSEC_PCIE_DEVICE);
>> +	if (!dvsec)
>> +		return 0;
>> +
>> +	pci_dbg(pci_dev, "CXL_DVSEC_PCIE_DEVICE capability found\n");
>> +
>> +	cxl = kzalloc(sizeof(*cxl), GFP_KERNEL);
>> +	if (!cxl)
>> +		return -ENOMEM;
>> +
>> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
>> +	if (IS_ERR(cxl->cxlds)) {
>> +		pci_err(pci_dev, "CXL accel device state failed");
>> +		rc = -ENOMEM;
>> +		goto err1;
>> +	}
>> +
>> +	cxl_set_dvsec(cxl->cxlds, dvsec);
>> +	cxl_set_serial(cxl->cxlds, pci_dev->dev.id);
>> +
>> +	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
>> +	if (cxl_set_resource(cxl->cxlds, res, CXL_RES_DPA)) {
>> +		pci_err(pci_dev, "cxl_set_resource DPA failed\n");
>> +		rc = -EINVAL;
>> +		goto err2;
>> +	}
>> +
>> +	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>> +	if (cxl_set_resource(cxl->cxlds, res, CXL_RES_RAM)) {
>> +		pci_err(pci_dev, "cxl_set_resource RAM failed\n");
>> +		rc = -EINVAL;
>> +		goto err2;
>> +	}
>> +
>> +	efx->cxl = cxl;
>> +#endif
>> +
>> +	return 0;
>> +
>> +#if IS_ENABLED(CONFIG_CXL_BUS)
> Same here...
>
>> +err2:
>> +	kfree(cxl->cxlds);
>> +err1:
>> +	kfree(cxl);
>> +	return rc;
>> +
>> +#endif
>> +}
>> +
>> +void efx_cxl_exit(struct efx_nic *efx)
>> +{
>> +#if IS_ENABLED(CONFIG_CXL_BUS)
> and here.
>
>> +	if (efx->cxl) {
>> +		kfree(efx->cxl->cxlds);
>> +		kfree(efx->cxl);
>> +	}
>> +#endif
>> +}
>> +
>> +MODULE_IMPORT_NS(CXL);
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
>> new file mode 100644
>> index 000000000000..f57fb2afd124
>> --- /dev/null
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
>> @@ -0,0 +1,29 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/****************************************************************************
>> + * Driver for AMD network controllers and boards
>> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation, incorporated herein by reference.
>> + */
>> +
>> +#ifndef EFX_CXL_H
>> +#define EFX_CXL_H
>> +
>> +struct efx_nic;
>> +struct cxl_dev_state;
>> +
>> +struct efx_cxl {
>> +	struct cxl_dev_state *cxlds;
>> +	struct cxl_memdev *cxlmd;
>> +	struct cxl_root_decoder *cxlrd;
>> +	struct cxl_port *endpoint;
>> +	struct cxl_endpoint_decoder *cxled;
>> +	struct cxl_region *efx_region;
>> +	void __iomem *ctpio_cxl;
>> +};
>> +
>> +int efx_cxl_init(struct efx_nic *efx);
>> +void efx_cxl_exit(struct efx_nic *efx);
> As mentioned above, you would need a #ifdef block here with stubs for when CONFIG_SFC_CXL isn't enabled.
>
>> +#endif
>> diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
>> index b85c51cbe7f9..77261de65e63 100644
>> --- a/drivers/net/ethernet/sfc/net_driver.h
>> +++ b/drivers/net/ethernet/sfc/net_driver.h
>> @@ -817,6 +817,8 @@ enum efx_xdp_tx_queues_mode {
>>   
>>   struct efx_mae;
>>   
>> +struct efx_cxl;
>> +
>>   /**
>>    * struct efx_nic - an Efx NIC
>>    * @name: Device name (net device name or bus id before net device registered)
>> @@ -963,6 +965,8 @@ struct efx_mae;
>>    * @tc: state for TC offload (EF100).
>>    * @devlink: reference to devlink structure owned by this device
>>    * @dl_port: devlink port associated with the PF
>> + * @cxl: details of related cxl objects
>> + * @efx_cxl_pio_initialised: clx initialization outcome.
>>    * @mem_bar: The BAR that is mapped into membase.
>>    * @reg_base: Offset from the start of the bar to the function control window.
>>    * @monitor_work: Hardware monitor workitem
>> @@ -1148,6 +1152,8 @@ struct efx_nic {
>>   
>>   	struct devlink *devlink;
>>   	struct devlink_port *dl_port;
>> +	struct efx_cxl *cxl;
>> +	bool efx_cxl_pio_initialised;
>>   	unsigned int mem_bar;
>>   	u32 reg_base;
>>   

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 02/26] sfc: add cxl support using new CXL API
  2024-10-17 16:52 ` [PATCH v4 02/26] sfc: add cxl support using new CXL API alejandro.lucero-palau
  2024-10-17 21:48   ` Ben Cheatham
@ 2024-10-25 14:03   ` Jonathan Cameron
  2024-10-28 11:59     ` Alejandro Lucero Palau
  1 sibling, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2024-10-25 14:03 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, Alejandro Lucero

On Thu, 17 Oct 2024 17:52:01 +0100
<alejandro.lucero-palau@amd.com> wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Add CXL initialization based on new CXL API for accel drivers and make
> it dependable on kernel CXL configuration.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/net/ethernet/sfc/Kconfig      |  1 +
>  drivers/net/ethernet/sfc/Makefile     |  2 +-
>  drivers/net/ethernet/sfc/efx.c        | 16 +++++
>  drivers/net/ethernet/sfc/efx_cxl.c    | 92 +++++++++++++++++++++++++++
>  drivers/net/ethernet/sfc/efx_cxl.h    | 29 +++++++++
>  drivers/net/ethernet/sfc/net_driver.h |  6 ++
>  6 files changed, 145 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>  create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
> 
> diff --git a/drivers/net/ethernet/sfc/Kconfig b/drivers/net/ethernet/sfc/Kconfig
> index 3eb55dcfa8a6..b308a6f674b2 100644
> --- a/drivers/net/ethernet/sfc/Kconfig
> +++ b/drivers/net/ethernet/sfc/Kconfig
> @@ -20,6 +20,7 @@ config SFC
>  	tristate "Solarflare SFC9100/EF100-family support"
>  	depends on PCI
>  	depends on PTP_1588_CLOCK_OPTIONAL
> +	depends on CXL_BUS && CXL_BUS=m && m

 Do some makefile magic and stubs to only support efx_cxl.c
being built at all if necessary conditions met.
Doesn't necessarily need a visible control.

config SFC9100_CXL
	boolean

then here have
select SFC9100_CXL if CXL_BUS


>  	select MDIO
>  	select CRC32
>  	select NET_DEVLINK

> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> new file mode 100644
> index 000000000000..fb3eef339b34
> --- /dev/null
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -0,0 +1,92 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/****************************************************************************
> + *
> + * Driver for AMD network controllers and boards
> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +#include <linux/cxl/cxl.h>
> +#include <linux/cxl/pci.h>
> +#include <linux/pci.h>
> +
> +#include "net_driver.h"
> +#include "efx_cxl.h"
> +
> +#define EFX_CTPIO_BUFFER_SIZE	SZ_256M
> +
> +int efx_cxl_init(struct efx_nic *efx)
> +{
> +#if IS_ENABLED(CONFIG_CXL_BUS)

If it can't do anything useful, make the build depend on this
and provide stubs for when it isn't enabled.

I'd not expect to see any ifdef stuff for basic CXL in this file
as it should build unless they are all configured.

> +	struct pci_dev *pci_dev = efx->pci_dev;
> +	struct efx_cxl *cxl;
> +	struct resource res;
> +	u16 dvsec;
> +	int rc;
> +
> +	efx->efx_cxl_pio_initialised = false;
> +
> +	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
> +					  CXL_DVSEC_PCIE_DEVICE);
> +	if (!dvsec)
> +		return 0;
> +
> +	pci_dbg(pci_dev, "CXL_DVSEC_PCIE_DEVICE capability found\n");
> +
> +	cxl = kzalloc(sizeof(*cxl), GFP_KERNEL);

__free magic here.
Assuming later changes don't make that a bad idea - I've not
read the whole set for a while.


> +	if (!cxl)
> +		return -ENOMEM;
> +
> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);

Stash this in a local cxl_dev_state for now and use __free() for that.

> +	if (IS_ERR(cxl->cxlds)) {
> +		pci_err(pci_dev, "CXL accel device state failed");
> +		rc = -ENOMEM;
> +		goto err1;
> +	}
> +
> +	cxl_set_dvsec(cxl->cxlds, dvsec);
> +	cxl_set_serial(cxl->cxlds, pci_dev->dev.id);
> +
> +	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
> +	if (cxl_set_resource(cxl->cxlds, res, CXL_RES_DPA)) {
> +		pci_err(pci_dev, "cxl_set_resource DPA failed\n");
> +		rc = -EINVAL;
> +		goto err2;
> +	}
> +
> +	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
> +	if (cxl_set_resource(cxl->cxlds, res, CXL_RES_RAM)) {
> +		pci_err(pci_dev, "cxl_set_resource RAM failed\n");
> +		rc = -EINVAL;
> +		goto err2;
> +	}
> +
	cxl->cxlds = no_free_ptr(cxlds);
	efx->cxl = no_free_ptr(cxl);

> +	efx->cxl = cxl;
> +#endif
> +
> +	return 0;
> +
> +#if IS_ENABLED(CONFIG_CXL_BUS)
With __free changes suggest above, no need to do anything here
and can return directly from the error checks above.

> +err2:
> +	kfree(cxl->cxlds);
> +err1:
> +	kfree(cxl);
> +	return rc;
> +
> +#endif
> +}
> +
> +void efx_cxl_exit(struct efx_nic *efx)
> +{
> +#if IS_ENABLED(CONFIG_CXL_BUS)

IS_REACHABLE() maybe, so if this driver is built in but CXL_BUS
is not then this will go away.

> +	if (efx->cxl) {
> +		kfree(efx->cxl->cxlds);
> +		kfree(efx->cxl);
> +	}
> +#endif
> +}
> +
> +MODULE_IMPORT_NS(CXL);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
> new file mode 100644
> index 000000000000..f57fb2afd124
> --- /dev/null
> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/****************************************************************************
> + * Driver for AMD network controllers and boards
> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +#ifndef EFX_CXL_H
> +#define EFX_CXL_H
> +
> +struct efx_nic;
> +struct cxl_dev_state;

struct cxl_memdev;
struct cxl_root_decoder;
struct cxl_port;
...

> +
> +struct efx_cxl {
> +	struct cxl_dev_state *cxlds;
> +	struct cxl_memdev *cxlmd;
> +	struct cxl_root_decoder *cxlrd;
> +	struct cxl_port *endpoint;
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_region *efx_region;
> +	void __iomem *ctpio_cxl;
> +};
> +
> +int efx_cxl_init(struct efx_nic *efx);
> +void efx_cxl_exit(struct efx_nic *efx);
> +#endif
> diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
> index b85c51cbe7f9..77261de65e63 100644
> --- a/drivers/net/ethernet/sfc/net_driver.h
> +++ b/drivers/net/ethernet/sfc/net_driver.h
> @@ -817,6 +817,8 @@ enum efx_xdp_tx_queues_mode {
>  
>  struct efx_mae;
>  
> +struct efx_cxl;
> +
>  /**
>   * struct efx_nic - an Efx NIC
>   * @name: Device name (net device name or bus id before net device registered)
> @@ -963,6 +965,8 @@ struct efx_mae;
>   * @tc: state for TC offload (EF100).
>   * @devlink: reference to devlink structure owned by this device
>   * @dl_port: devlink port associated with the PF
> + * @cxl: details of related cxl objects
> + * @efx_cxl_pio_initialised: clx initialization outcome.

cxl

Also, it's in a struct called efx_nic, so is the efx_ prefix
useful?

>   * @mem_bar: The BAR that is mapped into membase.
>   * @reg_base: Offset from the start of the bar to the function control window.
>   * @monitor_work: Hardware monitor workitem
> @@ -1148,6 +1152,8 @@ struct efx_nic {
>  
>  	struct devlink *devlink;
>  	struct devlink_port *dl_port;
> +	struct efx_cxl *cxl;
> +	bool efx_cxl_pio_initialised;
>  	unsigned int mem_bar;
>  	u32 reg_base;
>  


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 02/26] sfc: add cxl support using new CXL API
  2024-10-25 14:03   ` Jonathan Cameron
@ 2024-10-28 11:59     ` Alejandro Lucero Palau
  2024-10-29 15:14       ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-28 11:59 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/25/24 15:03, Jonathan Cameron wrote:
> On Thu, 17 Oct 2024 17:52:01 +0100
> <alejandro.lucero-palau@amd.com> wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Add CXL initialization based on new CXL API for accel drivers and make
>> it dependable on kernel CXL configuration.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/net/ethernet/sfc/Kconfig      |  1 +
>>   drivers/net/ethernet/sfc/Makefile     |  2 +-
>>   drivers/net/ethernet/sfc/efx.c        | 16 +++++
>>   drivers/net/ethernet/sfc/efx_cxl.c    | 92 +++++++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/efx_cxl.h    | 29 +++++++++
>>   drivers/net/ethernet/sfc/net_driver.h |  6 ++
>>   6 files changed, 145 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
>>
>> diff --git a/drivers/net/ethernet/sfc/Kconfig b/drivers/net/ethernet/sfc/Kconfig
>> index 3eb55dcfa8a6..b308a6f674b2 100644
>> --- a/drivers/net/ethernet/sfc/Kconfig
>> +++ b/drivers/net/ethernet/sfc/Kconfig
>> @@ -20,6 +20,7 @@ config SFC
>>   	tristate "Solarflare SFC9100/EF100-family support"
>>   	depends on PCI
>>   	depends on PTP_1588_CLOCK_OPTIONAL
>> +	depends on CXL_BUS && CXL_BUS=m && m
>   Do some makefile magic and stubs to only support efx_cxl.c
> being built at all if necessary conditions met.
> Doesn't necessarily need a visible control.


Yes, I have already change this after Ben Cheatham's comments.


> config SFC9100_CXL
> 	boolean
>
> then here have
> select SFC9100_CXL if CXL_BUS
>
>
>>   	select MDIO
>>   	select CRC32
>>   	select NET_DEVLINK
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> new file mode 100644
>> index 000000000000..fb3eef339b34
>> --- /dev/null
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -0,0 +1,92 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/****************************************************************************
>> + *
>> + * Driver for AMD network controllers and boards
>> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation, incorporated herein by reference.
>> + */
>> +
>> +#include <linux/cxl/cxl.h>
>> +#include <linux/cxl/pci.h>
>> +#include <linux/pci.h>
>> +
>> +#include "net_driver.h"
>> +#include "efx_cxl.h"
>> +
>> +#define EFX_CTPIO_BUFFER_SIZE	SZ_256M
>> +
>> +int efx_cxl_init(struct efx_nic *efx)
>> +{
>> +#if IS_ENABLED(CONFIG_CXL_BUS)
> If it can't do anything useful, make the build depend on this
> and provide stubs for when it isn't enabled.
>
> I'd not expect to see any ifdef stuff for basic CXL in this file
> as it should build unless they are all configured.


Right. I got rid of them after the changes commented above.


>> +	struct pci_dev *pci_dev = efx->pci_dev;
>> +	struct efx_cxl *cxl;
>> +	struct resource res;
>> +	u16 dvsec;
>> +	int rc;
>> +
>> +	efx->efx_cxl_pio_initialised = false;
>> +
>> +	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
>> +					  CXL_DVSEC_PCIE_DEVICE);
>> +	if (!dvsec)
>> +		return 0;
>> +
>> +	pci_dbg(pci_dev, "CXL_DVSEC_PCIE_DEVICE capability found\n");
>> +
>> +	cxl = kzalloc(sizeof(*cxl), GFP_KERNEL);
> __free magic here.
> Assuming later changes don't make that a bad idea - I've not
> read the whole set for a while.


Remember we are in netdev territory and those free magic things are not 
liked ...


>
>> +	if (!cxl)
>> +		return -ENOMEM;
>> +
>> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
> Stash this in a local cxl_dev_state for now and use __free() for that.
>
>> +	if (IS_ERR(cxl->cxlds)) {
>> +		pci_err(pci_dev, "CXL accel device state failed");
>> +		rc = -ENOMEM;
>> +		goto err1;
>> +	}
>> +
>> +	cxl_set_dvsec(cxl->cxlds, dvsec);
>> +	cxl_set_serial(cxl->cxlds, pci_dev->dev.id);
>> +
>> +	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
>> +	if (cxl_set_resource(cxl->cxlds, res, CXL_RES_DPA)) {
>> +		pci_err(pci_dev, "cxl_set_resource DPA failed\n");
>> +		rc = -EINVAL;
>> +		goto err2;
>> +	}
>> +
>> +	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>> +	if (cxl_set_resource(cxl->cxlds, res, CXL_RES_RAM)) {
>> +		pci_err(pci_dev, "cxl_set_resource RAM failed\n");
>> +		rc = -EINVAL;
>> +		goto err2;
>> +	}
>> +
> 	cxl->cxlds = no_free_ptr(cxlds);
> 	efx->cxl = no_free_ptr(cxl);
>
>> +	efx->cxl = cxl;
>> +#endif
>> +
>> +	return 0;
>> +
>> +#if IS_ENABLED(CONFIG_CXL_BUS)
> With __free changes suggest above, no need to do anything here
> and can return directly from the error checks above.
>
>> +err2:
>> +	kfree(cxl->cxlds);
>> +err1:
>> +	kfree(cxl);
>> +	return rc;
>> +
>> +#endif
>> +}
>> +
>> +void efx_cxl_exit(struct efx_nic *efx)
>> +{
>> +#if IS_ENABLED(CONFIG_CXL_BUS)
> IS_REACHABLE() maybe, so if this driver is built in but CXL_BUS
> is not then this will go away.
>
>> +	if (efx->cxl) {
>> +		kfree(efx->cxl->cxlds);
>> +		kfree(efx->cxl);
>> +	}
>> +#endif
>> +}
>> +
>> +MODULE_IMPORT_NS(CXL);
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
>> new file mode 100644
>> index 000000000000..f57fb2afd124
>> --- /dev/null
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
>> @@ -0,0 +1,29 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/****************************************************************************
>> + * Driver for AMD network controllers and boards
>> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation, incorporated herein by reference.
>> + */
>> +
>> +#ifndef EFX_CXL_H
>> +#define EFX_CXL_H
>> +
>> +struct efx_nic;
>> +struct cxl_dev_state;
> struct cxl_memdev;
> struct cxl_root_decoder;
> struct cxl_port;
> ...
>

Yes. I'll do it.


>> +
>> +struct efx_cxl {
>> +	struct cxl_dev_state *cxlds;
>> +	struct cxl_memdev *cxlmd;
>> +	struct cxl_root_decoder *cxlrd;
>> +	struct cxl_port *endpoint;
>> +	struct cxl_endpoint_decoder *cxled;
>> +	struct cxl_region *efx_region;
>> +	void __iomem *ctpio_cxl;
>> +};
>> +
>> +int efx_cxl_init(struct efx_nic *efx);
>> +void efx_cxl_exit(struct efx_nic *efx);
>> +#endif
>> diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
>> index b85c51cbe7f9..77261de65e63 100644
>> --- a/drivers/net/ethernet/sfc/net_driver.h
>> +++ b/drivers/net/ethernet/sfc/net_driver.h
>> @@ -817,6 +817,8 @@ enum efx_xdp_tx_queues_mode {
>>   
>>   struct efx_mae;
>>   
>> +struct efx_cxl;
>> +
>>   /**
>>    * struct efx_nic - an Efx NIC
>>    * @name: Device name (net device name or bus id before net device registered)
>> @@ -963,6 +965,8 @@ struct efx_mae;
>>    * @tc: state for TC offload (EF100).
>>    * @devlink: reference to devlink structure owned by this device
>>    * @dl_port: devlink port associated with the PF
>> + * @cxl: details of related cxl objects
>> + * @efx_cxl_pio_initialised: clx initialization outcome.
> cxl


Well spotted. I'll fix it.


> Also, it's in a struct called efx_nic, so is the efx_ prefix
> useful?


I do not like to have the name as the struct ...

Anyways, thanks for the review.


>>    * @mem_bar: The BAR that is mapped into membase.
>>    * @reg_base: Offset from the start of the bar to the function control window.
>>    * @monitor_work: Hardware monitor workitem
>> @@ -1148,6 +1152,8 @@ struct efx_nic {
>>   
>>   	struct devlink *devlink;
>>   	struct devlink_port *dl_port;
>> +	struct efx_cxl *cxl;
>> +	bool efx_cxl_pio_initialised;
>>   	unsigned int mem_bar;
>>   	u32 reg_base;
>>   

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 02/26] sfc: add cxl support using new CXL API
  2024-10-28 11:59     ` Alejandro Lucero Palau
@ 2024-10-29 15:14       ` Jonathan Cameron
  2024-10-30 16:31         ` Alejandro Lucero Palau
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2024-10-29 15:14 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet

> >> +
> >> +	cxl = kzalloc(sizeof(*cxl), GFP_KERNEL);  
> > __free magic here.
> > Assuming later changes don't make that a bad idea - I've not
> > read the whole set for a while.  
> 
> 
> Remember we are in netdev territory and those free magic things are not 
> liked ...

I'll keep forgetting that. Feel free to ignore me when I do!



> >> diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
> >> index b85c51cbe7f9..77261de65e63 100644
> >> --- a/drivers/net/ethernet/sfc/net_driver.h
> >> +++ b/drivers/net/ethernet/sfc/net_driver.h
> >> @@ -817,6 +817,8 @@ enum efx_xdp_tx_queues_mode {
> >>   
> >>   struct efx_mae;
> >>   
> >> +struct efx_cxl;
> >> +
> >>   /**
> >>    * struct efx_nic - an Efx NIC
> >>    * @name: Device name (net device name or bus id before net device registered)
> >> @@ -963,6 +965,8 @@ struct efx_mae;
> >>    * @tc: state for TC offload (EF100).
> >>    * @devlink: reference to devlink structure owned by this device
> >>    * @dl_port: devlink port associated with the PF
> >> + * @cxl: details of related cxl objects
> >> + * @efx_cxl_pio_initialised: clx initialization outcome.  
> > cxl  
> 
> 
> Well spotted. I'll fix it.
> 
> 
> > Also, it's in a struct called efx_nic, so is the efx_ prefix
> > useful?  
> 
> 
> I do not like to have the name as the struct ...
You've lost me.  efx_nic->cxl_pio_initialised was that I was suggesting
and not setting how this comment applies.

J

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 02/26] sfc: add cxl support using new CXL API
  2024-10-29 15:14       ` Jonathan Cameron
@ 2024-10-30 16:31         ` Alejandro Lucero Palau
  0 siblings, 0 replies; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-30 16:31 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet


On 10/29/24 15:14, Jonathan Cameron wrote:
>>>> +
>>>> +	cxl = kzalloc(sizeof(*cxl), GFP_KERNEL);
>>> __free magic here.
>>> Assuming later changes don't make that a bad idea - I've not
>>> read the whole set for a while.
>>
>> Remember we are in netdev territory and those free magic things are not
>> liked ...
> I'll keep forgetting that. Feel free to ignore me when I do!
>
>
>
>>>> diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
>>>> index b85c51cbe7f9..77261de65e63 100644
>>>> --- a/drivers/net/ethernet/sfc/net_driver.h
>>>> +++ b/drivers/net/ethernet/sfc/net_driver.h
>>>> @@ -817,6 +817,8 @@ enum efx_xdp_tx_queues_mode {
>>>>    
>>>>    struct efx_mae;
>>>>    
>>>> +struct efx_cxl;
>>>> +
>>>>    /**
>>>>     * struct efx_nic - an Efx NIC
>>>>     * @name: Device name (net device name or bus id before net device registered)
>>>> @@ -963,6 +965,8 @@ struct efx_mae;
>>>>     * @tc: state for TC offload (EF100).
>>>>     * @devlink: reference to devlink structure owned by this device
>>>>     * @dl_port: devlink port associated with the PF
>>>> + * @cxl: details of related cxl objects
>>>> + * @efx_cxl_pio_initialised: clx initialization outcome.
>>> cxl
>>
>> Well spotted. I'll fix it.
>>
>>
>>> Also, it's in a struct called efx_nic, so is the efx_ prefix
>>> useful?
>>
>> I do not like to have the name as the struct ...
> You've lost me.  efx_nic->cxl_pio_initialised was that I was suggesting
> and not setting how this comment applies.


I could try some excuses ... but it was my silly side or maybe not 
enough coffee.

You are of course right, and the prefix is not needed.

I'll fix it for v5.

Thanks!


> J

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 03/26] cxl: add capabilities field to cxl_dev_state and cxl_port
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 01/26] cxl: add type2 device basic support alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 02/26] sfc: add cxl support using new CXL API alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-25 14:14   ` Jonathan Cameron
  2024-10-28 18:19   ` Dave Jiang
  2024-10-17 16:52 ` [PATCH v4 04/26] cxl/pci: add check for validating capabilities alejandro.lucero-palau
                   ` (23 subsequent siblings)
  26 siblings, 2 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Type2 devices have some Type3 functionalities as optional like an mbox
or an hdm decoder, and CXL core needs a way to know what an CXL accelerator
implements.

Add a new field to cxl_dev_state for keeping device capabilities as
discovered during initialization. Add same field to cxl_port as registers
discovery is also used during port initialization.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/port.c | 11 +++++++----
 drivers/cxl/core/regs.c | 21 ++++++++++++++-------
 drivers/cxl/cxl.h       |  9 ++++++---
 drivers/cxl/cxlmem.h    |  2 ++
 drivers/cxl/pci.c       | 10 ++++++----
 include/linux/cxl/cxl.h | 31 +++++++++++++++++++++++++++++++
 6 files changed, 66 insertions(+), 18 deletions(-)

diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 1d5007e3795a..7b859b79d59d 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -749,7 +749,7 @@ static struct cxl_port *cxl_port_alloc(struct device *uport_dev,
 }
 
 static int cxl_setup_comp_regs(struct device *host, struct cxl_register_map *map,
-			       resource_size_t component_reg_phys)
+			       resource_size_t component_reg_phys, unsigned long *caps)
 {
 	*map = (struct cxl_register_map) {
 		.host = host,
@@ -763,7 +763,7 @@ static int cxl_setup_comp_regs(struct device *host, struct cxl_register_map *map
 	map->reg_type = CXL_REGLOC_RBI_COMPONENT;
 	map->max_size = CXL_COMPONENT_REG_BLOCK_SIZE;
 
-	return cxl_setup_regs(map);
+	return cxl_setup_regs(map, caps);
 }
 
 static int cxl_port_setup_regs(struct cxl_port *port,
@@ -772,7 +772,7 @@ static int cxl_port_setup_regs(struct cxl_port *port,
 	if (dev_is_platform(port->uport_dev))
 		return 0;
 	return cxl_setup_comp_regs(&port->dev, &port->reg_map,
-				   component_reg_phys);
+				   component_reg_phys, port->capabilities);
 }
 
 static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport,
@@ -789,7 +789,8 @@ static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport,
 	 * NULL.
 	 */
 	rc = cxl_setup_comp_regs(dport->dport_dev, &dport->reg_map,
-				 component_reg_phys);
+				 component_reg_phys,
+				 dport->port->capabilities);
 	dport->reg_map.host = host;
 	return rc;
 }
@@ -858,6 +859,8 @@ static struct cxl_port *__devm_cxl_add_port(struct device *host,
 		port->reg_map = cxlds->reg_map;
 		port->reg_map.host = &port->dev;
 		cxlmd->endpoint = port;
+		bitmap_copy(port->capabilities, cxlds->capabilities,
+			    CXL_MAX_CAPS);
 	} else if (parent_dport) {
 		rc = dev_set_name(dev, "port%d", port->id);
 		if (rc)
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index e1082e749c69..9d63a2adfd42 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2020 Intel Corporation. */
 #include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/cxl/cxl.h>
 #include <linux/device.h>
 #include <linux/slab.h>
 #include <linux/pci.h>
@@ -36,7 +37,8 @@
  * Probe for component register information and return it in map object.
  */
 void cxl_probe_component_regs(struct device *dev, void __iomem *base,
-			      struct cxl_component_reg_map *map)
+			      struct cxl_component_reg_map *map,
+			      unsigned long *caps)
 {
 	int cap, cap_count;
 	u32 cap_array;
@@ -84,6 +86,7 @@ void cxl_probe_component_regs(struct device *dev, void __iomem *base,
 			decoder_cnt = cxl_hdm_decoder_count(hdr);
 			length = 0x20 * decoder_cnt + 0x10;
 			rmap = &map->hdm_decoder;
+			*caps |= BIT(CXL_DEV_CAP_HDM);
 			break;
 		}
 		case CXL_CM_CAP_CAP_ID_RAS:
@@ -91,6 +94,7 @@ void cxl_probe_component_regs(struct device *dev, void __iomem *base,
 				offset);
 			length = CXL_RAS_CAPABILITY_LENGTH;
 			rmap = &map->ras;
+			*caps |= BIT(CXL_DEV_CAP_RAS);
 			break;
 		default:
 			dev_dbg(dev, "Unknown CM cap ID: %d (0x%x)\n", cap_id,
@@ -117,7 +121,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_probe_component_regs, CXL);
  * Probe for device register information and return it in map object.
  */
 void cxl_probe_device_regs(struct device *dev, void __iomem *base,
-			   struct cxl_device_reg_map *map)
+			   struct cxl_device_reg_map *map, unsigned long *caps)
 {
 	int cap, cap_count;
 	u64 cap_array;
@@ -146,10 +150,12 @@ void cxl_probe_device_regs(struct device *dev, void __iomem *base,
 		case CXLDEV_CAP_CAP_ID_DEVICE_STATUS:
 			dev_dbg(dev, "found Status capability (0x%x)\n", offset);
 			rmap = &map->status;
+			*caps |= BIT(CXL_DEV_CAP_DEV_STATUS);
 			break;
 		case CXLDEV_CAP_CAP_ID_PRIMARY_MAILBOX:
 			dev_dbg(dev, "found Mailbox capability (0x%x)\n", offset);
 			rmap = &map->mbox;
+			*caps |= BIT(CXL_DEV_CAP_MAILBOX_PRIMARY);
 			break;
 		case CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX:
 			dev_dbg(dev, "found Secondary Mailbox capability (0x%x)\n", offset);
@@ -157,6 +163,7 @@ void cxl_probe_device_regs(struct device *dev, void __iomem *base,
 		case CXLDEV_CAP_CAP_ID_MEMDEV:
 			dev_dbg(dev, "found Memory Device capability (0x%x)\n", offset);
 			rmap = &map->memdev;
+			*caps |= BIT(CXL_DEV_CAP_MEMDEV);
 			break;
 		default:
 			if (cap_id >= 0x8000)
@@ -421,7 +428,7 @@ static void cxl_unmap_regblock(struct cxl_register_map *map)
 	map->base = NULL;
 }
 
-static int cxl_probe_regs(struct cxl_register_map *map)
+static int cxl_probe_regs(struct cxl_register_map *map, unsigned long *caps)
 {
 	struct cxl_component_reg_map *comp_map;
 	struct cxl_device_reg_map *dev_map;
@@ -431,12 +438,12 @@ static int cxl_probe_regs(struct cxl_register_map *map)
 	switch (map->reg_type) {
 	case CXL_REGLOC_RBI_COMPONENT:
 		comp_map = &map->component_map;
-		cxl_probe_component_regs(host, base, comp_map);
+		cxl_probe_component_regs(host, base, comp_map, caps);
 		dev_dbg(host, "Set up component registers\n");
 		break;
 	case CXL_REGLOC_RBI_MEMDEV:
 		dev_map = &map->device_map;
-		cxl_probe_device_regs(host, base, dev_map);
+		cxl_probe_device_regs(host, base, dev_map, caps);
 		if (!dev_map->status.valid || !dev_map->mbox.valid ||
 		    !dev_map->memdev.valid) {
 			dev_err(host, "registers not found: %s%s%s\n",
@@ -455,7 +462,7 @@ static int cxl_probe_regs(struct cxl_register_map *map)
 	return 0;
 }
 
-int cxl_setup_regs(struct cxl_register_map *map)
+int cxl_setup_regs(struct cxl_register_map *map, unsigned long *caps)
 {
 	int rc;
 
@@ -463,7 +470,7 @@ int cxl_setup_regs(struct cxl_register_map *map)
 	if (rc)
 		return rc;
 
-	rc = cxl_probe_regs(map);
+	rc = cxl_probe_regs(map, caps);
 	cxl_unmap_regblock(map);
 
 	return rc;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 9afb407d438f..a7c242a19b62 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -8,6 +8,7 @@
 #include <linux/bitfield.h>
 #include <linux/notifier.h>
 #include <linux/bitops.h>
+#include <linux/cxl/cxl.h>
 #include <linux/log2.h>
 #include <linux/node.h>
 #include <linux/io.h>
@@ -284,9 +285,9 @@ struct cxl_register_map {
 };
 
 void cxl_probe_component_regs(struct device *dev, void __iomem *base,
-			      struct cxl_component_reg_map *map);
+			      struct cxl_component_reg_map *map, unsigned long *caps);
 void cxl_probe_device_regs(struct device *dev, void __iomem *base,
-			   struct cxl_device_reg_map *map);
+			   struct cxl_device_reg_map *map, unsigned long *caps);
 int cxl_map_component_regs(const struct cxl_register_map *map,
 			   struct cxl_component_regs *regs,
 			   unsigned long map_mask);
@@ -300,7 +301,7 @@ int cxl_find_regblock_instance(struct pci_dev *pdev, enum cxl_regloc_type type,
 			       struct cxl_register_map *map, int index);
 int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
 		      struct cxl_register_map *map);
-int cxl_setup_regs(struct cxl_register_map *map);
+int cxl_setup_regs(struct cxl_register_map *map, unsigned long *caps);
 struct cxl_dport;
 resource_size_t cxl_rcd_component_reg_phys(struct device *dev,
 					   struct cxl_dport *dport);
@@ -600,6 +601,7 @@ struct cxl_dax_region {
  * @cdat: Cached CDAT data
  * @cdat_available: Should a CDAT attribute be available in sysfs
  * @pci_latency: Upstream latency in picoseconds
+ * @capabilities: those capabilities as defined in device mapped registers
  */
 struct cxl_port {
 	struct device dev;
@@ -623,6 +625,7 @@ struct cxl_port {
 	} cdat;
 	bool cdat_available;
 	long pci_latency;
+	DECLARE_BITMAP(capabilities, CXL_MAX_CAPS);
 };
 
 /**
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index afb53d058d62..68d28eab3696 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -424,6 +424,7 @@ struct cxl_dpa_perf {
  * @ram_res: Active Volatile memory capacity configuration
  * @serial: PCIe Device Serial Number
  * @type: Generic Memory Class device or Vendor Specific Memory device
+ * @capabilities: those capabilities as defined in device mapped registers
  */
 struct cxl_dev_state {
 	struct device *dev;
@@ -438,6 +439,7 @@ struct cxl_dev_state {
 	struct resource ram_res;
 	u64 serial;
 	enum cxl_devtype type;
+	DECLARE_BITMAP(capabilities, CXL_MAX_CAPS);
 };
 
 /**
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 246930932ea6..6cd7ab117f80 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -503,7 +503,8 @@ static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev,
 }
 
 static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
-			      struct cxl_register_map *map)
+			      struct cxl_register_map *map,
+			      unsigned long *caps)
 {
 	int rc;
 
@@ -520,7 +521,7 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
 	if (rc)
 		return rc;
 
-	return cxl_setup_regs(map);
+	return cxl_setup_regs(map, caps);
 }
 
 static int cxl_pci_ras_unmask(struct pci_dev *pdev)
@@ -827,7 +828,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	cxl_set_dvsec(cxlds, dvsec);
 
-	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
+	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
+				cxlds->capabilities);
 	if (rc)
 		return rc;
 
@@ -840,7 +842,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	 * still be useful for management functions so don't return an error.
 	 */
 	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
-				&cxlds->reg_map);
+				&cxlds->reg_map, cxlds->capabilities);
 	if (rc)
 		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
 	else if (!cxlds->reg_map.component_map.ras.valid)
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
index c06ca750168f..4a4f75a86018 100644
--- a/include/linux/cxl/cxl.h
+++ b/include/linux/cxl/cxl.h
@@ -12,6 +12,37 @@ enum cxl_resource {
 	CXL_RES_PMEM,
 };
 
+/* Capabilities as defined for:
+ *
+ *	Component Registers (Table 8-22 CXL 3.0 specification)
+ *	Device Registers (8.2.8.2.1 CXL 3.0 specification)
+ */
+
+enum cxl_dev_cap {
+	/* capabilities from Component Registers */
+	CXL_DEV_CAP_RAS,
+	CXL_DEV_CAP_SEC,
+	CXL_DEV_CAP_LINK,
+	CXL_DEV_CAP_HDM,
+	CXL_DEV_CAP_SEC_EXT,
+	CXL_DEV_CAP_IDE,
+	CXL_DEV_CAP_SNOOP_FILTER,
+	CXL_DEV_CAP_TIMEOUT_AND_ISOLATION,
+	CXL_DEV_CAP_CACHEMEM_EXT,
+	CXL_DEV_CAP_BI_ROUTE_TABLE,
+	CXL_DEV_CAP_BI_DECODER,
+	CXL_DEV_CAP_CACHEID_ROUTE_TABLE,
+	CXL_DEV_CAP_CACHEID_DECODER,
+	CXL_DEV_CAP_HDM_EXT,
+	CXL_DEV_CAP_METADATA_EXT,
+	/* capabilities from Device Registers */
+	CXL_DEV_CAP_DEV_STATUS,
+	CXL_DEV_CAP_MAILBOX_PRIMARY,
+	CXL_DEV_CAP_MAILBOX_SECONDARY,
+	CXL_DEV_CAP_MEMDEV,
+	CXL_MAX_CAPS,
+};
+
 struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
 
 void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 03/26] cxl: add capabilities field to cxl_dev_state and cxl_port
  2024-10-17 16:52 ` [PATCH v4 03/26] cxl: add capabilities field to cxl_dev_state and cxl_port alejandro.lucero-palau
@ 2024-10-25 14:14   ` Jonathan Cameron
  2024-10-28 12:00     ` Alejandro Lucero Palau
  2024-10-28 18:19   ` Dave Jiang
  1 sibling, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2024-10-25 14:14 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet, Alejandro Lucero

On Thu, 17 Oct 2024 17:52:02 +0100
alejandro.lucero-palau@amd.com wrote:

> From: Alejandro Lucero <alucerop@amd.com>
> 
> Type2 devices have some Type3 functionalities as optional like an mbox
> or an hdm decoder, and CXL core needs a way to know what an CXL accelerator
> implements.
> 
> Add a new field to cxl_dev_state for keeping device capabilities as
> discovered during initialization. Add same field to cxl_port as registers
> discovery is also used during port initialization.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Just a trivial wrong spec reference.

> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
> index c06ca750168f..4a4f75a86018 100644
> --- a/include/linux/cxl/cxl.h
> +++ b/include/linux/cxl/cxl.h
> @@ -12,6 +12,37 @@ enum cxl_resource {
>  	CXL_RES_PMEM,
>  };
>  
> +/* Capabilities as defined for:
> + *
> + *	Component Registers (Table 8-22 CXL 3.0 specification)
> + *	Device Registers (8.2.8.2.1 CXL 3.0 specification)
> + */
> +
> +enum cxl_dev_cap {
> +	/* capabilities from Component Registers */
> +	CXL_DEV_CAP_RAS,
> +	CXL_DEV_CAP_SEC,
> +	CXL_DEV_CAP_LINK,
> +	CXL_DEV_CAP_HDM,
> +	CXL_DEV_CAP_SEC_EXT,
> +	CXL_DEV_CAP_IDE,
> +	CXL_DEV_CAP_SNOOP_FILTER,
> +	CXL_DEV_CAP_TIMEOUT_AND_ISOLATION,
> +	CXL_DEV_CAP_CACHEMEM_EXT,
> +	CXL_DEV_CAP_BI_ROUTE_TABLE,
> +	CXL_DEV_CAP_BI_DECODER,
> +	CXL_DEV_CAP_CACHEID_ROUTE_TABLE,
> +	CXL_DEV_CAP_CACHEID_DECODER,
> +	CXL_DEV_CAP_HDM_EXT,
> +	CXL_DEV_CAP_METADATA_EXT,
This is the 3.1 version of the table as metadata cap wasn't
added until then.  I'd just update the reference.

> +	/* capabilities from Device Registers */
> +	CXL_DEV_CAP_DEV_STATUS,
> +	CXL_DEV_CAP_MAILBOX_PRIMARY,
> +	CXL_DEV_CAP_MAILBOX_SECONDARY,
> +	CXL_DEV_CAP_MEMDEV,
> +	CXL_MAX_CAPS,
I'd drop that trailing comma. Don't want anything to be accidentally added after this.
> +};
> +
>  struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
>  
>  void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 03/26] cxl: add capabilities field to cxl_dev_state and cxl_port
  2024-10-25 14:14   ` Jonathan Cameron
@ 2024-10-28 12:00     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-28 12:00 UTC (permalink / raw)
  To: Jonathan Cameron, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/25/24 15:14, Jonathan Cameron wrote:
> On Thu, 17 Oct 2024 17:52:02 +0100
> alejandro.lucero-palau@amd.com wrote:
>
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Type2 devices have some Type3 functionalities as optional like an mbox
>> or an hdm decoder, and CXL core needs a way to know what an CXL accelerator
>> implements.
>>
>> Add a new field to cxl_dev_state for keeping device capabilities as
>> discovered during initialization. Add same field to cxl_port as registers
>> discovery is also used during port initialization.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Just a trivial wrong spec reference.
>
>> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
>> index c06ca750168f..4a4f75a86018 100644
>> --- a/include/linux/cxl/cxl.h
>> +++ b/include/linux/cxl/cxl.h
>> @@ -12,6 +12,37 @@ enum cxl_resource {
>>   	CXL_RES_PMEM,
>>   };
>>   
>> +/* Capabilities as defined for:
>> + *
>> + *	Component Registers (Table 8-22 CXL 3.0 specification)
>> + *	Device Registers (8.2.8.2.1 CXL 3.0 specification)
>> + */
>> +
>> +enum cxl_dev_cap {
>> +	/* capabilities from Component Registers */
>> +	CXL_DEV_CAP_RAS,
>> +	CXL_DEV_CAP_SEC,
>> +	CXL_DEV_CAP_LINK,
>> +	CXL_DEV_CAP_HDM,
>> +	CXL_DEV_CAP_SEC_EXT,
>> +	CXL_DEV_CAP_IDE,
>> +	CXL_DEV_CAP_SNOOP_FILTER,
>> +	CXL_DEV_CAP_TIMEOUT_AND_ISOLATION,
>> +	CXL_DEV_CAP_CACHEMEM_EXT,
>> +	CXL_DEV_CAP_BI_ROUTE_TABLE,
>> +	CXL_DEV_CAP_BI_DECODER,
>> +	CXL_DEV_CAP_CACHEID_ROUTE_TABLE,
>> +	CXL_DEV_CAP_CACHEID_DECODER,
>> +	CXL_DEV_CAP_HDM_EXT,
>> +	CXL_DEV_CAP_METADATA_EXT,
> This is the 3.1 version of the table as metadata cap wasn't
> added until then.  I'd just update the reference.
>

Right. I'll do it.


>> +	/* capabilities from Device Registers */
>> +	CXL_DEV_CAP_DEV_STATUS,
>> +	CXL_DEV_CAP_MAILBOX_PRIMARY,
>> +	CXL_DEV_CAP_MAILBOX_SECONDARY,
>> +	CXL_DEV_CAP_MEMDEV,
>> +	CXL_MAX_CAPS,
> I'd drop that trailing comma. Don't want anything to be accidentally added after this.


Sure.

Thanks!


>> +};
>> +
>>   struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
>>   
>>   void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 03/26] cxl: add capabilities field to cxl_dev_state and cxl_port
  2024-10-17 16:52 ` [PATCH v4 03/26] cxl: add capabilities field to cxl_dev_state and cxl_port alejandro.lucero-palau
  2024-10-25 14:14   ` Jonathan Cameron
@ 2024-10-28 18:19   ` Dave Jiang
  2024-10-30 16:28     ` Alejandro Lucero Palau
  1 sibling, 1 reply; 64+ messages in thread
From: Dave Jiang @ 2024-10-28 18:19 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero



On 10/17/24 9:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Type2 devices have some Type3 functionalities as optional like an mbox
> or an hdm decoder, and CXL core needs a way to know what an CXL accelerator
> implements.
> 
> Add a new field to cxl_dev_state for keeping device capabilities as
> discovered during initialization. Add same field to cxl_port as registers
> discovery is also used during port initialization.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/port.c | 11 +++++++----
>  drivers/cxl/core/regs.c | 21 ++++++++++++++-------
>  drivers/cxl/cxl.h       |  9 ++++++---
>  drivers/cxl/cxlmem.h    |  2 ++
>  drivers/cxl/pci.c       | 10 ++++++----
>  include/linux/cxl/cxl.h | 31 +++++++++++++++++++++++++++++++
>  6 files changed, 66 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 1d5007e3795a..7b859b79d59d 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -749,7 +749,7 @@ static struct cxl_port *cxl_port_alloc(struct device *uport_dev,
>  }
>  
>  static int cxl_setup_comp_regs(struct device *host, struct cxl_register_map *map,
> -			       resource_size_t component_reg_phys)
> +			       resource_size_t component_reg_phys, unsigned long *caps)
>  {
>  	*map = (struct cxl_register_map) {
>  		.host = host,
> @@ -763,7 +763,7 @@ static int cxl_setup_comp_regs(struct device *host, struct cxl_register_map *map
>  	map->reg_type = CXL_REGLOC_RBI_COMPONENT;
>  	map->max_size = CXL_COMPONENT_REG_BLOCK_SIZE;
>  
> -	return cxl_setup_regs(map);
> +	return cxl_setup_regs(map, caps);
>  }
>  
>  static int cxl_port_setup_regs(struct cxl_port *port,
> @@ -772,7 +772,7 @@ static int cxl_port_setup_regs(struct cxl_port *port,
>  	if (dev_is_platform(port->uport_dev))
>  		return 0;
>  	return cxl_setup_comp_regs(&port->dev, &port->reg_map,
> -				   component_reg_phys);
> +				   component_reg_phys, port->capabilities);
>  }
>  
>  static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport,
> @@ -789,7 +789,8 @@ static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport,
>  	 * NULL.
>  	 */
>  	rc = cxl_setup_comp_regs(dport->dport_dev, &dport->reg_map,
> -				 component_reg_phys);
> +				 component_reg_phys,
> +				 dport->port->capabilities);
>  	dport->reg_map.host = host;
>  	return rc;
>  }
> @@ -858,6 +859,8 @@ static struct cxl_port *__devm_cxl_add_port(struct device *host,
>  		port->reg_map = cxlds->reg_map;
>  		port->reg_map.host = &port->dev;
>  		cxlmd->endpoint = port;
> +		bitmap_copy(port->capabilities, cxlds->capabilities,
> +			    CXL_MAX_CAPS);
>  	} else if (parent_dport) {
>  		rc = dev_set_name(dev, "port%d", port->id);
>  		if (rc)
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index e1082e749c69..9d63a2adfd42 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /* Copyright(c) 2020 Intel Corporation. */
>  #include <linux/io-64-nonatomic-lo-hi.h>
> +#include <linux/cxl/cxl.h>
>  #include <linux/device.h>
>  #include <linux/slab.h>
>  #include <linux/pci.h>
> @@ -36,7 +37,8 @@
>   * Probe for component register information and return it in map object.
>   */
>  void cxl_probe_component_regs(struct device *dev, void __iomem *base,
> -			      struct cxl_component_reg_map *map)
> +			      struct cxl_component_reg_map *map,
> +			      unsigned long *caps)
>  {
>  	int cap, cap_count;
>  	u32 cap_array;
> @@ -84,6 +86,7 @@ void cxl_probe_component_regs(struct device *dev, void __iomem *base,
>  			decoder_cnt = cxl_hdm_decoder_count(hdr);
>  			length = 0x20 * decoder_cnt + 0x10;
>  			rmap = &map->hdm_decoder;
> +			*caps |= BIT(CXL_DEV_CAP_HDM);
>  			break;
>  		}
>  		case CXL_CM_CAP_CAP_ID_RAS:
> @@ -91,6 +94,7 @@ void cxl_probe_component_regs(struct device *dev, void __iomem *base,
>  				offset);
>  			length = CXL_RAS_CAPABILITY_LENGTH;
>  			rmap = &map->ras;
> +			*caps |= BIT(CXL_DEV_CAP_RAS);
>  			break;
>  		default:
>  			dev_dbg(dev, "Unknown CM cap ID: %d (0x%x)\n", cap_id,
> @@ -117,7 +121,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_probe_component_regs, CXL);
>   * Probe for device register information and return it in map object.
>   */
>  void cxl_probe_device_regs(struct device *dev, void __iomem *base,
> -			   struct cxl_device_reg_map *map)
> +			   struct cxl_device_reg_map *map, unsigned long *caps)
>  {
>  	int cap, cap_count;
>  	u64 cap_array;
> @@ -146,10 +150,12 @@ void cxl_probe_device_regs(struct device *dev, void __iomem *base,
>  		case CXLDEV_CAP_CAP_ID_DEVICE_STATUS:
>  			dev_dbg(dev, "found Status capability (0x%x)\n", offset);
>  			rmap = &map->status;
> +			*caps |= BIT(CXL_DEV_CAP_DEV_STATUS);
>  			break;
>  		case CXLDEV_CAP_CAP_ID_PRIMARY_MAILBOX:
>  			dev_dbg(dev, "found Mailbox capability (0x%x)\n", offset);
>  			rmap = &map->mbox;
> +			*caps |= BIT(CXL_DEV_CAP_MAILBOX_PRIMARY);
>  			break;
>  		case CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX:
>  			dev_dbg(dev, "found Secondary Mailbox capability (0x%x)\n", offset);
> @@ -157,6 +163,7 @@ void cxl_probe_device_regs(struct device *dev, void __iomem *base,
>  		case CXLDEV_CAP_CAP_ID_MEMDEV:
>  			dev_dbg(dev, "found Memory Device capability (0x%x)\n", offset);
>  			rmap = &map->memdev;
> +			*caps |= BIT(CXL_DEV_CAP_MEMDEV);
>  			break;
>  		default:
>  			if (cap_id >= 0x8000)
> @@ -421,7 +428,7 @@ static void cxl_unmap_regblock(struct cxl_register_map *map)
>  	map->base = NULL;
>  }
>  
> -static int cxl_probe_regs(struct cxl_register_map *map)
> +static int cxl_probe_regs(struct cxl_register_map *map, unsigned long *caps)
>  {
>  	struct cxl_component_reg_map *comp_map;
>  	struct cxl_device_reg_map *dev_map;
> @@ -431,12 +438,12 @@ static int cxl_probe_regs(struct cxl_register_map *map)
>  	switch (map->reg_type) {
>  	case CXL_REGLOC_RBI_COMPONENT:
>  		comp_map = &map->component_map;
> -		cxl_probe_component_regs(host, base, comp_map);
> +		cxl_probe_component_regs(host, base, comp_map, caps);
>  		dev_dbg(host, "Set up component registers\n");
>  		break;
>  	case CXL_REGLOC_RBI_MEMDEV:
>  		dev_map = &map->device_map;
> -		cxl_probe_device_regs(host, base, dev_map);
> +		cxl_probe_device_regs(host, base, dev_map, caps);
>  		if (!dev_map->status.valid || !dev_map->mbox.valid ||
>  		    !dev_map->memdev.valid) {
>  			dev_err(host, "registers not found: %s%s%s\n",
> @@ -455,7 +462,7 @@ static int cxl_probe_regs(struct cxl_register_map *map)
>  	return 0;
>  }
>  
> -int cxl_setup_regs(struct cxl_register_map *map)
> +int cxl_setup_regs(struct cxl_register_map *map, unsigned long *caps)
>  {
>  	int rc;
>  
> @@ -463,7 +470,7 @@ int cxl_setup_regs(struct cxl_register_map *map)
>  	if (rc)
>  		return rc;
>  
> -	rc = cxl_probe_regs(map);
> +	rc = cxl_probe_regs(map, caps);
>  	cxl_unmap_regblock(map);
>  
>  	return rc;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 9afb407d438f..a7c242a19b62 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -8,6 +8,7 @@
>  #include <linux/bitfield.h>
>  #include <linux/notifier.h>
>  #include <linux/bitops.h>
> +#include <linux/cxl/cxl.h>
>  #include <linux/log2.h>
>  #include <linux/node.h>
>  #include <linux/io.h>
> @@ -284,9 +285,9 @@ struct cxl_register_map {
>  };
>  
>  void cxl_probe_component_regs(struct device *dev, void __iomem *base,
> -			      struct cxl_component_reg_map *map);
> +			      struct cxl_component_reg_map *map, unsigned long *caps);
>  void cxl_probe_device_regs(struct device *dev, void __iomem *base,
> -			   struct cxl_device_reg_map *map);
> +			   struct cxl_device_reg_map *map, unsigned long *caps);
>  int cxl_map_component_regs(const struct cxl_register_map *map,
>  			   struct cxl_component_regs *regs,
>  			   unsigned long map_mask);
> @@ -300,7 +301,7 @@ int cxl_find_regblock_instance(struct pci_dev *pdev, enum cxl_regloc_type type,
>  			       struct cxl_register_map *map, int index);
>  int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>  		      struct cxl_register_map *map);
> -int cxl_setup_regs(struct cxl_register_map *map);
> +int cxl_setup_regs(struct cxl_register_map *map, unsigned long *caps);
>  struct cxl_dport;
>  resource_size_t cxl_rcd_component_reg_phys(struct device *dev,
>  					   struct cxl_dport *dport);
> @@ -600,6 +601,7 @@ struct cxl_dax_region {
>   * @cdat: Cached CDAT data
>   * @cdat_available: Should a CDAT attribute be available in sysfs
>   * @pci_latency: Upstream latency in picoseconds
> + * @capabilities: those capabilities as defined in device mapped registers
>   */
>  struct cxl_port {
>  	struct device dev;
> @@ -623,6 +625,7 @@ struct cxl_port {
>  	} cdat;
>  	bool cdat_available;
>  	long pci_latency;
> +	DECLARE_BITMAP(capabilities, CXL_MAX_CAPS);
>  };
>  
>  /**
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index afb53d058d62..68d28eab3696 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -424,6 +424,7 @@ struct cxl_dpa_perf {
>   * @ram_res: Active Volatile memory capacity configuration
>   * @serial: PCIe Device Serial Number
>   * @type: Generic Memory Class device or Vendor Specific Memory device
> + * @capabilities: those capabilities as defined in device mapped registers
>   */
>  struct cxl_dev_state {
>  	struct device *dev;
> @@ -438,6 +439,7 @@ struct cxl_dev_state {
>  	struct resource ram_res;
>  	u64 serial;
>  	enum cxl_devtype type;
> +	DECLARE_BITMAP(capabilities, CXL_MAX_CAPS);
>  };
>  
>  /**
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 246930932ea6..6cd7ab117f80 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -503,7 +503,8 @@ static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev,
>  }
>  
>  static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> -			      struct cxl_register_map *map)
> +			      struct cxl_register_map *map,
> +			      unsigned long *caps)
>  {
>  	int rc;
>  
> @@ -520,7 +521,7 @@ static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>  	if (rc)
>  		return rc;
>  
> -	return cxl_setup_regs(map);
> +	return cxl_setup_regs(map, caps);
>  }
>  
>  static int cxl_pci_ras_unmask(struct pci_dev *pdev)
> @@ -827,7 +828,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  
>  	cxl_set_dvsec(cxlds, dvsec);
>  
> -	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
> +				cxlds->capabilities);
>  	if (rc)
>  		return rc;
>  
> @@ -840,7 +842,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	 * still be useful for management functions so don't return an error.
>  	 */
>  	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> -				&cxlds->reg_map);
> +				&cxlds->reg_map, cxlds->capabilities);
>  	if (rc)
>  		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
>  	else if (!cxlds->reg_map.component_map.ras.valid)
> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
> index c06ca750168f..4a4f75a86018 100644
> --- a/include/linux/cxl/cxl.h
> +++ b/include/linux/cxl/cxl.h
> @@ -12,6 +12,37 @@ enum cxl_resource {
>  	CXL_RES_PMEM,
>  };
>  
> +/* Capabilities as defined for:
> + *
> + *	Component Registers (Table 8-22 CXL 3.0 specification)
> + *	Device Registers (8.2.8.2.1 CXL 3.0 specification)
> + */
> +
> +enum cxl_dev_cap {
> +	/* capabilities from Component Registers */
> +	CXL_DEV_CAP_RAS,
> +	CXL_DEV_CAP_SEC,
> +	CXL_DEV_CAP_LINK,
> +	CXL_DEV_CAP_HDM,
> +	CXL_DEV_CAP_SEC_EXT,
> +	CXL_DEV_CAP_IDE,
> +	CXL_DEV_CAP_SNOOP_FILTER,
> +	CXL_DEV_CAP_TIMEOUT_AND_ISOLATION,
> +	CXL_DEV_CAP_CACHEMEM_EXT,
> +	CXL_DEV_CAP_BI_ROUTE_TABLE,
> +	CXL_DEV_CAP_BI_DECODER,
> +	CXL_DEV_CAP_CACHEID_ROUTE_TABLE,
> +	CXL_DEV_CAP_CACHEID_DECODER,
> +	CXL_DEV_CAP_HDM_EXT,
> +	CXL_DEV_CAP_METADATA_EXT,
> +	/* capabilities from Device Registers */
> +	CXL_DEV_CAP_DEV_STATUS,
> +	CXL_DEV_CAP_MAILBOX_PRIMARY,
> +	CXL_DEV_CAP_MAILBOX_SECONDARY,

I think there was a previous comment about dropping this cap since OS would never access this cap?

DJ

> +	CXL_DEV_CAP_MEMDEV,
> +	CXL_MAX_CAPS,
> +};
> +
>  struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
>  
>  void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 03/26] cxl: add capabilities field to cxl_dev_state and cxl_port
  2024-10-28 18:19   ` Dave Jiang
@ 2024-10-30 16:28     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-30 16:28 UTC (permalink / raw)
  To: Dave Jiang, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, pabeni,
	edumazet


On 10/28/24 18:19, Dave Jiang wrote:


<snip>


>
>
> +
> +enum cxl_dev_cap {
> +	/* capabilities from Component Registers */
> +	CXL_DEV_CAP_RAS,
> +	CXL_DEV_CAP_SEC,
> +	CXL_DEV_CAP_LINK,
> +	CXL_DEV_CAP_HDM,
> +	CXL_DEV_CAP_SEC_EXT,
> +	CXL_DEV_CAP_IDE,
> +	CXL_DEV_CAP_SNOOP_FILTER,
> +	CXL_DEV_CAP_TIMEOUT_AND_ISOLATION,
> +	CXL_DEV_CAP_CACHEMEM_EXT,
> +	CXL_DEV_CAP_BI_ROUTE_TABLE,
> +	CXL_DEV_CAP_BI_DECODER,
> +	CXL_DEV_CAP_CACHEID_ROUTE_TABLE,
> +	CXL_DEV_CAP_CACHEID_DECODER,
> +	CXL_DEV_CAP_HDM_EXT,
> +	CXL_DEV_CAP_METADATA_EXT,
> +	/* capabilities from Device Registers */
> +	CXL_DEV_CAP_DEV_STATUS,
> +	CXL_DEV_CAP_MAILBOX_PRIMARY,
> +	CXL_DEV_CAP_MAILBOX_SECONDARY,
> I think there was a previous comment about dropping this cap since OS would never access this cap?
>
> DJ


Oh, yes, Jonathan raised this and I forgot.

It'll be fixed in v5.

Thanks


>
>> +	CXL_DEV_CAP_MEMDEV,
>> +	CXL_MAX_CAPS,
>> +};
>> +
>>   struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
>>   
>>   void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 04/26] cxl/pci: add check for validating capabilities
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (2 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 03/26] cxl: add capabilities field to cxl_dev_state and cxl_port alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-25 10:16   ` Alejandro Lucero Palau
  2024-10-17 16:52 ` [PATCH v4 05/26] cxl: move pci generic code alejandro.lucero-palau
                   ` (22 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

During CXL device initialization supported capabilities by the device
are discovered. Type3 and Type2 devices have different mandatory
capabilities and a Type2 expects a specific set including optional
capabilities.

Add a function for checking expected capabilities against those found
during initialization.

Rely on this function for validating capabilities instead of when CXL
regs are probed.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/pci.c  | 14 ++++++++++++++
 drivers/cxl/core/regs.c |  9 ---------
 drivers/cxl/pci.c       | 17 +++++++++++++++++
 include/linux/cxl/cxl.h |  3 +++
 4 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 3d6564dbda57..fa2a5e216dc3 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -8,6 +8,7 @@
 #include <linux/pci-doe.h>
 #include <linux/aer.h>
 #include <linux/cxl/pci.h>
+#include <linux/cxl/cxl.h>
 #include <cxlpci.h>
 #include <cxlmem.h>
 #include <cxl.h>
@@ -1077,3 +1078,16 @@ bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port)
 				     __cxl_endpoint_decoder_reset_detected);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, CXL);
+
+bool cxl_pci_check_caps(struct cxl_dev_state *cxlds, unsigned long *expected_caps,
+			unsigned long *current_caps)
+{
+	if (current_caps)
+		bitmap_copy(current_caps, cxlds->capabilities, CXL_MAX_CAPS);
+
+	dev_dbg(cxlds->dev, "Checking cxlds caps 0x%08lx vs expected caps 0x%08lx\n",
+		*cxlds->capabilities, *expected_caps);
+
+	return bitmap_equal(cxlds->capabilities, expected_caps, CXL_MAX_CAPS);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_pci_check_caps, CXL);
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index 9d63a2adfd42..6fbc5c57149e 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -444,15 +444,6 @@ static int cxl_probe_regs(struct cxl_register_map *map, unsigned long *caps)
 	case CXL_REGLOC_RBI_MEMDEV:
 		dev_map = &map->device_map;
 		cxl_probe_device_regs(host, base, dev_map, caps);
-		if (!dev_map->status.valid || !dev_map->mbox.valid ||
-		    !dev_map->memdev.valid) {
-			dev_err(host, "registers not found: %s%s%s\n",
-				!dev_map->status.valid ? "status " : "",
-				!dev_map->mbox.valid ? "mbox " : "",
-				!dev_map->memdev.valid ? "memdev " : "");
-			return -ENXIO;
-		}
-
 		dev_dbg(host, "Probing device registers...\n");
 		break;
 	default:
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 6cd7ab117f80..89c8ac1a61fd 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -792,6 +792,8 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
+	DECLARE_BITMAP(expected, CXL_MAX_CAPS);
+	DECLARE_BITMAP(found, CXL_MAX_CAPS);
 	struct cxl_memdev_state *mds;
 	struct cxl_dev_state *cxlds;
 	struct cxl_register_map map;
@@ -853,6 +855,21 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
 
+	bitmap_clear(expected, 0, BITS_PER_TYPE(unsigned long));
+
+	/* These are the mandatory capabilities for a Type3 device */
+	bitmap_set(expected, CXL_DEV_CAP_HDM, 1);
+	bitmap_set(expected, CXL_DEV_CAP_DEV_STATUS, 1);
+	bitmap_set(expected, CXL_DEV_CAP_MAILBOX_PRIMARY, 1);
+	bitmap_set(expected, CXL_DEV_CAP_DEV_STATUS, 1);
+
+	if (!cxl_pci_check_caps(cxlds, expected, found)) {
+		dev_err(&pdev->dev,
+			"Expected capabilities not matching with found capabilities: (%08lx - %08lx)\n",
+			*expected, *found);
+		return -ENXIO;
+	}
+
 	rc = cxl_await_media_ready(cxlds);
 	if (rc == 0)
 		cxlds->media_ready = true;
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
index 4a4f75a86018..78653fa4daa0 100644
--- a/include/linux/cxl/cxl.h
+++ b/include/linux/cxl/cxl.h
@@ -49,4 +49,7 @@ void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);
 void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial);
 int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
 		     enum cxl_resource);
+bool cxl_pci_check_caps(struct cxl_dev_state *cxlds,
+			unsigned long *expected_caps,
+			unsigned long *current_caps);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 04/26] cxl/pci: add check for validating capabilities
  2024-10-17 16:52 ` [PATCH v4 04/26] cxl/pci: add check for validating capabilities alejandro.lucero-palau
@ 2024-10-25 10:16   ` Alejandro Lucero Palau
  2024-10-25 14:16     ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-25 10:16 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet


On 10/17/24 17:52, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
>
> During CXL device initialization supported capabilities by the device
> are discovered. Type3 and Type2 devices have different mandatory
> capabilities and a Type2 expects a specific set including optional
> capabilities.
>
> Add a function for checking expected capabilities against those found
> during initialization.
>
> Rely on this function for validating capabilities instead of when CXL
> regs are probed.
>
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>   drivers/cxl/core/pci.c  | 14 ++++++++++++++
>   drivers/cxl/core/regs.c |  9 ---------
>   drivers/cxl/pci.c       | 17 +++++++++++++++++
>   include/linux/cxl/cxl.h |  3 +++
>   4 files changed, 34 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 3d6564dbda57..fa2a5e216dc3 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -8,6 +8,7 @@
>   #include <linux/pci-doe.h>
>   #include <linux/aer.h>
>   #include <linux/cxl/pci.h>
> +#include <linux/cxl/cxl.h>
>   #include <cxlpci.h>
>   #include <cxlmem.h>
>   #include <cxl.h>
> @@ -1077,3 +1078,16 @@ bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port)
>   				     __cxl_endpoint_decoder_reset_detected);
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, CXL);
> +
> +bool cxl_pci_check_caps(struct cxl_dev_state *cxlds, unsigned long *expected_caps,
> +			unsigned long *current_caps)
> +{
> +	if (current_caps)
> +		bitmap_copy(current_caps, cxlds->capabilities, CXL_MAX_CAPS);
> +
> +	dev_dbg(cxlds->dev, "Checking cxlds caps 0x%08lx vs expected caps 0x%08lx\n",
> +		*cxlds->capabilities, *expected_caps);
> +
> +	return bitmap_equal(cxlds->capabilities, expected_caps, CXL_MAX_CAPS);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_pci_check_caps, CXL);
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index 9d63a2adfd42..6fbc5c57149e 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -444,15 +444,6 @@ static int cxl_probe_regs(struct cxl_register_map *map, unsigned long *caps)
>   	case CXL_REGLOC_RBI_MEMDEV:
>   		dev_map = &map->device_map;
>   		cxl_probe_device_regs(host, base, dev_map, caps);
> -		if (!dev_map->status.valid || !dev_map->mbox.valid ||
> -		    !dev_map->memdev.valid) {
> -			dev_err(host, "registers not found: %s%s%s\n",
> -				!dev_map->status.valid ? "status " : "",
> -				!dev_map->mbox.valid ? "mbox " : "",
> -				!dev_map->memdev.valid ? "memdev " : "");
> -			return -ENXIO;
> -		}
> -
>   		dev_dbg(host, "Probing device registers...\n");
>   		break;
>   	default:
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 6cd7ab117f80..89c8ac1a61fd 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -792,6 +792,8 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
>   static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   {
>   	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> +	DECLARE_BITMAP(expected, CXL_MAX_CAPS);
> +	DECLARE_BITMAP(found, CXL_MAX_CAPS);
>   	struct cxl_memdev_state *mds;
>   	struct cxl_dev_state *cxlds;
>   	struct cxl_register_map map;
> @@ -853,6 +855,21 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   	if (rc)
>   		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
>   
> +	bitmap_clear(expected, 0, BITS_PER_TYPE(unsigned long));
> +
> +	/* These are the mandatory capabilities for a Type3 device */
> +	bitmap_set(expected, CXL_DEV_CAP_HDM, 1);
> +	bitmap_set(expected, CXL_DEV_CAP_DEV_STATUS, 1);
> +	bitmap_set(expected, CXL_DEV_CAP_MAILBOX_PRIMARY, 1);
> +	bitmap_set(expected, CXL_DEV_CAP_DEV_STATUS, 1);
> +
> +	if (!cxl_pci_check_caps(cxlds, expected, found)) {
> +		dev_err(&pdev->dev,
> +			"Expected capabilities not matching with found capabilities: (%08lx - %08lx)\n",
> +			*expected, *found);
> +		return -ENXIO;
> +	}
> +


This is wrong since a Type3 could have more caps than the mandatory 
ones. I will change the check for at least the mandatory ones being 
there, and do not fail if they are.

I guess a dev_dbg showing always the found versus the expected ones 
would not harm, so adding that as well in v5.


>   	rc = cxl_await_media_ready(cxlds);
>   	if (rc == 0)
>   		cxlds->media_ready = true;
> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
> index 4a4f75a86018..78653fa4daa0 100644
> --- a/include/linux/cxl/cxl.h
> +++ b/include/linux/cxl/cxl.h
> @@ -49,4 +49,7 @@ void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);
>   void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial);
>   int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>   		     enum cxl_resource);
> +bool cxl_pci_check_caps(struct cxl_dev_state *cxlds,
> +			unsigned long *expected_caps,
> +			unsigned long *current_caps);
>   #endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 04/26] cxl/pci: add check for validating capabilities
  2024-10-25 10:16   ` Alejandro Lucero Palau
@ 2024-10-25 14:16     ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2024-10-25 14:16 UTC (permalink / raw)
  To: Alejandro Lucero Palau
  Cc: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, pabeni, edumazet

On Fri, 25 Oct 2024 11:16:55 +0100
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 10/17/24 17:52, alejandro.lucero-palau@amd.com wrote:
> > From: Alejandro Lucero <alucerop@amd.com>
> >
> > During CXL device initialization supported capabilities by the device
> > are discovered. Type3 and Type2 devices have different mandatory
> > capabilities and a Type2 expects a specific set including optional
> > capabilities.
> >
> > Add a function for checking expected capabilities against those found
> > during initialization.
> >
> > Rely on this function for validating capabilities instead of when CXL
> > regs are probed.
> >
> > Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> > ---
> >   drivers/cxl/core/pci.c  | 14 ++++++++++++++
> >   drivers/cxl/core/regs.c |  9 ---------
> >   drivers/cxl/pci.c       | 17 +++++++++++++++++
> >   include/linux/cxl/cxl.h |  3 +++
> >   4 files changed, 34 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index 3d6564dbda57..fa2a5e216dc3 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -8,6 +8,7 @@
> >   #include <linux/pci-doe.h>
> >   #include <linux/aer.h>
> >   #include <linux/cxl/pci.h>
> > +#include <linux/cxl/cxl.h>
> >   #include <cxlpci.h>
> >   #include <cxlmem.h>
> >   #include <cxl.h>
> > @@ -1077,3 +1078,16 @@ bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port)
> >   				     __cxl_endpoint_decoder_reset_detected);
> >   }
> >   EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, CXL);
> > +
> > +bool cxl_pci_check_caps(struct cxl_dev_state *cxlds, unsigned long *expected_caps,
> > +			unsigned long *current_caps)
> > +{
> > +	if (current_caps)
> > +		bitmap_copy(current_caps, cxlds->capabilities, CXL_MAX_CAPS);
> > +
> > +	dev_dbg(cxlds->dev, "Checking cxlds caps 0x%08lx vs expected caps 0x%08lx\n",
> > +		*cxlds->capabilities, *expected_caps);
> > +
> > +	return bitmap_equal(cxlds->capabilities, expected_caps, CXL_MAX_CAPS);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_pci_check_caps, CXL);
> > diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> > index 9d63a2adfd42..6fbc5c57149e 100644
> > --- a/drivers/cxl/core/regs.c
> > +++ b/drivers/cxl/core/regs.c
> > @@ -444,15 +444,6 @@ static int cxl_probe_regs(struct cxl_register_map *map, unsigned long *caps)
> >   	case CXL_REGLOC_RBI_MEMDEV:
> >   		dev_map = &map->device_map;
> >   		cxl_probe_device_regs(host, base, dev_map, caps);
> > -		if (!dev_map->status.valid || !dev_map->mbox.valid ||
> > -		    !dev_map->memdev.valid) {
> > -			dev_err(host, "registers not found: %s%s%s\n",
> > -				!dev_map->status.valid ? "status " : "",
> > -				!dev_map->mbox.valid ? "mbox " : "",
> > -				!dev_map->memdev.valid ? "memdev " : "");
> > -			return -ENXIO;
> > -		}
> > -
> >   		dev_dbg(host, "Probing device registers...\n");
> >   		break;
> >   	default:
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 6cd7ab117f80..89c8ac1a61fd 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -792,6 +792,8 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
> >   static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >   {
> >   	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> > +	DECLARE_BITMAP(expected, CXL_MAX_CAPS);
> > +	DECLARE_BITMAP(found, CXL_MAX_CAPS);
> >   	struct cxl_memdev_state *mds;
> >   	struct cxl_dev_state *cxlds;
> >   	struct cxl_register_map map;
> > @@ -853,6 +855,21 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >   	if (rc)
> >   		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
> >   
> > +	bitmap_clear(expected, 0, BITS_PER_TYPE(unsigned long));
> > +
> > +	/* These are the mandatory capabilities for a Type3 device */
> > +	bitmap_set(expected, CXL_DEV_CAP_HDM, 1);
> > +	bitmap_set(expected, CXL_DEV_CAP_DEV_STATUS, 1);
> > +	bitmap_set(expected, CXL_DEV_CAP_MAILBOX_PRIMARY, 1);
> > +	bitmap_set(expected, CXL_DEV_CAP_DEV_STATUS, 1);
> > +
> > +	if (!cxl_pci_check_caps(cxlds, expected, found)) {
> > +		dev_err(&pdev->dev,
> > +			"Expected capabilities not matching with found capabilities: (%08lx - %08lx)\n",
> > +			*expected, *found);
> > +		return -ENXIO;
> > +	}
> > +  
> 
> 
> This is wrong since a Type3 could have more caps than the mandatory 
> ones. I will change the check for at least the mandatory ones being 
> there, and do not fail if they are.
> 
> I guess a dev_dbg showing always the found versus the expected ones 
> would not harm, so adding that as well in v5.

I'd also make it clear that we only check for caps that are actually
used by Linux drivers. I don't fancy having to fix emulation up to support
something random that no software cares about.  Will get there
eventually but it's not a priority.

Jonathan

> 
> 
> >   	rc = cxl_await_media_ready(cxlds);
> >   	if (rc == 0)
> >   		cxlds->media_ready = true;
> > diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
> > index 4a4f75a86018..78653fa4daa0 100644
> > --- a/include/linux/cxl/cxl.h
> > +++ b/include/linux/cxl/cxl.h
> > @@ -49,4 +49,7 @@ void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);
> >   void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial);
> >   int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
> >   		     enum cxl_resource);
> > +bool cxl_pci_check_caps(struct cxl_dev_state *cxlds,
> > +			unsigned long *expected_caps,
> > +			unsigned long *current_caps);
> >   #endif  
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 05/26] cxl: move pci generic code
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (3 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 04/26] cxl/pci: add check for validating capabilities alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 21:49   ` Ben Cheatham
  2024-10-17 16:52 ` [PATCH v4 06/26] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
                   ` (21 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Inside cxl/core/pci.c there are helpers for CXL PCIe initialization
meanwhile cxl/pci.c implements the functionality for a Type3 device
initialization.

Move helper functions from cxl/pci.c to cxl/pci/pci.c in order to be
exported and shared with CXL Type2 device initialization.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/pci.c | 62 ++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlpci.h   |  3 ++
 drivers/cxl/pci.c      | 61 -----------------------------------------
 3 files changed, 65 insertions(+), 61 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index fa2a5e216dc3..99acc258722d 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -1079,6 +1079,68 @@ bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, CXL);
 
+/*
+ * Assume that any RCIEP that emits the CXL memory expander class code
+ * is an RCD
+ */
+bool is_cxl_restricted(struct pci_dev *pdev)
+{
+	return pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END;
+}
+EXPORT_SYMBOL_NS_GPL(is_cxl_restricted, CXL);
+
+static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev,
+				  struct cxl_register_map *map)
+{
+	struct cxl_port *port;
+	struct cxl_dport *dport;
+	resource_size_t component_reg_phys;
+
+	*map = (struct cxl_register_map) {
+		.host = &pdev->dev,
+		.resource = CXL_RESOURCE_NONE,
+	};
+
+	port = cxl_pci_find_port(pdev, &dport);
+	if (!port)
+		return -EPROBE_DEFER;
+
+	component_reg_phys = cxl_rcd_component_reg_phys(&pdev->dev, dport);
+
+	put_device(&port->dev);
+
+	if (component_reg_phys == CXL_RESOURCE_NONE)
+		return -ENXIO;
+
+	map->resource = component_reg_phys;
+	map->reg_type = CXL_REGLOC_RBI_COMPONENT;
+	map->max_size = CXL_COMPONENT_REG_BLOCK_SIZE;
+
+	return 0;
+}
+
+int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
+		       struct cxl_register_map *map, unsigned long *caps)
+{
+	int rc;
+
+	rc = cxl_find_regblock(pdev, type, map);
+
+	/*
+	 * If the Register Locator DVSEC does not exist, check if it
+	 * is an RCH and try to extract the Component Registers from
+	 * an RCRB.
+	 */
+	if (rc && type == CXL_REGLOC_RBI_COMPONENT && is_cxl_restricted(pdev))
+		rc = cxl_rcrb_get_comp_regs(pdev, map);
+
+	if (rc)
+		return rc;
+
+	return cxl_setup_regs(map, caps);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_pci_setup_regs, CXL);
+
 bool cxl_pci_check_caps(struct cxl_dev_state *cxlds, unsigned long *expected_caps,
 			unsigned long *current_caps)
 {
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index eb59019fe5f3..985cca3c3350 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -113,4 +113,7 @@ void read_cdat_data(struct cxl_port *port);
 void cxl_cor_error_detected(struct pci_dev *pdev);
 pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
 				    pci_channel_state_t state);
+bool is_cxl_restricted(struct pci_dev *pdev);
+int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
+		       struct cxl_register_map *map, unsigned long *caps);
 #endif /* __CXL_PCI_H__ */
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 89c8ac1a61fd..e9333211e18f 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -463,67 +463,6 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds, bool irq_avail)
 	return 0;
 }
 
-/*
- * Assume that any RCIEP that emits the CXL memory expander class code
- * is an RCD
- */
-static bool is_cxl_restricted(struct pci_dev *pdev)
-{
-	return pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END;
-}
-
-static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev,
-				  struct cxl_register_map *map)
-{
-	struct cxl_port *port;
-	struct cxl_dport *dport;
-	resource_size_t component_reg_phys;
-
-	*map = (struct cxl_register_map) {
-		.host = &pdev->dev,
-		.resource = CXL_RESOURCE_NONE,
-	};
-
-	port = cxl_pci_find_port(pdev, &dport);
-	if (!port)
-		return -EPROBE_DEFER;
-
-	component_reg_phys = cxl_rcd_component_reg_phys(&pdev->dev, dport);
-
-	put_device(&port->dev);
-
-	if (component_reg_phys == CXL_RESOURCE_NONE)
-		return -ENXIO;
-
-	map->resource = component_reg_phys;
-	map->reg_type = CXL_REGLOC_RBI_COMPONENT;
-	map->max_size = CXL_COMPONENT_REG_BLOCK_SIZE;
-
-	return 0;
-}
-
-static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
-			      struct cxl_register_map *map,
-			      unsigned long *caps)
-{
-	int rc;
-
-	rc = cxl_find_regblock(pdev, type, map);
-
-	/*
-	 * If the Register Locator DVSEC does not exist, check if it
-	 * is an RCH and try to extract the Component Registers from
-	 * an RCRB.
-	 */
-	if (rc && type == CXL_REGLOC_RBI_COMPONENT && is_cxl_restricted(pdev))
-		rc = cxl_rcrb_get_comp_regs(pdev, map);
-
-	if (rc)
-		return rc;
-
-	return cxl_setup_regs(map, caps);
-}
-
 static int cxl_pci_ras_unmask(struct pci_dev *pdev)
 {
 	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 05/26] cxl: move pci generic code
  2024-10-17 16:52 ` [PATCH v4 05/26] cxl: move pci generic code alejandro.lucero-palau
@ 2024-10-17 21:49   ` Ben Cheatham
  2024-10-18  9:35     ` Alejandro Lucero Palau
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Cheatham @ 2024-10-17 21:49 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet

On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Inside cxl/core/pci.c there are helpers for CXL PCIe initialization
> meanwhile cxl/pci.c implements the functionality for a Type3 device
> initialization.
> 
> Move helper functions from cxl/pci.c to cxl/pci/pci.c in order to be

Wrong path, cxl/pci/pci.c should be cxl/core/pci.c.

> exported and shared with CXL Type2 device initialization.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/pci.c | 62 ++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlpci.h   |  3 ++
>  drivers/cxl/pci.c      | 61 -----------------------------------------
>  3 files changed, 65 insertions(+), 61 deletions(-)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index fa2a5e216dc3..99acc258722d 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -1079,6 +1079,68 @@ bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, CXL);
>  
> +/*
> + * Assume that any RCIEP that emits the CXL memory expander class code
> + * is an RCD
> + */
> +bool is_cxl_restricted(struct pci_dev *pdev)
> +{
> +	return pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END;
> +}
> +EXPORT_SYMBOL_NS_GPL(is_cxl_restricted, CXL);
> +
> +static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev,
> +				  struct cxl_register_map *map)
> +{
> +	struct cxl_port *port;
> +	struct cxl_dport *dport;
> +	resource_size_t component_reg_phys;
> +
> +	*map = (struct cxl_register_map) {
> +		.host = &pdev->dev,
> +		.resource = CXL_RESOURCE_NONE,
> +	};
> +
> +	port = cxl_pci_find_port(pdev, &dport);
> +	if (!port)
> +		return -EPROBE_DEFER;
> +
> +	component_reg_phys = cxl_rcd_component_reg_phys(&pdev->dev, dport);
> +
> +	put_device(&port->dev);
> +
> +	if (component_reg_phys == CXL_RESOURCE_NONE)
> +		return -ENXIO;
> +
> +	map->resource = component_reg_phys;
> +	map->reg_type = CXL_REGLOC_RBI_COMPONENT;
> +	map->max_size = CXL_COMPONENT_REG_BLOCK_SIZE;
> +
> +	return 0;
> +}
> +
> +int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> +		       struct cxl_register_map *map, unsigned long *caps)
> +{
> +	int rc;
> +
> +	rc = cxl_find_regblock(pdev, type, map);
> +
> +	/*
> +	 * If the Register Locator DVSEC does not exist, check if it
> +	 * is an RCH and try to extract the Component Registers from
> +	 * an RCRB.
> +	 */
> +	if (rc && type == CXL_REGLOC_RBI_COMPONENT && is_cxl_restricted(pdev))
> +		rc = cxl_rcrb_get_comp_regs(pdev, map);
> +
> +	if (rc)
> +		return rc;
> +
> +	return cxl_setup_regs(map, caps);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_pci_setup_regs, CXL);
> +
>  bool cxl_pci_check_caps(struct cxl_dev_state *cxlds, unsigned long *expected_caps,
>  			unsigned long *current_caps)
>  {
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index eb59019fe5f3..985cca3c3350 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -113,4 +113,7 @@ void read_cdat_data(struct cxl_port *port);
>  void cxl_cor_error_detected(struct pci_dev *pdev);
>  pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
>  				    pci_channel_state_t state);
> +bool is_cxl_restricted(struct pci_dev *pdev);
> +int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> +		       struct cxl_register_map *map, unsigned long *caps);
>  #endif /* __CXL_PCI_H__ */
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 89c8ac1a61fd..e9333211e18f 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -463,67 +463,6 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds, bool irq_avail)
>  	return 0;
>  }
>  
> -/*
> - * Assume that any RCIEP that emits the CXL memory expander class code
> - * is an RCD
> - */
> -static bool is_cxl_restricted(struct pci_dev *pdev)
> -{
> -	return pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END;
> -}
> -
> -static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev,
> -				  struct cxl_register_map *map)
> -{
> -	struct cxl_port *port;
> -	struct cxl_dport *dport;
> -	resource_size_t component_reg_phys;
> -
> -	*map = (struct cxl_register_map) {
> -		.host = &pdev->dev,
> -		.resource = CXL_RESOURCE_NONE,
> -	};
> -
> -	port = cxl_pci_find_port(pdev, &dport);
> -	if (!port)
> -		return -EPROBE_DEFER;
> -
> -	component_reg_phys = cxl_rcd_component_reg_phys(&pdev->dev, dport);
> -
> -	put_device(&port->dev);
> -
> -	if (component_reg_phys == CXL_RESOURCE_NONE)
> -		return -ENXIO;
> -
> -	map->resource = component_reg_phys;
> -	map->reg_type = CXL_REGLOC_RBI_COMPONENT;
> -	map->max_size = CXL_COMPONENT_REG_BLOCK_SIZE;
> -
> -	return 0;
> -}
> -
> -static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> -			      struct cxl_register_map *map,
> -			      unsigned long *caps)
> -{
> -	int rc;
> -
> -	rc = cxl_find_regblock(pdev, type, map);
> -
> -	/*
> -	 * If the Register Locator DVSEC does not exist, check if it
> -	 * is an RCH and try to extract the Component Registers from
> -	 * an RCRB.
> -	 */
> -	if (rc && type == CXL_REGLOC_RBI_COMPONENT && is_cxl_restricted(pdev))
> -		rc = cxl_rcrb_get_comp_regs(pdev, map);
> -
> -	if (rc)
> -		return rc;
> -
> -	return cxl_setup_regs(map, caps);
> -}
> -
>  static int cxl_pci_ras_unmask(struct pci_dev *pdev)
>  {
>  	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 05/26] cxl: move pci generic code
  2024-10-17 21:49   ` Ben Cheatham
@ 2024-10-18  9:35     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-18  9:35 UTC (permalink / raw)
  To: Ben Cheatham, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/17/24 22:49, Ben Cheatham wrote:
> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Inside cxl/core/pci.c there are helpers for CXL PCIe initialization
>> meanwhile cxl/pci.c implements the functionality for a Type3 device
>> initialization.
>>
>> Move helper functions from cxl/pci.c to cxl/pci/pci.c in order to be
> Wrong path, cxl/pci/pci.c should be cxl/core/pci.c.


Oh, good catch.

I'll fix it.

Thanks


>> exported and shared with CXL Type2 device initialization.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/pci.c | 62 ++++++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/cxlpci.h   |  3 ++
>>   drivers/cxl/pci.c      | 61 -----------------------------------------
>>   3 files changed, 65 insertions(+), 61 deletions(-)
>>
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index fa2a5e216dc3..99acc258722d 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -1079,6 +1079,68 @@ bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, CXL);
>>   
>> +/*
>> + * Assume that any RCIEP that emits the CXL memory expander class code
>> + * is an RCD
>> + */
>> +bool is_cxl_restricted(struct pci_dev *pdev)
>> +{
>> +	return pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(is_cxl_restricted, CXL);
>> +
>> +static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev,
>> +				  struct cxl_register_map *map)
>> +{
>> +	struct cxl_port *port;
>> +	struct cxl_dport *dport;
>> +	resource_size_t component_reg_phys;
>> +
>> +	*map = (struct cxl_register_map) {
>> +		.host = &pdev->dev,
>> +		.resource = CXL_RESOURCE_NONE,
>> +	};
>> +
>> +	port = cxl_pci_find_port(pdev, &dport);
>> +	if (!port)
>> +		return -EPROBE_DEFER;
>> +
>> +	component_reg_phys = cxl_rcd_component_reg_phys(&pdev->dev, dport);
>> +
>> +	put_device(&port->dev);
>> +
>> +	if (component_reg_phys == CXL_RESOURCE_NONE)
>> +		return -ENXIO;
>> +
>> +	map->resource = component_reg_phys;
>> +	map->reg_type = CXL_REGLOC_RBI_COMPONENT;
>> +	map->max_size = CXL_COMPONENT_REG_BLOCK_SIZE;
>> +
>> +	return 0;
>> +}
>> +
>> +int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>> +		       struct cxl_register_map *map, unsigned long *caps)
>> +{
>> +	int rc;
>> +
>> +	rc = cxl_find_regblock(pdev, type, map);
>> +
>> +	/*
>> +	 * If the Register Locator DVSEC does not exist, check if it
>> +	 * is an RCH and try to extract the Component Registers from
>> +	 * an RCRB.
>> +	 */
>> +	if (rc && type == CXL_REGLOC_RBI_COMPONENT && is_cxl_restricted(pdev))
>> +		rc = cxl_rcrb_get_comp_regs(pdev, map);
>> +
>> +	if (rc)
>> +		return rc;
>> +
>> +	return cxl_setup_regs(map, caps);
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_pci_setup_regs, CXL);
>> +
>>   bool cxl_pci_check_caps(struct cxl_dev_state *cxlds, unsigned long *expected_caps,
>>   			unsigned long *current_caps)
>>   {
>> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
>> index eb59019fe5f3..985cca3c3350 100644
>> --- a/drivers/cxl/cxlpci.h
>> +++ b/drivers/cxl/cxlpci.h
>> @@ -113,4 +113,7 @@ void read_cdat_data(struct cxl_port *port);
>>   void cxl_cor_error_detected(struct pci_dev *pdev);
>>   pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
>>   				    pci_channel_state_t state);
>> +bool is_cxl_restricted(struct pci_dev *pdev);
>> +int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>> +		       struct cxl_register_map *map, unsigned long *caps);
>>   #endif /* __CXL_PCI_H__ */
>> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
>> index 89c8ac1a61fd..e9333211e18f 100644
>> --- a/drivers/cxl/pci.c
>> +++ b/drivers/cxl/pci.c
>> @@ -463,67 +463,6 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds, bool irq_avail)
>>   	return 0;
>>   }
>>   
>> -/*
>> - * Assume that any RCIEP that emits the CXL memory expander class code
>> - * is an RCD
>> - */
>> -static bool is_cxl_restricted(struct pci_dev *pdev)
>> -{
>> -	return pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END;
>> -}
>> -
>> -static int cxl_rcrb_get_comp_regs(struct pci_dev *pdev,
>> -				  struct cxl_register_map *map)
>> -{
>> -	struct cxl_port *port;
>> -	struct cxl_dport *dport;
>> -	resource_size_t component_reg_phys;
>> -
>> -	*map = (struct cxl_register_map) {
>> -		.host = &pdev->dev,
>> -		.resource = CXL_RESOURCE_NONE,
>> -	};
>> -
>> -	port = cxl_pci_find_port(pdev, &dport);
>> -	if (!port)
>> -		return -EPROBE_DEFER;
>> -
>> -	component_reg_phys = cxl_rcd_component_reg_phys(&pdev->dev, dport);
>> -
>> -	put_device(&port->dev);
>> -
>> -	if (component_reg_phys == CXL_RESOURCE_NONE)
>> -		return -ENXIO;
>> -
>> -	map->resource = component_reg_phys;
>> -	map->reg_type = CXL_REGLOC_RBI_COMPONENT;
>> -	map->max_size = CXL_COMPONENT_REG_BLOCK_SIZE;
>> -
>> -	return 0;
>> -}
>> -
>> -static int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>> -			      struct cxl_register_map *map,
>> -			      unsigned long *caps)
>> -{
>> -	int rc;
>> -
>> -	rc = cxl_find_regblock(pdev, type, map);
>> -
>> -	/*
>> -	 * If the Register Locator DVSEC does not exist, check if it
>> -	 * is an RCH and try to extract the Component Registers from
>> -	 * an RCRB.
>> -	 */
>> -	if (rc && type == CXL_REGLOC_RBI_COMPONENT && is_cxl_restricted(pdev))
>> -		rc = cxl_rcrb_get_comp_regs(pdev, map);
>> -
>> -	if (rc)
>> -		return rc;
>> -
>> -	return cxl_setup_regs(map, caps);
>> -}
>> -
>>   static int cxl_pci_ras_unmask(struct pci_dev *pdev)
>>   {
>>   	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 06/26] cxl: add function for type2 cxl regs setup
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (4 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 05/26] cxl: move pci generic code alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 21:49   ` Ben Cheatham
  2024-10-17 16:52 ` [PATCH v4 07/26] sfc: use cxl api for regs setup and checking alejandro.lucero-palau
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Create a new function for a type2 device initialising
cxl_dev_state struct regarding cxl regs setup and mapping.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/pci.c  | 47 +++++++++++++++++++++++++++++++++++++++++
 include/linux/cxl/cxl.h |  2 ++
 2 files changed, 49 insertions(+)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 99acc258722d..f0f7e8bd4499 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -1141,6 +1141,53 @@ int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
 }
 EXPORT_SYMBOL_NS_GPL(cxl_pci_setup_regs, CXL);
 
+static int cxl_pci_setup_memdev_regs(struct pci_dev *pdev,
+				     struct cxl_dev_state *cxlds)
+{
+	struct cxl_register_map map;
+	int rc;
+
+	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
+				cxlds->capabilities);
+	/*
+	 * This call returning a non-zero value is not considered an error since
+	 * these regs are not mandatory for Type2. If they do exist then mapping
+	 * them should not fail.
+	 */
+	if (rc)
+		return 0;
+
+	return cxl_map_device_regs(&map, &cxlds->regs.device_regs);
+}
+
+int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
+{
+	int rc;
+
+	rc = cxl_pci_setup_memdev_regs(pdev, cxlds);
+	if (rc)
+		return rc;
+
+	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
+				&cxlds->reg_map, cxlds->capabilities);
+	if (rc) {
+		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
+		return rc;
+	}
+
+	if (!test_bit(CXL_CM_CAP_CAP_ID_RAS, cxlds->capabilities))
+		return rc;
+
+	rc = cxl_map_component_regs(&cxlds->reg_map,
+				    &cxlds->regs.component,
+				    BIT(CXL_CM_CAP_CAP_ID_RAS));
+	if (rc)
+		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
+
+	return rc;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_pci_accel_setup_regs, CXL);
+
 bool cxl_pci_check_caps(struct cxl_dev_state *cxlds, unsigned long *expected_caps,
 			unsigned long *current_caps)
 {
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
index 78653fa4daa0..2f48ee591259 100644
--- a/include/linux/cxl/cxl.h
+++ b/include/linux/cxl/cxl.h
@@ -5,6 +5,7 @@
 #define __CXL_H
 
 #include <linux/device.h>
+#include <linux/pci.h>
 
 enum cxl_resource {
 	CXL_RES_DPA,
@@ -52,4 +53,5 @@ int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
 bool cxl_pci_check_caps(struct cxl_dev_state *cxlds,
 			unsigned long *expected_caps,
 			unsigned long *current_caps);
+int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 06/26] cxl: add function for type2 cxl regs setup
  2024-10-17 16:52 ` [PATCH v4 06/26] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
@ 2024-10-17 21:49   ` Ben Cheatham
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Cheatham @ 2024-10-17 21:49 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet

On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Create a new function for a type2 device initialising
> cxl_dev_state struct regarding cxl regs setup and mapping.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/pci.c  | 47 +++++++++++++++++++++++++++++++++++++++++
>  include/linux/cxl/cxl.h |  2 ++
>  2 files changed, 49 insertions(+)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 99acc258722d..f0f7e8bd4499 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -1141,6 +1141,53 @@ int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_pci_setup_regs, CXL);
>  
> +static int cxl_pci_setup_memdev_regs(struct pci_dev *pdev,
> +				     struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_register_map map;
> +	int rc;
> +
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map,
> +				cxlds->capabilities);
> +	/*
> +	 * This call returning a non-zero value is not considered an error since
> +	 * these regs are not mandatory for Type2. If they do exist then mapping
> +	 * them should not fail.
> +	 */
> +	if (rc)
> +		return 0;
> +
> +	return cxl_map_device_regs(&map, &cxlds->regs.device_regs);
> +}

I think you can use this function for type 3 device set up in cxl_pci_probe() as well with
a minor change. Instead of

	if (rc)
		return 0;

above, you could do
	
	if (rc) {
		if (cxlds->type == CXL_DEVTYPE_CLASSMEM)
			return rc;
		return 0;
	}

instead and replace the memdev cxl_pci_setup_regs() call in cxl_pci_probe(). 
> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds)
> +{
> +	int rc;
> +
> +	rc = cxl_pci_setup_memdev_regs(pdev, cxlds);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT,
> +				&cxlds->reg_map, cxlds->capabilities);
> +	if (rc) {
> +		dev_warn(&pdev->dev, "No component registers (%d)\n", rc);
> +		return rc;
> +	}
> +
> +	if (!test_bit(CXL_CM_CAP_CAP_ID_RAS, cxlds->capabilities))
> +		return rc;

If you make the modification above, I think this is just a drop-in replacement for
the component register set up code in cxl_pci_probe(). I may be wrong (it's EOD here
and my brain is a little tired), but it could be a nice cleanup if so.

> +
> +	rc = cxl_map_component_regs(&cxlds->reg_map,
> +				    &cxlds->regs.component,
> +				    BIT(CXL_CM_CAP_CAP_ID_RAS));
> +	if (rc)
> +		dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_pci_accel_setup_regs, CXL);
> +
>  bool cxl_pci_check_caps(struct cxl_dev_state *cxlds, unsigned long *expected_caps,
>  			unsigned long *current_caps)
>  {
> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
> index 78653fa4daa0..2f48ee591259 100644
> --- a/include/linux/cxl/cxl.h
> +++ b/include/linux/cxl/cxl.h
> @@ -5,6 +5,7 @@
>  #define __CXL_H
>  
>  #include <linux/device.h>
> +#include <linux/pci.h>
>  
>  enum cxl_resource {
>  	CXL_RES_DPA,
> @@ -52,4 +53,5 @@ int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>  bool cxl_pci_check_caps(struct cxl_dev_state *cxlds,
>  			unsigned long *expected_caps,
>  			unsigned long *current_caps);
> +int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>  #endif


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 07/26] sfc: use cxl api for regs setup and checking
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (5 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 06/26] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 21:49   ` Ben Cheatham
  2024-10-17 16:52 ` [PATCH v4 08/26] cxl: add functions for resource request/release by a driver alejandro.lucero-palau
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Use cxl code for registers discovery and mapping.

Validate capabilities found based on those registers against expected
capabilities.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/net/ethernet/sfc/efx_cxl.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index fb3eef339b34..749aa97683fd 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -22,6 +22,8 @@ int efx_cxl_init(struct efx_nic *efx)
 {
 #if IS_ENABLED(CONFIG_CXL_BUS)
 	struct pci_dev *pci_dev = efx->pci_dev;
+	DECLARE_BITMAP(expected, CXL_MAX_CAPS);
+	DECLARE_BITMAP(found, CXL_MAX_CAPS);
 	struct efx_cxl *cxl;
 	struct resource res;
 	u16 dvsec;
@@ -64,6 +66,23 @@ int efx_cxl_init(struct efx_nic *efx)
 		goto err2;
 	}
 
+	rc = cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds);
+	if (rc) {
+		pci_err(pci_dev, "CXL accel setup regs failed");
+		goto err2;
+	}
+
+	bitmap_clear(expected, 0, BITS_PER_TYPE(unsigned long));
+	bitmap_set(expected, CXL_DEV_CAP_HDM, 1);
+	bitmap_set(expected, CXL_DEV_CAP_RAS, 1);
+
+	if (!cxl_pci_check_caps(cxl->cxlds, expected, found)) {
+		pci_err(pci_dev,
+			"CXL device capabilities found(%08lx) not as expected(%08lx)",
+			*found, *expected);
+		goto err2;
+	}
+
 	efx->cxl = cxl;
 #endif
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 07/26] sfc: use cxl api for regs setup and checking
  2024-10-17 16:52 ` [PATCH v4 07/26] sfc: use cxl api for regs setup and checking alejandro.lucero-palau
@ 2024-10-17 21:49   ` Ben Cheatham
  2024-10-18 15:07     ` Alejandro Lucero Palau
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Cheatham @ 2024-10-17 21:49 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet

On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Use cxl code for registers discovery and mapping.
> 
> Validate capabilities found based on those registers against expected
> capabilities.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/net/ethernet/sfc/efx_cxl.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index fb3eef339b34..749aa97683fd 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -22,6 +22,8 @@ int efx_cxl_init(struct efx_nic *efx)
>  {
>  #if IS_ENABLED(CONFIG_CXL_BUS)
>  	struct pci_dev *pci_dev = efx->pci_dev;
> +	DECLARE_BITMAP(expected, CXL_MAX_CAPS);
> +	DECLARE_BITMAP(found, CXL_MAX_CAPS);
>  	struct efx_cxl *cxl;
>  	struct resource res;
>  	u16 dvsec;
> @@ -64,6 +66,23 @@ int efx_cxl_init(struct efx_nic *efx)
>  		goto err2;
>  	}
>  
> +	rc = cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds);
> +	if (rc) {
> +		pci_err(pci_dev, "CXL accel setup regs failed");
> +		goto err2;
> +	}
> +
> +	bitmap_clear(expected, 0, BITS_PER_TYPE(unsigned long));

In some places you use BITS_PER_TYPE(unsigned long) for the size of the capabilities bitmap,
while in others you use CXL_MAX_CAPS. Right now it isn't an issue since CXL_MAX_CAPS is way
smaller than the size of an unsigned long, but I seem to remember Jonathan suggesting this
for future proofing. So, I would suggest setting CXL_MAX_CAPS = BITS_PER_TYPE(unsigned long)
and using CXL_MAX_CAPS everywhere (or just using CXL_MAX_CAPS as-is). Then, when/if there
are more capabilities we can just increase what CXL_MAX_CAPS is set to.

> +	bitmap_set(expected, CXL_DEV_CAP_HDM, 1);
> +	bitmap_set(expected, CXL_DEV_CAP_RAS, 1);
> +
> +	if (!cxl_pci_check_caps(cxl->cxlds, expected, found)) {
> +		pci_err(pci_dev,
> +			"CXL device capabilities found(%08lx) not as expected(%08lx)",
> +			*found, *expected);
> +		goto err2;
> +	}
> +
>  	efx->cxl = cxl;
>  #endif
>  


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 07/26] sfc: use cxl api for regs setup and checking
  2024-10-17 21:49   ` Ben Cheatham
@ 2024-10-18 15:07     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-18 15:07 UTC (permalink / raw)
  To: Ben Cheatham, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/17/24 22:49, Ben Cheatham wrote:
> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Use cxl code for registers discovery and mapping.
>>
>> Validate capabilities found based on those registers against expected
>> capabilities.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/net/ethernet/sfc/efx_cxl.c | 19 +++++++++++++++++++
>>   1 file changed, 19 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index fb3eef339b34..749aa97683fd 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -22,6 +22,8 @@ int efx_cxl_init(struct efx_nic *efx)
>>   {
>>   #if IS_ENABLED(CONFIG_CXL_BUS)
>>   	struct pci_dev *pci_dev = efx->pci_dev;
>> +	DECLARE_BITMAP(expected, CXL_MAX_CAPS);
>> +	DECLARE_BITMAP(found, CXL_MAX_CAPS);
>>   	struct efx_cxl *cxl;
>>   	struct resource res;
>>   	u16 dvsec;
>> @@ -64,6 +66,23 @@ int efx_cxl_init(struct efx_nic *efx)
>>   		goto err2;
>>   	}
>>   
>> +	rc = cxl_pci_accel_setup_regs(pci_dev, cxl->cxlds);
>> +	if (rc) {
>> +		pci_err(pci_dev, "CXL accel setup regs failed");
>> +		goto err2;
>> +	}
>> +
>> +	bitmap_clear(expected, 0, BITS_PER_TYPE(unsigned long));
> In some places you use BITS_PER_TYPE(unsigned long) for the size of the capabilities bitmap,
> while in others you use CXL_MAX_CAPS. Right now it isn't an issue since CXL_MAX_CAPS is way
> smaller than the size of an unsigned long, but I seem to remember Jonathan suggesting this
> for future proofing. So, I would suggest setting CXL_MAX_CAPS = BITS_PER_TYPE(unsigned long)
> and using CXL_MAX_CAPS everywhere (or just using CXL_MAX_CAPS as-is). Then, when/if there
> are more capabilities we can just increase what CXL_MAX_CAPS is set to.


The reason for using this BITS_PER_TYPE here is because with 
CXL_MAX_CAPS, as it is defined now, it would not clear those bits not 
covered by the current value. Defining CXL_MAX_CAPS as 32 in the enum 
would solce thais problem. I think that is cleaner than doing any 
masking depending on CXL_MAX_CAPS so I will do so in v5.


Thanks


>> +	bitmap_set(expected, CXL_DEV_CAP_HDM, 1);
>> +	bitmap_set(expected, CXL_DEV_CAP_RAS, 1);
>> +
>> +	if (!cxl_pci_check_caps(cxl->cxlds, expected, found)) {
>> +		pci_err(pci_dev,
>> +			"CXL device capabilities found(%08lx) not as expected(%08lx)",
>> +			*found, *expected);
>> +		goto err2;
>> +	}
>> +
>>   	efx->cxl = cxl;
>>   #endif
>>   

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 08/26] cxl: add functions for resource request/release by a driver
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (6 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 07/26] sfc: use cxl api for regs setup and checking alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 21:49   ` Ben Cheatham
  2024-10-17 16:52 ` [PATCH v4 09/26] sfc: request cxl ram resource alejandro.lucero-palau
                   ` (18 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Create accessors for an accel driver requesting and releasing a resource.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/memdev.c | 51 +++++++++++++++++++++++++++++++++++++++
 include/linux/cxl/cxl.h   |  2 ++
 2 files changed, 53 insertions(+)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 94b8a7b53c92..4b2641f20128 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -744,6 +744,57 @@ int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
 }
 EXPORT_SYMBOL_NS_GPL(cxl_set_resource, CXL);
 
+int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type)
+{
+	int rc;
+
+	switch (type) {
+	case CXL_RES_RAM:
+		if (!resource_size(&cxlds->ram_res)) {
+			dev_err(cxlds->dev,
+				"resource request for ram with size 0\n");
+			return -EINVAL;
+		}
+
+		rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
+		break;
+	case CXL_RES_PMEM:
+		if (!resource_size(&cxlds->pmem_res)) {
+			dev_err(cxlds->dev,
+				"resource request for pmem with size 0\n");
+			return -EINVAL;
+		}
+		rc = request_resource(&cxlds->dpa_res, &cxlds->pmem_res);
+		break;
+	default:
+		dev_err(cxlds->dev, "unsupported resource type (%u)\n", type);
+		return -EINVAL;
+	}
+
+	return rc;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_request_resource, CXL);
+
+int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type)
+{
+	int rc;
+
+	switch (type) {
+	case CXL_RES_RAM:
+		rc = release_resource(&cxlds->ram_res);
+		break;
+	case CXL_RES_PMEM:
+		rc = release_resource(&cxlds->pmem_res);
+		break;
+	default:
+		dev_err(cxlds->dev, "unknown resource type (%u)\n", type);
+		return -EINVAL;
+	}
+
+	return rc;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_release_resource, CXL);
+
 static int cxl_memdev_release_file(struct inode *inode, struct file *file)
 {
 	struct cxl_memdev *cxlmd =
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
index 2f48ee591259..6c6d27721067 100644
--- a/include/linux/cxl/cxl.h
+++ b/include/linux/cxl/cxl.h
@@ -54,4 +54,6 @@ bool cxl_pci_check_caps(struct cxl_dev_state *cxlds,
 			unsigned long *expected_caps,
 			unsigned long *current_caps);
 int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
+int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
+int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 08/26] cxl: add functions for resource request/release by a driver
  2024-10-17 16:52 ` [PATCH v4 08/26] cxl: add functions for resource request/release by a driver alejandro.lucero-palau
@ 2024-10-17 21:49   ` Ben Cheatham
  2024-10-18 14:58     ` Alejandro Lucero Palau
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Cheatham @ 2024-10-17 21:49 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet

On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Create accessors for an accel driver requesting and releasing a resource.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/memdev.c | 51 +++++++++++++++++++++++++++++++++++++++
>  include/linux/cxl/cxl.h   |  2 ++
>  2 files changed, 53 insertions(+)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 94b8a7b53c92..4b2641f20128 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -744,6 +744,57 @@ int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_set_resource, CXL);
>  
> +int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type)
> +{
> +	int rc;
> +
> +	switch (type) {
> +	case CXL_RES_RAM:
> +		if (!resource_size(&cxlds->ram_res)) {
> +			dev_err(cxlds->dev,
> +				"resource request for ram with size 0\n");
> +			return -EINVAL;
> +		}
> +
> +		rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
> +		break;
> +	case CXL_RES_PMEM:
> +		if (!resource_size(&cxlds->pmem_res)) {
> +			dev_err(cxlds->dev,
> +				"resource request for pmem with size 0\n");
> +			return -EINVAL;
> +		}
> +		rc = request_resource(&cxlds->dpa_res, &cxlds->pmem_res);
> +		break;
> +	default:
> +		dev_err(cxlds->dev, "unsupported resource type (%u)\n", type);
> +		return -EINVAL;
> +	}
> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_request_resource, CXL);

It looks like add_dpa_res() in cxl/core/mbox.c already does what you are doing here, minus the enum.
Is there a way that could be reused, or a good reason not too? Even if you don't export the function
outside of the cxl tree, you could reuse that function for the internals of this one.

> +
> +int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type)
> +{
> +	int rc;
> +
> +	switch (type) {
> +	case CXL_RES_RAM:
> +		rc = release_resource(&cxlds->ram_res);
> +		break;
> +	case CXL_RES_PMEM:
> +		rc = release_resource(&cxlds->pmem_res);
> +		break;
> +	default:
> +		dev_err(cxlds->dev, "unknown resource type (%u)\n", type);
> +		return -EINVAL;
> +	}
> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_release_resource, CXL);
> +

Same thing here, but with cxl_dpa_release() instead of add_dpa_res().

Looking at it some more, it looks like there is also some stuff to do with locking for CXL DPA resources in
that function that you would be skipping with your functions above. Will that be a problem later? I have no
clue, but thought I should ask just in case.

>  static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>  {
>  	struct cxl_memdev *cxlmd =
> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
> index 2f48ee591259..6c6d27721067 100644
> --- a/include/linux/cxl/cxl.h
> +++ b/include/linux/cxl/cxl.h
> @@ -54,4 +54,6 @@ bool cxl_pci_check_caps(struct cxl_dev_state *cxlds,
>  			unsigned long *expected_caps,
>  			unsigned long *current_caps);
>  int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
> +int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
> +int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>  #endif


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 08/26] cxl: add functions for resource request/release by a driver
  2024-10-17 21:49   ` Ben Cheatham
@ 2024-10-18 14:58     ` Alejandro Lucero Palau
  2024-10-18 16:40       ` Ben Cheatham
  0 siblings, 1 reply; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-18 14:58 UTC (permalink / raw)
  To: Ben Cheatham, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/17/24 22:49, Ben Cheatham wrote:
> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Create accessors for an accel driver requesting and releasing a resource.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/memdev.c | 51 +++++++++++++++++++++++++++++++++++++++
>>   include/linux/cxl/cxl.h   |  2 ++
>>   2 files changed, 53 insertions(+)
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index 94b8a7b53c92..4b2641f20128 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -744,6 +744,57 @@ int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_set_resource, CXL);
>>   
>> +int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type)
>> +{
>> +	int rc;
>> +
>> +	switch (type) {
>> +	case CXL_RES_RAM:
>> +		if (!resource_size(&cxlds->ram_res)) {
>> +			dev_err(cxlds->dev,
>> +				"resource request for ram with size 0\n");
>> +			return -EINVAL;
>> +		}
>> +
>> +		rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
>> +		break;
>> +	case CXL_RES_PMEM:
>> +		if (!resource_size(&cxlds->pmem_res)) {
>> +			dev_err(cxlds->dev,
>> +				"resource request for pmem with size 0\n");
>> +			return -EINVAL;
>> +		}
>> +		rc = request_resource(&cxlds->dpa_res, &cxlds->pmem_res);
>> +		break;
>> +	default:
>> +		dev_err(cxlds->dev, "unsupported resource type (%u)\n", type);
>> +		return -EINVAL;
>> +	}
>> +
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_request_resource, CXL);
> It looks like add_dpa_res() in cxl/core/mbox.c already does what you are doing here, minus the enum.
> Is there a way that could be reused, or a good reason not too? Even if you don't export the function
> outside of the cxl tree, you could reuse that function for the internals of this one.


Although they are obviously similar, I think it makes sense to keep 
both. The CXL accel API is being implemented for avoiding accel drivers 
to manipulate cxl structs but through the API calls. With add_dpa_res we 
would break that, and calling it from the new cxl_request_resource would 
need changes as inside add_dpa_res the resource is initialized what has 
already been done in this implementation. IMO, those changes would make 
the code uglier.


Moreover, your comment below about cxl_dpa_release is, I think, wrong, 
since inside that function other things are being done related to 
regions. BTW, I can not see other release_resource calls from the 
current code than those added by this patch.


So, , I'm not keen to change this now, but maybe a good follow-up work.


>> +
>> +int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type)
>> +{
>> +	int rc;
>> +
>> +	switch (type) {
>> +	case CXL_RES_RAM:
>> +		rc = release_resource(&cxlds->ram_res);
>> +		break;
>> +	case CXL_RES_PMEM:
>> +		rc = release_resource(&cxlds->pmem_res);
>> +		break;
>> +	default:
>> +		dev_err(cxlds->dev, "unknown resource type (%u)\n", type);
>> +		return -EINVAL;
>> +	}
>> +
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_release_resource, CXL);
>> +
> Same thing here, but with cxl_dpa_release() instead of add_dpa_res().
>
> Looking at it some more, it looks like there is also some stuff to do with locking for CXL DPA resources in
> that function that you would be skipping with your functions above. Will that be a problem later? I have no
> clue, but thought I should ask just in case.
>
>>   static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>>   {
>>   	struct cxl_memdev *cxlmd =
>> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
>> index 2f48ee591259..6c6d27721067 100644
>> --- a/include/linux/cxl/cxl.h
>> +++ b/include/linux/cxl/cxl.h
>> @@ -54,4 +54,6 @@ bool cxl_pci_check_caps(struct cxl_dev_state *cxlds,
>>   			unsigned long *expected_caps,
>>   			unsigned long *current_caps);
>>   int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>> +int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>> +int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>>   #endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 08/26] cxl: add functions for resource request/release by a driver
  2024-10-18 14:58     ` Alejandro Lucero Palau
@ 2024-10-18 16:40       ` Ben Cheatham
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Cheatham @ 2024-10-18 16:40 UTC (permalink / raw)
  To: Alejandro Lucero Palau, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet



On 10/18/24 9:58 AM, Alejandro Lucero Palau wrote:
> 
> On 10/17/24 22:49, Ben Cheatham wrote:
>> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>>> From: Alejandro Lucero <alucerop@amd.com>
>>>
>>> Create accessors for an accel driver requesting and releasing a resource.
>>>
>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>> ---
>>>   drivers/cxl/core/memdev.c | 51 +++++++++++++++++++++++++++++++++++++++
>>>   include/linux/cxl/cxl.h   |  2 ++
>>>   2 files changed, 53 insertions(+)
>>>
>>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>>> index 94b8a7b53c92..4b2641f20128 100644
>>> --- a/drivers/cxl/core/memdev.c
>>> +++ b/drivers/cxl/core/memdev.c
>>> @@ -744,6 +744,57 @@ int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>>   }
>>>   EXPORT_SYMBOL_NS_GPL(cxl_set_resource, CXL);
>>>   +int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type)
>>> +{
>>> +    int rc;
>>> +
>>> +    switch (type) {
>>> +    case CXL_RES_RAM:
>>> +        if (!resource_size(&cxlds->ram_res)) {
>>> +            dev_err(cxlds->dev,
>>> +                "resource request for ram with size 0\n");
>>> +            return -EINVAL;
>>> +        }
>>> +
>>> +        rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
>>> +        break;
>>> +    case CXL_RES_PMEM:
>>> +        if (!resource_size(&cxlds->pmem_res)) {
>>> +            dev_err(cxlds->dev,
>>> +                "resource request for pmem with size 0\n");
>>> +            return -EINVAL;
>>> +        }
>>> +        rc = request_resource(&cxlds->dpa_res, &cxlds->pmem_res);
>>> +        break;
>>> +    default:
>>> +        dev_err(cxlds->dev, "unsupported resource type (%u)\n", type);
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    return rc;
>>> +}
>>> +EXPORT_SYMBOL_NS_GPL(cxl_request_resource, CXL);
>> It looks like add_dpa_res() in cxl/core/mbox.c already does what you are doing here, minus the enum.
>> Is there a way that could be reused, or a good reason not too? Even if you don't export the function
>> outside of the cxl tree, you could reuse that function for the internals of this one.
> 
> 
> Although they are obviously similar, I think it makes sense to keep both. The CXL accel API is being implemented for avoiding accel drivers to manipulate cxl structs but through the API calls. With add_dpa_res we would break that, and calling it from the new cxl_request_resource would need changes as inside add_dpa_res the resource is initialized what has already been done in this implementation. IMO, those changes would make the code uglier.
> 

That sounds good to me. I just wanted to make sure there was a good reason to have this set of functions as well!

> 
> Moreover, your comment below about cxl_dpa_release is, I think, wrong, since inside that function other things are being done related to regions. BTW, I can not see other release_resource calls from the current code than those added by this patch.
> 
> 
> So, , I'm not keen to change this now, but maybe a good follow-up work.
> 

From what I've seen, cxl_dpa_release() is only used as part of device cleanup so that's probably why you aren't seeing much usage.

I agree with you with regards to the extra stuff in cxl_dpa_release() with how the patch is right now. I think if DAX region
support ends up being adding the extra region management done in cxl_dpa_release() may be required. My reasoning here is that
at that point we can expect more users than just the driver accessing the CXL resources, so a more managed remove will probably be
necessary.

If I'm wrong about this, then this patch is fine as-is.

> 
>>> +
>>> +int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type)
>>> +{
>>> +    int rc;
>>> +
>>> +    switch (type) {
>>> +    case CXL_RES_RAM:
>>> +        rc = release_resource(&cxlds->ram_res);
>>> +        break;
>>> +    case CXL_RES_PMEM:
>>> +        rc = release_resource(&cxlds->pmem_res);
>>> +        break;
>>> +    default:
>>> +        dev_err(cxlds->dev, "unknown resource type (%u)\n", type);
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    return rc;
>>> +}
>>> +EXPORT_SYMBOL_NS_GPL(cxl_release_resource, CXL);
>>> +
>> Same thing here, but with cxl_dpa_release() instead of add_dpa_res().
>>
>> Looking at it some more, it looks like there is also some stuff to do with locking for CXL DPA resources in
>> that function that you would be skipping with your functions above. Will that be a problem later? I have no
>> clue, but thought I should ask just in case.
>>
>>>   static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>>>   {
>>>       struct cxl_memdev *cxlmd =
>>> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
>>> index 2f48ee591259..6c6d27721067 100644
>>> --- a/include/linux/cxl/cxl.h
>>> +++ b/include/linux/cxl/cxl.h
>>> @@ -54,4 +54,6 @@ bool cxl_pci_check_caps(struct cxl_dev_state *cxlds,
>>>               unsigned long *expected_caps,
>>>               unsigned long *current_caps);
>>>   int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>>> +int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>>> +int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>>>   #endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 09/26] sfc: request cxl ram resource
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (7 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 08/26] cxl: add functions for resource request/release by a driver alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 10/26] cxl: harden resource_contains checks to handle zero size resources alejandro.lucero-palau
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Use cxl accessor for obtaining the ram resource the device advertises.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/net/ethernet/sfc/efx_cxl.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 749aa97683fd..d47f8e91eaef 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -83,6 +83,12 @@ int efx_cxl_init(struct efx_nic *efx)
 		goto err2;
 	}
 
+	rc = cxl_request_resource(cxl->cxlds, CXL_RES_RAM);
+	if (rc) {
+		pci_err(pci_dev, "CXL request resource failed");
+		goto err2;
+	}
+
 	efx->cxl = cxl;
 #endif
 
@@ -102,6 +108,7 @@ void efx_cxl_exit(struct efx_nic *efx)
 {
 #if IS_ENABLED(CONFIG_CXL_BUS)
 	if (efx->cxl) {
+		cxl_release_resource(efx->cxl->cxlds, CXL_RES_RAM);
 		kfree(efx->cxl->cxlds);
 		kfree(efx->cxl);
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 10/26] cxl: harden resource_contains checks to handle zero size resources
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (8 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 09/26] sfc: request cxl ram resource alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 11/26] cxl: add function for setting media ready by a driver alejandro.lucero-palau
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

For a resource defined with size zero, resource_contains returns
always true.

Add resource size check before using it.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/hdm.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 3df10517a327..c729541bb7e1 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -327,10 +327,13 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
 	cxled->dpa_res = res;
 	cxled->skip = skipped;
 
-	if (resource_contains(&cxlds->pmem_res, res))
+	if (resource_size(&cxlds->pmem_res) &&
+	    resource_contains(&cxlds->pmem_res, res)) {
 		cxled->mode = CXL_DECODER_PMEM;
-	else if (resource_contains(&cxlds->ram_res, res))
+	} else if (resource_size(&cxlds->ram_res) &&
+		   resource_contains(&cxlds->ram_res, res)) {
 		cxled->mode = CXL_DECODER_RAM;
+	}
 	else {
 		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
 			 port->id, cxled->cxld.id, cxled->dpa_res);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 11/26] cxl: add function for setting media ready by a driver
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (9 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 10/26] cxl: harden resource_contains checks to handle zero size resources alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 12/26] sfc: set cxl media ready alejandro.lucero-palau
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

A Type-2 driver can require to set the memory availability explicitly.

Add a function to the exported CXL API for accelerator drivers.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/memdev.c | 6 ++++++
 include/linux/cxl/cxl.h   | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 4b2641f20128..56fddb0d6a85 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -795,6 +795,12 @@ int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_release_resource, CXL);
 
+void cxl_set_media_ready(struct cxl_dev_state *cxlds)
+{
+	cxlds->media_ready = true;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_set_media_ready, CXL);
+
 static int cxl_memdev_release_file(struct inode *inode, struct file *file)
 {
 	struct cxl_memdev *cxlmd =
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
index 6c6d27721067..b8ee42b38f68 100644
--- a/include/linux/cxl/cxl.h
+++ b/include/linux/cxl/cxl.h
@@ -56,4 +56,5 @@ bool cxl_pci_check_caps(struct cxl_dev_state *cxlds,
 int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
 int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
 int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
+void cxl_set_media_ready(struct cxl_dev_state *cxlds);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 12/26] sfc: set cxl media ready
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (10 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 11/26] cxl: add function for setting media ready by a driver alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 13/26] cxl: prepare memdev creation for type2 alejandro.lucero-palau
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Use cxl api accessor for explicitly set media ready as hardware design
implies it is ready and there is no device register for stating so.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/net/ethernet/sfc/efx_cxl.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index d47f8e91eaef..419cf9fb6bd0 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -89,6 +89,11 @@ int efx_cxl_init(struct efx_nic *efx)
 		goto err2;
 	}
 
+	/* We do not have the register about media status. Hardware design
+	 * implies it is ready.
+	 */
+	cxl_set_media_ready(cxl->cxlds);
+
 	efx->cxl = cxl;
 #endif
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 13/26] cxl: prepare memdev creation for type2
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (11 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 12/26] sfc: set cxl media ready alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 21:49   ` Ben Cheatham
  2024-10-17 16:52 ` [PATCH v4 14/26] sfc: create type2 cxl memdev alejandro.lucero-palau
                   ` (13 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when
creating a memdev leading to problems when obtaining cxl_memdev_state
references from a CXL_DEVTYPE_DEVMEM type. This last device type is
managed by a specific vendor driver and does not need same sysfs files
since not userspace intervention is expected.

Create a new cxl_mem device type with no attributes for Type2.

Avoid debugfs files relying on existence of clx_memdev_state.

Make devm_cxl_add_memdev accesible from a accel driver.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/memdev.c | 15 +++++++++++++--
 drivers/cxl/core/region.c |  3 ++-
 drivers/cxl/mem.c         | 25 +++++++++++++++++++------
 include/linux/cxl/cxl.h   |  2 ++
 4 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 56fddb0d6a85..f168cd42f8a5 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -546,9 +546,17 @@ static const struct device_type cxl_memdev_type = {
 	.groups = cxl_memdev_attribute_groups,
 };
 
+static const struct device_type cxl_accel_memdev_type = {
+	.name = "cxl_memdev",
+	.release = cxl_memdev_release,
+	.devnode = cxl_memdev_devnode,
+};
+
 bool is_cxl_memdev(const struct device *dev)
 {
-	return dev->type == &cxl_memdev_type;
+	return (dev->type == &cxl_memdev_type ||
+		dev->type == &cxl_accel_memdev_type);
+
 }
 EXPORT_SYMBOL_NS_GPL(is_cxl_memdev, CXL);
 
@@ -659,7 +667,10 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
 	dev->parent = cxlds->dev;
 	dev->bus = &cxl_bus_type;
 	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
-	dev->type = &cxl_memdev_type;
+	if (cxlds->type == CXL_DEVTYPE_DEVMEM)
+		dev->type = &cxl_accel_memdev_type;
+	else
+		dev->type = &cxl_memdev_type;
 	device_set_pm_not_required(dev);
 	INIT_WORK(&cxlmd->detach_work, detach_memdev);
 
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 21ad5f242875..7e7761ff9fc4 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1941,7 +1941,8 @@ static int cxl_region_attach(struct cxl_region *cxlr,
 		return -EINVAL;
 	}
 
-	cxl_region_perf_data_calculate(cxlr, cxled);
+	if (cxlr->type == CXL_DECODER_HOSTONLYMEM)
+		cxl_region_perf_data_calculate(cxlr, cxled);
 
 	if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
 		int i;
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 7de232eaeb17..3a250ddeef35 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -131,12 +131,18 @@ static int cxl_mem_probe(struct device *dev)
 	dentry = cxl_debugfs_create_dir(dev_name(dev));
 	debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show);
 
-	if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
-		debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
-				    &cxl_poison_inject_fops);
-	if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
-		debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
-				    &cxl_poison_clear_fops);
+	/*
+	 * Avoid poison debugfs files for Type2 devices as they rely on
+	 * cxl_memdev_state.
+	 */
+	if (mds) {
+		if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
+			debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
+					    &cxl_poison_inject_fops);
+		if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
+			debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
+					    &cxl_poison_clear_fops);
+	}
 
 	rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
 	if (rc)
@@ -222,6 +228,13 @@ static umode_t cxl_mem_visible(struct kobject *kobj, struct attribute *a, int n)
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 
+	/*
+	 * Avoid poison sysfs files for Type2 devices as they rely on
+	 * cxl_memdev_state.
+	 */
+	if (!mds)
+		return 0;
+
 	if (a == &dev_attr_trigger_poison_list.attr)
 		if (!test_bit(CXL_POISON_ENABLED_LIST,
 			      mds->poison.enabled_cmds))
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
index b8ee42b38f68..bbbcf6574246 100644
--- a/include/linux/cxl/cxl.h
+++ b/include/linux/cxl/cxl.h
@@ -57,4 +57,6 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
 int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
 int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
 void cxl_set_media_ready(struct cxl_dev_state *cxlds);
+struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
+				       struct cxl_dev_state *cxlds);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 13/26] cxl: prepare memdev creation for type2
  2024-10-17 16:52 ` [PATCH v4 13/26] cxl: prepare memdev creation for type2 alejandro.lucero-palau
@ 2024-10-17 21:49   ` Ben Cheatham
  2024-10-18 10:49     ` Alejandro Lucero Palau
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Cheatham @ 2024-10-17 21:49 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet

On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when
> creating a memdev leading to problems when obtaining cxl_memdev_state
> references from a CXL_DEVTYPE_DEVMEM type. This last device type is
> managed by a specific vendor driver and does not need same sysfs files
> since not userspace intervention is expected.
> 
> Create a new cxl_mem device type with no attributes for Type2.
> 

I agree with the sentiment that type 2 devices shouldn't have the same sysfs files,
but I think they should have *some* sysfs files. I would like to be able to see
these devices show up in something like "cxl list", which this patch would prevent.
I really think that it would be fine to only have the bare minimum though, such as
ram resource size/location, NUMA node, serial, etc.

> Avoid debugfs files relying on existence of clx_memdev_state.
> 
> Make devm_cxl_add_memdev accesible from a accel driver.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/memdev.c | 15 +++++++++++++--
>  drivers/cxl/core/region.c |  3 ++-
>  drivers/cxl/mem.c         | 25 +++++++++++++++++++------
>  include/linux/cxl/cxl.h   |  2 ++
>  4 files changed, 36 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 56fddb0d6a85..f168cd42f8a5 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -546,9 +546,17 @@ static const struct device_type cxl_memdev_type = {
>  	.groups = cxl_memdev_attribute_groups,
>  };
>  
> +static const struct device_type cxl_accel_memdev_type = {
> +	.name = "cxl_memdev",
> +	.release = cxl_memdev_release,
> +	.devnode = cxl_memdev_devnode,
> +};
> +
>  bool is_cxl_memdev(const struct device *dev)
>  {
> -	return dev->type == &cxl_memdev_type;
> +	return (dev->type == &cxl_memdev_type ||
> +		dev->type == &cxl_accel_memdev_type);
> +
>  }
>  EXPORT_SYMBOL_NS_GPL(is_cxl_memdev, CXL);
>  
> @@ -659,7 +667,10 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>  	dev->parent = cxlds->dev;
>  	dev->bus = &cxl_bus_type;
>  	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
> -	dev->type = &cxl_memdev_type;
> +	if (cxlds->type == CXL_DEVTYPE_DEVMEM)
> +		dev->type = &cxl_accel_memdev_type;
> +	else
> +		dev->type = &cxl_memdev_type;
>  	device_set_pm_not_required(dev);
>  	INIT_WORK(&cxlmd->detach_work, detach_memdev);
>  
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 21ad5f242875..7e7761ff9fc4 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1941,7 +1941,8 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  		return -EINVAL;
>  	}
>  
> -	cxl_region_perf_data_calculate(cxlr, cxled);
> +	if (cxlr->type == CXL_DECODER_HOSTONLYMEM)
> +		cxl_region_perf_data_calculate(cxlr, cxled);
>  
>  	if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
>  		int i;
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 7de232eaeb17..3a250ddeef35 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -131,12 +131,18 @@ static int cxl_mem_probe(struct device *dev)
>  	dentry = cxl_debugfs_create_dir(dev_name(dev));
>  	debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show);
>  
> -	if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
> -		debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
> -				    &cxl_poison_inject_fops);
> -	if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
> -		debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
> -				    &cxl_poison_clear_fops);
> +	/*
> +	 * Avoid poison debugfs files for Type2 devices as they rely on
> +	 * cxl_memdev_state.
> +	 */
> +	if (mds) {
> +		if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
> +			debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
> +					    &cxl_poison_inject_fops);
> +		if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
> +			debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
> +					    &cxl_poison_clear_fops);
> +	}
>  
>  	rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
>  	if (rc)
> @@ -222,6 +228,13 @@ static umode_t cxl_mem_visible(struct kobject *kobj, struct attribute *a, int n)
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>  	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>  
> +	/*
> +	 * Avoid poison sysfs files for Type2 devices as they rely on
> +	 * cxl_memdev_state.
> +	 */
> +	if (!mds)
> +		return 0;
> +
>  	if (a == &dev_attr_trigger_poison_list.attr)
>  		if (!test_bit(CXL_POISON_ENABLED_LIST,
>  			      mds->poison.enabled_cmds))
> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
> index b8ee42b38f68..bbbcf6574246 100644
> --- a/include/linux/cxl/cxl.h
> +++ b/include/linux/cxl/cxl.h
> @@ -57,4 +57,6 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>  int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>  int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>  void cxl_set_media_ready(struct cxl_dev_state *cxlds);
> +struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
> +				       struct cxl_dev_state *cxlds);
>  #endif


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 13/26] cxl: prepare memdev creation for type2
  2024-10-17 21:49   ` Ben Cheatham
@ 2024-10-18 10:49     ` Alejandro Lucero Palau
  2024-10-18 16:40       ` Ben Cheatham
  0 siblings, 1 reply; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-18 10:49 UTC (permalink / raw)
  To: Ben Cheatham, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/17/24 22:49, Ben Cheatham wrote:
> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when
>> creating a memdev leading to problems when obtaining cxl_memdev_state
>> references from a CXL_DEVTYPE_DEVMEM type. This last device type is
>> managed by a specific vendor driver and does not need same sysfs files
>> since not userspace intervention is expected.
>>
>> Create a new cxl_mem device type with no attributes for Type2.
>>
> I agree with the sentiment that type 2 devices shouldn't have the same sysfs files,
> but I think they should have *some* sysfs files. I would like to be able to see
> these devices show up in something like "cxl list", which this patch would prevent.
> I really think that it would be fine to only have the bare minimum though, such as
> ram resource size/location, NUMA node, serial, etc.


But this patch does not avoid all sysfs files at all, just those 
depending on specific type3 fields.

I can see the endpoint directory related to the accelerator cxl device, 
and information about the region, size, start, type, ...

Not sure if the ndctl cxl command should be modified for this kind of 
change, but I can see "cxl list -E" working.


>> Avoid debugfs files relying on existence of clx_memdev_state.
>>
>> Make devm_cxl_add_memdev accesible from a accel driver.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/memdev.c | 15 +++++++++++++--
>>   drivers/cxl/core/region.c |  3 ++-
>>   drivers/cxl/mem.c         | 25 +++++++++++++++++++------
>>   include/linux/cxl/cxl.h   |  2 ++
>>   4 files changed, 36 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index 56fddb0d6a85..f168cd42f8a5 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -546,9 +546,17 @@ static const struct device_type cxl_memdev_type = {
>>   	.groups = cxl_memdev_attribute_groups,
>>   };
>>   
>> +static const struct device_type cxl_accel_memdev_type = {
>> +	.name = "cxl_memdev",
>> +	.release = cxl_memdev_release,
>> +	.devnode = cxl_memdev_devnode,
>> +};
>> +
>>   bool is_cxl_memdev(const struct device *dev)
>>   {
>> -	return dev->type == &cxl_memdev_type;
>> +	return (dev->type == &cxl_memdev_type ||
>> +		dev->type == &cxl_accel_memdev_type);
>> +
>>   }
>>   EXPORT_SYMBOL_NS_GPL(is_cxl_memdev, CXL);
>>   
>> @@ -659,7 +667,10 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>>   	dev->parent = cxlds->dev;
>>   	dev->bus = &cxl_bus_type;
>>   	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
>> -	dev->type = &cxl_memdev_type;
>> +	if (cxlds->type == CXL_DEVTYPE_DEVMEM)
>> +		dev->type = &cxl_accel_memdev_type;
>> +	else
>> +		dev->type = &cxl_memdev_type;
>>   	device_set_pm_not_required(dev);
>>   	INIT_WORK(&cxlmd->detach_work, detach_memdev);
>>   
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 21ad5f242875..7e7761ff9fc4 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -1941,7 +1941,8 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>>   		return -EINVAL;
>>   	}
>>   
>> -	cxl_region_perf_data_calculate(cxlr, cxled);
>> +	if (cxlr->type == CXL_DECODER_HOSTONLYMEM)
>> +		cxl_region_perf_data_calculate(cxlr, cxled);
>>   
>>   	if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
>>   		int i;
>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>> index 7de232eaeb17..3a250ddeef35 100644
>> --- a/drivers/cxl/mem.c
>> +++ b/drivers/cxl/mem.c
>> @@ -131,12 +131,18 @@ static int cxl_mem_probe(struct device *dev)
>>   	dentry = cxl_debugfs_create_dir(dev_name(dev));
>>   	debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show);
>>   
>> -	if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
>> -		debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
>> -				    &cxl_poison_inject_fops);
>> -	if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
>> -		debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
>> -				    &cxl_poison_clear_fops);
>> +	/*
>> +	 * Avoid poison debugfs files for Type2 devices as they rely on
>> +	 * cxl_memdev_state.
>> +	 */
>> +	if (mds) {
>> +		if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
>> +			debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
>> +					    &cxl_poison_inject_fops);
>> +		if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
>> +			debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
>> +					    &cxl_poison_clear_fops);
>> +	}
>>   
>>   	rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
>>   	if (rc)
>> @@ -222,6 +228,13 @@ static umode_t cxl_mem_visible(struct kobject *kobj, struct attribute *a, int n)
>>   	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>>   	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>>   
>> +	/*
>> +	 * Avoid poison sysfs files for Type2 devices as they rely on
>> +	 * cxl_memdev_state.
>> +	 */
>> +	if (!mds)
>> +		return 0;
>> +
>>   	if (a == &dev_attr_trigger_poison_list.attr)
>>   		if (!test_bit(CXL_POISON_ENABLED_LIST,
>>   			      mds->poison.enabled_cmds))
>> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
>> index b8ee42b38f68..bbbcf6574246 100644
>> --- a/include/linux/cxl/cxl.h
>> +++ b/include/linux/cxl/cxl.h
>> @@ -57,4 +57,6 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>>   int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>>   int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>>   void cxl_set_media_ready(struct cxl_dev_state *cxlds);
>> +struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>> +				       struct cxl_dev_state *cxlds);
>>   #endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 13/26] cxl: prepare memdev creation for type2
  2024-10-18 10:49     ` Alejandro Lucero Palau
@ 2024-10-18 16:40       ` Ben Cheatham
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Cheatham @ 2024-10-18 16:40 UTC (permalink / raw)
  To: Alejandro Lucero Palau, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet



On 10/18/24 5:49 AM, Alejandro Lucero Palau wrote:
> 
> On 10/17/24 22:49, Ben Cheatham wrote:
>> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>>> From: Alejandro Lucero <alucerop@amd.com>
>>>
>>> Current cxl core is relying on a CXL_DEVTYPE_CLASSMEM type device when
>>> creating a memdev leading to problems when obtaining cxl_memdev_state
>>> references from a CXL_DEVTYPE_DEVMEM type. This last device type is
>>> managed by a specific vendor driver and does not need same sysfs files
>>> since not userspace intervention is expected.
>>>
>>> Create a new cxl_mem device type with no attributes for Type2.
>>>
>> I agree with the sentiment that type 2 devices shouldn't have the same sysfs files,
>> but I think they should have *some* sysfs files. I would like to be able to see
>> these devices show up in something like "cxl list", which this patch would prevent.
>> I really think that it would be fine to only have the bare minimum though, such as
>> ram resource size/location, NUMA node, serial, etc.
> 
> 
> But this patch does not avoid all sysfs files at all, just those depending on specific type3 fields.
> 
> I can see the endpoint directory related to the accelerator cxl device, and information about the region, size, start, type, ...
> 
> Not sure if the ndctl cxl command should be modified for this kind of change, but I can see "cxl list -E" working.
> 

Sorry, I guess that's what I get for just looking at it without testing! That should be fine
then.

> 
>>> Avoid debugfs files relying on existence of clx_memdev_state.
>>>
>>> Make devm_cxl_add_memdev accesible from a accel driver.
>>>
>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>> ---
>>>   drivers/cxl/core/memdev.c | 15 +++++++++++++--
>>>   drivers/cxl/core/region.c |  3 ++-
>>>   drivers/cxl/mem.c         | 25 +++++++++++++++++++------
>>>   include/linux/cxl/cxl.h   |  2 ++
>>>   4 files changed, 36 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>>> index 56fddb0d6a85..f168cd42f8a5 100644
>>> --- a/drivers/cxl/core/memdev.c
>>> +++ b/drivers/cxl/core/memdev.c
>>> @@ -546,9 +546,17 @@ static const struct device_type cxl_memdev_type = {
>>>       .groups = cxl_memdev_attribute_groups,
>>>   };
>>>   +static const struct device_type cxl_accel_memdev_type = {
>>> +    .name = "cxl_memdev",
>>> +    .release = cxl_memdev_release,
>>> +    .devnode = cxl_memdev_devnode,
>>> +};
>>> +
>>>   bool is_cxl_memdev(const struct device *dev)
>>>   {
>>> -    return dev->type == &cxl_memdev_type;
>>> +    return (dev->type == &cxl_memdev_type ||
>>> +        dev->type == &cxl_accel_memdev_type);
>>> +
>>>   }
>>>   EXPORT_SYMBOL_NS_GPL(is_cxl_memdev, CXL);
>>>   @@ -659,7 +667,10 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>>>       dev->parent = cxlds->dev;
>>>       dev->bus = &cxl_bus_type;
>>>       dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
>>> -    dev->type = &cxl_memdev_type;
>>> +    if (cxlds->type == CXL_DEVTYPE_DEVMEM)
>>> +        dev->type = &cxl_accel_memdev_type;
>>> +    else
>>> +        dev->type = &cxl_memdev_type;
>>>       device_set_pm_not_required(dev);
>>>       INIT_WORK(&cxlmd->detach_work, detach_memdev);
>>>   diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>> index 21ad5f242875..7e7761ff9fc4 100644
>>> --- a/drivers/cxl/core/region.c
>>> +++ b/drivers/cxl/core/region.c
>>> @@ -1941,7 +1941,8 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>>>           return -EINVAL;
>>>       }
>>>   -    cxl_region_perf_data_calculate(cxlr, cxled);
>>> +    if (cxlr->type == CXL_DECODER_HOSTONLYMEM)
>>> +        cxl_region_perf_data_calculate(cxlr, cxled);
>>>         if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
>>>           int i;
>>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>>> index 7de232eaeb17..3a250ddeef35 100644
>>> --- a/drivers/cxl/mem.c
>>> +++ b/drivers/cxl/mem.c
>>> @@ -131,12 +131,18 @@ static int cxl_mem_probe(struct device *dev)
>>>       dentry = cxl_debugfs_create_dir(dev_name(dev));
>>>       debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show);
>>>   -    if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
>>> -        debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
>>> -                    &cxl_poison_inject_fops);
>>> -    if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
>>> -        debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
>>> -                    &cxl_poison_clear_fops);
>>> +    /*
>>> +     * Avoid poison debugfs files for Type2 devices as they rely on
>>> +     * cxl_memdev_state.
>>> +     */
>>> +    if (mds) {
>>> +        if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
>>> +            debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
>>> +                        &cxl_poison_inject_fops);
>>> +        if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
>>> +            debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
>>> +                        &cxl_poison_clear_fops);
>>> +    }
>>>         rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
>>>       if (rc)
>>> @@ -222,6 +228,13 @@ static umode_t cxl_mem_visible(struct kobject *kobj, struct attribute *a, int n)
>>>       struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>>>       struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>>>   +    /*
>>> +     * Avoid poison sysfs files for Type2 devices as they rely on
>>> +     * cxl_memdev_state.
>>> +     */
>>> +    if (!mds)
>>> +        return 0;
>>> +
>>>       if (a == &dev_attr_trigger_poison_list.attr)
>>>           if (!test_bit(CXL_POISON_ENABLED_LIST,
>>>                     mds->poison.enabled_cmds))
>>> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
>>> index b8ee42b38f68..bbbcf6574246 100644
>>> --- a/include/linux/cxl/cxl.h
>>> +++ b/include/linux/cxl/cxl.h
>>> @@ -57,4 +57,6 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_dev_state *cxlds);
>>>   int cxl_request_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>>>   int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
>>>   void cxl_set_media_ready(struct cxl_dev_state *cxlds);
>>> +struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>>> +                       struct cxl_dev_state *cxlds);
>>>   #endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 14/26] sfc: create type2 cxl memdev
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (12 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 13/26] cxl: prepare memdev creation for type2 alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 15/26] cxl: define a driver interface for HPA free space enumeration alejandro.lucero-palau
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Use cxl API for creating a cxl memory device using the type2
cxl_dev_state struct.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/net/ethernet/sfc/efx_cxl.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 419cf9fb6bd0..452421d71fbf 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -94,12 +94,21 @@ int efx_cxl_init(struct efx_nic *efx)
 	 */
 	cxl_set_media_ready(cxl->cxlds);
 
+	cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
+	if (IS_ERR(cxl->cxlmd)) {
+		pci_err(pci_dev, "CXL accel memdev creation failed");
+		rc = PTR_ERR(cxl->cxlmd);
+		goto err3;
+	}
+
 	efx->cxl = cxl;
 #endif
 
 	return 0;
 
 #if IS_ENABLED(CONFIG_CXL_BUS)
+err3:
+	cxl_release_resource(cxl->cxlds, CXL_RES_RAM);
 err2:
 	kfree(cxl->cxlds);
 err1:
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 15/26] cxl: define a driver interface for HPA free space enumeration
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (13 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 14/26] sfc: create type2 cxl memdev alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 16/26] sfc: obtain root decoder with enough HPA free space alejandro.lucero-palau
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

CXL region creation involves allocating capacity from device DPA
(device-physical-address space) and assigning it to decode a given HPA
(host-physical-address space). Before determining how much DPA to
allocate the amount of available HPA must be determined. Also, not all
HPA is create equal, some specifically targets RAM, some target PMEM,
some is prepared for device-memory flows like HDM-D and HDM-DB, and some
is host-only (HDM-H).

Wrap all of those concerns into an API that retrieves a root decoder
(platform CXL window) that fits the specified constraints and the
capacity available for a new region.

Based on https://lore.kernel.org/linux-cxl/168592159290.1948938.13522227102445462976.stgit@dwillia2-xfh.jf.intel.com/

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c | 141 ++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h         |   3 +
 include/linux/cxl/cxl.h   |   8 +++
 3 files changed, 152 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 7e7761ff9fc4..3d5f40507df9 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -703,6 +703,147 @@ static int free_hpa(struct cxl_region *cxlr)
 	return 0;
 }
 
+struct cxlrd_max_context {
+	struct device *host_bridge;
+	unsigned long flags;
+	resource_size_t max_hpa;
+	struct cxl_root_decoder *cxlrd;
+};
+
+static int find_max_hpa(struct device *dev, void *data)
+{
+	struct cxlrd_max_context *ctx = data;
+	struct cxl_switch_decoder *cxlsd;
+	struct cxl_root_decoder *cxlrd;
+	struct resource *res, *prev;
+	struct cxl_decoder *cxld;
+	resource_size_t max;
+
+	if (!is_root_decoder(dev))
+		return 0;
+
+	cxlrd = to_cxl_root_decoder(dev);
+	cxlsd = &cxlrd->cxlsd;
+	cxld = &cxlsd->cxld;
+	if ((cxld->flags & ctx->flags) != ctx->flags) {
+		dev_dbg(dev, "%s, flags not matching: %08lx vs %08lx\n",
+			__func__, cxld->flags, ctx->flags);
+		return 0;
+	}
+
+	/* An accelerator can not be part of an interleaved HPA range. */
+	if (cxld->interleave_ways != 1) {
+		dev_dbg(dev, "%s, interleave_ways not matching\n", __func__);
+		return 0;
+	}
+
+	guard(rwsem_read)(&cxl_region_rwsem);
+	if (ctx->host_bridge != cxlsd->target[0]->dport_dev) {
+		dev_dbg(dev, "%s, host bridge does not match\n", __func__);
+		return 0;
+	}
+
+	/*
+	 * Walk the root decoder resource range relying on cxl_region_rwsem to
+	 * preclude sibling arrival/departure and find the largest free space
+	 * gap.
+	 */
+	lockdep_assert_held_read(&cxl_region_rwsem);
+	max = 0;
+	res = cxlrd->res->child;
+	if (!res)
+		max = resource_size(cxlrd->res);
+	else
+		max = 0;
+
+	for (prev = NULL; res; prev = res, res = res->sibling) {
+		struct resource *next = res->sibling;
+		resource_size_t free = 0;
+
+		if (!prev && res->start > cxlrd->res->start) {
+			free = res->start - cxlrd->res->start;
+			max = max(free, max);
+		}
+		if (prev && res->start > prev->end + 1) {
+			free = res->start - prev->end + 1;
+			max = max(free, max);
+		}
+		if (next && res->end + 1 < next->start) {
+			free = next->start - res->end + 1;
+			max = max(free, max);
+		}
+		if (!next && res->end + 1 < cxlrd->res->end + 1) {
+			free = cxlrd->res->end + 1 - res->end + 1;
+			max = max(free, max);
+		}
+	}
+
+	dev_dbg(CXLRD_DEV(cxlrd), "%s, found %pa bytes of free space\n",
+		__func__, &max);
+	if (max > ctx->max_hpa) {
+		if (ctx->cxlrd)
+			put_device(CXLRD_DEV(ctx->cxlrd));
+		get_device(CXLRD_DEV(cxlrd));
+		ctx->cxlrd = cxlrd;
+		ctx->max_hpa = max;
+		dev_dbg(CXLRD_DEV(cxlrd), "%s, found %pa bytes of free space\n",
+			__func__, &max);
+	}
+	return 0;
+}
+
+/**
+ * cxl_get_hpa_freespace - find a root decoder with free capacity per constraints
+ * @endpoint: an endpoint that is mapped by the returned decoder
+ * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
+ * @max_avail_contig: output parameter of max contiguous bytes available in the
+ *		      returned decoder
+ *
+ * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
+ * is a point in time snapshot. If by the time the caller goes to use this root
+ * decoder's capacity the capacity is reduced then caller needs to loop and
+ * retry.
+ *
+ * The returned root decoder has an elevated reference count that needs to be
+ * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
+ * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
+ * does not race.
+ */
+struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_memdev *cxlmd,
+					       unsigned long flags,
+					       resource_size_t *max_avail_contig)
+{
+	struct cxl_port *endpoint = cxlmd->endpoint;
+	struct cxlrd_max_context ctx = {
+		.host_bridge = endpoint->host_bridge,
+		.flags = flags,
+	};
+	struct cxl_port *root_port;
+	struct cxl_root *root __free(put_cxl_root) = find_cxl_root(endpoint);
+
+	if (!is_cxl_endpoint(endpoint)) {
+		dev_dbg(&endpoint->dev, "hpa requestor is not an endpoint\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (!root) {
+		dev_dbg(&endpoint->dev, "endpoint can not be related to a root port\n");
+		return ERR_PTR(-ENXIO);
+	}
+
+	root_port = &root->port;
+	down_read(&cxl_region_rwsem);
+	device_for_each_child(&root_port->dev, &ctx, find_max_hpa);
+	up_read(&cxl_region_rwsem);
+
+	if (!ctx.cxlrd)
+		return ERR_PTR(-ENOMEM);
+
+	*max_avail_contig = ctx.max_hpa;
+	return ctx.cxlrd;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_get_hpa_freespace, CXL);
+
 static ssize_t size_store(struct device *dev, struct device_attribute *attr,
 			  const char *buf, size_t len)
 {
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index a7c242a19b62..2ea180f05acd 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -773,6 +773,9 @@ static inline void cxl_setup_parent_dport(struct device *host,
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
 struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
+
+#define CXLRD_DEV(cxlrd) (&(cxlrd)->cxlsd.cxld.dev)
+
 struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
 bool is_switch_decoder(struct device *dev);
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
index bbbcf6574246..46381bbda5f4 100644
--- a/include/linux/cxl/cxl.h
+++ b/include/linux/cxl/cxl.h
@@ -7,6 +7,10 @@
 #include <linux/device.h>
 #include <linux/pci.h>
 
+#define CXL_DECODER_F_RAM   BIT(0)
+#define CXL_DECODER_F_PMEM  BIT(1)
+#define CXL_DECODER_F_TYPE2 BIT(2)
+
 enum cxl_resource {
 	CXL_RES_DPA,
 	CXL_RES_RAM,
@@ -59,4 +63,8 @@ int cxl_release_resource(struct cxl_dev_state *cxlds, enum cxl_resource type);
 void cxl_set_media_ready(struct cxl_dev_state *cxlds);
 struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
 				       struct cxl_dev_state *cxlds);
+struct cxl_port;
+struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_memdev *cxlmd,
+					       unsigned long flags,
+					       resource_size_t *max);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 16/26] sfc: obtain root decoder with enough HPA free space
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (14 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 15/26] cxl: define a driver interface for HPA free space enumeration alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 17/26] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Asking for availbale HPA space is the previous step to try to obtain
an HPA range suitable to accel driver purposes.

Add this call to efx cxl initialization.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/net/ethernet/sfc/efx_cxl.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 452421d71fbf..399bd60f2e40 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -26,6 +26,7 @@ int efx_cxl_init(struct efx_nic *efx)
 	DECLARE_BITMAP(found, CXL_MAX_CAPS);
 	struct efx_cxl *cxl;
 	struct resource res;
+	resource_size_t max;
 	u16 dvsec;
 	int rc;
 
@@ -101,6 +102,23 @@ int efx_cxl_init(struct efx_nic *efx)
 		goto err3;
 	}
 
+	cxl->cxlrd = cxl_get_hpa_freespace(cxl->cxlmd,
+					   CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
+					   &max);
+
+	if (IS_ERR(cxl->cxlrd)) {
+		pci_err(pci_dev, "cxl_get_hpa_freespace failed\n");
+		rc = PTR_ERR(cxl->cxlrd);
+		goto err3;
+	}
+
+	if (max < EFX_CTPIO_BUFFER_SIZE) {
+		pci_err(pci_dev, "%s: no enough free HPA space %llu < %u\n",
+			__func__, max, EFX_CTPIO_BUFFER_SIZE);
+		rc = -ENOSPC;
+		goto err3;
+	}
+
 	efx->cxl = cxl;
 #endif
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 17/26] cxl: define a driver interface for DPA allocation
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (15 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 16/26] sfc: obtain root decoder with enough HPA free space alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 18/26] sfc: get endpoint decoder alejandro.lucero-palau
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Region creation involves finding available DPA (device-physical-address)
capacity to map into HPA (host-physical-address) space. Given the HPA
capacity constraint, define an API, cxl_request_dpa(), that has the
flexibility to  map the minimum amount of memory the driver needs to
operate vs the total possible that can be mapped given HPA availability.

Factor out the core of cxl_dpa_alloc, that does free space scanning,
into a cxl_dpa_freespace() helper, and use that to balance the capacity
available to map vs the @min and @max arguments to cxl_request_dpa.

Based on https://lore.kernel.org/linux-cxl/168592158743.1948938.7622563891193802610.stgit@dwillia2-xfh.jf.intel.com/

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/hdm.c  | 153 ++++++++++++++++++++++++++++++++++------
 include/linux/cxl/cxl.h |   5 ++
 2 files changed, 138 insertions(+), 20 deletions(-)

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index c729541bb7e1..d2afc9a1d8f6 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -3,6 +3,7 @@
 #include <linux/seq_file.h>
 #include <linux/device.h>
 #include <linux/delay.h>
+#include <linux/cxl/cxl.h>
 
 #include "cxlmem.h"
 #include "core.h"
@@ -420,6 +421,7 @@ int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
 	up_write(&cxl_dpa_rwsem);
 	return rc;
 }
+EXPORT_SYMBOL_NS_GPL(cxl_dpa_free, CXL);
 
 int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
 		     enum cxl_decoder_mode mode)
@@ -467,31 +469,18 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
 	return rc;
 }
 
-int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
+static resource_size_t cxl_dpa_freespace(struct cxl_endpoint_decoder *cxled,
+					 resource_size_t *start_out,
+					 resource_size_t *skip_out)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 	resource_size_t free_ram_start, free_pmem_start;
-	struct cxl_port *port = cxled_to_port(cxled);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
-	struct device *dev = &cxled->cxld.dev;
 	resource_size_t start, avail, skip;
 	struct resource *p, *last;
-	int rc;
-
-	down_write(&cxl_dpa_rwsem);
-	if (cxled->cxld.region) {
-		dev_dbg(dev, "decoder attached to %s\n",
-			dev_name(&cxled->cxld.region->dev));
-		rc = -EBUSY;
-		goto out;
-	}
 
-	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
-		dev_dbg(dev, "decoder enabled\n");
-		rc = -EBUSY;
-		goto out;
-	}
 
+	lockdep_assert_held(&cxl_dpa_rwsem);
 	for (p = cxlds->ram_res.child, last = NULL; p; p = p->sibling)
 		last = p;
 	if (last)
@@ -528,14 +517,45 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
 			skip_end = start - 1;
 		skip = skip_end - skip_start + 1;
 	} else {
-		dev_dbg(dev, "mode not set\n");
-		rc = -EINVAL;
+		avail = 0;
+	}
+
+	if (!avail)
+		return 0;
+	if (start_out)
+		*start_out = start;
+	if (skip_out)
+		*skip_out = skip;
+	return avail;
+}
+
+int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
+{
+	struct cxl_port *port = cxled_to_port(cxled);
+	struct device *dev = &cxled->cxld.dev;
+	resource_size_t start, avail, skip;
+	int rc;
+
+	down_write(&cxl_dpa_rwsem);
+	if (cxled->cxld.region) {
+		dev_dbg(dev, "EBUSY, decoder attached to %s\n",
+			dev_name(&cxled->cxld.region->dev));
+		rc = -EBUSY;
 		goto out;
 	}
 
+	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
+		dev_dbg(dev, "EBUSY, decoder enabled\n");
+		rc = -EBUSY;
+		goto out;
+	}
+
+	avail = cxl_dpa_freespace(cxled, &start, &skip);
+
 	if (size > avail) {
 		dev_dbg(dev, "%pa exceeds available %s capacity: %pa\n", &size,
-			cxl_decoder_mode_name(cxled->mode), &avail);
+			     cxled->mode == CXL_DECODER_RAM ? "ram" : "pmem",
+			     &avail);
 		rc = -ENOSPC;
 		goto out;
 	}
@@ -550,6 +570,99 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
 	return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
 }
 
+static int find_free_decoder(struct device *dev, void *data)
+{
+	struct cxl_endpoint_decoder *cxled;
+	struct cxl_port *port;
+
+	if (!is_endpoint_decoder(dev))
+		return 0;
+
+	cxled = to_cxl_endpoint_decoder(dev);
+	port = cxled_to_port(cxled);
+
+	if (cxled->cxld.id != port->hdm_end + 1)
+		return 0;
+
+	return 1;
+}
+
+/**
+ * cxl_request_dpa - search and reserve DPA given input constraints
+ * @endpoint: an endpoint port with available decoders
+ * @is_ram: DPA operation mode (ram vs pmem)
+ * @min: the minimum amount of capacity the call needs
+ * @max: extra capacity to allocate after min is satisfied
+ *
+ * Given that a region needs to allocate from limited HPA capacity it
+ * may be the case that a device has more mappable DPA capacity than
+ * available HPA. So, the expectation is that @min is a driver known
+ * value for how much capacity is needed, and @max is based the limit of
+ * how much HPA space is available for a new region.
+ *
+ * Returns a pinned cxl_decoder with at least @min bytes of capacity
+ * reserved, or an error pointer. The caller is also expected to own the
+ * lifetime of the memdev registration associated with the endpoint to
+ * pin the decoder registered as well.
+ */
+struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_memdev *cxlmd,
+					     bool is_ram,
+					     resource_size_t min,
+					     resource_size_t max)
+{
+	struct cxl_port *endpoint = cxlmd->endpoint;
+	struct cxl_endpoint_decoder *cxled;
+	enum cxl_decoder_mode mode;
+	struct device *cxled_dev;
+	resource_size_t alloc;
+	int rc;
+
+	if (!IS_ALIGNED(min | max, SZ_256M))
+		return ERR_PTR(-EINVAL);
+
+	down_read(&cxl_dpa_rwsem);
+	cxled_dev = device_find_child(&endpoint->dev, NULL, find_free_decoder);
+	up_read(&cxl_dpa_rwsem);
+
+	if (!cxled_dev)
+		cxled = ERR_PTR(-ENXIO);
+	else
+		cxled = to_cxl_endpoint_decoder(cxled_dev);
+
+	if (!cxled || IS_ERR(cxled))
+		return cxled;
+
+	if (is_ram)
+		mode = CXL_DECODER_RAM;
+	else
+		mode = CXL_DECODER_PMEM;
+
+	rc = cxl_dpa_set_mode(cxled, mode);
+	if (rc)
+		goto err;
+
+	down_read(&cxl_dpa_rwsem);
+	alloc = cxl_dpa_freespace(cxled, NULL, NULL);
+	up_read(&cxl_dpa_rwsem);
+
+	if (max)
+		alloc = min(max, alloc);
+	if (alloc < min) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	rc = cxl_dpa_alloc(cxled, alloc);
+	if (rc)
+		goto err;
+
+	return cxled;
+err:
+	put_device(cxled_dev);
+	return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_request_dpa, CXL);
+
 static void cxld_set_interleave(struct cxl_decoder *cxld, u32 *ctrl)
 {
 	u16 eig;
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
index 46381bbda5f4..45b6badb8048 100644
--- a/include/linux/cxl/cxl.h
+++ b/include/linux/cxl/cxl.h
@@ -67,4 +67,9 @@ struct cxl_port;
 struct cxl_root_decoder *cxl_get_hpa_freespace(struct cxl_memdev *cxlmd,
 					       unsigned long flags,
 					       resource_size_t *max);
+struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_memdev *cxlmd,
+					     bool is_ram,
+					     resource_size_t min,
+					     resource_size_t max);
+int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 18/26] sfc: get endpoint decoder
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (16 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 17/26] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 19/26] cxl: make region type based on endpoint type alejandro.lucero-palau
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Use cxl api for getting DPA (Device Phisical Address) to use through an
endpoint decoder.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/net/ethernet/sfc/efx_cxl.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 399bd60f2e40..c0da75b2d8e1 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -119,6 +119,14 @@ int efx_cxl_init(struct efx_nic *efx)
 		goto err3;
 	}
 
+	cxl->cxled = cxl_request_dpa(cxl->cxlmd, true, EFX_CTPIO_BUFFER_SIZE,
+				     EFX_CTPIO_BUFFER_SIZE);
+	if (!cxl->cxled || IS_ERR(cxl->cxled)) {
+		pci_err(pci_dev, "CXL accel request DPA failed");
+		rc = PTR_ERR(cxl->cxlrd);
+		goto err3;
+	}
+
 	efx->cxl = cxl;
 #endif
 
@@ -140,6 +148,7 @@ void efx_cxl_exit(struct efx_nic *efx)
 {
 #if IS_ENABLED(CONFIG_CXL_BUS)
 	if (efx->cxl) {
+		cxl_dpa_free(efx->cxl->cxled);
 		cxl_release_resource(efx->cxl->cxlds, CXL_RES_RAM);
 		kfree(efx->cxl->cxlds);
 		kfree(efx->cxl);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 19/26] cxl: make region type based on endpoint type
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (17 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 18/26] sfc: get endpoint decoder alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 20/26] cxl/region: factor out interleave ways setup alejandro.lucero-palau
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Current code is expecting Type3 or CXL_DECODER_HOSTONLYMEM devices only.
Support for Type2 implies region type needs to be based on the endpoint
type instead.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/region.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 3d5f40507df9..5c0a40fa1b10 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2665,7 +2665,8 @@ static ssize_t create_ram_region_show(struct device *dev,
 }
 
 static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
-					  enum cxl_decoder_mode mode, int id)
+					  enum cxl_decoder_mode mode, int id,
+					  enum cxl_decoder_type target_type)
 {
 	int rc;
 
@@ -2687,7 +2688,7 @@ static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
 		return ERR_PTR(-EBUSY);
 	}
 
-	return devm_cxl_add_region(cxlrd, id, mode, CXL_DECODER_HOSTONLYMEM);
+	return devm_cxl_add_region(cxlrd, id, mode, target_type);
 }
 
 static ssize_t create_pmem_region_store(struct device *dev,
@@ -2702,7 +2703,8 @@ static ssize_t create_pmem_region_store(struct device *dev,
 	if (rc != 1)
 		return -EINVAL;
 
-	cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id);
+	cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id,
+			       CXL_DECODER_HOSTONLYMEM);
 	if (IS_ERR(cxlr))
 		return PTR_ERR(cxlr);
 
@@ -2722,7 +2724,8 @@ static ssize_t create_ram_region_store(struct device *dev,
 	if (rc != 1)
 		return -EINVAL;
 
-	cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id);
+	cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id,
+			       CXL_DECODER_HOSTONLYMEM);
 	if (IS_ERR(cxlr))
 		return PTR_ERR(cxlr);
 
@@ -3382,7 +3385,8 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 
 	do {
 		cxlr = __create_region(cxlrd, cxled->mode,
-				       atomic_read(&cxlrd->region_id));
+				       atomic_read(&cxlrd->region_id),
+				       cxled->cxld.target_type);
 	} while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
 
 	if (IS_ERR(cxlr)) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 20/26] cxl/region: factor out interleave ways setup
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (18 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 19/26] cxl: make region type based on endpoint type alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 21/26] cxl/region: factor out interleave granularity setup alejandro.lucero-palau
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

In preparation for kernel driven region creation, factor out a common
helper from the user-sysfs region setup for interleave ways.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c | 46 +++++++++++++++++++++++----------------
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 5c0a40fa1b10..ad5818fbdeb6 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -480,22 +480,14 @@ static ssize_t interleave_ways_show(struct device *dev,
 
 static const struct attribute_group *get_cxl_region_target_group(void);
 
-static ssize_t interleave_ways_store(struct device *dev,
-				     struct device_attribute *attr,
-				     const char *buf, size_t len)
+static int set_interleave_ways(struct cxl_region *cxlr, int val)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
 	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
-	struct cxl_region *cxlr = to_cxl_region(dev);
 	struct cxl_region_params *p = &cxlr->params;
-	unsigned int val, save;
-	int rc;
+	int save, rc;
 	u8 iw;
 
-	rc = kstrtouint(buf, 0, &val);
-	if (rc)
-		return rc;
-
 	rc = ways_to_eiw(val, &iw);
 	if (rc)
 		return rc;
@@ -510,20 +502,36 @@ static ssize_t interleave_ways_store(struct device *dev,
 		return -EINVAL;
 	}
 
-	rc = down_write_killable(&cxl_region_rwsem);
-	if (rc)
-		return rc;
-	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
-		rc = -EBUSY;
-		goto out;
-	}
+	lockdep_assert_held_write(&cxl_region_rwsem);
+	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
+		return -EBUSY;
 
 	save = p->interleave_ways;
 	p->interleave_ways = val;
 	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
 	if (rc)
 		p->interleave_ways = save;
-out:
+
+	return rc;
+}
+
+static ssize_t interleave_ways_store(struct device *dev,
+				     struct device_attribute *attr,
+				     const char *buf, size_t len)
+{
+	struct cxl_region *cxlr = to_cxl_region(dev);
+	unsigned int val;
+	int rc;
+
+	rc = kstrtouint(buf, 0, &val);
+	if (rc)
+		return rc;
+
+	rc = down_write_killable(&cxl_region_rwsem);
+	if (rc)
+		return rc;
+
+	rc = set_interleave_ways(cxlr, val);
 	up_write(&cxl_region_rwsem);
 	if (rc)
 		return rc;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 21/26] cxl/region: factor out interleave granularity setup
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (19 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 20/26] cxl/region: factor out interleave ways setup alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 22/26] cxl: allow region creation by type2 drivers alejandro.lucero-palau
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

In preparation for kernel driven region creation, factor out a common
helper from the user-sysfs region setup for interleave granularity.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c | 39 +++++++++++++++++++++++----------------
 1 file changed, 23 insertions(+), 16 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index ad5818fbdeb6..d08a2a848ac9 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -556,21 +556,14 @@ static ssize_t interleave_granularity_show(struct device *dev,
 	return rc;
 }
 
-static ssize_t interleave_granularity_store(struct device *dev,
-					    struct device_attribute *attr,
-					    const char *buf, size_t len)
+static int set_interleave_granularity(struct cxl_region *cxlr, int val)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
 	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
-	struct cxl_region *cxlr = to_cxl_region(dev);
 	struct cxl_region_params *p = &cxlr->params;
-	int rc, val;
+	int rc;
 	u16 ig;
 
-	rc = kstrtoint(buf, 0, &val);
-	if (rc)
-		return rc;
-
 	rc = granularity_to_eig(val, &ig);
 	if (rc)
 		return rc;
@@ -586,16 +579,30 @@ static ssize_t interleave_granularity_store(struct device *dev,
 	if (cxld->interleave_ways > 1 && val != cxld->interleave_granularity)
 		return -EINVAL;
 
+	lockdep_assert_held_write(&cxl_region_rwsem);
+	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
+		return -EBUSY;
+
+	p->interleave_granularity = val;
+	return 0;
+}
+
+static ssize_t interleave_granularity_store(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t len)
+{
+	struct cxl_region *cxlr = to_cxl_region(dev);
+	int rc, val;
+
+	rc = kstrtoint(buf, 0, &val);
+	if (rc)
+		return rc;
+
 	rc = down_write_killable(&cxl_region_rwsem);
 	if (rc)
 		return rc;
-	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
-		rc = -EBUSY;
-		goto out;
-	}
 
-	p->interleave_granularity = val;
-out:
+	rc = set_interleave_granularity(cxlr, val);
 	up_write(&cxl_region_rwsem);
 	if (rc)
 		return rc;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 22/26] cxl: allow region creation by type2 drivers
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (20 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 21/26] cxl/region: factor out interleave granularity setup alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 21:49   ` Ben Cheatham
  2024-10-17 16:52 ` [PATCH v4 23/26] sfc: create cxl region alejandro.lucero-palau
                   ` (4 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Creating a CXL region requires userspace intervention through the cxl
sysfs files. Type2 support should allow accelerator drivers to create
such cxl region from kernel code.

Adding that functionality and integrating it with current support for
memory expanders.

Based on https://lore.kernel.org/linux-cxl/168592159835.1948938.1647215579839222774.stgit@dwillia2-xfh.jf.intel.com/

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c | 147 ++++++++++++++++++++++++++++++++++----
 drivers/cxl/cxlmem.h      |   2 +
 include/linux/cxl/cxl.h   |   4 ++
 3 files changed, 138 insertions(+), 15 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index d08a2a848ac9..04c270a29e96 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2253,6 +2253,18 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
 	return rc;
 }
 
+int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled)
+{
+	int rc;
+
+	down_write(&cxl_region_rwsem);
+	cxled->mode = CXL_DECODER_DEAD;
+	rc = cxl_region_detach(cxled);
+	up_write(&cxl_region_rwsem);
+	return rc;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_accel_region_detach, CXL);
+
 void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
 {
 	down_write(&cxl_region_rwsem);
@@ -2781,6 +2793,14 @@ cxl_find_region_by_name(struct cxl_root_decoder *cxlrd, const char *name)
 	return to_cxl_region(region_dev);
 }
 
+static void drop_region(struct cxl_region *cxlr)
+{
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	struct cxl_port *port = cxlrd_to_port(cxlrd);
+
+	devm_release_action(port->uport_dev, unregister_region, cxlr);
+}
+
 static ssize_t delete_region_store(struct device *dev,
 				   struct device_attribute *attr,
 				   const char *buf, size_t len)
@@ -3386,17 +3406,18 @@ static int match_region_by_range(struct device *dev, void *data)
 	return rc;
 }
 
-/* Establish an empty region covering the given HPA range */
-static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
-					   struct cxl_endpoint_decoder *cxled)
+static void construct_region_end(void)
+{
+	up_write(&cxl_region_rwsem);
+}
+
+static struct cxl_region *construct_region_begin(struct cxl_root_decoder *cxlrd,
+						 struct cxl_endpoint_decoder *cxled)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
-	struct cxl_port *port = cxlrd_to_port(cxlrd);
-	struct range *hpa = &cxled->cxld.hpa_range;
 	struct cxl_region_params *p;
 	struct cxl_region *cxlr;
-	struct resource *res;
-	int rc;
+	int err;
 
 	do {
 		cxlr = __create_region(cxlrd, cxled->mode,
@@ -3405,8 +3426,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 	} while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
 
 	if (IS_ERR(cxlr)) {
-		dev_err(cxlmd->dev.parent,
-			"%s:%s: %s failed assign region: %ld\n",
+		dev_err(cxlmd->dev.parent, "%s:%s: %s failed assign region: %ld\n",
 			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
 			__func__, PTR_ERR(cxlr));
 		return cxlr;
@@ -3416,13 +3436,33 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 	p = &cxlr->params;
 	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
 		dev_err(cxlmd->dev.parent,
-			"%s:%s: %s autodiscovery interrupted\n",
+			"%s:%s: %s region setup interrupted\n",
 			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
 			__func__);
-		rc = -EBUSY;
-		goto err;
+		err = -EBUSY;
+		construct_region_end();
+		drop_region(cxlr);
+		return ERR_PTR(err);
 	}
 
+	return cxlr;
+}
+
+/* Establish an empty region covering the given HPA range */
+static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
+					   struct cxl_endpoint_decoder *cxled)
+{
+	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+	struct range *hpa = &cxled->cxld.hpa_range;
+	struct cxl_region_params *p;
+	struct cxl_region *cxlr;
+	struct resource *res;
+	int rc;
+
+	cxlr = construct_region_begin(cxlrd, cxled);
+	if (IS_ERR(cxlr))
+		return cxlr;
+
 	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
 
 	res = kmalloc(sizeof(*res), GFP_KERNEL);
@@ -3445,6 +3485,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 			 __func__, dev_name(&cxlr->dev));
 	}
 
+	p = &cxlr->params;
 	p->res = res;
 	p->interleave_ways = cxled->cxld.interleave_ways;
 	p->interleave_granularity = cxled->cxld.interleave_granularity;
@@ -3462,15 +3503,91 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 	/* ...to match put_device() in cxl_add_to_region() */
 	get_device(&cxlr->dev);
 	up_write(&cxl_region_rwsem);
-
+	construct_region_end();
 	return cxlr;
 
 err:
-	up_write(&cxl_region_rwsem);
-	devm_release_action(port->uport_dev, unregister_region, cxlr);
+	construct_region_end();
+	drop_region(cxlr);
+	return ERR_PTR(rc);
+}
+
+static struct cxl_region *
+__construct_new_region(struct cxl_root_decoder *cxlrd,
+		       struct cxl_endpoint_decoder *cxled)
+{
+	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
+	struct cxl_region_params *p;
+	struct cxl_region *cxlr;
+	int rc;
+
+	cxlr = construct_region_begin(cxlrd, cxled);
+	if (IS_ERR(cxlr))
+		return cxlr;
+
+	rc = set_interleave_ways(cxlr, 1);
+	if (rc)
+		goto err;
+
+	rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
+	if (rc)
+		goto err;
+
+	rc = alloc_hpa(cxlr, resource_size(cxled->dpa_res));
+	if (rc)
+		goto err;
+
+	down_read(&cxl_dpa_rwsem);
+	rc = cxl_region_attach(cxlr, cxled, 0);
+	up_read(&cxl_dpa_rwsem);
+
+	if (rc)
+		goto err;
+
+	rc = cxl_region_decode_commit(cxlr);
+	if (rc)
+		goto err;
+
+	p = &cxlr->params;
+	p->state = CXL_CONFIG_COMMIT;
+
+	construct_region_end();
+	return cxlr;
+err:
+	construct_region_end();
+	drop_region(cxlr);
 	return ERR_PTR(rc);
 }
 
+/**
+ * cxl_create_region - Establish a region given an endpoint decoder
+ * @cxlrd: root decoder to allocate HPA
+ * @cxled: endpoint decoder with reserved DPA capacity
+ *
+ * Returns a fully formed region in the commit state and attached to the
+ * cxl_region driver.
+ */
+struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
+				     struct cxl_endpoint_decoder *cxled)
+{
+	struct cxl_region *cxlr;
+
+	mutex_lock(&cxlrd->range_lock);
+	cxlr = __construct_new_region(cxlrd, cxled);
+	mutex_unlock(&cxlrd->range_lock);
+
+	if (IS_ERR(cxlr))
+		return cxlr;
+
+	if (device_attach(&cxlr->dev) <= 0) {
+		dev_err(&cxlr->dev, "failed to create region\n");
+		drop_region(cxlr);
+		return ERR_PTR(-ENODEV);
+	}
+	return cxlr;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_create_region, CXL);
+
 int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 68d28eab3696..0f5c71909fd1 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -875,4 +875,6 @@ struct cxl_hdm {
 struct seq_file;
 struct dentry *cxl_debugfs_create_dir(const char *dir);
 void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds);
+struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
+				     struct cxl_endpoint_decoder *cxled);
 #endif /* __CXL_MEM_H__ */
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
index 45b6badb8048..c544339c2baf 100644
--- a/include/linux/cxl/cxl.h
+++ b/include/linux/cxl/cxl.h
@@ -72,4 +72,8 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_memdev *cxlmd,
 					     resource_size_t min,
 					     resource_size_t max);
 int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
+struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
+				     struct cxl_endpoint_decoder *cxled);
+
+int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 22/26] cxl: allow region creation by type2 drivers
  2024-10-17 16:52 ` [PATCH v4 22/26] cxl: allow region creation by type2 drivers alejandro.lucero-palau
@ 2024-10-17 21:49   ` Ben Cheatham
  2024-10-18  8:51     ` Alejandro Lucero Palau
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Cheatham @ 2024-10-17 21:49 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet

On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Creating a CXL region requires userspace intervention through the cxl
> sysfs files. Type2 support should allow accelerator drivers to create
> such cxl region from kernel code.
> 
> Adding that functionality and integrating it with current support for
> memory expanders.
> 
> Based on https://lore.kernel.org/linux-cxl/168592159835.1948938.1647215579839222774.stgit@dwillia2-xfh.jf.intel.com/
> 

So I ran into an issue at this point when using v3 as a base for my own testing. The problem is that
you are doing manual region management while not explicitly preventing auto region discovery when
devm_cxl_add_memdev() is called (patch 14/26 in this series). This caused some resource allocation
conflicts which then caused both the auto region and the manual region set up to fail. To make it more
concrete, here's the flow I encountered (I tried something new here, let me know if the ascii
is all mangled):

devm_cxl_add_memdev() is called                                                                         
│                                                                                                       
├───► cxl_mem probes new memdev                                                                         
│     │                                                                                                 
│     ├─► cxl_mem probe adds new endpoint port                                                          
│     │                                                                                                 
│     └─► cxl_mem probe finishes                                                                        
├───────────────────────────────────────────────► Manual region set up starts (finding free space, etc.)
├───► cxl_port probes the new endpoint port            │                                                
│     │                                                │                                                
│     ├─► cxl_port probe sets up new endpoint          ├─► create_new_region() is called                
│     │                                                │                                                
│     ├─► cxl_port calls discover_region()             │                                                
│     │                                                │                                                
│     ├─► discover_region() creates new auto           ├─► create_new_region() creates
│     │   discoveredregion                             │   new manual region                                          
│◄────◄────────────────────────────────────────────────┘                                                
│                                                                                                       
└─► Region creation fails due to resource contention/race (DPA resource, RAM resource, etc.)

The timeline is a little off here I think, but it should be close enough to illustrate the point.
The easy solution here to not allow auto region discovery for CXL type 2 devices, like so:

diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 22a9ba89cf5a..07b991e2c05b 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -34,6 +34,7 @@ static void schedule_detach(void *cxlmd)
 static int discover_region(struct device *dev, void *root)
 {
        struct cxl_endpoint_decoder *cxled;
+       struct cxl_memdev *cxlmd;
        int rc;

        dev_err(dev, "%s:%d: Enter\n", __func__, __LINE__);
@@ -45,7 +46,9 @@ static int discover_region(struct device *dev, void *root)
        if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
                return 0;

-       if (cxled->state != CXL_DECODER_STATE_AUTO)
+       cxlmd = cxled_to_memdev(cxled);
+       if (cxled->state != CXL_DECODER_STATE_AUTO ||
+           cxlmd->cxlds->type == CXL_DEVTYPE_DEVMEM)
                return 0;

I think there's a better way to go about this, more to say about it in patch 24/26. I've
dropped this here just in case you don't like my ideas there ;).
                                                                    
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/region.c | 147 ++++++++++++++++++++++++++++++++++----
>  drivers/cxl/cxlmem.h      |   2 +
>  include/linux/cxl/cxl.h   |   4 ++
>  3 files changed, 138 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index d08a2a848ac9..04c270a29e96 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2253,6 +2253,18 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
>  	return rc;
>  }
>  
> +int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled)
> +{
> +	int rc;
> +
> +	down_write(&cxl_region_rwsem);
> +	cxled->mode = CXL_DECODER_DEAD;
> +	rc = cxl_region_detach(cxled);
> +	up_write(&cxl_region_rwsem);
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_region_detach, CXL);
> +
>  void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
>  {
>  	down_write(&cxl_region_rwsem);
> @@ -2781,6 +2793,14 @@ cxl_find_region_by_name(struct cxl_root_decoder *cxlrd, const char *name)
>  	return to_cxl_region(region_dev);
>  }
>  
> +static void drop_region(struct cxl_region *cxlr)
> +{
> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> +	struct cxl_port *port = cxlrd_to_port(cxlrd);
> +
> +	devm_release_action(port->uport_dev, unregister_region, cxlr);
> +}
> +
>  static ssize_t delete_region_store(struct device *dev,
>  				   struct device_attribute *attr,
>  				   const char *buf, size_t len)
> @@ -3386,17 +3406,18 @@ static int match_region_by_range(struct device *dev, void *data)
>  	return rc;
>  }
>  
> -/* Establish an empty region covering the given HPA range */
> -static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> -					   struct cxl_endpoint_decoder *cxled)
> +static void construct_region_end(void)
> +{
> +	up_write(&cxl_region_rwsem);
> +}
> +
> +static struct cxl_region *construct_region_begin(struct cxl_root_decoder *cxlrd,
> +						 struct cxl_endpoint_decoder *cxled)
>  {
>  	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> -	struct cxl_port *port = cxlrd_to_port(cxlrd);
> -	struct range *hpa = &cxled->cxld.hpa_range;
>  	struct cxl_region_params *p;
>  	struct cxl_region *cxlr;
> -	struct resource *res;
> -	int rc;
> +	int err;
>  
>  	do {
>  		cxlr = __create_region(cxlrd, cxled->mode,
> @@ -3405,8 +3426,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  	} while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
>  
>  	if (IS_ERR(cxlr)) {
> -		dev_err(cxlmd->dev.parent,
> -			"%s:%s: %s failed assign region: %ld\n",
> +		dev_err(cxlmd->dev.parent, "%s:%s: %s failed assign region: %ld\n",
>  			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>  			__func__, PTR_ERR(cxlr));
>  		return cxlr;
> @@ -3416,13 +3436,33 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  	p = &cxlr->params;
>  	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>  		dev_err(cxlmd->dev.parent,
> -			"%s:%s: %s autodiscovery interrupted\n",
> +			"%s:%s: %s region setup interrupted\n",
>  			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>  			__func__);
> -		rc = -EBUSY;
> -		goto err;
> +		err = -EBUSY;
> +		construct_region_end();
> +		drop_region(cxlr);
> +		return ERR_PTR(err);
>  	}
>  
> +	return cxlr;
> +}
> +
> +/* Establish an empty region covering the given HPA range */
> +static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> +					   struct cxl_endpoint_decoder *cxled)
> +{
> +	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +	struct range *hpa = &cxled->cxld.hpa_range;
> +	struct cxl_region_params *p;
> +	struct cxl_region *cxlr;
> +	struct resource *res;
> +	int rc;
> +
> +	cxlr = construct_region_begin(cxlrd, cxled);
> +	if (IS_ERR(cxlr))
> +		return cxlr;
> +
>  	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>  
>  	res = kmalloc(sizeof(*res), GFP_KERNEL);
> @@ -3445,6 +3485,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  			 __func__, dev_name(&cxlr->dev));
>  	}
>  
> +	p = &cxlr->params;
>  	p->res = res;
>  	p->interleave_ways = cxled->cxld.interleave_ways;
>  	p->interleave_granularity = cxled->cxld.interleave_granularity;
> @@ -3462,15 +3503,91 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  	/* ...to match put_device() in cxl_add_to_region() */
>  	get_device(&cxlr->dev);
>  	up_write(&cxl_region_rwsem);
> -
> +	construct_region_end();
>  	return cxlr;
>  
>  err:
> -	up_write(&cxl_region_rwsem);
> -	devm_release_action(port->uport_dev, unregister_region, cxlr);
> +	construct_region_end();
> +	drop_region(cxlr);
> +	return ERR_PTR(rc);
> +}
> +
> +static struct cxl_region *
> +__construct_new_region(struct cxl_root_decoder *cxlrd,
> +		       struct cxl_endpoint_decoder *cxled)
> +{
> +	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
> +	struct cxl_region_params *p;
> +	struct cxl_region *cxlr;
> +	int rc;
> +
> +	cxlr = construct_region_begin(cxlrd, cxled);
> +	if (IS_ERR(cxlr))
> +		return cxlr;
> +
> +	rc = set_interleave_ways(cxlr, 1);
> +	if (rc)
> +		goto err;
> +
> +	rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
> +	if (rc)
> +		goto err;
> +
> +	rc = alloc_hpa(cxlr, resource_size(cxled->dpa_res));
> +	if (rc)
> +		goto err;
> +
> +	down_read(&cxl_dpa_rwsem);
> +	rc = cxl_region_attach(cxlr, cxled, 0);
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (rc)
> +		goto err;
> +
> +	rc = cxl_region_decode_commit(cxlr);
> +	if (rc)
> +		goto err;
> +
> +	p = &cxlr->params;
> +	p->state = CXL_CONFIG_COMMIT;
> +
> +	construct_region_end();
> +	return cxlr;
> +err:
> +	construct_region_end();
> +	drop_region(cxlr);
>  	return ERR_PTR(rc);
>  }
>  
> +/**
> + * cxl_create_region - Establish a region given an endpoint decoder
> + * @cxlrd: root decoder to allocate HPA
> + * @cxled: endpoint decoder with reserved DPA capacity
> + *
> + * Returns a fully formed region in the commit state and attached to the
> + * cxl_region driver.
> + */
> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
> +				     struct cxl_endpoint_decoder *cxled)
> +{
> +	struct cxl_region *cxlr;
> +
> +	mutex_lock(&cxlrd->range_lock);
> +	cxlr = __construct_new_region(cxlrd, cxled);
> +	mutex_unlock(&cxlrd->range_lock);
> +
> +	if (IS_ERR(cxlr))
> +		return cxlr;
> +
> +	if (device_attach(&cxlr->dev) <= 0) {
> +		dev_err(&cxlr->dev, "failed to create region\n");
> +		drop_region(cxlr);
> +		return ERR_PTR(-ENODEV);
> +	}
> +	return cxlr;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_create_region, CXL);
> +
>  int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
>  {
>  	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 68d28eab3696..0f5c71909fd1 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -875,4 +875,6 @@ struct cxl_hdm {
>  struct seq_file;
>  struct dentry *cxl_debugfs_create_dir(const char *dir);
>  void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds);
> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
> +				     struct cxl_endpoint_decoder *cxled);
>  #endif /* __CXL_MEM_H__ */
> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
> index 45b6badb8048..c544339c2baf 100644
> --- a/include/linux/cxl/cxl.h
> +++ b/include/linux/cxl/cxl.h
> @@ -72,4 +72,8 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_memdev *cxlmd,
>  					     resource_size_t min,
>  					     resource_size_t max);
>  int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
> +				     struct cxl_endpoint_decoder *cxled);
> +
> +int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled);
>  #endif


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 22/26] cxl: allow region creation by type2 drivers
  2024-10-17 21:49   ` Ben Cheatham
@ 2024-10-18  8:51     ` Alejandro Lucero Palau
  2024-10-18 16:40       ` Ben Cheatham
  0 siblings, 1 reply; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-18  8:51 UTC (permalink / raw)
  To: Ben Cheatham, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/17/24 22:49, Ben Cheatham wrote:
> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Creating a CXL region requires userspace intervention through the cxl
>> sysfs files. Type2 support should allow accelerator drivers to create
>> such cxl region from kernel code.
>>
>> Adding that functionality and integrating it with current support for
>> memory expanders.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592159835.1948938.1647215579839222774.stgit@dwillia2-xfh.jf.intel.com/
>>
> So I ran into an issue at this point when using v3 as a base for my own testing. The problem is that
> you are doing manual region management while not explicitly preventing auto region discovery when
> devm_cxl_add_memdev() is called (patch 14/26 in this series). This caused some resource allocation
> conflicts which then caused both the auto region and the manual region set up to fail. To make it more
> concrete, here's the flow I encountered (I tried something new here, let me know if the ascii
> is all mangled):
>
> devm_cxl_add_memdev() is called
> │
> ├───► cxl_mem probes new memdev
> │     │
> │     ├─► cxl_mem probe adds new endpoint port
> │     │
> │     └─► cxl_mem probe finishes
> ├───────────────────────────────────────────────► Manual region set up starts (finding free space, etc.)
> ├───► cxl_port probes the new endpoint port            │
> │     │                                                │
> │     ├─► cxl_port probe sets up new endpoint          ├─► create_new_region() is called
> │     │                                                │
> │     ├─► cxl_port calls discover_region()             │
> │     │                                                │
> │     ├─► discover_region() creates new auto           ├─► create_new_region() creates
> │     │   discoveredregion                             │   new manual region
> │◄────◄────────────────────────────────────────────────┘
> │
> └─► Region creation fails due to resource contention/race (DPA resource, RAM resource, etc.)
>
> The timeline is a little off here I think, but it should be close enough to illustrate the point.


Interesting.


I'm aware of that code path when endpoint port is probed, but it is not 
a problem with my testing because the decoder is not enabled at the time 
of discover_region.


I've tested this with two different emulated devices, one a dumb qemu 
type2 device with a driver doing nothing but cxl initialization, and 
another being our network device with CXL support and using RTL 
emulation, and in both cases the decoder is not enabled at that point, 
which makes sense since, AFAIK, it is at region creation/attachment when 
the decoder is committed/enabled. So my obvious question is how are you 
testing this functionality? It seems as if you could have been creating 
more than one region somehow, or maybe something I'm just missing about 
this.


> The easy solution here to not allow auto region discovery for CXL type 2 devices, like so:
>
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index 22a9ba89cf5a..07b991e2c05b 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -34,6 +34,7 @@ static void schedule_detach(void *cxlmd)
>   static int discover_region(struct device *dev, void *root)
>   {
>          struct cxl_endpoint_decoder *cxled;
> +       struct cxl_memdev *cxlmd;
>          int rc;
>
>          dev_err(dev, "%s:%d: Enter\n", __func__, __LINE__);
> @@ -45,7 +46,9 @@ static int discover_region(struct device *dev, void *root)
>          if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
>                  return 0;
>
> -       if (cxled->state != CXL_DECODER_STATE_AUTO)
> +       cxlmd = cxled_to_memdev(cxled);
> +       if (cxled->state != CXL_DECODER_STATE_AUTO ||
> +           cxlmd->cxlds->type == CXL_DEVTYPE_DEVMEM)
>                  return 0;
>
> I think there's a better way to go about this, more to say about it in patch 24/26. I've
> dropped this here just in case you don't like my ideas there ;).
>                                                                      
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>   drivers/cxl/core/region.c | 147 ++++++++++++++++++++++++++++++++++----
>>   drivers/cxl/cxlmem.h      |   2 +
>>   include/linux/cxl/cxl.h   |   4 ++
>>   3 files changed, 138 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index d08a2a848ac9..04c270a29e96 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -2253,6 +2253,18 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
>>   	return rc;
>>   }
>>   
>> +int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled)
>> +{
>> +	int rc;
>> +
>> +	down_write(&cxl_region_rwsem);
>> +	cxled->mode = CXL_DECODER_DEAD;
>> +	rc = cxl_region_detach(cxled);
>> +	up_write(&cxl_region_rwsem);
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_region_detach, CXL);
>> +
>>   void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
>>   {
>>   	down_write(&cxl_region_rwsem);
>> @@ -2781,6 +2793,14 @@ cxl_find_region_by_name(struct cxl_root_decoder *cxlrd, const char *name)
>>   	return to_cxl_region(region_dev);
>>   }
>>   
>> +static void drop_region(struct cxl_region *cxlr)
>> +{
>> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
>> +	struct cxl_port *port = cxlrd_to_port(cxlrd);
>> +
>> +	devm_release_action(port->uport_dev, unregister_region, cxlr);
>> +}
>> +
>>   static ssize_t delete_region_store(struct device *dev,
>>   				   struct device_attribute *attr,
>>   				   const char *buf, size_t len)
>> @@ -3386,17 +3406,18 @@ static int match_region_by_range(struct device *dev, void *data)
>>   	return rc;
>>   }
>>   
>> -/* Establish an empty region covering the given HPA range */
>> -static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>> -					   struct cxl_endpoint_decoder *cxled)
>> +static void construct_region_end(void)
>> +{
>> +	up_write(&cxl_region_rwsem);
>> +}
>> +
>> +static struct cxl_region *construct_region_begin(struct cxl_root_decoder *cxlrd,
>> +						 struct cxl_endpoint_decoder *cxled)
>>   {
>>   	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>> -	struct cxl_port *port = cxlrd_to_port(cxlrd);
>> -	struct range *hpa = &cxled->cxld.hpa_range;
>>   	struct cxl_region_params *p;
>>   	struct cxl_region *cxlr;
>> -	struct resource *res;
>> -	int rc;
>> +	int err;
>>   
>>   	do {
>>   		cxlr = __create_region(cxlrd, cxled->mode,
>> @@ -3405,8 +3426,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>   	} while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
>>   
>>   	if (IS_ERR(cxlr)) {
>> -		dev_err(cxlmd->dev.parent,
>> -			"%s:%s: %s failed assign region: %ld\n",
>> +		dev_err(cxlmd->dev.parent, "%s:%s: %s failed assign region: %ld\n",
>>   			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>>   			__func__, PTR_ERR(cxlr));
>>   		return cxlr;
>> @@ -3416,13 +3436,33 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>   	p = &cxlr->params;
>>   	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>>   		dev_err(cxlmd->dev.parent,
>> -			"%s:%s: %s autodiscovery interrupted\n",
>> +			"%s:%s: %s region setup interrupted\n",
>>   			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>>   			__func__);
>> -		rc = -EBUSY;
>> -		goto err;
>> +		err = -EBUSY;
>> +		construct_region_end();
>> +		drop_region(cxlr);
>> +		return ERR_PTR(err);
>>   	}
>>   
>> +	return cxlr;
>> +}
>> +
>> +/* Establish an empty region covering the given HPA range */
>> +static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>> +					   struct cxl_endpoint_decoder *cxled)
>> +{
>> +	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>> +	struct range *hpa = &cxled->cxld.hpa_range;
>> +	struct cxl_region_params *p;
>> +	struct cxl_region *cxlr;
>> +	struct resource *res;
>> +	int rc;
>> +
>> +	cxlr = construct_region_begin(cxlrd, cxled);
>> +	if (IS_ERR(cxlr))
>> +		return cxlr;
>> +
>>   	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>>   
>>   	res = kmalloc(sizeof(*res), GFP_KERNEL);
>> @@ -3445,6 +3485,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>   			 __func__, dev_name(&cxlr->dev));
>>   	}
>>   
>> +	p = &cxlr->params;
>>   	p->res = res;
>>   	p->interleave_ways = cxled->cxld.interleave_ways;
>>   	p->interleave_granularity = cxled->cxld.interleave_granularity;
>> @@ -3462,15 +3503,91 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>   	/* ...to match put_device() in cxl_add_to_region() */
>>   	get_device(&cxlr->dev);
>>   	up_write(&cxl_region_rwsem);
>> -
>> +	construct_region_end();
>>   	return cxlr;
>>   
>>   err:
>> -	up_write(&cxl_region_rwsem);
>> -	devm_release_action(port->uport_dev, unregister_region, cxlr);
>> +	construct_region_end();
>> +	drop_region(cxlr);
>> +	return ERR_PTR(rc);
>> +}
>> +
>> +static struct cxl_region *
>> +__construct_new_region(struct cxl_root_decoder *cxlrd,
>> +		       struct cxl_endpoint_decoder *cxled)
>> +{
>> +	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
>> +	struct cxl_region_params *p;
>> +	struct cxl_region *cxlr;
>> +	int rc;
>> +
>> +	cxlr = construct_region_begin(cxlrd, cxled);
>> +	if (IS_ERR(cxlr))
>> +		return cxlr;
>> +
>> +	rc = set_interleave_ways(cxlr, 1);
>> +	if (rc)
>> +		goto err;
>> +
>> +	rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
>> +	if (rc)
>> +		goto err;
>> +
>> +	rc = alloc_hpa(cxlr, resource_size(cxled->dpa_res));
>> +	if (rc)
>> +		goto err;
>> +
>> +	down_read(&cxl_dpa_rwsem);
>> +	rc = cxl_region_attach(cxlr, cxled, 0);
>> +	up_read(&cxl_dpa_rwsem);
>> +
>> +	if (rc)
>> +		goto err;
>> +
>> +	rc = cxl_region_decode_commit(cxlr);
>> +	if (rc)
>> +		goto err;
>> +
>> +	p = &cxlr->params;
>> +	p->state = CXL_CONFIG_COMMIT;
>> +
>> +	construct_region_end();
>> +	return cxlr;
>> +err:
>> +	construct_region_end();
>> +	drop_region(cxlr);
>>   	return ERR_PTR(rc);
>>   }
>>   
>> +/**
>> + * cxl_create_region - Establish a region given an endpoint decoder
>> + * @cxlrd: root decoder to allocate HPA
>> + * @cxled: endpoint decoder with reserved DPA capacity
>> + *
>> + * Returns a fully formed region in the commit state and attached to the
>> + * cxl_region driver.
>> + */
>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>> +				     struct cxl_endpoint_decoder *cxled)
>> +{
>> +	struct cxl_region *cxlr;
>> +
>> +	mutex_lock(&cxlrd->range_lock);
>> +	cxlr = __construct_new_region(cxlrd, cxled);
>> +	mutex_unlock(&cxlrd->range_lock);
>> +
>> +	if (IS_ERR(cxlr))
>> +		return cxlr;
>> +
>> +	if (device_attach(&cxlr->dev) <= 0) {
>> +		dev_err(&cxlr->dev, "failed to create region\n");
>> +		drop_region(cxlr);
>> +		return ERR_PTR(-ENODEV);
>> +	}
>> +	return cxlr;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_create_region, CXL);
>> +
>>   int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
>>   {
>>   	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index 68d28eab3696..0f5c71909fd1 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -875,4 +875,6 @@ struct cxl_hdm {
>>   struct seq_file;
>>   struct dentry *cxl_debugfs_create_dir(const char *dir);
>>   void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds);
>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>> +				     struct cxl_endpoint_decoder *cxled);
>>   #endif /* __CXL_MEM_H__ */
>> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
>> index 45b6badb8048..c544339c2baf 100644
>> --- a/include/linux/cxl/cxl.h
>> +++ b/include/linux/cxl/cxl.h
>> @@ -72,4 +72,8 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_memdev *cxlmd,
>>   					     resource_size_t min,
>>   					     resource_size_t max);
>>   int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>> +				     struct cxl_endpoint_decoder *cxled);
>> +
>> +int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled);
>>   #endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 22/26] cxl: allow region creation by type2 drivers
  2024-10-18  8:51     ` Alejandro Lucero Palau
@ 2024-10-18 16:40       ` Ben Cheatham
  2024-10-21  9:54         ` Alejandro Lucero Palau
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Cheatham @ 2024-10-18 16:40 UTC (permalink / raw)
  To: Alejandro Lucero Palau, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet



On 10/18/24 3:51 AM, Alejandro Lucero Palau wrote:
> 
> On 10/17/24 22:49, Ben Cheatham wrote:
>> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>>> From: Alejandro Lucero <alucerop@amd.com>
>>>
>>> Creating a CXL region requires userspace intervention through the cxl
>>> sysfs files. Type2 support should allow accelerator drivers to create
>>> such cxl region from kernel code.
>>>
>>> Adding that functionality and integrating it with current support for
>>> memory expanders.
>>>
>>> Based on https://lore.kernel.org/linux-cxl/168592159835.1948938.1647215579839222774.stgit@dwillia2-xfh.jf.intel.com/
>>>
>> So I ran into an issue at this point when using v3 as a base for my own testing. The problem is that
>> you are doing manual region management while not explicitly preventing auto region discovery when
>> devm_cxl_add_memdev() is called (patch 14/26 in this series). This caused some resource allocation
>> conflicts which then caused both the auto region and the manual region set up to fail. To make it more
>> concrete, here's the flow I encountered (I tried something new here, let me know if the ascii
>> is all mangled):
>>
>> devm_cxl_add_memdev() is called
>> │
>> ├───► cxl_mem probes new memdev
>> │     │
>> │     ├─► cxl_mem probe adds new endpoint port
>> │     │
>> │     └─► cxl_mem probe finishes
>> ├───────────────────────────────────────────────► Manual region set up starts (finding free space, etc.)
>> ├───► cxl_port probes the new endpoint port            │
>> │     │                                                │
>> │     ├─► cxl_port probe sets up new endpoint          ├─► create_new_region() is called
>> │     │                                                │
>> │     ├─► cxl_port calls discover_region()             │
>> │     │                                                │
>> │     ├─► discover_region() creates new auto           ├─► create_new_region() creates
>> │     │   discoveredregion                             │   new manual region
>> │◄────◄────────────────────────────────────────────────┘
>> │
>> └─► Region creation fails due to resource contention/race (DPA resource, RAM resource, etc.)
>>
>> The timeline is a little off here I think, but it should be close enough to illustrate the point.
> 
> 
> Interesting.
> 
> 
> I'm aware of that code path when endpoint port is probed, but it is not a problem with my testing because the decoder is not enabled at the time of discover_region.
> 
> 
> I've tested this with two different emulated devices, one a dumb qemu type2 device with a driver doing nothing but cxl initialization, and another being our network device with CXL support and using RTL emulation, and in both cases the decoder is not enabled at that point, which makes sense since, AFAIK, it is at region creation/attachment when the decoder is committed/enabled. So my obvious question is how are you testing this functionality? It seems as if you could have been creating more than one region somehow, or maybe something I'm just missing about this.
> 

I think the reason you aren't seeing this is that QEMU doesn't have regions programmed by firmware. In my setup
the decoders are coming up pre-programmed and enabled by firmware, so it is hitting the path during endpoint probe.

Thanks,
Ben

> 
>> The easy solution here to not allow auto region discovery for CXL type 2 devices, like so:
>>
>> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
>> index 22a9ba89cf5a..07b991e2c05b 100644
>> --- a/drivers/cxl/port.c
>> +++ b/drivers/cxl/port.c
>> @@ -34,6 +34,7 @@ static void schedule_detach(void *cxlmd)
>>   static int discover_region(struct device *dev, void *root)
>>   {
>>          struct cxl_endpoint_decoder *cxled;
>> +       struct cxl_memdev *cxlmd;
>>          int rc;
>>
>>          dev_err(dev, "%s:%d: Enter\n", __func__, __LINE__);
>> @@ -45,7 +46,9 @@ static int discover_region(struct device *dev, void *root)
>>          if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
>>                  return 0;
>>
>> -       if (cxled->state != CXL_DECODER_STATE_AUTO)
>> +       cxlmd = cxled_to_memdev(cxled);
>> +       if (cxled->state != CXL_DECODER_STATE_AUTO ||
>> +           cxlmd->cxlds->type == CXL_DEVTYPE_DEVMEM)
>>                  return 0;
>>
>> I think there's a better way to go about this, more to say about it in patch 24/26. I've
>> dropped this here just in case you don't like my ideas there ;).
>>                                                                     
>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>> ---
>>>   drivers/cxl/core/region.c | 147 ++++++++++++++++++++++++++++++++++----
>>>   drivers/cxl/cxlmem.h      |   2 +
>>>   include/linux/cxl/cxl.h   |   4 ++
>>>   3 files changed, 138 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>> index d08a2a848ac9..04c270a29e96 100644
>>> --- a/drivers/cxl/core/region.c
>>> +++ b/drivers/cxl/core/region.c
>>> @@ -2253,6 +2253,18 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
>>>       return rc;
>>>   }
>>>   +int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled)
>>> +{
>>> +    int rc;
>>> +
>>> +    down_write(&cxl_region_rwsem);
>>> +    cxled->mode = CXL_DECODER_DEAD;
>>> +    rc = cxl_region_detach(cxled);
>>> +    up_write(&cxl_region_rwsem);
>>> +    return rc;
>>> +}
>>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_region_detach, CXL);
>>> +
>>>   void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
>>>   {
>>>       down_write(&cxl_region_rwsem);
>>> @@ -2781,6 +2793,14 @@ cxl_find_region_by_name(struct cxl_root_decoder *cxlrd, const char *name)
>>>       return to_cxl_region(region_dev);
>>>   }
>>>   +static void drop_region(struct cxl_region *cxlr)
>>> +{
>>> +    struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
>>> +    struct cxl_port *port = cxlrd_to_port(cxlrd);
>>> +
>>> +    devm_release_action(port->uport_dev, unregister_region, cxlr);
>>> +}
>>> +
>>>   static ssize_t delete_region_store(struct device *dev,
>>>                      struct device_attribute *attr,
>>>                      const char *buf, size_t len)
>>> @@ -3386,17 +3406,18 @@ static int match_region_by_range(struct device *dev, void *data)
>>>       return rc;
>>>   }
>>>   -/* Establish an empty region covering the given HPA range */
>>> -static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>> -                       struct cxl_endpoint_decoder *cxled)
>>> +static void construct_region_end(void)
>>> +{
>>> +    up_write(&cxl_region_rwsem);
>>> +}
>>> +
>>> +static struct cxl_region *construct_region_begin(struct cxl_root_decoder *cxlrd,
>>> +                         struct cxl_endpoint_decoder *cxled)
>>>   {
>>>       struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>>> -    struct cxl_port *port = cxlrd_to_port(cxlrd);
>>> -    struct range *hpa = &cxled->cxld.hpa_range;
>>>       struct cxl_region_params *p;
>>>       struct cxl_region *cxlr;
>>> -    struct resource *res;
>>> -    int rc;
>>> +    int err;
>>>         do {
>>>           cxlr = __create_region(cxlrd, cxled->mode,
>>> @@ -3405,8 +3426,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>>       } while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
>>>         if (IS_ERR(cxlr)) {
>>> -        dev_err(cxlmd->dev.parent,
>>> -            "%s:%s: %s failed assign region: %ld\n",
>>> +        dev_err(cxlmd->dev.parent, "%s:%s: %s failed assign region: %ld\n",
>>>               dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>>>               __func__, PTR_ERR(cxlr));
>>>           return cxlr;
>>> @@ -3416,13 +3436,33 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>>       p = &cxlr->params;
>>>       if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>>>           dev_err(cxlmd->dev.parent,
>>> -            "%s:%s: %s autodiscovery interrupted\n",
>>> +            "%s:%s: %s region setup interrupted\n",
>>>               dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>>>               __func__);
>>> -        rc = -EBUSY;
>>> -        goto err;
>>> +        err = -EBUSY;
>>> +        construct_region_end();
>>> +        drop_region(cxlr);
>>> +        return ERR_PTR(err);
>>>       }
>>>   +    return cxlr;
>>> +}
>>> +
>>> +/* Establish an empty region covering the given HPA range */
>>> +static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>> +                       struct cxl_endpoint_decoder *cxled)
>>> +{
>>> +    struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>>> +    struct range *hpa = &cxled->cxld.hpa_range;
>>> +    struct cxl_region_params *p;
>>> +    struct cxl_region *cxlr;
>>> +    struct resource *res;
>>> +    int rc;
>>> +
>>> +    cxlr = construct_region_begin(cxlrd, cxled);
>>> +    if (IS_ERR(cxlr))
>>> +        return cxlr;
>>> +
>>>       set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>>>         res = kmalloc(sizeof(*res), GFP_KERNEL);
>>> @@ -3445,6 +3485,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>>                __func__, dev_name(&cxlr->dev));
>>>       }
>>>   +    p = &cxlr->params;
>>>       p->res = res;
>>>       p->interleave_ways = cxled->cxld.interleave_ways;
>>>       p->interleave_granularity = cxled->cxld.interleave_granularity;
>>> @@ -3462,15 +3503,91 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>>       /* ...to match put_device() in cxl_add_to_region() */
>>>       get_device(&cxlr->dev);
>>>       up_write(&cxl_region_rwsem);
>>> -
>>> +    construct_region_end();
>>>       return cxlr;
>>>     err:
>>> -    up_write(&cxl_region_rwsem);
>>> -    devm_release_action(port->uport_dev, unregister_region, cxlr);
>>> +    construct_region_end();
>>> +    drop_region(cxlr);
>>> +    return ERR_PTR(rc);
>>> +}
>>> +
>>> +static struct cxl_region *
>>> +__construct_new_region(struct cxl_root_decoder *cxlrd,
>>> +               struct cxl_endpoint_decoder *cxled)
>>> +{
>>> +    struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
>>> +    struct cxl_region_params *p;
>>> +    struct cxl_region *cxlr;
>>> +    int rc;
>>> +
>>> +    cxlr = construct_region_begin(cxlrd, cxled);
>>> +    if (IS_ERR(cxlr))
>>> +        return cxlr;
>>> +
>>> +    rc = set_interleave_ways(cxlr, 1);
>>> +    if (rc)
>>> +        goto err;
>>> +
>>> +    rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
>>> +    if (rc)
>>> +        goto err;
>>> +
>>> +    rc = alloc_hpa(cxlr, resource_size(cxled->dpa_res));
>>> +    if (rc)
>>> +        goto err;
>>> +
>>> +    down_read(&cxl_dpa_rwsem);
>>> +    rc = cxl_region_attach(cxlr, cxled, 0);
>>> +    up_read(&cxl_dpa_rwsem);
>>> +
>>> +    if (rc)
>>> +        goto err;
>>> +
>>> +    rc = cxl_region_decode_commit(cxlr);
>>> +    if (rc)
>>> +        goto err;
>>> +
>>> +    p = &cxlr->params;
>>> +    p->state = CXL_CONFIG_COMMIT;
>>> +
>>> +    construct_region_end();
>>> +    return cxlr;
>>> +err:
>>> +    construct_region_end();
>>> +    drop_region(cxlr);
>>>       return ERR_PTR(rc);
>>>   }
>>>   +/**
>>> + * cxl_create_region - Establish a region given an endpoint decoder
>>> + * @cxlrd: root decoder to allocate HPA
>>> + * @cxled: endpoint decoder with reserved DPA capacity
>>> + *
>>> + * Returns a fully formed region in the commit state and attached to the
>>> + * cxl_region driver.
>>> + */
>>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>>> +                     struct cxl_endpoint_decoder *cxled)
>>> +{
>>> +    struct cxl_region *cxlr;
>>> +
>>> +    mutex_lock(&cxlrd->range_lock);
>>> +    cxlr = __construct_new_region(cxlrd, cxled);
>>> +    mutex_unlock(&cxlrd->range_lock);
>>> +
>>> +    if (IS_ERR(cxlr))
>>> +        return cxlr;
>>> +
>>> +    if (device_attach(&cxlr->dev) <= 0) {
>>> +        dev_err(&cxlr->dev, "failed to create region\n");
>>> +        drop_region(cxlr);
>>> +        return ERR_PTR(-ENODEV);
>>> +    }
>>> +    return cxlr;
>>> +}
>>> +EXPORT_SYMBOL_NS_GPL(cxl_create_region, CXL);
>>> +
>>>   int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
>>>   {
>>>       struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>>> index 68d28eab3696..0f5c71909fd1 100644
>>> --- a/drivers/cxl/cxlmem.h
>>> +++ b/drivers/cxl/cxlmem.h
>>> @@ -875,4 +875,6 @@ struct cxl_hdm {
>>>   struct seq_file;
>>>   struct dentry *cxl_debugfs_create_dir(const char *dir);
>>>   void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds);
>>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>>> +                     struct cxl_endpoint_decoder *cxled);
>>>   #endif /* __CXL_MEM_H__ */
>>> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
>>> index 45b6badb8048..c544339c2baf 100644
>>> --- a/include/linux/cxl/cxl.h
>>> +++ b/include/linux/cxl/cxl.h
>>> @@ -72,4 +72,8 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_memdev *cxlmd,
>>>                            resource_size_t min,
>>>                            resource_size_t max);
>>>   int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
>>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>>> +                     struct cxl_endpoint_decoder *cxled);
>>> +
>>> +int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled);
>>>   #endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 22/26] cxl: allow region creation by type2 drivers
  2024-10-18 16:40       ` Ben Cheatham
@ 2024-10-21  9:54         ` Alejandro Lucero Palau
  0 siblings, 0 replies; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-21  9:54 UTC (permalink / raw)
  To: Ben Cheatham, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/18/24 17:40, Ben Cheatham wrote:
>
> On 10/18/24 3:51 AM, Alejandro Lucero Palau wrote:
>> On 10/17/24 22:49, Ben Cheatham wrote:
>>> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>>>> From: Alejandro Lucero <alucerop@amd.com>
>>>>
>>>> Creating a CXL region requires userspace intervention through the cxl
>>>> sysfs files. Type2 support should allow accelerator drivers to create
>>>> such cxl region from kernel code.
>>>>
>>>> Adding that functionality and integrating it with current support for
>>>> memory expanders.
>>>>
>>>> Based on https://lore.kernel.org/linux-cxl/168592159835.1948938.1647215579839222774.stgit@dwillia2-xfh.jf.intel.com/
>>>>
>>> So I ran into an issue at this point when using v3 as a base for my own testing. The problem is that
>>> you are doing manual region management while not explicitly preventing auto region discovery when
>>> devm_cxl_add_memdev() is called (patch 14/26 in this series). This caused some resource allocation
>>> conflicts which then caused both the auto region and the manual region set up to fail. To make it more
>>> concrete, here's the flow I encountered (I tried something new here, let me know if the ascii
>>> is all mangled):
>>>
>>> devm_cxl_add_memdev() is called
>>> │
>>> ├───► cxl_mem probes new memdev
>>> │     │
>>> │     ├─► cxl_mem probe adds new endpoint port
>>> │     │
>>> │     └─► cxl_mem probe finishes
>>> ├───────────────────────────────────────────────► Manual region set up starts (finding free space, etc.)
>>> ├───► cxl_port probes the new endpoint port            │
>>> │     │                                                │
>>> │     ├─► cxl_port probe sets up new endpoint          ├─► create_new_region() is called
>>> │     │                                                │
>>> │     ├─► cxl_port calls discover_region()             │
>>> │     │                                                │
>>> │     ├─► discover_region() creates new auto           ├─► create_new_region() creates
>>> │     │   discoveredregion                             │   new manual region
>>> │◄────◄────────────────────────────────────────────────┘
>>> │
>>> └─► Region creation fails due to resource contention/race (DPA resource, RAM resource, etc.)
>>>
>>> The timeline is a little off here I think, but it should be close enough to illustrate the point.
>>
>> Interesting.
>>
>>
>> I'm aware of that code path when endpoint port is probed, but it is not a problem with my testing because the decoder is not enabled at the time of discover_region.
>>
>>
>> I've tested this with two different emulated devices, one a dumb qemu type2 device with a driver doing nothing but cxl initialization, and another being our network device with CXL support and using RTL emulation, and in both cases the decoder is not enabled at that point, which makes sense since, AFAIK, it is at region creation/attachment when the decoder is committed/enabled. So my obvious question is how are you testing this functionality? It seems as if you could have been creating more than one region somehow, or maybe something I'm just missing about this.
>>
> I think the reason you aren't seeing this is that QEMU doesn't have regions programmed by firmware. In my setup
> the decoders are coming up pre-programmed and enabled by firmware, so it is hitting the path during endpoint probe.


That explains it, and it also means you do not have the EFI_RESERVED 
flag in use what we expect for our device.

And I think the solution you give below should fix it. I'll add it to v5.

Thanks!


> Thanks,
> Ben
>
>>> The easy solution here to not allow auto region discovery for CXL type 2 devices, like so:
>>>
>>> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
>>> index 22a9ba89cf5a..07b991e2c05b 100644
>>> --- a/drivers/cxl/port.c
>>> +++ b/drivers/cxl/port.c
>>> @@ -34,6 +34,7 @@ static void schedule_detach(void *cxlmd)
>>>    static int discover_region(struct device *dev, void *root)
>>>    {
>>>           struct cxl_endpoint_decoder *cxled;
>>> +       struct cxl_memdev *cxlmd;
>>>           int rc;
>>>
>>>           dev_err(dev, "%s:%d: Enter\n", __func__, __LINE__);
>>> @@ -45,7 +46,9 @@ static int discover_region(struct device *dev, void *root)
>>>           if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
>>>                   return 0;
>>>
>>> -       if (cxled->state != CXL_DECODER_STATE_AUTO)
>>> +       cxlmd = cxled_to_memdev(cxled);
>>> +       if (cxled->state != CXL_DECODER_STATE_AUTO ||
>>> +           cxlmd->cxlds->type == CXL_DEVTYPE_DEVMEM)
>>>                   return 0;
>>>
>>> I think there's a better way to go about this, more to say about it in patch 24/26. I've
>>> dropped this here just in case you don't like my ideas there ;).
>>>                                                                      
>>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>>> ---
>>>>    drivers/cxl/core/region.c | 147 ++++++++++++++++++++++++++++++++++----
>>>>    drivers/cxl/cxlmem.h      |   2 +
>>>>    include/linux/cxl/cxl.h   |   4 ++
>>>>    3 files changed, 138 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>>> index d08a2a848ac9..04c270a29e96 100644
>>>> --- a/drivers/cxl/core/region.c
>>>> +++ b/drivers/cxl/core/region.c
>>>> @@ -2253,6 +2253,18 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
>>>>        return rc;
>>>>    }
>>>>    +int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled)
>>>> +{
>>>> +    int rc;
>>>> +
>>>> +    down_write(&cxl_region_rwsem);
>>>> +    cxled->mode = CXL_DECODER_DEAD;
>>>> +    rc = cxl_region_detach(cxled);
>>>> +    up_write(&cxl_region_rwsem);
>>>> +    return rc;
>>>> +}
>>>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_region_detach, CXL);
>>>> +
>>>>    void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled)
>>>>    {
>>>>        down_write(&cxl_region_rwsem);
>>>> @@ -2781,6 +2793,14 @@ cxl_find_region_by_name(struct cxl_root_decoder *cxlrd, const char *name)
>>>>        return to_cxl_region(region_dev);
>>>>    }
>>>>    +static void drop_region(struct cxl_region *cxlr)
>>>> +{
>>>> +    struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
>>>> +    struct cxl_port *port = cxlrd_to_port(cxlrd);
>>>> +
>>>> +    devm_release_action(port->uport_dev, unregister_region, cxlr);
>>>> +}
>>>> +
>>>>    static ssize_t delete_region_store(struct device *dev,
>>>>                       struct device_attribute *attr,
>>>>                       const char *buf, size_t len)
>>>> @@ -3386,17 +3406,18 @@ static int match_region_by_range(struct device *dev, void *data)
>>>>        return rc;
>>>>    }
>>>>    -/* Establish an empty region covering the given HPA range */
>>>> -static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>>> -                       struct cxl_endpoint_decoder *cxled)
>>>> +static void construct_region_end(void)
>>>> +{
>>>> +    up_write(&cxl_region_rwsem);
>>>> +}
>>>> +
>>>> +static struct cxl_region *construct_region_begin(struct cxl_root_decoder *cxlrd,
>>>> +                         struct cxl_endpoint_decoder *cxled)
>>>>    {
>>>>        struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>>>> -    struct cxl_port *port = cxlrd_to_port(cxlrd);
>>>> -    struct range *hpa = &cxled->cxld.hpa_range;
>>>>        struct cxl_region_params *p;
>>>>        struct cxl_region *cxlr;
>>>> -    struct resource *res;
>>>> -    int rc;
>>>> +    int err;
>>>>          do {
>>>>            cxlr = __create_region(cxlrd, cxled->mode,
>>>> @@ -3405,8 +3426,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>>>        } while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
>>>>          if (IS_ERR(cxlr)) {
>>>> -        dev_err(cxlmd->dev.parent,
>>>> -            "%s:%s: %s failed assign region: %ld\n",
>>>> +        dev_err(cxlmd->dev.parent, "%s:%s: %s failed assign region: %ld\n",
>>>>                dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>>>>                __func__, PTR_ERR(cxlr));
>>>>            return cxlr;
>>>> @@ -3416,13 +3436,33 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>>>        p = &cxlr->params;
>>>>        if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>>>>            dev_err(cxlmd->dev.parent,
>>>> -            "%s:%s: %s autodiscovery interrupted\n",
>>>> +            "%s:%s: %s region setup interrupted\n",
>>>>                dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>>>>                __func__);
>>>> -        rc = -EBUSY;
>>>> -        goto err;
>>>> +        err = -EBUSY;
>>>> +        construct_region_end();
>>>> +        drop_region(cxlr);
>>>> +        return ERR_PTR(err);
>>>>        }
>>>>    +    return cxlr;
>>>> +}
>>>> +
>>>> +/* Establish an empty region covering the given HPA range */
>>>> +static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>>> +                       struct cxl_endpoint_decoder *cxled)
>>>> +{
>>>> +    struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>>>> +    struct range *hpa = &cxled->cxld.hpa_range;
>>>> +    struct cxl_region_params *p;
>>>> +    struct cxl_region *cxlr;
>>>> +    struct resource *res;
>>>> +    int rc;
>>>> +
>>>> +    cxlr = construct_region_begin(cxlrd, cxled);
>>>> +    if (IS_ERR(cxlr))
>>>> +        return cxlr;
>>>> +
>>>>        set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>>>>          res = kmalloc(sizeof(*res), GFP_KERNEL);
>>>> @@ -3445,6 +3485,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>>>                 __func__, dev_name(&cxlr->dev));
>>>>        }
>>>>    +    p = &cxlr->params;
>>>>        p->res = res;
>>>>        p->interleave_ways = cxled->cxld.interleave_ways;
>>>>        p->interleave_granularity = cxled->cxld.interleave_granularity;
>>>> @@ -3462,15 +3503,91 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>>>>        /* ...to match put_device() in cxl_add_to_region() */
>>>>        get_device(&cxlr->dev);
>>>>        up_write(&cxl_region_rwsem);
>>>> -
>>>> +    construct_region_end();
>>>>        return cxlr;
>>>>      err:
>>>> -    up_write(&cxl_region_rwsem);
>>>> -    devm_release_action(port->uport_dev, unregister_region, cxlr);
>>>> +    construct_region_end();
>>>> +    drop_region(cxlr);
>>>> +    return ERR_PTR(rc);
>>>> +}
>>>> +
>>>> +static struct cxl_region *
>>>> +__construct_new_region(struct cxl_root_decoder *cxlrd,
>>>> +               struct cxl_endpoint_decoder *cxled)
>>>> +{
>>>> +    struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
>>>> +    struct cxl_region_params *p;
>>>> +    struct cxl_region *cxlr;
>>>> +    int rc;
>>>> +
>>>> +    cxlr = construct_region_begin(cxlrd, cxled);
>>>> +    if (IS_ERR(cxlr))
>>>> +        return cxlr;
>>>> +
>>>> +    rc = set_interleave_ways(cxlr, 1);
>>>> +    if (rc)
>>>> +        goto err;
>>>> +
>>>> +    rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
>>>> +    if (rc)
>>>> +        goto err;
>>>> +
>>>> +    rc = alloc_hpa(cxlr, resource_size(cxled->dpa_res));
>>>> +    if (rc)
>>>> +        goto err;
>>>> +
>>>> +    down_read(&cxl_dpa_rwsem);
>>>> +    rc = cxl_region_attach(cxlr, cxled, 0);
>>>> +    up_read(&cxl_dpa_rwsem);
>>>> +
>>>> +    if (rc)
>>>> +        goto err;
>>>> +
>>>> +    rc = cxl_region_decode_commit(cxlr);
>>>> +    if (rc)
>>>> +        goto err;
>>>> +
>>>> +    p = &cxlr->params;
>>>> +    p->state = CXL_CONFIG_COMMIT;
>>>> +
>>>> +    construct_region_end();
>>>> +    return cxlr;
>>>> +err:
>>>> +    construct_region_end();
>>>> +    drop_region(cxlr);
>>>>        return ERR_PTR(rc);
>>>>    }
>>>>    +/**
>>>> + * cxl_create_region - Establish a region given an endpoint decoder
>>>> + * @cxlrd: root decoder to allocate HPA
>>>> + * @cxled: endpoint decoder with reserved DPA capacity
>>>> + *
>>>> + * Returns a fully formed region in the commit state and attached to the
>>>> + * cxl_region driver.
>>>> + */
>>>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>>>> +                     struct cxl_endpoint_decoder *cxled)
>>>> +{
>>>> +    struct cxl_region *cxlr;
>>>> +
>>>> +    mutex_lock(&cxlrd->range_lock);
>>>> +    cxlr = __construct_new_region(cxlrd, cxled);
>>>> +    mutex_unlock(&cxlrd->range_lock);
>>>> +
>>>> +    if (IS_ERR(cxlr))
>>>> +        return cxlr;
>>>> +
>>>> +    if (device_attach(&cxlr->dev) <= 0) {
>>>> +        dev_err(&cxlr->dev, "failed to create region\n");
>>>> +        drop_region(cxlr);
>>>> +        return ERR_PTR(-ENODEV);
>>>> +    }
>>>> +    return cxlr;
>>>> +}
>>>> +EXPORT_SYMBOL_NS_GPL(cxl_create_region, CXL);
>>>> +
>>>>    int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
>>>>    {
>>>>        struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>>>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>>>> index 68d28eab3696..0f5c71909fd1 100644
>>>> --- a/drivers/cxl/cxlmem.h
>>>> +++ b/drivers/cxl/cxlmem.h
>>>> @@ -875,4 +875,6 @@ struct cxl_hdm {
>>>>    struct seq_file;
>>>>    struct dentry *cxl_debugfs_create_dir(const char *dir);
>>>>    void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds);
>>>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>>>> +                     struct cxl_endpoint_decoder *cxled);
>>>>    #endif /* __CXL_MEM_H__ */
>>>> diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
>>>> index 45b6badb8048..c544339c2baf 100644
>>>> --- a/include/linux/cxl/cxl.h
>>>> +++ b/include/linux/cxl/cxl.h
>>>> @@ -72,4 +72,8 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_memdev *cxlmd,
>>>>                             resource_size_t min,
>>>>                             resource_size_t max);
>>>>    int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
>>>> +struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
>>>> +                     struct cxl_endpoint_decoder *cxled);
>>>> +
>>>> +int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled);
>>>>    #endif

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 23/26] sfc: create cxl region
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (21 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 22/26] cxl: allow region creation by type2 drivers alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 24/26] cxl: preclude device memory to be used for dax alejandro.lucero-palau
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

Use cxl api for creating a region using the endpoint decoder related to
a DPA range.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/net/ethernet/sfc/efx_cxl.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index c0da75b2d8e1..869129635a84 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -127,12 +127,21 @@ int efx_cxl_init(struct efx_nic *efx)
 		goto err3;
 	}
 
+	cxl->efx_region = cxl_create_region(cxl->cxlrd, cxl->cxled);
+	if (!cxl->efx_region) {
+		pci_err(pci_dev, "CXL accel create region failed");
+		rc = PTR_ERR(cxl->efx_region);
+		goto err_region;
+	}
+
 	efx->cxl = cxl;
 #endif
 
 	return 0;
 
 #if IS_ENABLED(CONFIG_CXL_BUS)
+err_region:
+	cxl_dpa_free(cxl->cxled);
 err3:
 	cxl_release_resource(cxl->cxlds, CXL_RES_RAM);
 err2:
@@ -148,6 +157,7 @@ void efx_cxl_exit(struct efx_nic *efx)
 {
 #if IS_ENABLED(CONFIG_CXL_BUS)
 	if (efx->cxl) {
+		cxl_accel_region_detach(efx->cxl->cxled);
 		cxl_dpa_free(efx->cxl->cxled);
 		cxl_release_resource(efx->cxl->cxlds, CXL_RES_RAM);
 		kfree(efx->cxl->cxlds);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 24/26] cxl: preclude device memory to be used for dax
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (22 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 23/26] sfc: create cxl region alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 21:50   ` Ben Cheatham
  2024-10-17 16:52 ` [PATCH v4 25/26] cxl: add function for obtaining params from a region alejandro.lucero-palau
                   ` (2 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

By definition a type2 cxl device will use the host managed memory for
specific functionality, therefore it should not be available to other
uses.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/region.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 04c270a29e96..7c84d8f89af6 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3703,6 +3703,9 @@ static int cxl_region_probe(struct device *dev)
 	case CXL_DECODER_PMEM:
 		return devm_cxl_add_pmem_region(cxlr);
 	case CXL_DECODER_RAM:
+		if (cxlr->type != CXL_DECODER_HOSTONLYMEM)
+			return 0;
+
 		/*
 		 * The region can not be manged by CXL if any portion of
 		 * it is already online as 'System RAM'
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 24/26] cxl: preclude device memory to be used for dax
  2024-10-17 16:52 ` [PATCH v4 24/26] cxl: preclude device memory to be used for dax alejandro.lucero-palau
@ 2024-10-17 21:50   ` Ben Cheatham
  2024-10-18  8:10     ` Alejandro Lucero Palau
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Cheatham @ 2024-10-17 21:50 UTC (permalink / raw)
  To: alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet

On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> By definition a type2 cxl device will use the host managed memory for
> specific functionality, therefore it should not be available to other
> uses.
> 

I disagree that this is a valid assumption. I don't think that the device memory
should be added as system ram, but I do think there is value in having the
option to have the memory be available as a device-dax region. My reasoning here is:

1) I can think of a possible use case where the memory benefits from being user space
accessible (CXL memory GPU buffers).
2) I think it's really early to say this is the only way we expect these devices to
be used. The flip side of this is that it is early, so we can always change it later
when we start seeing real devices, but I would vote to keep a more flexible structure
early and if no one uses it oh well.

My idea here is that whoever writes the driver indicates whether they want to make
the device memory device-dax mappable, or do it all manually like you are now. I've
been working on a RFC based on v3 of this series that has this (as well as the
"better" solution mentioned in patch 22/26) that I was planning
on sending out in the next week or two, but if the consensus here is that this is
not the direction we want to go I'll probably drop that portion.

Thanks,
Ben

> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/region.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 04c270a29e96..7c84d8f89af6 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3703,6 +3703,9 @@ static int cxl_region_probe(struct device *dev)
>  	case CXL_DECODER_PMEM:
>  		return devm_cxl_add_pmem_region(cxlr);
>  	case CXL_DECODER_RAM:
> +		if (cxlr->type != CXL_DECODER_HOSTONLYMEM)
> +			return 0;
> +
>  		/*
>  		 * The region can not be manged by CXL if any portion of
>  		 * it is already online as 'System RAM'

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 24/26] cxl: preclude device memory to be used for dax
  2024-10-17 21:50   ` Ben Cheatham
@ 2024-10-18  8:10     ` Alejandro Lucero Palau
  0 siblings, 0 replies; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-18  8:10 UTC (permalink / raw)
  To: Ben Cheatham, alejandro.lucero-palau
  Cc: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet


On 10/17/24 22:50, Ben Cheatham wrote:
> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> By definition a type2 cxl device will use the host managed memory for
>> specific functionality, therefore it should not be available to other
>> uses.
>>
> I disagree that this is a valid assumption. I don't think that the device memory
> should be added as system ram, but I do think there is value in having the
> option to have the memory be available as a device-dax region. My reasoning here is:
>
> 1) I can think of a possible use case where the memory benefits from being user space
> accessible (CXL memory GPU buffers).
> 2) I think it's really early to say this is the only way we expect these devices to
> be used. The flip side of this is that it is early, so we can always change it later
> when we start seeing real devices, but I would vote to keep a more flexible structure
> early and if no one uses it oh well.
>
> My idea here is that whoever writes the driver indicates whether they want to make
> the device memory device-dax mappable, or do it all manually like you are now. I've
> been working on a RFC based on v3 of this series that has this (as well as the
> "better" solution mentioned in patch 22/26) that I was planning
> on sending out in the next week or two, but if the consensus here is that this is
> not the direction we want to go I'll probably drop that portion.


I understand your point and I agree dax creation could be required or 
interesting for some accelerators.


My experience when testing without this patch is the system is using 
that DAX even without any specific user space app, so the system was 
crashing because the CXL memory backend was not doing the expected 
thing. That is exactly the same case for our device, where memory should 
not be used except with the right format when writing. So the trivial 
patch was to preclude this dax creation for an accel/Type2.


I'm not against adding that flexibility now with a flag set by the 
driver at region creation time, so I'll add it for v5 if none is against it.


Thanks!


> Thanks,
> Ben
>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/region.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 04c270a29e96..7c84d8f89af6 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -3703,6 +3703,9 @@ static int cxl_region_probe(struct device *dev)
>>   	case CXL_DECODER_PMEM:
>>   		return devm_cxl_add_pmem_region(cxlr);
>>   	case CXL_DECODER_RAM:
>> +		if (cxlr->type != CXL_DECODER_HOSTONLYMEM)
>> +			return 0;
>> +
>>   		/*
>>   		 * The region can not be manged by CXL if any portion of
>>   		 * it is already online as 'System RAM'

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v4 25/26] cxl: add function for obtaining params from a region
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (23 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 24/26] cxl: preclude device memory to be used for dax alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-17 16:52 ` [PATCH v4 26/26] sfc: support pio mapping based on cxl alejandro.lucero-palau
  2024-10-23  8:46 ` [PATCH v4 00/26] cxl: add Type2 device support Paolo Abeni
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

A CXL region struct contains the physical address to work with.

Add a function for given a opaque cxl region struct returns the params
to be used for mapping such memory range.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/region.c | 16 ++++++++++++++++
 drivers/cxl/cxl.h         |  2 ++
 include/linux/cxl/cxl.h   |  2 ++
 3 files changed, 20 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 7c84d8f89af6..60c3aa6ee404 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2674,6 +2674,22 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
 	return ERR_PTR(rc);
 }
 
+int cxl_get_region_params(struct cxl_region *region, resource_size_t *start,
+			  resource_size_t *end)
+{
+	if (!region)
+		return -ENODEV;
+
+	if (!region->params.res)
+		return -ENOSPC;
+
+	*start = region->params.res->start;
+	*end = region->params.res->end;
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_get_region_params, CXL);
+
 static ssize_t __create_region_show(struct cxl_root_decoder *cxlrd, char *buf)
 {
 	return sysfs_emit(buf, "region%u\n", atomic_read(&cxlrd->region_id));
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 2ea180f05acd..79fc3f6f29b2 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -904,6 +904,8 @@ void cxl_coordinates_combine(struct access_coordinate *out,
 
 bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
 
+int cxl_get_region_params(struct cxl_region *region, resource_size_t *start,
+			  resource_size_t *end);
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.
diff --git a/include/linux/cxl/cxl.h b/include/linux/cxl/cxl.h
index c544339c2baf..d76f4ae60fbf 100644
--- a/include/linux/cxl/cxl.h
+++ b/include/linux/cxl/cxl.h
@@ -76,4 +76,6 @@ struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
 				     struct cxl_endpoint_decoder *cxled);
 
 int cxl_accel_region_detach(struct cxl_endpoint_decoder *cxled);
+int cxl_get_region_params(struct cxl_region *region, resource_size_t *start,
+			  resource_size_t *end);
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v4 26/26] sfc: support pio mapping based on cxl
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (24 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 25/26] cxl: add function for obtaining params from a region alejandro.lucero-palau
@ 2024-10-17 16:52 ` alejandro.lucero-palau
  2024-10-23  8:46 ` [PATCH v4 00/26] cxl: add Type2 device support Paolo Abeni
  26 siblings, 0 replies; 64+ messages in thread
From: alejandro.lucero-palau @ 2024-10-17 16:52 UTC (permalink / raw)
  To: linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, pabeni, edumazet
  Cc: Alejandro Lucero

From: Alejandro Lucero <alucerop@amd.com>

With a device supporting CXL and successfully initialised, use the cxl
region to map the memory range and use this mapping for PIO buffers.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/net/ethernet/sfc/ef10.c       | 34 +++++++++++++++++++++------
 drivers/net/ethernet/sfc/efx_cxl.c    | 19 ++++++++++++++-
 drivers/net/ethernet/sfc/mcdi_pcol.h  | 12 ++++++++++
 drivers/net/ethernet/sfc/net_driver.h |  2 ++
 drivers/net/ethernet/sfc/nic.h        |  2 ++
 5 files changed, 61 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 7d69302ffa0a..794574151b2f 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -24,6 +24,7 @@
 #include <linux/wait.h>
 #include <linux/workqueue.h>
 #include <net/udp_tunnel.h>
+#include "efx_cxl.h"
 
 /* Hardware control for EF10 architecture including 'Huntington'. */
 
@@ -177,6 +178,12 @@ static int efx_ef10_init_datapath_caps(struct efx_nic *efx)
 			  efx->num_mac_stats);
 	}
 
+	if (outlen < MC_CMD_GET_CAPABILITIES_V7_OUT_LEN)
+		nic_data->datapath_caps3 = 0;
+	else
+		nic_data->datapath_caps3 = MCDI_DWORD(outbuf,
+						      GET_CAPABILITIES_V7_OUT_FLAGS3);
+
 	return 0;
 }
 
@@ -949,7 +956,7 @@ static void efx_ef10_remove(struct efx_nic *efx)
 
 	efx_mcdi_rx_free_indir_table(efx);
 
-	if (nic_data->wc_membase)
+	if (nic_data->wc_membase && !efx->efx_cxl_pio_in_use)
 		iounmap(nic_data->wc_membase);
 
 	rc = efx_mcdi_free_vis(efx);
@@ -1263,8 +1270,21 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx)
 	iounmap(efx->membase);
 	efx->membase = membase;
 
-	/* Set up the WC mapping if needed */
-	if (wc_mem_map_size) {
+	if (!wc_mem_map_size)
+		return 0;
+
+	/* Set up the WC mapping */
+
+	if ((nic_data->datapath_caps3 &
+	    (1 << MC_CMD_GET_CAPABILITIES_V7_OUT_CXL_CONFIG_ENABLE_LBN)) &&
+	    efx->efx_cxl_pio_initialised) {
+		/* Using PIO through CXL mapping? */
+		nic_data->pio_write_base = efx->cxl->ctpio_cxl +
+					   (pio_write_vi_base * efx->vi_stride +
+					    ER_DZ_TX_PIOBUF - uc_mem_map_size);
+		efx->efx_cxl_pio_in_use = true;
+	} else {
+		/* Using legacy PIO BAR mapping */
 		nic_data->wc_membase = ioremap_wc(efx->membase_phys +
 						  uc_mem_map_size,
 						  wc_mem_map_size);
@@ -1279,12 +1299,12 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx)
 			nic_data->wc_membase +
 			(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
 			 uc_mem_map_size);
-
-		rc = efx_ef10_link_piobufs(efx);
-		if (rc)
-			efx_ef10_free_piobufs(efx);
 	}
 
+	rc = efx_ef10_link_piobufs(efx);
+	if (rc)
+		efx_ef10_free_piobufs(efx);
+
 	netif_dbg(efx, probe, efx->net_dev,
 		  "memory BAR at %pa (virtual %p+%x UC, %p+%x WC)\n",
 		  &efx->membase_phys, efx->membase, uc_mem_map_size,
diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
index 869129635a84..1629ffe3dccb 100644
--- a/drivers/net/ethernet/sfc/efx_cxl.c
+++ b/drivers/net/ethernet/sfc/efx_cxl.c
@@ -24,9 +24,9 @@ int efx_cxl_init(struct efx_nic *efx)
 	struct pci_dev *pci_dev = efx->pci_dev;
 	DECLARE_BITMAP(expected, CXL_MAX_CAPS);
 	DECLARE_BITMAP(found, CXL_MAX_CAPS);
+	resource_size_t max, start, end;
 	struct efx_cxl *cxl;
 	struct resource res;
-	resource_size_t max;
 	u16 dvsec;
 	int rc;
 
@@ -134,12 +134,28 @@ int efx_cxl_init(struct efx_nic *efx)
 		goto err_region;
 	}
 
+	rc = cxl_get_region_params(cxl->efx_region, &start, &end);
+	if (!rc) {
+		pci_err(pci_dev, "CXL getting regions params failed");
+		goto err_region_params;
+	}
+
+	cxl->ctpio_cxl = ioremap(start, end - start);
+	if (!cxl->ctpio_cxl) {
+		pci_err(pci_dev, "CXL ioremap region failed");
+		goto err_region_params;
+	}
+
+	efx->efx_cxl_pio_initialised = true;
+
 	efx->cxl = cxl;
 #endif
 
 	return 0;
 
 #if IS_ENABLED(CONFIG_CXL_BUS)
+err_region_params:
+	cxl_accel_region_detach(efx->cxl->cxled);
 err_region:
 	cxl_dpa_free(cxl->cxled);
 err3:
@@ -157,6 +173,7 @@ void efx_cxl_exit(struct efx_nic *efx)
 {
 #if IS_ENABLED(CONFIG_CXL_BUS)
 	if (efx->cxl) {
+		iounmap(efx->cxl->ctpio_cxl);
 		cxl_accel_region_detach(efx->cxl->cxled);
 		cxl_dpa_free(efx->cxl->cxled);
 		cxl_release_resource(efx->cxl->cxlds, CXL_RES_RAM);
diff --git a/drivers/net/ethernet/sfc/mcdi_pcol.h b/drivers/net/ethernet/sfc/mcdi_pcol.h
index cd297e19cddc..c158a1e8d01b 100644
--- a/drivers/net/ethernet/sfc/mcdi_pcol.h
+++ b/drivers/net/ethernet/sfc/mcdi_pcol.h
@@ -16799,6 +16799,9 @@
 #define        MC_CMD_GET_CAPABILITIES_V7_OUT_DYNAMIC_MPORT_JOURNAL_OFST 148
 #define        MC_CMD_GET_CAPABILITIES_V7_OUT_DYNAMIC_MPORT_JOURNAL_LBN 14
 #define        MC_CMD_GET_CAPABILITIES_V7_OUT_DYNAMIC_MPORT_JOURNAL_WIDTH 1
+#define        MC_CMD_GET_CAPABILITIES_V7_OUT_CXL_CONFIG_ENABLE_OFST 148
+#define        MC_CMD_GET_CAPABILITIES_V7_OUT_CXL_CONFIG_ENABLE_LBN 17
+#define        MC_CMD_GET_CAPABILITIES_V7_OUT_CXL_CONFIG_ENABLE_WIDTH 1
 
 /* MC_CMD_GET_CAPABILITIES_V8_OUT msgresponse */
 #define    MC_CMD_GET_CAPABILITIES_V8_OUT_LEN 160
@@ -17303,6 +17306,9 @@
 #define        MC_CMD_GET_CAPABILITIES_V8_OUT_DYNAMIC_MPORT_JOURNAL_OFST 148
 #define        MC_CMD_GET_CAPABILITIES_V8_OUT_DYNAMIC_MPORT_JOURNAL_LBN 14
 #define        MC_CMD_GET_CAPABILITIES_V8_OUT_DYNAMIC_MPORT_JOURNAL_WIDTH 1
+#define        MC_CMD_GET_CAPABILITIES_V8_OUT_CXL_CONFIG_ENABLE_OFST 148
+#define        MC_CMD_GET_CAPABILITIES_V8_OUT_CXL_CONFIG_ENABLE_LBN 17
+#define        MC_CMD_GET_CAPABILITIES_V8_OUT_CXL_CONFIG_ENABLE_WIDTH 1
 /* These bits are reserved for communicating test-specific capabilities to
  * host-side test software. All production drivers should treat this field as
  * opaque.
@@ -17821,6 +17827,9 @@
 #define        MC_CMD_GET_CAPABILITIES_V9_OUT_DYNAMIC_MPORT_JOURNAL_OFST 148
 #define        MC_CMD_GET_CAPABILITIES_V9_OUT_DYNAMIC_MPORT_JOURNAL_LBN 14
 #define        MC_CMD_GET_CAPABILITIES_V9_OUT_DYNAMIC_MPORT_JOURNAL_WIDTH 1
+#define        MC_CMD_GET_CAPABILITIES_V9_OUT_CXL_CONFIG_ENABLE_OFST 148
+#define        MC_CMD_GET_CAPABILITIES_V9_OUT_CXL_CONFIG_ENABLE_LBN 17
+#define        MC_CMD_GET_CAPABILITIES_V9_OUT_CXL_CONFIG_ENABLE_WIDTH 1
 /* These bits are reserved for communicating test-specific capabilities to
  * host-side test software. All production drivers should treat this field as
  * opaque.
@@ -18374,6 +18383,9 @@
 #define        MC_CMD_GET_CAPABILITIES_V10_OUT_DYNAMIC_MPORT_JOURNAL_OFST 148
 #define        MC_CMD_GET_CAPABILITIES_V10_OUT_DYNAMIC_MPORT_JOURNAL_LBN 14
 #define        MC_CMD_GET_CAPABILITIES_V10_OUT_DYNAMIC_MPORT_JOURNAL_WIDTH 1
+#define        MC_CMD_GET_CAPABILITIES_V10_OUT_CXL_CONFIG_ENABLE_OFST 148
+#define        MC_CMD_GET_CAPABILITIES_V10_OUT_CXL_CONFIG_ENABLE_LBN 17
+#define        MC_CMD_GET_CAPABILITIES_V10_OUT_CXL_CONFIG_ENABLE_WIDTH 1
 /* These bits are reserved for communicating test-specific capabilities to
  * host-side test software. All production drivers should treat this field as
  * opaque.
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 77261de65e63..893e7841ffb4 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -967,6 +967,7 @@ struct efx_cxl;
  * @dl_port: devlink port associated with the PF
  * @cxl: details of related cxl objects
  * @efx_cxl_pio_initialised: clx initialization outcome.
+ * @efx_cxl_pio_in_use: PIO using CXL mapping
  * @mem_bar: The BAR that is mapped into membase.
  * @reg_base: Offset from the start of the bar to the function control window.
  * @monitor_work: Hardware monitor workitem
@@ -1154,6 +1155,7 @@ struct efx_nic {
 	struct devlink_port *dl_port;
 	struct efx_cxl *cxl;
 	bool efx_cxl_pio_initialised;
+	bool efx_cxl_pio_in_use;
 	unsigned int mem_bar;
 	u32 reg_base;
 
diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index 1db64fc6e909..b7148810acdb 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -151,6 +151,7 @@ enum {
  * @datapath_caps: Capabilities of datapath firmware (FLAGS1 field of
  *	%MC_CMD_GET_CAPABILITIES response)
  * @datapath_caps2: Further Capabilities of datapath firmware (FLAGS2 field of
+ * @datapath_caps3: Further Capabilities of datapath firmware (FLAGS3 field of
  * %MC_CMD_GET_CAPABILITIES response)
  * @rx_dpcpu_fw_id: Firmware ID of the RxDPCPU
  * @tx_dpcpu_fw_id: Firmware ID of the TxDPCPU
@@ -186,6 +187,7 @@ struct efx_ef10_nic_data {
 	bool must_check_datapath_caps;
 	u32 datapath_caps;
 	u32 datapath_caps2;
+	u32 datapath_caps3;
 	unsigned int rx_dpcpu_fw_id;
 	unsigned int tx_dpcpu_fw_id;
 	bool must_probe_vswitching;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 00/26] cxl: add Type2 device support
  2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
                   ` (25 preceding siblings ...)
  2024-10-17 16:52 ` [PATCH v4 26/26] sfc: support pio mapping based on cxl alejandro.lucero-palau
@ 2024-10-23  8:46 ` Paolo Abeni
  2024-10-23  9:38   ` Alejandro Lucero Palau
  26 siblings, 1 reply; 64+ messages in thread
From: Paolo Abeni @ 2024-10-23  8:46 UTC (permalink / raw)
  To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
	martin.habets, edward.cree, davem, kuba, edumazet

Hi,

10/17/24 18:51, alejandro.lucero-palau@amd.com wrote:
> 2) The driver for using the added functionality is not a test driver but a real
> one: the SFC ethernet network driver. It uses the CXL region mapped for PIO
> buffers instead of regions inside PCIe BARs.

I'm sorry for the late feedback, but which is the merge plan here?

The series spawns across 2 different subsystems and could cause conflicts.

Could the network device change be separated and send (to netdev) after
the clx ones land into Linus' tree?

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v4 00/26] cxl: add Type2 device support
  2024-10-23  8:46 ` [PATCH v4 00/26] cxl: add Type2 device support Paolo Abeni
@ 2024-10-23  9:38   ` Alejandro Lucero Palau
  2024-11-20 16:50     ` Should the CXL Type2 support patchset be split up? Alejandro Lucero Palau
  0 siblings, 1 reply; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-10-23  9:38 UTC (permalink / raw)
  To: Paolo Abeni, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, edumazet


On 10/23/24 09:46, Paolo Abeni wrote:
> Hi,
>
> 10/17/24 18:51, alejandro.lucero-palau@amd.com wrote:
>> 2) The driver for using the added functionality is not a test driver but a real
>> one: the SFC ethernet network driver. It uses the CXL region mapped for PIO
>> buffers instead of regions inside PCIe BARs.
> I'm sorry for the late feedback, but which is the merge plan here?
>
> The series spawns across 2 different subsystems and could cause conflicts.
>
> Could the network device change be separated and send (to netdev) after
> the clx ones land into Linus' tree?


Hi Paolo,


With v4 all sfc changes are different patches than those modifying CXL 
core, so I guess this is good for what you suggest.


Not sure the implications for merging only some patches into the CXL tree.


Thanks,

Alejandro


> Thanks,
>
> Paolo
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Should the CXL Type2 support patchset be split up?
  2024-10-23  9:38   ` Alejandro Lucero Palau
@ 2024-11-20 16:50     ` Alejandro Lucero Palau
  2024-11-20 17:13       ` Dave Jiang
  0 siblings, 1 reply; 64+ messages in thread
From: Alejandro Lucero Palau @ 2024-11-20 16:50 UTC (permalink / raw)
  To: Paolo Abeni, alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, martin.habets, edward.cree, davem, kuba, edumazet,
	Jonathan Cameron, Dave Jiang

Hi all,

Facing Paolo's question again trying to involve CXL and (more) netdev 
maintainers.

Next v6 could have two different patchsets, one for cxl, one for netdev. 
The current patchset has already cleanly isolated sfc netdev patches, so 
it is trivial.

The main question is if CXL maintainers will be happy with this change 
as the sfc is the client justifying the CXL core changes. Also, the 
split could be delayed until all the patches get the Reviewed-by tag 
what is now only ~75% of them (sfc related patches without the public 
approval yet but internally obtained).

Thanks,

Alejandro

On 10/23/24 10:38, Alejandro Lucero Palau wrote:
>
> On 10/23/24 09:46, Paolo Abeni wrote:
>> I'm sorry for the late feedback, but which is the merge plan here?
>>
>> The series spawns across 2 different subsystems and could cause 
>> conflicts.
>>
>> Could the network device change be separated and send (to netdev) after
>> the clx ones land into Linus' tree?
>
>
> Hi Paolo,
>
>
> With v4 all sfc changes are different patches than those modifying CXL 
> core, so I guess this is good for what you suggest.
>
>
> Not sure the implications for merging only some patches into the CXL 
> tree.
>
>
> Thanks,
>
> Alejandro
>
>
>> Thanks,
>>
>> Paolo
>>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: Should the CXL Type2 support patchset be split up?
  2024-11-20 16:50     ` Should the CXL Type2 support patchset be split up? Alejandro Lucero Palau
@ 2024-11-20 17:13       ` Dave Jiang
  0 siblings, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2024-11-20 17:13 UTC (permalink / raw)
  To: Alejandro Lucero Palau, Paolo Abeni, alejandro.lucero-palau,
	linux-cxl, netdev, dan.j.williams, martin.habets, edward.cree,
	davem, kuba, edumazet, Jonathan Cameron



On 11/20/24 9:50 AM, Alejandro Lucero Palau wrote:
> Hi all,
> 
> 
> Facing Paolo's question again trying to involve CXL and (more) netdev maintainers.
> 
> 
> Next v6 could have two different patchsets, one for cxl, one for netdev. The current patchset has already cleanly isolated sfc netdev patches, so it is trivial.
> 
> The main question is if CXL maintainers will be happy with this change as the sfc is the client justifying the CXL core changes. Also, the split could be delayed until all the patches get the Reviewed-by tag what is now only ~75% of them (sfc related patches without the public approval yet but internally obtained).

Given that the series is dominantly CXL patches, my suggestion would be get the acks from netdev side and CXL can take the whole series without doing any splitting. That's been typically how it has been done with cross subsystem changes. i.e. ACPI+CXL etc. 

DJ 

> 
> Thanks,
> 
> Alejandro
> 
> 
> On 10/23/24 10:38, Alejandro Lucero Palau wrote:
>>
>> On 10/23/24 09:46, Paolo Abeni wrote:
>>> I'm sorry for the late feedback, but which is the merge plan here?
>>>
>>> The series spawns across 2 different subsystems and could cause conflicts.
>>>
>>> Could the network device change be separated and send (to netdev) after
>>> the clx ones land into Linus' tree?
>>
>>
>> Hi Paolo,
>>
>>
>> With v4 all sfc changes are different patches than those modifying CXL core, so I guess this is good for what you suggest.
>>
>>
>> Not sure the implications for merging only some patches into the CXL tree.
>>
>>
>> Thanks,
>>
>> Alejandro
>>
>>
>>> Thanks,
>>>
>>> Paolo
>>>


^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2024-11-20 17:13 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-17 16:51 [PATCH v4 00/26] cxl: add Type2 device support alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 01/26] cxl: add type2 device basic support alejandro.lucero-palau
2024-10-25 13:50   ` Jonathan Cameron
2024-10-28  9:37     ` Alejandro Lucero Palau
2024-10-28 18:05   ` Dave Jiang
2024-10-30 16:26     ` Alejandro Lucero Palau
2024-10-17 16:52 ` [PATCH v4 02/26] sfc: add cxl support using new CXL API alejandro.lucero-palau
2024-10-17 21:48   ` Ben Cheatham
2024-10-18 13:38     ` Alejandro Lucero Palau
2024-10-25 14:03   ` Jonathan Cameron
2024-10-28 11:59     ` Alejandro Lucero Palau
2024-10-29 15:14       ` Jonathan Cameron
2024-10-30 16:31         ` Alejandro Lucero Palau
2024-10-17 16:52 ` [PATCH v4 03/26] cxl: add capabilities field to cxl_dev_state and cxl_port alejandro.lucero-palau
2024-10-25 14:14   ` Jonathan Cameron
2024-10-28 12:00     ` Alejandro Lucero Palau
2024-10-28 18:19   ` Dave Jiang
2024-10-30 16:28     ` Alejandro Lucero Palau
2024-10-17 16:52 ` [PATCH v4 04/26] cxl/pci: add check for validating capabilities alejandro.lucero-palau
2024-10-25 10:16   ` Alejandro Lucero Palau
2024-10-25 14:16     ` Jonathan Cameron
2024-10-17 16:52 ` [PATCH v4 05/26] cxl: move pci generic code alejandro.lucero-palau
2024-10-17 21:49   ` Ben Cheatham
2024-10-18  9:35     ` Alejandro Lucero Palau
2024-10-17 16:52 ` [PATCH v4 06/26] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
2024-10-17 21:49   ` Ben Cheatham
2024-10-17 16:52 ` [PATCH v4 07/26] sfc: use cxl api for regs setup and checking alejandro.lucero-palau
2024-10-17 21:49   ` Ben Cheatham
2024-10-18 15:07     ` Alejandro Lucero Palau
2024-10-17 16:52 ` [PATCH v4 08/26] cxl: add functions for resource request/release by a driver alejandro.lucero-palau
2024-10-17 21:49   ` Ben Cheatham
2024-10-18 14:58     ` Alejandro Lucero Palau
2024-10-18 16:40       ` Ben Cheatham
2024-10-17 16:52 ` [PATCH v4 09/26] sfc: request cxl ram resource alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 10/26] cxl: harden resource_contains checks to handle zero size resources alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 11/26] cxl: add function for setting media ready by a driver alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 12/26] sfc: set cxl media ready alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 13/26] cxl: prepare memdev creation for type2 alejandro.lucero-palau
2024-10-17 21:49   ` Ben Cheatham
2024-10-18 10:49     ` Alejandro Lucero Palau
2024-10-18 16:40       ` Ben Cheatham
2024-10-17 16:52 ` [PATCH v4 14/26] sfc: create type2 cxl memdev alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 15/26] cxl: define a driver interface for HPA free space enumeration alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 16/26] sfc: obtain root decoder with enough HPA free space alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 17/26] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 18/26] sfc: get endpoint decoder alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 19/26] cxl: make region type based on endpoint type alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 20/26] cxl/region: factor out interleave ways setup alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 21/26] cxl/region: factor out interleave granularity setup alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 22/26] cxl: allow region creation by type2 drivers alejandro.lucero-palau
2024-10-17 21:49   ` Ben Cheatham
2024-10-18  8:51     ` Alejandro Lucero Palau
2024-10-18 16:40       ` Ben Cheatham
2024-10-21  9:54         ` Alejandro Lucero Palau
2024-10-17 16:52 ` [PATCH v4 23/26] sfc: create cxl region alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 24/26] cxl: preclude device memory to be used for dax alejandro.lucero-palau
2024-10-17 21:50   ` Ben Cheatham
2024-10-18  8:10     ` Alejandro Lucero Palau
2024-10-17 16:52 ` [PATCH v4 25/26] cxl: add function for obtaining params from a region alejandro.lucero-palau
2024-10-17 16:52 ` [PATCH v4 26/26] sfc: support pio mapping based on cxl alejandro.lucero-palau
2024-10-23  8:46 ` [PATCH v4 00/26] cxl: add Type2 device support Paolo Abeni
2024-10-23  9:38   ` Alejandro Lucero Palau
2024-11-20 16:50     ` Should the CXL Type2 support patchset be split up? Alejandro Lucero Palau
2024-11-20 17:13       ` Dave Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).