Linux CXL
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Add CXL Reset Support for CXL Devices
@ 2025-02-21  4:39 Srirangan Madhavan
  2025-02-21  4:39 ` [PATCH v2 1/2] cxl: de-duplicate cxl DVSEC reg defines Srirangan Madhavan
  2025-02-21  4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
  0 siblings, 2 replies; 10+ messages in thread
From: Srirangan Madhavan @ 2025-02-21  4:39 UTC (permalink / raw)
  To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
  Cc: Zhi Wang, Vishal Aslot, Shanker Donthineni, linux-cxl,
	Bjorn Helgaas, linux-pci

This patch series introduces support for the CXL Reset method
for CXL devices, implementing the reset procedure outlined in
CXL Spec [1] v3.2, Sections 9.6 and 9.7.

v2 changes:
- de-duplicate CXL DVSEC register defines under include/cxl/pci.h
- fix style related issues.

v1 changes:
- Added cover letter and dropped the RFC.

The RFC patche can be found here [2]

Motivation:
-----------
This change is broadly useful for reasons including but not limited to the
following:
- As support for Type 2 devices [3] is being introduced, more devices will
  require finer-grained reset mechanisms beyond bus-wide reset methods.
- FLR does not affect CXL.cache or CXL.mem protocols, making CXL Reset
  the preferred method in some cases.
- The CXL spec (Sections 7.2.3 Binding and Unbinding, 9.5 FLR) highlights use
  cases like function rebinding and error recovery, where CXL Reset is
  explicitly mentioned.

Change Description:
-------------------
The patch implements the required steps for CXL reset and introduces a new reset
method in the pci_reset_fn_methods. It also defines the necessary CXL DVSEC
register macros. The actual steps for rest are broadly: disable cache lines and
asserts WB+I, wait for cache and memory clear signals, initial reset, wait for
complete and return status.

Command line to test the CXL reset on a capable device:
    echo cxl_reset > /sys/bus/pci/devices/<pci_device>/reset_method
    echo 1 > /sys/bus/pci/devices/<pci_device>/reset

[1] https://computeexpresslink.org/cxl-specification/
[2] https://lore.kernel.org/all/20241213074143.374-1-smadhavan@nvidia.com/
[3] https://lore.kernel.org/linux-cxl/20241216161042.42108-1-alejandro.lucero-palau@amd.com/

Srirangan Madhavan (2):
  cxl: de-duplicate cxl DVSEC reg defines
  cxl: add support for cxl reset

 drivers/cxl/core/pci.c        |   1 +
 drivers/cxl/core/regs.c       |   1 +
 drivers/cxl/cxlpci.h          |  53 -----------
 drivers/cxl/pci.c             |   1 +
 drivers/pci/pci.c             | 163 ++++++++++++++++++++++++++++++++--
 include/cxl/pci.h             |  76 ++++++++++++++++
 include/linux/pci.h           |   2 +-
 include/uapi/linux/pci_regs.h |   5 --
 8 files changed, 235 insertions(+), 67 deletions(-)
 create mode 100644 include/cxl/pci.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] cxl: de-duplicate cxl DVSEC reg defines
  2025-02-21  4:39 [PATCH v2 0/2] Add CXL Reset Support for CXL Devices Srirangan Madhavan
@ 2025-02-21  4:39 ` Srirangan Madhavan
  2025-02-21  4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
  1 sibling, 0 replies; 10+ messages in thread
From: Srirangan Madhavan @ 2025-02-21  4:39 UTC (permalink / raw)
  To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
  Cc: Zhi Wang, Vishal Aslot, Shanker Donthineni, linux-cxl,
	Bjorn Helgaas, linux-pci

There are two set of CXL DVSEC register defines in
uapi/linux/pci_regs.h as PCI_DVSEC_CXL* and in cxl/cxlpci.h as
CXL_DVSEC_*.

Consolidate the defines under include/cxl/ accessible by both
pci core and cxl. Also fix any references to the deprecated copy
of these defines.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/cxl/core/pci.c        |  1 +
 drivers/cxl/core/regs.c       |  1 +
 drivers/cxl/cxlpci.h          | 53 ------------------------------
 drivers/cxl/pci.c             |  1 +
 drivers/pci/pci.c             | 17 +++++-----
 include/cxl/pci.h             | 62 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/pci_regs.h |  5 ---
 7 files changed, 74 insertions(+), 66 deletions(-)
 create mode 100644 include/cxl/pci.h

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index a5c65f79db18..afa2c85fc6fd 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -7,6 +7,7 @@
 #include <linux/pci.h>
 #include <linux/pci-doe.h>
 #include <linux/aer.h>
+#include <cxl/pci.h>
 #include <cxlpci.h>
 #include <cxlmem.h>
 #include <cxl.h>
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index 117c2e94c761..58a942a4946c 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -4,6 +4,7 @@
 #include <linux/device.h>
 #include <linux/slab.h>
 #include <linux/pci.h>
+#include <cxl/pci.h>
 #include <cxlmem.h>
 #include <cxlpci.h>
 #include <pmu.h>
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 54e219b0049e..24c06eca8a68 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -7,59 +7,6 @@
 
 #define CXL_MEMORY_PROGIF	0x10
 
-/*
- * See section 8.1 Configuration Space Registers in the CXL 2.0
- * Specification. Names are taken straight from the specification with "CXL" and
- * "DVSEC" redundancies removed. When obvious, abbreviations may be used.
- */
-#define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
-
-/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
-#define CXL_DVSEC_PCIE_DEVICE					0
-#define   CXL_DVSEC_CAP_OFFSET		0xA
-#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
-#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
-#define   CXL_DVSEC_CTRL_OFFSET		0xC
-#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
-#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
-#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
-#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
-#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
-#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
-#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
-#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
-#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
-
-#define CXL_DVSEC_RANGE_MAX		2
-
-/* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
-#define CXL_DVSEC_FUNCTION_MAP					2
-
-/* CXL 2.0 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
-#define CXL_DVSEC_PORT_EXTENSIONS				3
-
-/* CXL 2.0 8.1.6: GPF DVSEC for CXL Port */
-#define CXL_DVSEC_PORT_GPF					4
-#define   CXL_DVSEC_PORT_GPF_PHASE_1_CONTROL_OFFSET		0x0C
-#define     CXL_DVSEC_PORT_GPF_PHASE_1_TMO_BASE_MASK		GENMASK(3, 0)
-#define     CXL_DVSEC_PORT_GPF_PHASE_1_TMO_SCALE_MASK		GENMASK(11, 8)
-#define   CXL_DVSEC_PORT_GPF_PHASE_2_CONTROL_OFFSET		0xE
-#define     CXL_DVSEC_PORT_GPF_PHASE_2_TMO_BASE_MASK		GENMASK(3, 0)
-#define     CXL_DVSEC_PORT_GPF_PHASE_2_TMO_SCALE_MASK		GENMASK(11, 8)
-
-/* CXL 2.0 8.1.7: GPF DVSEC for CXL Device */
-#define CXL_DVSEC_DEVICE_GPF					5
-
-/* CXL 2.0 8.1.8: PCIe DVSEC for Flex Bus Port */
-#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
-
-/* CXL 2.0 8.1.9: Register Locator DVSEC */
-#define CXL_DVSEC_REG_LOCATOR					8
-#define   CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET			0xC
-#define     CXL_DVSEC_REG_LOCATOR_BIR_MASK			GENMASK(2, 0)
-#define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
-#define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
-
 /*
  * NOTE: Currently all the functions which are enabled for CXL require their
  * vectors to be in the first 16.  Use this as the default max.
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index b2c943a4de0a..1f802b7cd65d 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -12,6 +12,7 @@
 #include <linux/aer.h>
 #include <linux/io.h>
 #include <cxl/mailbox.h>
+#include <cxl/pci.h>
 #include "cxlmem.h"
 #include "cxlpci.h"
 #include "cxl.h"
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 869d204a70a3..3ab1871ecf8a 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -30,6 +30,7 @@
 #include <asm/dma.h>
 #include <linux/aer.h>
 #include <linux/bitfield.h>
+#include <cxl/pci.h>
 #include "pci.h"
 
 DEFINE_MUTEX(pci_slot_mutex);
@@ -5029,7 +5030,7 @@ static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
 static u16 cxl_port_dvsec(struct pci_dev *dev)
 {
 	return pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
-					 PCI_DVSEC_CXL_PORT);
+					 CXL_DVSEC_PORT_EXTENSIONS);
 }
 
 static bool cxl_sbr_masked(struct pci_dev *dev)
@@ -5041,7 +5042,7 @@ static bool cxl_sbr_masked(struct pci_dev *dev)
 	if (!dvsec)
 		return false;
 
-	rc = pci_read_config_word(dev, dvsec + PCI_DVSEC_CXL_PORT_CTL, &reg);
+	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_PORT_CTL, &reg);
 	if (rc || PCI_POSSIBLE_ERROR(reg))
 		return false;
 
@@ -5050,7 +5051,7 @@ static bool cxl_sbr_masked(struct pci_dev *dev)
 	 * bit in Bridge Control has no effect.  When 1, the Port generates
 	 * hot reset when the SBR bit is set to 1.
 	 */
-	if (reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR)
+	if (reg & CXL_DVSEC_UNMASK_SBR)
 		return false;
 
 	return true;
@@ -5095,22 +5096,22 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
 	if (probe)
 		return 0;
 
-	rc = pci_read_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL, &reg);
+	rc = pci_read_config_word(bridge, dvsec + CXL_DVSEC_PORT_CTL, &reg);
 	if (rc)
 		return -ENOTTY;
 
-	if (reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR) {
+	if (reg & CXL_DVSEC_UNMASK_SBR) {
 		val = reg;
 	} else {
-		val = reg | PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR;
-		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
+		val = reg | CXL_DVSEC_UNMASK_SBR;
+		pci_write_config_word(bridge, dvsec + CXL_DVSEC_PORT_CTL,
 				      val);
 	}
 
 	rc = pci_reset_bus_function(dev, probe);
 
 	if (reg != val)
-		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
+		pci_write_config_word(bridge, dvsec + CXL_DVSEC_PORT_CTL,
 				      reg);
 
 	return rc;
diff --git a/include/cxl/pci.h b/include/cxl/pci.h
new file mode 100644
index 000000000000..3977425ec477
--- /dev/null
+++ b/include/cxl/pci.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
+
+#ifndef __CXL_ACCEL_PCI_H
+#define __CXL_ACCEL_PCI_H
+
+/*
+ * See section 8.1 Configuration Space Registers in the CXL 2.0
+ * Specification. Names are taken straight from the specification with "CXL" and
+ * "DVSEC" redundancies removed. When obvious, abbreviations may be used.
+ */
+#define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
+
+/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
+#define CXL_DVSEC_PCIE_DEVICE					0
+#define   CXL_DVSEC_CAP_OFFSET		0xA
+#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
+#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
+#define   CXL_DVSEC_CTRL_OFFSET		0xC
+#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
+#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
+#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
+#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
+#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
+#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
+#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
+#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
+#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
+
+#define CXL_DVSEC_RANGE_MAX		2
+
+/* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
+#define CXL_DVSEC_FUNCTION_MAP					2
+
+/* CXL 2.0 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
+#define CXL_DVSEC_PORT_EXTENSIONS				3
+#define   CXL_DVSEC_PORT_CTL			0xC
+#define     CXL_DVSEC_UNMASK_SBR		BIT(0)
+
+/* CXL 2.0 8.1.6: GPF DVSEC for CXL Port */
+#define CXL_DVSEC_PORT_GPF					4
+#define   CXL_DVSEC_PORT_GPF_PHASE_1_CONTROL_OFFSET		0x0C
+#define     CXL_DVSEC_PORT_GPF_PHASE_1_TMO_BASE_MASK		GENMASK(3, 0)
+#define     CXL_DVSEC_PORT_GPF_PHASE_1_TMO_SCALE_MASK		GENMASK(11, 8)
+#define   CXL_DVSEC_PORT_GPF_PHASE_2_CONTROL_OFFSET		0xE
+#define     CXL_DVSEC_PORT_GPF_PHASE_2_TMO_BASE_MASK		GENMASK(3, 0)
+#define     CXL_DVSEC_PORT_GPF_PHASE_2_TMO_SCALE_MASK		GENMASK(11, 8)
+
+/* CXL 2.0 8.1.7: GPF DVSEC for CXL Device */
+#define CXL_DVSEC_DEVICE_GPF					5
+
+/* CXL 2.0 8.1.8: PCIe DVSEC for Flex Bus Port */
+#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
+
+/* CXL 2.0 8.1.9: Register Locator DVSEC */
+#define CXL_DVSEC_REG_LOCATOR					8
+#define   CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET			0xC
+#define     CXL_DVSEC_REG_LOCATOR_BIR_MASK			GENMASK(2, 0)
+#define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
+#define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
+
+#endif
\ No newline at end of file
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 3445c4970e4d..02844c20c527 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -1208,9 +1208,4 @@
 #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_PROTOCOL		0x00ff0000
 #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_NEXT_INDEX	0xff000000
 
-/* Compute Express Link (CXL r3.1, sec 8.1.5) */
-#define PCI_DVSEC_CXL_PORT				3
-#define PCI_DVSEC_CXL_PORT_CTL				0x0c
-#define PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR		0x00000001
-
 #endif /* LINUX_PCI_REGS_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] cxl: add support for cxl reset
  2025-02-21  4:39 [PATCH v2 0/2] Add CXL Reset Support for CXL Devices Srirangan Madhavan
  2025-02-21  4:39 ` [PATCH v2 1/2] cxl: de-duplicate cxl DVSEC reg defines Srirangan Madhavan
@ 2025-02-21  4:39 ` Srirangan Madhavan
  2025-02-21 10:45   ` Lukas Wunner
                     ` (3 more replies)
  1 sibling, 4 replies; 10+ messages in thread
From: Srirangan Madhavan @ 2025-02-21  4:39 UTC (permalink / raw)
  To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
  Cc: Zhi Wang, Vishal Aslot, Shanker Donthineni, linux-cxl,
	Bjorn Helgaas, linux-pci

Type 2 devices are being introduced and will require finer-grained
reset mechanisms beyond bus-wide reset methods.

Add support for CXL reset per CXL v3.2 Section 9.6/9.7

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/pci/pci.c   | 146 ++++++++++++++++++++++++++++++++++++++++++++
 include/cxl/pci.h   |  40 ++++++++----
 include/linux/pci.h |   2 +-
 3 files changed, 174 insertions(+), 14 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 3ab1871ecf8a..9108daae252b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5117,6 +5117,151 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
 	return rc;
 }
 
+static int cxl_reset_prepare(struct pci_dev *dev, u16 dvsec)
+{
+	u32 timeout_us = 100, timeout_tot_us = 10000;
+	u16 reg, cap;
+	int rc;
+
+	if (!pci_wait_for_pending_transaction(dev))
+		pci_err(dev, "timed out waiting for pending transaction; performing cxl reset anyway\n");
+
+	/* Check if the device is cache capable. */
+	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &cap);
+	if (rc)
+		return rc;
+
+	if (!(cap & CXL_DVSEC_CACHE_CAPABLE))
+		return 0;
+
+	/* Disable cache. WB and invalidate cache if capability is advertised */
+	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, &reg);
+	if (rc)
+		return rc;
+	reg |= CXL_DVSEC_DISABLE_CACHING;
+	/*
+	 * DEVCTL2 bits are written only once. So check WB+I capability while
+	 * keeping disable caching set.
+	 */
+	if (cap & CXL_DVSEC_CACHE_WBI_CAPABLE)
+		reg |= CXL_DVSEC_INIT_CACHE_WBI;
+	pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
+
+	/*
+	 * From Section 9.6: "Software may leverage the cache size reported in
+	 * the DVSEC CXL Capability2 register to compute a suitable timeout
+	 * value".
+	 * Given there is no conversion factor for cache size -> timeout,
+	 * setting timer for default 10ms.
+	 */
+	do {
+		if (timeout_tot_us == 0)
+			return -ETIMEDOUT;
+		usleep_range(timeout_us, timeout_us + 1);
+		timeout_tot_us -= timeout_us;
+		rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET,
+					  &reg);
+		if (rc)
+			return rc;
+	} while (!(reg & CXL_DVSEC_CACHE_INVALID));
+
+	return 0;
+}
+
+static int cxl_reset_init(struct pci_dev *dev, u16 dvsec)
+{
+	/*
+	 * Timeout values ref CXL Spec v3.2 Ch 8 Control and Status Registers,
+	 * under section 8.1.3.1 DVSEC CXL Capability.
+	 */
+	u32 reset_timeouts_ms[] = { 10, 100, 1000, 10000, 100000 };
+	u16 reg;
+	u32 timeout_ms;
+	int rc, ind;
+
+	/* Check if CXL Reset MEM CLR is supported. */
+	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &reg);
+	if (rc)
+		return rc;
+
+	if (reg & CXL_DVSEC_CXL_RST_MEM_CLR_CAPABLE) {
+		rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET,
+					  &reg);
+		if (rc)
+			return rc;
+
+		reg |= CXL_DVSEC_CXL_RST_MEM_CLR_ENABLE;
+		pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
+	}
+
+	/* Read timeout value. */
+	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &reg);
+	if (rc)
+		return rc;
+	ind = FIELD_GET(CXL_DVSEC_CXL_RST_TIMEOUT_MASK, reg);
+	timeout_ms = reset_timeouts_ms[ind];
+
+	/* Write reset config. */
+	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, &reg);
+	if (rc)
+		return rc;
+
+	reg |= CXL_DVSEC_INIT_CXL_RESET;
+	pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
+
+	/* Wait till timeout and then check reset status is complete. */
+	msleep(timeout_ms);
+	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_STATUS2_OFFSET, &reg);
+	if (rc)
+		return rc;
+	if (reg & CXL_DVSEC_CXL_RESET_ERR ||
+	    ~reg & CXL_DVSEC_CXL_RST_COMPLETE)
+		return -ETIMEDOUT;
+
+	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, &reg);
+	if (rc)
+		return rc;
+	reg &= (~CXL_DVSEC_DISABLE_CACHING);
+	pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
+
+	return 0;
+}
+
+/**
+ * cxl_reset - initiate a cxl reset
+ * @dev: device to reset
+ * @probe: if true, return 0 if device can be reset this way
+ *
+ * Initiate a cxl reset on @dev.
+ */
+static int cxl_reset(struct pci_dev *dev, bool probe)
+{
+	u16 dvsec, reg;
+	int rc;
+
+	dvsec = pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
+					  CXL_DVSEC_PCIE_DEVICE);
+	if (!dvsec)
+		return -ENOTTY;
+
+	/* Check if CXL Reset is supported. */
+	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &reg);
+	if (rc)
+		return -ENOTTY;
+
+	if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
+		return -ENOTTY;
+
+	if (probe)
+		return 0;
+
+	rc = cxl_reset_prepare(dev, dvsec);
+	if (rc)
+		return rc;
+
+	return cxl_reset_init(dev, dvsec);
+}
+
 void pci_dev_lock(struct pci_dev *dev)
 {
 	/* block PM suspend, driver probe, etc. */
@@ -5203,6 +5348,7 @@ const struct pci_reset_fn_method pci_reset_fn_methods[] = {
 	{ pci_dev_acpi_reset, .name = "acpi" },
 	{ pcie_reset_flr, .name = "flr" },
 	{ pci_af_flr, .name = "af_flr" },
+	{ cxl_reset, .name = "cxl_reset" },
 	{ pci_pm_reset, .name = "pm" },
 	{ pci_reset_bus_function, .name = "bus" },
 	{ cxl_reset_bus_function, .name = "cxl_bus" },
diff --git a/include/cxl/pci.h b/include/cxl/pci.h
index 3977425ec477..05d4b1a63cfe 100644
--- a/include/cxl/pci.h
+++ b/include/cxl/pci.h
@@ -13,19 +13,33 @@
 
 /* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
 #define CXL_DVSEC_PCIE_DEVICE					0
-#define   CXL_DVSEC_CAP_OFFSET		0xA
-#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
-#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
-#define   CXL_DVSEC_CTRL_OFFSET		0xC
-#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
-#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
-#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
-#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
-#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
-#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
-#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
-#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
-#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
+#define   CXL_DVSEC_CAP_OFFSET			0xA
+#define     CXL_DVSEC_CACHE_CAPABLE		BIT(0)
+#define     CXL_DVSEC_MEM_CAPABLE		BIT(2)
+#define     CXL_DVSEC_HDM_COUNT_MASK		GENMASK(5, 4)
+#define     CXL_DVSEC_CACHE_WBI_CAPABLE		BIT(6)
+#define     CXL_DVSEC_CXL_RST_CAPABLE		BIT(7)
+#define     CXL_DVSEC_CXL_RST_TIMEOUT_MASK	GENMASK(10, 8)
+#define     CXL_DVSEC_CXL_RST_MEM_CLR_CAPABLE	BIT(11)
+#define   CXL_DVSEC_CTRL_OFFSET			0xC
+#define     CXL_DVSEC_MEM_ENABLE		BIT(2)
+#define   CXL_DVSEC_CTRL2_OFFSET		0x10
+#define     CXL_DVSEC_DISABLE_CACHING		BIT(0)
+#define     CXL_DVSEC_INIT_CACHE_WBI		BIT(1)
+#define     CXL_DVSEC_INIT_CXL_RESET		BIT(2)
+#define     CXL_DVSEC_CXL_RST_MEM_CLR_ENABLE	BIT(3)
+#define   CXL_DVSEC_STATUS2_OFFSET		0x12
+#define     CXL_DVSEC_CACHE_INVALID		BIT(0)
+#define     CXL_DVSEC_CXL_RST_COMPLETE		BIT(1)
+#define     CXL_DVSEC_CXL_RESET_ERR		BIT(2)
+#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)		(0x18 + ((i) * 0x10))
+#define   CXL_DVSEC_RANGE_SIZE_LOW(i)		(0x1C + ((i) * 0x10))
+#define     CXL_DVSEC_MEM_INFO_VALID		BIT(0)
+#define     CXL_DVSEC_MEM_ACTIVE		BIT(1)
+#define     CXL_DVSEC_MEM_SIZE_LOW_MASK		GENMASK(31, 28)
+#define   CXL_DVSEC_RANGE_BASE_HIGH(i)		(0x20 + ((i) * 0x10))
+#define   CXL_DVSEC_RANGE_BASE_LOW(i)		(0x24 + ((i) * 0x10))
+#define     CXL_DVSEC_MEM_BASE_LOW_MASK		GENMASK(31, 28)
 
 #define CXL_DVSEC_RANGE_MAX		2
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 47b31ad724fa..efcb06598f26 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -51,7 +51,7 @@
 			       PCI_STATUS_PARITY)
 
 /* Number of reset methods used in pci_reset_fn_methods array in pci.c */
-#define PCI_NUM_RESET_METHODS 8
+#define PCI_NUM_RESET_METHODS 9
 
 #define PCI_RESET_PROBE		true
 #define PCI_RESET_DO_RESET	false
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] cxl: add support for cxl reset
  2025-02-21  4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
@ 2025-02-21 10:45   ` Lukas Wunner
  2025-02-22  0:13     ` Bjorn Helgaas
  2025-02-21 12:43   ` Alejandro Lucero Palau
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Lukas Wunner @ 2025-02-21 10:45 UTC (permalink / raw)
  To: Srirangan Madhavan
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Zhi Wang, Vishal Aslot,
	Shanker Donthineni, linux-cxl, Bjorn Helgaas, linux-pci

On Thu, Feb 20, 2025 at 08:39:06PM -0800, Srirangan Madhavan wrote:
> Type 2 devices are being introduced and will require finer-grained
> reset mechanisms beyond bus-wide reset methods.
> 
> Add support for CXL reset per CXL v3.2 Section 9.6/9.7
> 
> Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
> ---
>  drivers/pci/pci.c   | 146 ++++++++++++++++++++++++++++++++++++++++++++

drivers/pci/pci.c is basically a catch-all for anything that doesn't fit
in one of the other .c files in drivers/pci.  I'm slightly worried that
this (otherwise legitimate) patch increases the clutter in pci.c further,
rendering it unmaintainable in the long term.

At the very least, I'm wondering if this can be #ifdef'ed to
CONFIG_CXL_PCI?

One idea would be to move this newly added reset method, as well as the
existing cxl_reset_bus_function(), to a new drivers/pci/cxl.c file.

I guess moving it to drivers/cxl/ isn't an option because cxl can be
modular.

Another idea would be to move all the reset handling (which makes up
a significant portion of pci.c) to a separate drivers/pci/reset.c.
This might be beyond the scope of your patch, but in the interim,
maybe at least an #ifdef can be added because the PCI core is also
used e.g. on memory-constrained wifi routers which don't care about
CXL at all.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] cxl: add support for cxl reset
  2025-02-21  4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
  2025-02-21 10:45   ` Lukas Wunner
@ 2025-02-21 12:43   ` Alejandro Lucero Palau
  2025-02-22  5:45   ` kernel test robot
  2025-02-22  7:08   ` kernel test robot
  3 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero Palau @ 2025-02-21 12:43 UTC (permalink / raw)
  To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
  Cc: Zhi Wang, Vishal Aslot, Shanker Donthineni, linux-cxl,
	Bjorn Helgaas, linux-pci


On 2/21/25 04:39, Srirangan Madhavan wrote:
> Type 2 devices are being introduced and will require finer-grained
> reset mechanisms beyond bus-wide reset methods.
>
> Add support for CXL reset per CXL v3.2 Section 9.6/9.7


Hi,


This seems too early as there is no plans for supporting CXL.cache yet. 
It is the second part of the current ongoing type2 support though.


I guess starting the discussion about how to proceed is not a problem, 
so my comments below, but my first comment is if the decisions about 
what to do should be generic. I think having some helpers for accel 
drivers will be good, but the invocation should be in the hands of the 
accel driver.


> Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
> ---
>   drivers/pci/pci.c   | 146 ++++++++++++++++++++++++++++++++++++++++++++
>   include/cxl/pci.h   |  40 ++++++++----
>   include/linux/pci.h |   2 +-
>   3 files changed, 174 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 3ab1871ecf8a..9108daae252b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5117,6 +5117,151 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
>   	return rc;
>   }
>   
> +static int cxl_reset_prepare(struct pci_dev *dev, u16 dvsec)
> +{
> +	u32 timeout_us = 100, timeout_tot_us = 10000;
> +	u16 reg, cap;
> +	int rc;
> +
> +	if (!pci_wait_for_pending_transaction(dev))
> +		pci_err(dev, "timed out waiting for pending transaction; performing cxl reset anyway\n");
> +
> +	/* Check if the device is cache capable. */
> +	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &cap);
> +	if (rc)
> +		return rc;
> +
> +	if (!(cap & CXL_DVSEC_CACHE_CAPABLE))
> +		return 0;
> +
> +	/* Disable cache. WB and invalidate cache if capability is advertised */


I do not know how safe is this. IMO, this needs to be synchronized by 
the accel driver which could imply to tell user space first. In our case 
it would imply to stop rx queues in the netdev for CXL.cache, and tx 
queues for CXL.mem. Doing it unconditionally could make current CXL 
transactions to stall ... although it could be argued the reset event 
implies something is broken, but let's try to do it properly if there is 
a chance of the system not unreliable.


> +	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, &reg);
> +	if (rc)
> +		return rc;
> +	reg |= CXL_DVSEC_DISABLE_CACHING;
> +	/*
> +	 * DEVCTL2 bits are written only once. So check WB+I capability while
> +	 * keeping disable caching set.
> +	 */
> +	if (cap & CXL_DVSEC_CACHE_WBI_CAPABLE)
> +		reg |= CXL_DVSEC_INIT_CACHE_WBI;
> +	pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
> +
> +	/*
> +	 * From Section 9.6: "Software may leverage the cache size reported in
> +	 * the DVSEC CXL Capability2 register to compute a suitable timeout
> +	 * value".
> +	 * Given there is no conversion factor for cache size -> timeout,
> +	 * setting timer for default 10ms.
> +	 */
> +	do {
> +		if (timeout_tot_us == 0)
> +			return -ETIMEDOUT;
> +		usleep_range(timeout_us, timeout_us + 1);
> +		timeout_tot_us -= timeout_us;
> +		rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET,
> +					  &reg);
> +		if (rc)
> +			return rc;
> +	} while (!(reg & CXL_DVSEC_CACHE_INVALID));
> +
> +	return 0;
> +}
> +
> +static int cxl_reset_init(struct pci_dev *dev, u16 dvsec)


I think cxl_reset_start after the *_prepare call makes more sense here.


> +{
> +	/*
> +	 * Timeout values ref CXL Spec v3.2 Ch 8 Control and Status Registers,
> +	 * under section 8.1.3.1 DVSEC CXL Capability.
> +	 */
> +	u32 reset_timeouts_ms[] = { 10, 100, 1000, 10000, 100000 };
> +	u16 reg;
> +	u32 timeout_ms;
> +	int rc, ind;
> +
> +	/* Check if CXL Reset MEM CLR is supported. */
> +	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &reg);
> +	if (rc)
> +		return rc;
> +
> +	if (reg & CXL_DVSEC_CXL_RST_MEM_CLR_CAPABLE) {
> +		rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET,
> +					  &reg);
> +		if (rc)
> +			return rc;
> +
> +		reg |= CXL_DVSEC_CXL_RST_MEM_CLR_ENABLE;
> +		pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
> +	}
> +
> +	/* Read timeout value. */
> +	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &reg);
> +	if (rc)
> +		return rc;
> +	ind = FIELD_GET(CXL_DVSEC_CXL_RST_TIMEOUT_MASK, reg);
> +	timeout_ms = reset_timeouts_ms[ind];
> +
> +	/* Write reset config. */
> +	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, &reg);
> +	if (rc)
> +		return rc;
> +
> +	reg |= CXL_DVSEC_INIT_CXL_RESET;
> +	pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
> +
> +	/* Wait till timeout and then check reset status is complete. */
> +	msleep(timeout_ms);
> +	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_STATUS2_OFFSET, &reg);
> +	if (rc)
> +		return rc;
> +	if (reg & CXL_DVSEC_CXL_RESET_ERR ||
> +	    ~reg & CXL_DVSEC_CXL_RST_COMPLETE)
> +		return -ETIMEDOUT;
> +
> +	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, &reg);
> +	if (rc)
> +		return rc;
> +	reg &= (~CXL_DVSEC_DISABLE_CACHING);
> +	pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
> +
> +	return 0;
> +}
> +
> +/**
> + * cxl_reset - initiate a cxl reset
> + * @dev: device to reset
> + * @probe: if true, return 0 if device can be reset this way
> + *
> + * Initiate a cxl reset on @dev.
> + */
> +static int cxl_reset(struct pci_dev *dev, bool probe)
> +{
> +	u16 dvsec, reg;
> +	int rc;
> +
> +	dvsec = pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
> +					  CXL_DVSEC_PCIE_DEVICE);
> +	if (!dvsec)
> +		return -ENOTTY;
> +
> +	/* Check if CXL Reset is supported. */
> +	rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &reg);
> +	if (rc)
> +		return -ENOTTY;
> +
> +	if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
> +		return -ENOTTY;
> +
> +	if (probe)
> +		return 0;
> +
> +	rc = cxl_reset_prepare(dev, dvsec);
> +	if (rc)
> +		return rc;
> +
> +	return cxl_reset_init(dev, dvsec);


One thing this does not cover, and I do not know if it should, is the 
fact that the device CXL regs will be reset, so the question is if the 
old values should be restored or the device/driver should go through the 
same initialization, if a hotplug device, or do it specifically if 
already present at boot time and the BIOS doing that first 
initialization. In one case the restoration needs to happen, in the 
other the old values/objects need to be removed. I think the second case 
is more problematic because this is likely involving CXL root complex 
configuration performed by the BIOS ... Not trivial at all IMO.


This is the main concern I expressed some time ago when looking at how 
type2 should support resets.


Anyway, thank you for sending this series which will foster further 
discussions about all this.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] cxl: add support for cxl reset
  2025-02-21 10:45   ` Lukas Wunner
@ 2025-02-22  0:13     ` Bjorn Helgaas
  2025-04-07 23:45       ` Srirangan Madhavan
  0 siblings, 1 reply; 10+ messages in thread
From: Bjorn Helgaas @ 2025-02-22  0:13 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams, Zhi Wang,
	Vishal Aslot, Shanker Donthineni, linux-cxl, Bjorn Helgaas,
	linux-pci

On Fri, Feb 21, 2025 at 11:45:56AM +0100, Lukas Wunner wrote:
> On Thu, Feb 20, 2025 at 08:39:06PM -0800, Srirangan Madhavan wrote:
> > Type 2 devices are being introduced and will require finer-grained
> > reset mechanisms beyond bus-wide reset methods.
> > 
> > Add support for CXL reset per CXL v3.2 Section 9.6/9.7
> > 
> > Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
> > ---
> >  drivers/pci/pci.c   | 146 ++++++++++++++++++++++++++++++++++++++++++++
> 
> drivers/pci/pci.c is basically a catch-all for anything that doesn't fit
> in one of the other .c files in drivers/pci.  I'm slightly worried that
> this (otherwise legitimate) patch increases the clutter in pci.c further,
> rendering it unmaintainable in the long term.

+1  The reset-related content in drivers/pci/pci.c has been growing
recently.  Maybe we should consider moving it all to a reset.c file.

> At the very least, I'm wondering if this can be #ifdef'ed to
> CONFIG_CXL_PCI?
> 
> One idea would be to move this newly added reset method, as well as the
> existing cxl_reset_bus_function(), to a new drivers/pci/cxl.c file.
> 
> I guess moving it to drivers/cxl/ isn't an option because cxl can be
> modular.
> 
> Another idea would be to move all the reset handling (which makes up
> a significant portion of pci.c) to a separate drivers/pci/reset.c.
> This might be beyond the scope of your patch, but in the interim,
> maybe at least an #ifdef can be added because the PCI core is also
> used e.g. on memory-constrained wifi routers which don't care about
> CXL at all.

Agree, we'll need some way to make this optional.

Bjorn

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] cxl: add support for cxl reset
  2025-02-21  4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
  2025-02-21 10:45   ` Lukas Wunner
  2025-02-21 12:43   ` Alejandro Lucero Palau
@ 2025-02-22  5:45   ` kernel test robot
  2025-02-22  7:08   ` kernel test robot
  3 siblings, 0 replies; 10+ messages in thread
From: kernel test robot @ 2025-02-22  5:45 UTC (permalink / raw)
  To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
  Cc: oe-kbuild-all, Zhi Wang, Vishal Aslot, Shanker Donthineni,
	linux-cxl, Bjorn Helgaas, linux-pci

Hi Srirangan,

kernel test robot noticed the following build warnings:

[auto build test WARNING on next-20250220]
[cannot apply to pci/next pci/for-linus linus/master v6.14-rc3 v6.14-rc2 v6.14-rc1 v6.14-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Srirangan-Madhavan/cxl-de-duplicate-cxl-DVSEC-reg-defines/20250221-124043
base:   next-20250220
patch link:    https://lore.kernel.org/r/20250221043906.1593189-3-smadhavan%40nvidia.com
patch subject: [PATCH v2 2/2] cxl: add support for cxl reset
config: arc-randconfig-001-20250222 (https://download.01.org/0day-ci/archive/20250222/202502221354.Zmyi5Mgf-lkp@intel.com/config)
compiler: arc-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250222/202502221354.Zmyi5Mgf-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502221354.Zmyi5Mgf-lkp@intel.com/

All warnings (new ones prefixed by >>):

   drivers/pci/pci.c: In function 'cxl_reset':
>> drivers/pci/pci.c:5258:17: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
    5258 |         if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
         |                 ^


vim +5258 drivers/pci/pci.c

  5235	
  5236	/**
  5237	 * cxl_reset - initiate a cxl reset
  5238	 * @dev: device to reset
  5239	 * @probe: if true, return 0 if device can be reset this way
  5240	 *
  5241	 * Initiate a cxl reset on @dev.
  5242	 */
  5243	static int cxl_reset(struct pci_dev *dev, bool probe)
  5244	{
  5245		u16 dvsec, reg;
  5246		int rc;
  5247	
  5248		dvsec = pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
  5249						  CXL_DVSEC_PCIE_DEVICE);
  5250		if (!dvsec)
  5251			return -ENOTTY;
  5252	
  5253		/* Check if CXL Reset is supported. */
  5254		rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &reg);
  5255		if (rc)
  5256			return -ENOTTY;
  5257	
> 5258		if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
  5259			return -ENOTTY;
  5260	
  5261		if (probe)
  5262			return 0;
  5263	
  5264		rc = cxl_reset_prepare(dev, dvsec);
  5265		if (rc)
  5266			return rc;
  5267	
  5268		return cxl_reset_init(dev, dvsec);
  5269	}
  5270	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] cxl: add support for cxl reset
  2025-02-21  4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
                     ` (2 preceding siblings ...)
  2025-02-22  5:45   ` kernel test robot
@ 2025-02-22  7:08   ` kernel test robot
  3 siblings, 0 replies; 10+ messages in thread
From: kernel test robot @ 2025-02-22  7:08 UTC (permalink / raw)
  To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
  Cc: llvm, oe-kbuild-all, Zhi Wang, Vishal Aslot, Shanker Donthineni,
	linux-cxl, Bjorn Helgaas, linux-pci

Hi Srirangan,

kernel test robot noticed the following build warnings:

[auto build test WARNING on next-20250220]
[cannot apply to pci/next pci/for-linus linus/master v6.14-rc3 v6.14-rc2 v6.14-rc1 v6.14-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Srirangan-Madhavan/cxl-de-duplicate-cxl-DVSEC-reg-defines/20250221-124043
base:   next-20250220
patch link:    https://lore.kernel.org/r/20250221043906.1593189-3-smadhavan%40nvidia.com
patch subject: [PATCH v2 2/2] cxl: add support for cxl reset
config: arm64-randconfig-003-20250222 (https://download.01.org/0day-ci/archive/20250222/202502221438.j0UgOryU-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250222/202502221438.j0UgOryU-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502221438.j0UgOryU-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/pci/pci.c:5258:10: warning: & has lower precedence than ==; == will be evaluated first [-Wparentheses]
    5258 |         if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
         |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/pci/pci.c:5258:10: note: place parentheses around the '==' expression to silence this warning
    5258 |         if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
         |                 ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/pci/pci.c:5258:10: note: place parentheses around the & expression to evaluate it first
    5258 |         if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
         |             ~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
   1 warning generated.


vim +5258 drivers/pci/pci.c

  5235	
  5236	/**
  5237	 * cxl_reset - initiate a cxl reset
  5238	 * @dev: device to reset
  5239	 * @probe: if true, return 0 if device can be reset this way
  5240	 *
  5241	 * Initiate a cxl reset on @dev.
  5242	 */
  5243	static int cxl_reset(struct pci_dev *dev, bool probe)
  5244	{
  5245		u16 dvsec, reg;
  5246		int rc;
  5247	
  5248		dvsec = pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
  5249						  CXL_DVSEC_PCIE_DEVICE);
  5250		if (!dvsec)
  5251			return -ENOTTY;
  5252	
  5253		/* Check if CXL Reset is supported. */
  5254		rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &reg);
  5255		if (rc)
  5256			return -ENOTTY;
  5257	
> 5258		if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
  5259			return -ENOTTY;
  5260	
  5261		if (probe)
  5262			return 0;
  5263	
  5264		rc = cxl_reset_prepare(dev, dvsec);
  5265		if (rc)
  5266			return rc;
  5267	
  5268		return cxl_reset_init(dev, dvsec);
  5269	}
  5270	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] cxl: add support for cxl reset
  2025-02-22  0:13     ` Bjorn Helgaas
@ 2025-04-07 23:45       ` Srirangan Madhavan
  2025-08-22 15:33         ` Alejandro Lucero Palau
  0 siblings, 1 reply; 10+ messages in thread
From: Srirangan Madhavan @ 2025-04-07 23:45 UTC (permalink / raw)
  To: Bjorn Helgaas, Lukas Wunner, Alejandro Lucero
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Zhi Wang, Vishal Aslot,
	Shanker Donthineni, linux-cxl@vger.kernel.org, Bjorn Helgaas,
	linux-pci@vger.kernel.org

Thank you so much for the comments Alejandro, Lukas and Bjorn.
My apologies for the late response. I was out of office for a few weeks.
I am picking up this patch again and adding my responses.

> I think cxl_reset_start after the *_prepare call makes more sense here.

@Alejandro, Just for clarity, 
do you mean I should rename the cxl_reset_init -> cxl_reset_start?

> I do not know how safe is this. IMO, this needs to be synchronized by
> the accel driver which could imply to tell user space first. In our case
> it would imply to stop rx queues in the netdev for CXL.cache, and tx
> queues for CXL.mem. Doing it unconditionally could make current CXL
> transactions to stall ... although it could be argued the reset event
> implies something is broken, but let's try to do it properly if there is
> a chance of the system not unreliable.

Regarding this, we feel that the reset framework already
has a *reset_prepare and is called by the Linux kernel, before initiating the steps
for acutal CXL reset. During this reset prepare is when the accel driver should 
quiesce its device. In this case, that would imply stopping the rx & tx and
draining the queues. Is there any particular reason this wouldn't work/be sufficient?

> One thing this does not cover, and I do not know if it should, is the
> fact that the device CXL regs will be reset, so the question is if the
> old values should be restored or the device/driver should go through the
> same initialization, if a hotplug device, or do it specifically if
> already present at boot time and the BIOS doing that first
> initialization. In one case the restoration needs to happen, in the
> other the old values/objects need to be removed. I think the second case
> is more problematic because this is likely involving CXL root complex
> configuration performed by the BIOS ... Not trivial at all IMO.

Here again, we think that the accel driver is the one that should be doing this.
In our case, when the reset done call is made, the driver before making the device
available to be reopened and used, needs to restore the config space/DVSEC etc. to
their prior state. During the reset_prepare stage these need to be saved. Since this
is how SBR is handled currently even in the PCIe world, (for ex. AER/EEH handling framework)
where driver is responsible for save and restore of the regs, and it is also not possible
for the kernel to know exactly what to save and restore for each specific type 2 devices,
it seems logical to do the same here.

> +1  The reset-related content in drivers/pci/pci.c has been growing
> recently.  Maybe we should consider moving it all to a reset.c file.

This makes sense. I'll prepare a patch to move the reset code and
compile out CXL specific parts while making any changes for the next version.

Thank you.

Regards,
Srirangan.
________________________________________
From: Bjorn Helgaas <helgaas@kernel.org>
Sent: Friday, February 21, 2025 4:13 PM
To: Lukas Wunner
Cc: Srirangan Madhavan; Davidlohr Bueso; Jonathan Cameron; Dave Jiang; Alison Schofield; Vishal Verma; Ira Weiny; Dan Williams; Zhi Wang; Vishal Aslot; Shanker Donthineni; linux-cxl@vger.kernel.org; Bjorn Helgaas; linux-pci@vger.kernel.org
Subject: Re: [PATCH v2 2/2] cxl: add support for cxl reset

External email: Use caution opening links or attachments


On Fri, Feb 21, 2025 at 11:45:56AM +0100, Lukas Wunner wrote:
> On Thu, Feb 20, 2025 at 08:39:06PM -0800, Srirangan Madhavan wrote:
> > Type 2 devices are being introduced and will require finer-grained
> > reset mechanisms beyond bus-wide reset methods.
> >
> > Add support for CXL reset per CXL v3.2 Section 9.6/9.7
> >
> > Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
> > ---
> >  drivers/pci/pci.c   | 146 ++++++++++++++++++++++++++++++++++++++++++++
>
> drivers/pci/pci.c is basically a catch-all for anything that doesn't fit
> in one of the other .c files in drivers/pci.  I'm slightly worried that
> this (otherwise legitimate) patch increases the clutter in pci.c further,
> rendering it unmaintainable in the long term.

+1  The reset-related content in drivers/pci/pci.c has been growing
recently.  Maybe we should consider moving it all to a reset.c file.

> At the very least, I'm wondering if this can be #ifdef'ed to
> CONFIG_CXL_PCI?
>
> One idea would be to move this newly added reset method, as well as the
> existing cxl_reset_bus_function(), to a new drivers/pci/cxl.c file.
>
> I guess moving it to drivers/cxl/ isn't an option because cxl can be
> modular.
>
> Another idea would be to move all the reset handling (which makes up
> a significant portion of pci.c) to a separate drivers/pci/reset.c.
> This might be beyond the scope of your patch, but in the interim,
> maybe at least an #ifdef can be added because the PCI core is also
> used e.g. on memory-constrained wifi routers which don't care about
> CXL at all.

Agree, we'll need some way to make this optional.

Bjorn

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] cxl: add support for cxl reset
  2025-04-07 23:45       ` Srirangan Madhavan
@ 2025-08-22 15:33         ` Alejandro Lucero Palau
  0 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero Palau @ 2025-08-22 15:33 UTC (permalink / raw)
  To: Srirangan Madhavan, Bjorn Helgaas, Lukas Wunner
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Zhi Wang, Vishal Aslot,
	Shanker Donthineni, linux-cxl@vger.kernel.org, Bjorn Helgaas,
	linux-pci@vger.kernel.org

Hi,


I'm sorry I missed your reply. Some comments below.


On 4/8/25 00:45, Srirangan Madhavan wrote:
> Thank you so much for the comments Alejandro, Lukas and Bjorn.
> My apologies for the late response. I was out of office for a few weeks.
> I am picking up this patch again and adding my responses.
>
>> I think cxl_reset_start after the *_prepare call makes more sense here.
> @Alejandro, Just for clarity,
> do you mean I should rename the cxl_reset_init -> cxl_reset_start?


Yes, that's my view. init and prepare are similar, so prepare or init, 
then start.


>> I do not know how safe is this. IMO, this needs to be synchronized by
>> the accel driver which could imply to tell user space first. In our case
>> it would imply to stop rx queues in the netdev for CXL.cache, and tx
>> queues for CXL.mem. Doing it unconditionally could make current CXL
>> transactions to stall ... although it could be argued the reset event
>> implies something is broken, but let's try to do it properly if there is
>> a chance of the system not unreliable.
> Regarding this, we feel that the reset framework already
> has a *reset_prepare and is called by the Linux kernel, before initiating the steps
> for acutal CXL reset. During this reset prepare is when the accel driver should
> quiesce its device. In this case, that would imply stopping the rx & tx and
> draining the queues. Is there any particular reason this wouldn't work/be sufficient?


I did not see the connection to the driver/device in the patch so I 
assumed it was not there, but after your comment, I studied the code and 
it is just a matter of implementing such reset prepare function in our 
driver, and I think that solves the problem I saw.


>> One thing this does not cover, and I do not know if it should, is the
>> fact that the device CXL regs will be reset, so the question is if the
>> old values should be restored or the device/driver should go through the
>> same initialization, if a hotplug device, or do it specifically if
>> already present at boot time and the BIOS doing that first
>> initialization. In one case the restoration needs to happen, in the
>> other the old values/objects need to be removed. I think the second case
>> is more problematic because this is likely involving CXL root complex
>> configuration performed by the BIOS ... Not trivial at all IMO.
> Here again, we think that the accel driver is the one that should be doing this.
> In our case, when the reset done call is made, the driver before making the device
> available to be reopened and used, needs to restore the config space/DVSEC etc. to
> their prior state. During the reset_prepare stage these need to be saved. Since this
> is how SBR is handled currently even in the PCIe world, (for ex. AER/EEH handling framework)
> where driver is responsible for save and restore of the regs, and it is also not possible
> for the kernel to know exactly what to save and restore for each specific type 2 devices,
> it seems logical to do the same here.


At the time I wrote that comment, it was not clear in my mind how this 
should work, and I had doubts about the BIOS.

But I can say now all this is fine and doable.


 From you comment (not in the thread but through discord) about being 
the sfc driver the client for this functionality, I think it could be. 
I'm not sure if as part of the current Type2 patchset effort or a 
follow-up work. I'll do some work the next week for seeing the implications.


Thank you


>> +1  The reset-related content in drivers/pci/pci.c has been growing
>> recently.  Maybe we should consider moving it all to a reset.c file.
> This makes sense. I'll prepare a patch to move the reset code and
> compile out CXL specific parts while making any changes for the next version.
>
> Thank you.
>
> Regards,
> Srirangan.
> ________________________________________
> From: Bjorn Helgaas <helgaas@kernel.org>
> Sent: Friday, February 21, 2025 4:13 PM
> To: Lukas Wunner
> Cc: Srirangan Madhavan; Davidlohr Bueso; Jonathan Cameron; Dave Jiang; Alison Schofield; Vishal Verma; Ira Weiny; Dan Williams; Zhi Wang; Vishal Aslot; Shanker Donthineni; linux-cxl@vger.kernel.org; Bjorn Helgaas; linux-pci@vger.kernel.org
> Subject: Re: [PATCH v2 2/2] cxl: add support for cxl reset
>
> External email: Use caution opening links or attachments
>
>
> On Fri, Feb 21, 2025 at 11:45:56AM +0100, Lukas Wunner wrote:
>> On Thu, Feb 20, 2025 at 08:39:06PM -0800, Srirangan Madhavan wrote:
>>> Type 2 devices are being introduced and will require finer-grained
>>> reset mechanisms beyond bus-wide reset methods.
>>>
>>> Add support for CXL reset per CXL v3.2 Section 9.6/9.7
>>>
>>> Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
>>> ---
>>>   drivers/pci/pci.c   | 146 ++++++++++++++++++++++++++++++++++++++++++++
>> drivers/pci/pci.c is basically a catch-all for anything that doesn't fit
>> in one of the other .c files in drivers/pci.  I'm slightly worried that
>> this (otherwise legitimate) patch increases the clutter in pci.c further,
>> rendering it unmaintainable in the long term.
> +1  The reset-related content in drivers/pci/pci.c has been growing
> recently.  Maybe we should consider moving it all to a reset.c file.
>
>> At the very least, I'm wondering if this can be #ifdef'ed to
>> CONFIG_CXL_PCI?
>>
>> One idea would be to move this newly added reset method, as well as the
>> existing cxl_reset_bus_function(), to a new drivers/pci/cxl.c file.
>>
>> I guess moving it to drivers/cxl/ isn't an option because cxl can be
>> modular.
>>
>> Another idea would be to move all the reset handling (which makes up
>> a significant portion of pci.c) to a separate drivers/pci/reset.c.
>> This might be beyond the scope of your patch, but in the interim,
>> maybe at least an #ifdef can be added because the PCI core is also
>> used e.g. on memory-constrained wifi routers which don't care about
>> CXL at all.
> Agree, we'll need some way to make this optional.
>
> Bjorn

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-08-22 15:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-21  4:39 [PATCH v2 0/2] Add CXL Reset Support for CXL Devices Srirangan Madhavan
2025-02-21  4:39 ` [PATCH v2 1/2] cxl: de-duplicate cxl DVSEC reg defines Srirangan Madhavan
2025-02-21  4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
2025-02-21 10:45   ` Lukas Wunner
2025-02-22  0:13     ` Bjorn Helgaas
2025-04-07 23:45       ` Srirangan Madhavan
2025-08-22 15:33         ` Alejandro Lucero Palau
2025-02-21 12:43   ` Alejandro Lucero Palau
2025-02-22  5:45   ` kernel test robot
2025-02-22  7:08   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox