* [PATCH v2 0/2] Add CXL Reset Support for CXL Devices
@ 2025-02-21 4:39 Srirangan Madhavan
2025-02-21 4:39 ` [PATCH v2 1/2] cxl: de-duplicate cxl DVSEC reg defines Srirangan Madhavan
2025-02-21 4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
0 siblings, 2 replies; 10+ messages in thread
From: Srirangan Madhavan @ 2025-02-21 4:39 UTC (permalink / raw)
To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
Cc: Zhi Wang, Vishal Aslot, Shanker Donthineni, linux-cxl,
Bjorn Helgaas, linux-pci
This patch series introduces support for the CXL Reset method
for CXL devices, implementing the reset procedure outlined in
CXL Spec [1] v3.2, Sections 9.6 and 9.7.
v2 changes:
- de-duplicate CXL DVSEC register defines under include/cxl/pci.h
- fix style related issues.
v1 changes:
- Added cover letter and dropped the RFC.
The RFC patche can be found here [2]
Motivation:
-----------
This change is broadly useful for reasons including but not limited to the
following:
- As support for Type 2 devices [3] is being introduced, more devices will
require finer-grained reset mechanisms beyond bus-wide reset methods.
- FLR does not affect CXL.cache or CXL.mem protocols, making CXL Reset
the preferred method in some cases.
- The CXL spec (Sections 7.2.3 Binding and Unbinding, 9.5 FLR) highlights use
cases like function rebinding and error recovery, where CXL Reset is
explicitly mentioned.
Change Description:
-------------------
The patch implements the required steps for CXL reset and introduces a new reset
method in the pci_reset_fn_methods. It also defines the necessary CXL DVSEC
register macros. The actual steps for rest are broadly: disable cache lines and
asserts WB+I, wait for cache and memory clear signals, initial reset, wait for
complete and return status.
Command line to test the CXL reset on a capable device:
echo cxl_reset > /sys/bus/pci/devices/<pci_device>/reset_method
echo 1 > /sys/bus/pci/devices/<pci_device>/reset
[1] https://computeexpresslink.org/cxl-specification/
[2] https://lore.kernel.org/all/20241213074143.374-1-smadhavan@nvidia.com/
[3] https://lore.kernel.org/linux-cxl/20241216161042.42108-1-alejandro.lucero-palau@amd.com/
Srirangan Madhavan (2):
cxl: de-duplicate cxl DVSEC reg defines
cxl: add support for cxl reset
drivers/cxl/core/pci.c | 1 +
drivers/cxl/core/regs.c | 1 +
drivers/cxl/cxlpci.h | 53 -----------
drivers/cxl/pci.c | 1 +
drivers/pci/pci.c | 163 ++++++++++++++++++++++++++++++++--
include/cxl/pci.h | 76 ++++++++++++++++
include/linux/pci.h | 2 +-
include/uapi/linux/pci_regs.h | 5 --
8 files changed, 235 insertions(+), 67 deletions(-)
create mode 100644 include/cxl/pci.h
--
2.25.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/2] cxl: de-duplicate cxl DVSEC reg defines
2025-02-21 4:39 [PATCH v2 0/2] Add CXL Reset Support for CXL Devices Srirangan Madhavan
@ 2025-02-21 4:39 ` Srirangan Madhavan
2025-02-21 4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
1 sibling, 0 replies; 10+ messages in thread
From: Srirangan Madhavan @ 2025-02-21 4:39 UTC (permalink / raw)
To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
Cc: Zhi Wang, Vishal Aslot, Shanker Donthineni, linux-cxl,
Bjorn Helgaas, linux-pci
There are two set of CXL DVSEC register defines in
uapi/linux/pci_regs.h as PCI_DVSEC_CXL* and in cxl/cxlpci.h as
CXL_DVSEC_*.
Consolidate the defines under include/cxl/ accessible by both
pci core and cxl. Also fix any references to the deprecated copy
of these defines.
Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
drivers/cxl/core/pci.c | 1 +
drivers/cxl/core/regs.c | 1 +
drivers/cxl/cxlpci.h | 53 ------------------------------
drivers/cxl/pci.c | 1 +
drivers/pci/pci.c | 17 +++++-----
include/cxl/pci.h | 62 +++++++++++++++++++++++++++++++++++
include/uapi/linux/pci_regs.h | 5 ---
7 files changed, 74 insertions(+), 66 deletions(-)
create mode 100644 include/cxl/pci.h
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index a5c65f79db18..afa2c85fc6fd 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -7,6 +7,7 @@
#include <linux/pci.h>
#include <linux/pci-doe.h>
#include <linux/aer.h>
+#include <cxl/pci.h>
#include <cxlpci.h>
#include <cxlmem.h>
#include <cxl.h>
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index 117c2e94c761..58a942a4946c 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -4,6 +4,7 @@
#include <linux/device.h>
#include <linux/slab.h>
#include <linux/pci.h>
+#include <cxl/pci.h>
#include <cxlmem.h>
#include <cxlpci.h>
#include <pmu.h>
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 54e219b0049e..24c06eca8a68 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -7,59 +7,6 @@
#define CXL_MEMORY_PROGIF 0x10
-/*
- * See section 8.1 Configuration Space Registers in the CXL 2.0
- * Specification. Names are taken straight from the specification with "CXL" and
- * "DVSEC" redundancies removed. When obvious, abbreviations may be used.
- */
-#define PCI_DVSEC_HEADER1_LENGTH_MASK GENMASK(31, 20)
-
-/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
-#define CXL_DVSEC_PCIE_DEVICE 0
-#define CXL_DVSEC_CAP_OFFSET 0xA
-#define CXL_DVSEC_MEM_CAPABLE BIT(2)
-#define CXL_DVSEC_HDM_COUNT_MASK GENMASK(5, 4)
-#define CXL_DVSEC_CTRL_OFFSET 0xC
-#define CXL_DVSEC_MEM_ENABLE BIT(2)
-#define CXL_DVSEC_RANGE_SIZE_HIGH(i) (0x18 + (i * 0x10))
-#define CXL_DVSEC_RANGE_SIZE_LOW(i) (0x1C + (i * 0x10))
-#define CXL_DVSEC_MEM_INFO_VALID BIT(0)
-#define CXL_DVSEC_MEM_ACTIVE BIT(1)
-#define CXL_DVSEC_MEM_SIZE_LOW_MASK GENMASK(31, 28)
-#define CXL_DVSEC_RANGE_BASE_HIGH(i) (0x20 + (i * 0x10))
-#define CXL_DVSEC_RANGE_BASE_LOW(i) (0x24 + (i * 0x10))
-#define CXL_DVSEC_MEM_BASE_LOW_MASK GENMASK(31, 28)
-
-#define CXL_DVSEC_RANGE_MAX 2
-
-/* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
-#define CXL_DVSEC_FUNCTION_MAP 2
-
-/* CXL 2.0 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
-#define CXL_DVSEC_PORT_EXTENSIONS 3
-
-/* CXL 2.0 8.1.6: GPF DVSEC for CXL Port */
-#define CXL_DVSEC_PORT_GPF 4
-#define CXL_DVSEC_PORT_GPF_PHASE_1_CONTROL_OFFSET 0x0C
-#define CXL_DVSEC_PORT_GPF_PHASE_1_TMO_BASE_MASK GENMASK(3, 0)
-#define CXL_DVSEC_PORT_GPF_PHASE_1_TMO_SCALE_MASK GENMASK(11, 8)
-#define CXL_DVSEC_PORT_GPF_PHASE_2_CONTROL_OFFSET 0xE
-#define CXL_DVSEC_PORT_GPF_PHASE_2_TMO_BASE_MASK GENMASK(3, 0)
-#define CXL_DVSEC_PORT_GPF_PHASE_2_TMO_SCALE_MASK GENMASK(11, 8)
-
-/* CXL 2.0 8.1.7: GPF DVSEC for CXL Device */
-#define CXL_DVSEC_DEVICE_GPF 5
-
-/* CXL 2.0 8.1.8: PCIe DVSEC for Flex Bus Port */
-#define CXL_DVSEC_PCIE_FLEXBUS_PORT 7
-
-/* CXL 2.0 8.1.9: Register Locator DVSEC */
-#define CXL_DVSEC_REG_LOCATOR 8
-#define CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET 0xC
-#define CXL_DVSEC_REG_LOCATOR_BIR_MASK GENMASK(2, 0)
-#define CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK GENMASK(15, 8)
-#define CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK GENMASK(31, 16)
-
/*
* NOTE: Currently all the functions which are enabled for CXL require their
* vectors to be in the first 16. Use this as the default max.
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index b2c943a4de0a..1f802b7cd65d 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -12,6 +12,7 @@
#include <linux/aer.h>
#include <linux/io.h>
#include <cxl/mailbox.h>
+#include <cxl/pci.h>
#include "cxlmem.h"
#include "cxlpci.h"
#include "cxl.h"
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 869d204a70a3..3ab1871ecf8a 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -30,6 +30,7 @@
#include <asm/dma.h>
#include <linux/aer.h>
#include <linux/bitfield.h>
+#include <cxl/pci.h>
#include "pci.h"
DEFINE_MUTEX(pci_slot_mutex);
@@ -5029,7 +5030,7 @@ static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
static u16 cxl_port_dvsec(struct pci_dev *dev)
{
return pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
- PCI_DVSEC_CXL_PORT);
+ CXL_DVSEC_PORT_EXTENSIONS);
}
static bool cxl_sbr_masked(struct pci_dev *dev)
@@ -5041,7 +5042,7 @@ static bool cxl_sbr_masked(struct pci_dev *dev)
if (!dvsec)
return false;
- rc = pci_read_config_word(dev, dvsec + PCI_DVSEC_CXL_PORT_CTL, ®);
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_PORT_CTL, ®);
if (rc || PCI_POSSIBLE_ERROR(reg))
return false;
@@ -5050,7 +5051,7 @@ static bool cxl_sbr_masked(struct pci_dev *dev)
* bit in Bridge Control has no effect. When 1, the Port generates
* hot reset when the SBR bit is set to 1.
*/
- if (reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR)
+ if (reg & CXL_DVSEC_UNMASK_SBR)
return false;
return true;
@@ -5095,22 +5096,22 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
if (probe)
return 0;
- rc = pci_read_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL, ®);
+ rc = pci_read_config_word(bridge, dvsec + CXL_DVSEC_PORT_CTL, ®);
if (rc)
return -ENOTTY;
- if (reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR) {
+ if (reg & CXL_DVSEC_UNMASK_SBR) {
val = reg;
} else {
- val = reg | PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR;
- pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
+ val = reg | CXL_DVSEC_UNMASK_SBR;
+ pci_write_config_word(bridge, dvsec + CXL_DVSEC_PORT_CTL,
val);
}
rc = pci_reset_bus_function(dev, probe);
if (reg != val)
- pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
+ pci_write_config_word(bridge, dvsec + CXL_DVSEC_PORT_CTL,
reg);
return rc;
diff --git a/include/cxl/pci.h b/include/cxl/pci.h
new file mode 100644
index 000000000000..3977425ec477
--- /dev/null
+++ b/include/cxl/pci.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
+
+#ifndef __CXL_ACCEL_PCI_H
+#define __CXL_ACCEL_PCI_H
+
+/*
+ * See section 8.1 Configuration Space Registers in the CXL 2.0
+ * Specification. Names are taken straight from the specification with "CXL" and
+ * "DVSEC" redundancies removed. When obvious, abbreviations may be used.
+ */
+#define PCI_DVSEC_HEADER1_LENGTH_MASK GENMASK(31, 20)
+
+/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
+#define CXL_DVSEC_PCIE_DEVICE 0
+#define CXL_DVSEC_CAP_OFFSET 0xA
+#define CXL_DVSEC_MEM_CAPABLE BIT(2)
+#define CXL_DVSEC_HDM_COUNT_MASK GENMASK(5, 4)
+#define CXL_DVSEC_CTRL_OFFSET 0xC
+#define CXL_DVSEC_MEM_ENABLE BIT(2)
+#define CXL_DVSEC_RANGE_SIZE_HIGH(i) (0x18 + (i * 0x10))
+#define CXL_DVSEC_RANGE_SIZE_LOW(i) (0x1C + (i * 0x10))
+#define CXL_DVSEC_MEM_INFO_VALID BIT(0)
+#define CXL_DVSEC_MEM_ACTIVE BIT(1)
+#define CXL_DVSEC_MEM_SIZE_LOW_MASK GENMASK(31, 28)
+#define CXL_DVSEC_RANGE_BASE_HIGH(i) (0x20 + (i * 0x10))
+#define CXL_DVSEC_RANGE_BASE_LOW(i) (0x24 + (i * 0x10))
+#define CXL_DVSEC_MEM_BASE_LOW_MASK GENMASK(31, 28)
+
+#define CXL_DVSEC_RANGE_MAX 2
+
+/* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
+#define CXL_DVSEC_FUNCTION_MAP 2
+
+/* CXL 2.0 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
+#define CXL_DVSEC_PORT_EXTENSIONS 3
+#define CXL_DVSEC_PORT_CTL 0xC
+#define CXL_DVSEC_UNMASK_SBR BIT(0)
+
+/* CXL 2.0 8.1.6: GPF DVSEC for CXL Port */
+#define CXL_DVSEC_PORT_GPF 4
+#define CXL_DVSEC_PORT_GPF_PHASE_1_CONTROL_OFFSET 0x0C
+#define CXL_DVSEC_PORT_GPF_PHASE_1_TMO_BASE_MASK GENMASK(3, 0)
+#define CXL_DVSEC_PORT_GPF_PHASE_1_TMO_SCALE_MASK GENMASK(11, 8)
+#define CXL_DVSEC_PORT_GPF_PHASE_2_CONTROL_OFFSET 0xE
+#define CXL_DVSEC_PORT_GPF_PHASE_2_TMO_BASE_MASK GENMASK(3, 0)
+#define CXL_DVSEC_PORT_GPF_PHASE_2_TMO_SCALE_MASK GENMASK(11, 8)
+
+/* CXL 2.0 8.1.7: GPF DVSEC for CXL Device */
+#define CXL_DVSEC_DEVICE_GPF 5
+
+/* CXL 2.0 8.1.8: PCIe DVSEC for Flex Bus Port */
+#define CXL_DVSEC_PCIE_FLEXBUS_PORT 7
+
+/* CXL 2.0 8.1.9: Register Locator DVSEC */
+#define CXL_DVSEC_REG_LOCATOR 8
+#define CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET 0xC
+#define CXL_DVSEC_REG_LOCATOR_BIR_MASK GENMASK(2, 0)
+#define CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK GENMASK(15, 8)
+#define CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK GENMASK(31, 16)
+
+#endif
\ No newline at end of file
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 3445c4970e4d..02844c20c527 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -1208,9 +1208,4 @@
#define PCI_DOE_DATA_OBJECT_DISC_RSP_3_PROTOCOL 0x00ff0000
#define PCI_DOE_DATA_OBJECT_DISC_RSP_3_NEXT_INDEX 0xff000000
-/* Compute Express Link (CXL r3.1, sec 8.1.5) */
-#define PCI_DVSEC_CXL_PORT 3
-#define PCI_DVSEC_CXL_PORT_CTL 0x0c
-#define PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR 0x00000001
-
#endif /* LINUX_PCI_REGS_H */
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 2/2] cxl: add support for cxl reset
2025-02-21 4:39 [PATCH v2 0/2] Add CXL Reset Support for CXL Devices Srirangan Madhavan
2025-02-21 4:39 ` [PATCH v2 1/2] cxl: de-duplicate cxl DVSEC reg defines Srirangan Madhavan
@ 2025-02-21 4:39 ` Srirangan Madhavan
2025-02-21 10:45 ` Lukas Wunner
` (3 more replies)
1 sibling, 4 replies; 10+ messages in thread
From: Srirangan Madhavan @ 2025-02-21 4:39 UTC (permalink / raw)
To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
Cc: Zhi Wang, Vishal Aslot, Shanker Donthineni, linux-cxl,
Bjorn Helgaas, linux-pci
Type 2 devices are being introduced and will require finer-grained
reset mechanisms beyond bus-wide reset methods.
Add support for CXL reset per CXL v3.2 Section 9.6/9.7
Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
drivers/pci/pci.c | 146 ++++++++++++++++++++++++++++++++++++++++++++
include/cxl/pci.h | 40 ++++++++----
include/linux/pci.h | 2 +-
3 files changed, 174 insertions(+), 14 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 3ab1871ecf8a..9108daae252b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5117,6 +5117,151 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
return rc;
}
+static int cxl_reset_prepare(struct pci_dev *dev, u16 dvsec)
+{
+ u32 timeout_us = 100, timeout_tot_us = 10000;
+ u16 reg, cap;
+ int rc;
+
+ if (!pci_wait_for_pending_transaction(dev))
+ pci_err(dev, "timed out waiting for pending transaction; performing cxl reset anyway\n");
+
+ /* Check if the device is cache capable. */
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &cap);
+ if (rc)
+ return rc;
+
+ if (!(cap & CXL_DVSEC_CACHE_CAPABLE))
+ return 0;
+
+ /* Disable cache. WB and invalidate cache if capability is advertised */
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, ®);
+ if (rc)
+ return rc;
+ reg |= CXL_DVSEC_DISABLE_CACHING;
+ /*
+ * DEVCTL2 bits are written only once. So check WB+I capability while
+ * keeping disable caching set.
+ */
+ if (cap & CXL_DVSEC_CACHE_WBI_CAPABLE)
+ reg |= CXL_DVSEC_INIT_CACHE_WBI;
+ pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
+
+ /*
+ * From Section 9.6: "Software may leverage the cache size reported in
+ * the DVSEC CXL Capability2 register to compute a suitable timeout
+ * value".
+ * Given there is no conversion factor for cache size -> timeout,
+ * setting timer for default 10ms.
+ */
+ do {
+ if (timeout_tot_us == 0)
+ return -ETIMEDOUT;
+ usleep_range(timeout_us, timeout_us + 1);
+ timeout_tot_us -= timeout_us;
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET,
+ ®);
+ if (rc)
+ return rc;
+ } while (!(reg & CXL_DVSEC_CACHE_INVALID));
+
+ return 0;
+}
+
+static int cxl_reset_init(struct pci_dev *dev, u16 dvsec)
+{
+ /*
+ * Timeout values ref CXL Spec v3.2 Ch 8 Control and Status Registers,
+ * under section 8.1.3.1 DVSEC CXL Capability.
+ */
+ u32 reset_timeouts_ms[] = { 10, 100, 1000, 10000, 100000 };
+ u16 reg;
+ u32 timeout_ms;
+ int rc, ind;
+
+ /* Check if CXL Reset MEM CLR is supported. */
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, ®);
+ if (rc)
+ return rc;
+
+ if (reg & CXL_DVSEC_CXL_RST_MEM_CLR_CAPABLE) {
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET,
+ ®);
+ if (rc)
+ return rc;
+
+ reg |= CXL_DVSEC_CXL_RST_MEM_CLR_ENABLE;
+ pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
+ }
+
+ /* Read timeout value. */
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, ®);
+ if (rc)
+ return rc;
+ ind = FIELD_GET(CXL_DVSEC_CXL_RST_TIMEOUT_MASK, reg);
+ timeout_ms = reset_timeouts_ms[ind];
+
+ /* Write reset config. */
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, ®);
+ if (rc)
+ return rc;
+
+ reg |= CXL_DVSEC_INIT_CXL_RESET;
+ pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
+
+ /* Wait till timeout and then check reset status is complete. */
+ msleep(timeout_ms);
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_STATUS2_OFFSET, ®);
+ if (rc)
+ return rc;
+ if (reg & CXL_DVSEC_CXL_RESET_ERR ||
+ ~reg & CXL_DVSEC_CXL_RST_COMPLETE)
+ return -ETIMEDOUT;
+
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, ®);
+ if (rc)
+ return rc;
+ reg &= (~CXL_DVSEC_DISABLE_CACHING);
+ pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
+
+ return 0;
+}
+
+/**
+ * cxl_reset - initiate a cxl reset
+ * @dev: device to reset
+ * @probe: if true, return 0 if device can be reset this way
+ *
+ * Initiate a cxl reset on @dev.
+ */
+static int cxl_reset(struct pci_dev *dev, bool probe)
+{
+ u16 dvsec, reg;
+ int rc;
+
+ dvsec = pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
+ CXL_DVSEC_PCIE_DEVICE);
+ if (!dvsec)
+ return -ENOTTY;
+
+ /* Check if CXL Reset is supported. */
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, ®);
+ if (rc)
+ return -ENOTTY;
+
+ if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
+ return -ENOTTY;
+
+ if (probe)
+ return 0;
+
+ rc = cxl_reset_prepare(dev, dvsec);
+ if (rc)
+ return rc;
+
+ return cxl_reset_init(dev, dvsec);
+}
+
void pci_dev_lock(struct pci_dev *dev)
{
/* block PM suspend, driver probe, etc. */
@@ -5203,6 +5348,7 @@ const struct pci_reset_fn_method pci_reset_fn_methods[] = {
{ pci_dev_acpi_reset, .name = "acpi" },
{ pcie_reset_flr, .name = "flr" },
{ pci_af_flr, .name = "af_flr" },
+ { cxl_reset, .name = "cxl_reset" },
{ pci_pm_reset, .name = "pm" },
{ pci_reset_bus_function, .name = "bus" },
{ cxl_reset_bus_function, .name = "cxl_bus" },
diff --git a/include/cxl/pci.h b/include/cxl/pci.h
index 3977425ec477..05d4b1a63cfe 100644
--- a/include/cxl/pci.h
+++ b/include/cxl/pci.h
@@ -13,19 +13,33 @@
/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
#define CXL_DVSEC_PCIE_DEVICE 0
-#define CXL_DVSEC_CAP_OFFSET 0xA
-#define CXL_DVSEC_MEM_CAPABLE BIT(2)
-#define CXL_DVSEC_HDM_COUNT_MASK GENMASK(5, 4)
-#define CXL_DVSEC_CTRL_OFFSET 0xC
-#define CXL_DVSEC_MEM_ENABLE BIT(2)
-#define CXL_DVSEC_RANGE_SIZE_HIGH(i) (0x18 + (i * 0x10))
-#define CXL_DVSEC_RANGE_SIZE_LOW(i) (0x1C + (i * 0x10))
-#define CXL_DVSEC_MEM_INFO_VALID BIT(0)
-#define CXL_DVSEC_MEM_ACTIVE BIT(1)
-#define CXL_DVSEC_MEM_SIZE_LOW_MASK GENMASK(31, 28)
-#define CXL_DVSEC_RANGE_BASE_HIGH(i) (0x20 + (i * 0x10))
-#define CXL_DVSEC_RANGE_BASE_LOW(i) (0x24 + (i * 0x10))
-#define CXL_DVSEC_MEM_BASE_LOW_MASK GENMASK(31, 28)
+#define CXL_DVSEC_CAP_OFFSET 0xA
+#define CXL_DVSEC_CACHE_CAPABLE BIT(0)
+#define CXL_DVSEC_MEM_CAPABLE BIT(2)
+#define CXL_DVSEC_HDM_COUNT_MASK GENMASK(5, 4)
+#define CXL_DVSEC_CACHE_WBI_CAPABLE BIT(6)
+#define CXL_DVSEC_CXL_RST_CAPABLE BIT(7)
+#define CXL_DVSEC_CXL_RST_TIMEOUT_MASK GENMASK(10, 8)
+#define CXL_DVSEC_CXL_RST_MEM_CLR_CAPABLE BIT(11)
+#define CXL_DVSEC_CTRL_OFFSET 0xC
+#define CXL_DVSEC_MEM_ENABLE BIT(2)
+#define CXL_DVSEC_CTRL2_OFFSET 0x10
+#define CXL_DVSEC_DISABLE_CACHING BIT(0)
+#define CXL_DVSEC_INIT_CACHE_WBI BIT(1)
+#define CXL_DVSEC_INIT_CXL_RESET BIT(2)
+#define CXL_DVSEC_CXL_RST_MEM_CLR_ENABLE BIT(3)
+#define CXL_DVSEC_STATUS2_OFFSET 0x12
+#define CXL_DVSEC_CACHE_INVALID BIT(0)
+#define CXL_DVSEC_CXL_RST_COMPLETE BIT(1)
+#define CXL_DVSEC_CXL_RESET_ERR BIT(2)
+#define CXL_DVSEC_RANGE_SIZE_HIGH(i) (0x18 + ((i) * 0x10))
+#define CXL_DVSEC_RANGE_SIZE_LOW(i) (0x1C + ((i) * 0x10))
+#define CXL_DVSEC_MEM_INFO_VALID BIT(0)
+#define CXL_DVSEC_MEM_ACTIVE BIT(1)
+#define CXL_DVSEC_MEM_SIZE_LOW_MASK GENMASK(31, 28)
+#define CXL_DVSEC_RANGE_BASE_HIGH(i) (0x20 + ((i) * 0x10))
+#define CXL_DVSEC_RANGE_BASE_LOW(i) (0x24 + ((i) * 0x10))
+#define CXL_DVSEC_MEM_BASE_LOW_MASK GENMASK(31, 28)
#define CXL_DVSEC_RANGE_MAX 2
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 47b31ad724fa..efcb06598f26 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -51,7 +51,7 @@
PCI_STATUS_PARITY)
/* Number of reset methods used in pci_reset_fn_methods array in pci.c */
-#define PCI_NUM_RESET_METHODS 8
+#define PCI_NUM_RESET_METHODS 9
#define PCI_RESET_PROBE true
#define PCI_RESET_DO_RESET false
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] cxl: add support for cxl reset
2025-02-21 4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
@ 2025-02-21 10:45 ` Lukas Wunner
2025-02-22 0:13 ` Bjorn Helgaas
2025-02-21 12:43 ` Alejandro Lucero Palau
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Lukas Wunner @ 2025-02-21 10:45 UTC (permalink / raw)
To: Srirangan Madhavan
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Zhi Wang, Vishal Aslot,
Shanker Donthineni, linux-cxl, Bjorn Helgaas, linux-pci
On Thu, Feb 20, 2025 at 08:39:06PM -0800, Srirangan Madhavan wrote:
> Type 2 devices are being introduced and will require finer-grained
> reset mechanisms beyond bus-wide reset methods.
>
> Add support for CXL reset per CXL v3.2 Section 9.6/9.7
>
> Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
> ---
> drivers/pci/pci.c | 146 ++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci.c is basically a catch-all for anything that doesn't fit
in one of the other .c files in drivers/pci. I'm slightly worried that
this (otherwise legitimate) patch increases the clutter in pci.c further,
rendering it unmaintainable in the long term.
At the very least, I'm wondering if this can be #ifdef'ed to
CONFIG_CXL_PCI?
One idea would be to move this newly added reset method, as well as the
existing cxl_reset_bus_function(), to a new drivers/pci/cxl.c file.
I guess moving it to drivers/cxl/ isn't an option because cxl can be
modular.
Another idea would be to move all the reset handling (which makes up
a significant portion of pci.c) to a separate drivers/pci/reset.c.
This might be beyond the scope of your patch, but in the interim,
maybe at least an #ifdef can be added because the PCI core is also
used e.g. on memory-constrained wifi routers which don't care about
CXL at all.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] cxl: add support for cxl reset
2025-02-21 4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
2025-02-21 10:45 ` Lukas Wunner
@ 2025-02-21 12:43 ` Alejandro Lucero Palau
2025-02-22 5:45 ` kernel test robot
2025-02-22 7:08 ` kernel test robot
3 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero Palau @ 2025-02-21 12:43 UTC (permalink / raw)
To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
Cc: Zhi Wang, Vishal Aslot, Shanker Donthineni, linux-cxl,
Bjorn Helgaas, linux-pci
On 2/21/25 04:39, Srirangan Madhavan wrote:
> Type 2 devices are being introduced and will require finer-grained
> reset mechanisms beyond bus-wide reset methods.
>
> Add support for CXL reset per CXL v3.2 Section 9.6/9.7
Hi,
This seems too early as there is no plans for supporting CXL.cache yet.
It is the second part of the current ongoing type2 support though.
I guess starting the discussion about how to proceed is not a problem,
so my comments below, but my first comment is if the decisions about
what to do should be generic. I think having some helpers for accel
drivers will be good, but the invocation should be in the hands of the
accel driver.
> Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
> ---
> drivers/pci/pci.c | 146 ++++++++++++++++++++++++++++++++++++++++++++
> include/cxl/pci.h | 40 ++++++++----
> include/linux/pci.h | 2 +-
> 3 files changed, 174 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 3ab1871ecf8a..9108daae252b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5117,6 +5117,151 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
> return rc;
> }
>
> +static int cxl_reset_prepare(struct pci_dev *dev, u16 dvsec)
> +{
> + u32 timeout_us = 100, timeout_tot_us = 10000;
> + u16 reg, cap;
> + int rc;
> +
> + if (!pci_wait_for_pending_transaction(dev))
> + pci_err(dev, "timed out waiting for pending transaction; performing cxl reset anyway\n");
> +
> + /* Check if the device is cache capable. */
> + rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, &cap);
> + if (rc)
> + return rc;
> +
> + if (!(cap & CXL_DVSEC_CACHE_CAPABLE))
> + return 0;
> +
> + /* Disable cache. WB and invalidate cache if capability is advertised */
I do not know how safe is this. IMO, this needs to be synchronized by
the accel driver which could imply to tell user space first. In our case
it would imply to stop rx queues in the netdev for CXL.cache, and tx
queues for CXL.mem. Doing it unconditionally could make current CXL
transactions to stall ... although it could be argued the reset event
implies something is broken, but let's try to do it properly if there is
a chance of the system not unreliable.
> + rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, ®);
> + if (rc)
> + return rc;
> + reg |= CXL_DVSEC_DISABLE_CACHING;
> + /*
> + * DEVCTL2 bits are written only once. So check WB+I capability while
> + * keeping disable caching set.
> + */
> + if (cap & CXL_DVSEC_CACHE_WBI_CAPABLE)
> + reg |= CXL_DVSEC_INIT_CACHE_WBI;
> + pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
> +
> + /*
> + * From Section 9.6: "Software may leverage the cache size reported in
> + * the DVSEC CXL Capability2 register to compute a suitable timeout
> + * value".
> + * Given there is no conversion factor for cache size -> timeout,
> + * setting timer for default 10ms.
> + */
> + do {
> + if (timeout_tot_us == 0)
> + return -ETIMEDOUT;
> + usleep_range(timeout_us, timeout_us + 1);
> + timeout_tot_us -= timeout_us;
> + rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET,
> + ®);
> + if (rc)
> + return rc;
> + } while (!(reg & CXL_DVSEC_CACHE_INVALID));
> +
> + return 0;
> +}
> +
> +static int cxl_reset_init(struct pci_dev *dev, u16 dvsec)
I think cxl_reset_start after the *_prepare call makes more sense here.
> +{
> + /*
> + * Timeout values ref CXL Spec v3.2 Ch 8 Control and Status Registers,
> + * under section 8.1.3.1 DVSEC CXL Capability.
> + */
> + u32 reset_timeouts_ms[] = { 10, 100, 1000, 10000, 100000 };
> + u16 reg;
> + u32 timeout_ms;
> + int rc, ind;
> +
> + /* Check if CXL Reset MEM CLR is supported. */
> + rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, ®);
> + if (rc)
> + return rc;
> +
> + if (reg & CXL_DVSEC_CXL_RST_MEM_CLR_CAPABLE) {
> + rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET,
> + ®);
> + if (rc)
> + return rc;
> +
> + reg |= CXL_DVSEC_CXL_RST_MEM_CLR_ENABLE;
> + pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
> + }
> +
> + /* Read timeout value. */
> + rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, ®);
> + if (rc)
> + return rc;
> + ind = FIELD_GET(CXL_DVSEC_CXL_RST_TIMEOUT_MASK, reg);
> + timeout_ms = reset_timeouts_ms[ind];
> +
> + /* Write reset config. */
> + rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, ®);
> + if (rc)
> + return rc;
> +
> + reg |= CXL_DVSEC_INIT_CXL_RESET;
> + pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
> +
> + /* Wait till timeout and then check reset status is complete. */
> + msleep(timeout_ms);
> + rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_STATUS2_OFFSET, ®);
> + if (rc)
> + return rc;
> + if (reg & CXL_DVSEC_CXL_RESET_ERR ||
> + ~reg & CXL_DVSEC_CXL_RST_COMPLETE)
> + return -ETIMEDOUT;
> +
> + rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, ®);
> + if (rc)
> + return rc;
> + reg &= (~CXL_DVSEC_DISABLE_CACHING);
> + pci_write_config_word(dev, dvsec + CXL_DVSEC_CTRL2_OFFSET, reg);
> +
> + return 0;
> +}
> +
> +/**
> + * cxl_reset - initiate a cxl reset
> + * @dev: device to reset
> + * @probe: if true, return 0 if device can be reset this way
> + *
> + * Initiate a cxl reset on @dev.
> + */
> +static int cxl_reset(struct pci_dev *dev, bool probe)
> +{
> + u16 dvsec, reg;
> + int rc;
> +
> + dvsec = pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
> + CXL_DVSEC_PCIE_DEVICE);
> + if (!dvsec)
> + return -ENOTTY;
> +
> + /* Check if CXL Reset is supported. */
> + rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, ®);
> + if (rc)
> + return -ENOTTY;
> +
> + if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
> + return -ENOTTY;
> +
> + if (probe)
> + return 0;
> +
> + rc = cxl_reset_prepare(dev, dvsec);
> + if (rc)
> + return rc;
> +
> + return cxl_reset_init(dev, dvsec);
One thing this does not cover, and I do not know if it should, is the
fact that the device CXL regs will be reset, so the question is if the
old values should be restored or the device/driver should go through the
same initialization, if a hotplug device, or do it specifically if
already present at boot time and the BIOS doing that first
initialization. In one case the restoration needs to happen, in the
other the old values/objects need to be removed. I think the second case
is more problematic because this is likely involving CXL root complex
configuration performed by the BIOS ... Not trivial at all IMO.
This is the main concern I expressed some time ago when looking at how
type2 should support resets.
Anyway, thank you for sending this series which will foster further
discussions about all this.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] cxl: add support for cxl reset
2025-02-21 10:45 ` Lukas Wunner
@ 2025-02-22 0:13 ` Bjorn Helgaas
2025-04-07 23:45 ` Srirangan Madhavan
0 siblings, 1 reply; 10+ messages in thread
From: Bjorn Helgaas @ 2025-02-22 0:13 UTC (permalink / raw)
To: Lukas Wunner
Cc: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams, Zhi Wang,
Vishal Aslot, Shanker Donthineni, linux-cxl, Bjorn Helgaas,
linux-pci
On Fri, Feb 21, 2025 at 11:45:56AM +0100, Lukas Wunner wrote:
> On Thu, Feb 20, 2025 at 08:39:06PM -0800, Srirangan Madhavan wrote:
> > Type 2 devices are being introduced and will require finer-grained
> > reset mechanisms beyond bus-wide reset methods.
> >
> > Add support for CXL reset per CXL v3.2 Section 9.6/9.7
> >
> > Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
> > ---
> > drivers/pci/pci.c | 146 ++++++++++++++++++++++++++++++++++++++++++++
>
> drivers/pci/pci.c is basically a catch-all for anything that doesn't fit
> in one of the other .c files in drivers/pci. I'm slightly worried that
> this (otherwise legitimate) patch increases the clutter in pci.c further,
> rendering it unmaintainable in the long term.
+1 The reset-related content in drivers/pci/pci.c has been growing
recently. Maybe we should consider moving it all to a reset.c file.
> At the very least, I'm wondering if this can be #ifdef'ed to
> CONFIG_CXL_PCI?
>
> One idea would be to move this newly added reset method, as well as the
> existing cxl_reset_bus_function(), to a new drivers/pci/cxl.c file.
>
> I guess moving it to drivers/cxl/ isn't an option because cxl can be
> modular.
>
> Another idea would be to move all the reset handling (which makes up
> a significant portion of pci.c) to a separate drivers/pci/reset.c.
> This might be beyond the scope of your patch, but in the interim,
> maybe at least an #ifdef can be added because the PCI core is also
> used e.g. on memory-constrained wifi routers which don't care about
> CXL at all.
Agree, we'll need some way to make this optional.
Bjorn
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] cxl: add support for cxl reset
2025-02-21 4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
2025-02-21 10:45 ` Lukas Wunner
2025-02-21 12:43 ` Alejandro Lucero Palau
@ 2025-02-22 5:45 ` kernel test robot
2025-02-22 7:08 ` kernel test robot
3 siblings, 0 replies; 10+ messages in thread
From: kernel test robot @ 2025-02-22 5:45 UTC (permalink / raw)
To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
Cc: oe-kbuild-all, Zhi Wang, Vishal Aslot, Shanker Donthineni,
linux-cxl, Bjorn Helgaas, linux-pci
Hi Srirangan,
kernel test robot noticed the following build warnings:
[auto build test WARNING on next-20250220]
[cannot apply to pci/next pci/for-linus linus/master v6.14-rc3 v6.14-rc2 v6.14-rc1 v6.14-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Srirangan-Madhavan/cxl-de-duplicate-cxl-DVSEC-reg-defines/20250221-124043
base: next-20250220
patch link: https://lore.kernel.org/r/20250221043906.1593189-3-smadhavan%40nvidia.com
patch subject: [PATCH v2 2/2] cxl: add support for cxl reset
config: arc-randconfig-001-20250222 (https://download.01.org/0day-ci/archive/20250222/202502221354.Zmyi5Mgf-lkp@intel.com/config)
compiler: arc-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250222/202502221354.Zmyi5Mgf-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502221354.Zmyi5Mgf-lkp@intel.com/
All warnings (new ones prefixed by >>):
drivers/pci/pci.c: In function 'cxl_reset':
>> drivers/pci/pci.c:5258:17: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
5258 | if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
| ^
vim +5258 drivers/pci/pci.c
5235
5236 /**
5237 * cxl_reset - initiate a cxl reset
5238 * @dev: device to reset
5239 * @probe: if true, return 0 if device can be reset this way
5240 *
5241 * Initiate a cxl reset on @dev.
5242 */
5243 static int cxl_reset(struct pci_dev *dev, bool probe)
5244 {
5245 u16 dvsec, reg;
5246 int rc;
5247
5248 dvsec = pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
5249 CXL_DVSEC_PCIE_DEVICE);
5250 if (!dvsec)
5251 return -ENOTTY;
5252
5253 /* Check if CXL Reset is supported. */
5254 rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, ®);
5255 if (rc)
5256 return -ENOTTY;
5257
> 5258 if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
5259 return -ENOTTY;
5260
5261 if (probe)
5262 return 0;
5263
5264 rc = cxl_reset_prepare(dev, dvsec);
5265 if (rc)
5266 return rc;
5267
5268 return cxl_reset_init(dev, dvsec);
5269 }
5270
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] cxl: add support for cxl reset
2025-02-21 4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
` (2 preceding siblings ...)
2025-02-22 5:45 ` kernel test robot
@ 2025-02-22 7:08 ` kernel test robot
3 siblings, 0 replies; 10+ messages in thread
From: kernel test robot @ 2025-02-22 7:08 UTC (permalink / raw)
To: Srirangan Madhavan, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
Cc: llvm, oe-kbuild-all, Zhi Wang, Vishal Aslot, Shanker Donthineni,
linux-cxl, Bjorn Helgaas, linux-pci
Hi Srirangan,
kernel test robot noticed the following build warnings:
[auto build test WARNING on next-20250220]
[cannot apply to pci/next pci/for-linus linus/master v6.14-rc3 v6.14-rc2 v6.14-rc1 v6.14-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Srirangan-Madhavan/cxl-de-duplicate-cxl-DVSEC-reg-defines/20250221-124043
base: next-20250220
patch link: https://lore.kernel.org/r/20250221043906.1593189-3-smadhavan%40nvidia.com
patch subject: [PATCH v2 2/2] cxl: add support for cxl reset
config: arm64-randconfig-003-20250222 (https://download.01.org/0day-ci/archive/20250222/202502221438.j0UgOryU-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250222/202502221438.j0UgOryU-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502221438.j0UgOryU-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/pci/pci.c:5258:10: warning: & has lower precedence than ==; == will be evaluated first [-Wparentheses]
5258 | if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/pci/pci.c:5258:10: note: place parentheses around the '==' expression to silence this warning
5258 | if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/pci/pci.c:5258:10: note: place parentheses around the & expression to evaluate it first
5258 | if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
| ~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
vim +5258 drivers/pci/pci.c
5235
5236 /**
5237 * cxl_reset - initiate a cxl reset
5238 * @dev: device to reset
5239 * @probe: if true, return 0 if device can be reset this way
5240 *
5241 * Initiate a cxl reset on @dev.
5242 */
5243 static int cxl_reset(struct pci_dev *dev, bool probe)
5244 {
5245 u16 dvsec, reg;
5246 int rc;
5247
5248 dvsec = pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
5249 CXL_DVSEC_PCIE_DEVICE);
5250 if (!dvsec)
5251 return -ENOTTY;
5252
5253 /* Check if CXL Reset is supported. */
5254 rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_CAP_OFFSET, ®);
5255 if (rc)
5256 return -ENOTTY;
5257
> 5258 if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)
5259 return -ENOTTY;
5260
5261 if (probe)
5262 return 0;
5263
5264 rc = cxl_reset_prepare(dev, dvsec);
5265 if (rc)
5266 return rc;
5267
5268 return cxl_reset_init(dev, dvsec);
5269 }
5270
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] cxl: add support for cxl reset
2025-02-22 0:13 ` Bjorn Helgaas
@ 2025-04-07 23:45 ` Srirangan Madhavan
2025-08-22 15:33 ` Alejandro Lucero Palau
0 siblings, 1 reply; 10+ messages in thread
From: Srirangan Madhavan @ 2025-04-07 23:45 UTC (permalink / raw)
To: Bjorn Helgaas, Lukas Wunner, Alejandro Lucero
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Zhi Wang, Vishal Aslot,
Shanker Donthineni, linux-cxl@vger.kernel.org, Bjorn Helgaas,
linux-pci@vger.kernel.org
Thank you so much for the comments Alejandro, Lukas and Bjorn.
My apologies for the late response. I was out of office for a few weeks.
I am picking up this patch again and adding my responses.
> I think cxl_reset_start after the *_prepare call makes more sense here.
@Alejandro, Just for clarity,
do you mean I should rename the cxl_reset_init -> cxl_reset_start?
> I do not know how safe is this. IMO, this needs to be synchronized by
> the accel driver which could imply to tell user space first. In our case
> it would imply to stop rx queues in the netdev for CXL.cache, and tx
> queues for CXL.mem. Doing it unconditionally could make current CXL
> transactions to stall ... although it could be argued the reset event
> implies something is broken, but let's try to do it properly if there is
> a chance of the system not unreliable.
Regarding this, we feel that the reset framework already
has a *reset_prepare and is called by the Linux kernel, before initiating the steps
for acutal CXL reset. During this reset prepare is when the accel driver should
quiesce its device. In this case, that would imply stopping the rx & tx and
draining the queues. Is there any particular reason this wouldn't work/be sufficient?
> One thing this does not cover, and I do not know if it should, is the
> fact that the device CXL regs will be reset, so the question is if the
> old values should be restored or the device/driver should go through the
> same initialization, if a hotplug device, or do it specifically if
> already present at boot time and the BIOS doing that first
> initialization. In one case the restoration needs to happen, in the
> other the old values/objects need to be removed. I think the second case
> is more problematic because this is likely involving CXL root complex
> configuration performed by the BIOS ... Not trivial at all IMO.
Here again, we think that the accel driver is the one that should be doing this.
In our case, when the reset done call is made, the driver before making the device
available to be reopened and used, needs to restore the config space/DVSEC etc. to
their prior state. During the reset_prepare stage these need to be saved. Since this
is how SBR is handled currently even in the PCIe world, (for ex. AER/EEH handling framework)
where driver is responsible for save and restore of the regs, and it is also not possible
for the kernel to know exactly what to save and restore for each specific type 2 devices,
it seems logical to do the same here.
> +1 The reset-related content in drivers/pci/pci.c has been growing
> recently. Maybe we should consider moving it all to a reset.c file.
This makes sense. I'll prepare a patch to move the reset code and
compile out CXL specific parts while making any changes for the next version.
Thank you.
Regards,
Srirangan.
________________________________________
From: Bjorn Helgaas <helgaas@kernel.org>
Sent: Friday, February 21, 2025 4:13 PM
To: Lukas Wunner
Cc: Srirangan Madhavan; Davidlohr Bueso; Jonathan Cameron; Dave Jiang; Alison Schofield; Vishal Verma; Ira Weiny; Dan Williams; Zhi Wang; Vishal Aslot; Shanker Donthineni; linux-cxl@vger.kernel.org; Bjorn Helgaas; linux-pci@vger.kernel.org
Subject: Re: [PATCH v2 2/2] cxl: add support for cxl reset
External email: Use caution opening links or attachments
On Fri, Feb 21, 2025 at 11:45:56AM +0100, Lukas Wunner wrote:
> On Thu, Feb 20, 2025 at 08:39:06PM -0800, Srirangan Madhavan wrote:
> > Type 2 devices are being introduced and will require finer-grained
> > reset mechanisms beyond bus-wide reset methods.
> >
> > Add support for CXL reset per CXL v3.2 Section 9.6/9.7
> >
> > Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
> > ---
> > drivers/pci/pci.c | 146 ++++++++++++++++++++++++++++++++++++++++++++
>
> drivers/pci/pci.c is basically a catch-all for anything that doesn't fit
> in one of the other .c files in drivers/pci. I'm slightly worried that
> this (otherwise legitimate) patch increases the clutter in pci.c further,
> rendering it unmaintainable in the long term.
+1 The reset-related content in drivers/pci/pci.c has been growing
recently. Maybe we should consider moving it all to a reset.c file.
> At the very least, I'm wondering if this can be #ifdef'ed to
> CONFIG_CXL_PCI?
>
> One idea would be to move this newly added reset method, as well as the
> existing cxl_reset_bus_function(), to a new drivers/pci/cxl.c file.
>
> I guess moving it to drivers/cxl/ isn't an option because cxl can be
> modular.
>
> Another idea would be to move all the reset handling (which makes up
> a significant portion of pci.c) to a separate drivers/pci/reset.c.
> This might be beyond the scope of your patch, but in the interim,
> maybe at least an #ifdef can be added because the PCI core is also
> used e.g. on memory-constrained wifi routers which don't care about
> CXL at all.
Agree, we'll need some way to make this optional.
Bjorn
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] cxl: add support for cxl reset
2025-04-07 23:45 ` Srirangan Madhavan
@ 2025-08-22 15:33 ` Alejandro Lucero Palau
0 siblings, 0 replies; 10+ messages in thread
From: Alejandro Lucero Palau @ 2025-08-22 15:33 UTC (permalink / raw)
To: Srirangan Madhavan, Bjorn Helgaas, Lukas Wunner
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Zhi Wang, Vishal Aslot,
Shanker Donthineni, linux-cxl@vger.kernel.org, Bjorn Helgaas,
linux-pci@vger.kernel.org
Hi,
I'm sorry I missed your reply. Some comments below.
On 4/8/25 00:45, Srirangan Madhavan wrote:
> Thank you so much for the comments Alejandro, Lukas and Bjorn.
> My apologies for the late response. I was out of office for a few weeks.
> I am picking up this patch again and adding my responses.
>
>> I think cxl_reset_start after the *_prepare call makes more sense here.
> @Alejandro, Just for clarity,
> do you mean I should rename the cxl_reset_init -> cxl_reset_start?
Yes, that's my view. init and prepare are similar, so prepare or init,
then start.
>> I do not know how safe is this. IMO, this needs to be synchronized by
>> the accel driver which could imply to tell user space first. In our case
>> it would imply to stop rx queues in the netdev for CXL.cache, and tx
>> queues for CXL.mem. Doing it unconditionally could make current CXL
>> transactions to stall ... although it could be argued the reset event
>> implies something is broken, but let's try to do it properly if there is
>> a chance of the system not unreliable.
> Regarding this, we feel that the reset framework already
> has a *reset_prepare and is called by the Linux kernel, before initiating the steps
> for acutal CXL reset. During this reset prepare is when the accel driver should
> quiesce its device. In this case, that would imply stopping the rx & tx and
> draining the queues. Is there any particular reason this wouldn't work/be sufficient?
I did not see the connection to the driver/device in the patch so I
assumed it was not there, but after your comment, I studied the code and
it is just a matter of implementing such reset prepare function in our
driver, and I think that solves the problem I saw.
>> One thing this does not cover, and I do not know if it should, is the
>> fact that the device CXL regs will be reset, so the question is if the
>> old values should be restored or the device/driver should go through the
>> same initialization, if a hotplug device, or do it specifically if
>> already present at boot time and the BIOS doing that first
>> initialization. In one case the restoration needs to happen, in the
>> other the old values/objects need to be removed. I think the second case
>> is more problematic because this is likely involving CXL root complex
>> configuration performed by the BIOS ... Not trivial at all IMO.
> Here again, we think that the accel driver is the one that should be doing this.
> In our case, when the reset done call is made, the driver before making the device
> available to be reopened and used, needs to restore the config space/DVSEC etc. to
> their prior state. During the reset_prepare stage these need to be saved. Since this
> is how SBR is handled currently even in the PCIe world, (for ex. AER/EEH handling framework)
> where driver is responsible for save and restore of the regs, and it is also not possible
> for the kernel to know exactly what to save and restore for each specific type 2 devices,
> it seems logical to do the same here.
At the time I wrote that comment, it was not clear in my mind how this
should work, and I had doubts about the BIOS.
But I can say now all this is fine and doable.
From you comment (not in the thread but through discord) about being
the sfc driver the client for this functionality, I think it could be.
I'm not sure if as part of the current Type2 patchset effort or a
follow-up work. I'll do some work the next week for seeing the implications.
Thank you
>> +1 The reset-related content in drivers/pci/pci.c has been growing
>> recently. Maybe we should consider moving it all to a reset.c file.
> This makes sense. I'll prepare a patch to move the reset code and
> compile out CXL specific parts while making any changes for the next version.
>
> Thank you.
>
> Regards,
> Srirangan.
> ________________________________________
> From: Bjorn Helgaas <helgaas@kernel.org>
> Sent: Friday, February 21, 2025 4:13 PM
> To: Lukas Wunner
> Cc: Srirangan Madhavan; Davidlohr Bueso; Jonathan Cameron; Dave Jiang; Alison Schofield; Vishal Verma; Ira Weiny; Dan Williams; Zhi Wang; Vishal Aslot; Shanker Donthineni; linux-cxl@vger.kernel.org; Bjorn Helgaas; linux-pci@vger.kernel.org
> Subject: Re: [PATCH v2 2/2] cxl: add support for cxl reset
>
> External email: Use caution opening links or attachments
>
>
> On Fri, Feb 21, 2025 at 11:45:56AM +0100, Lukas Wunner wrote:
>> On Thu, Feb 20, 2025 at 08:39:06PM -0800, Srirangan Madhavan wrote:
>>> Type 2 devices are being introduced and will require finer-grained
>>> reset mechanisms beyond bus-wide reset methods.
>>>
>>> Add support for CXL reset per CXL v3.2 Section 9.6/9.7
>>>
>>> Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
>>> ---
>>> drivers/pci/pci.c | 146 ++++++++++++++++++++++++++++++++++++++++++++
>> drivers/pci/pci.c is basically a catch-all for anything that doesn't fit
>> in one of the other .c files in drivers/pci. I'm slightly worried that
>> this (otherwise legitimate) patch increases the clutter in pci.c further,
>> rendering it unmaintainable in the long term.
> +1 The reset-related content in drivers/pci/pci.c has been growing
> recently. Maybe we should consider moving it all to a reset.c file.
>
>> At the very least, I'm wondering if this can be #ifdef'ed to
>> CONFIG_CXL_PCI?
>>
>> One idea would be to move this newly added reset method, as well as the
>> existing cxl_reset_bus_function(), to a new drivers/pci/cxl.c file.
>>
>> I guess moving it to drivers/cxl/ isn't an option because cxl can be
>> modular.
>>
>> Another idea would be to move all the reset handling (which makes up
>> a significant portion of pci.c) to a separate drivers/pci/reset.c.
>> This might be beyond the scope of your patch, but in the interim,
>> maybe at least an #ifdef can be added because the PCI core is also
>> used e.g. on memory-constrained wifi routers which don't care about
>> CXL at all.
> Agree, we'll need some way to make this optional.
>
> Bjorn
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-08-22 15:33 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-21 4:39 [PATCH v2 0/2] Add CXL Reset Support for CXL Devices Srirangan Madhavan
2025-02-21 4:39 ` [PATCH v2 1/2] cxl: de-duplicate cxl DVSEC reg defines Srirangan Madhavan
2025-02-21 4:39 ` [PATCH v2 2/2] cxl: add support for cxl reset Srirangan Madhavan
2025-02-21 10:45 ` Lukas Wunner
2025-02-22 0:13 ` Bjorn Helgaas
2025-04-07 23:45 ` Srirangan Madhavan
2025-08-22 15:33 ` Alejandro Lucero Palau
2025-02-21 12:43 ` Alejandro Lucero Palau
2025-02-22 5:45 ` kernel test robot
2025-02-22 7:08 ` kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox