* [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL
@ 2024-03-11 20:39 Dave Jiang
2024-03-11 20:39 ` [PATCH 1/3] PCI: Add check for CXL Secondary Bus Reset Dave Jiang
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Dave Jiang @ 2024-03-11 20:39 UTC (permalink / raw)
To: linux-cxl, linux-pci
Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
Jonathan.Cameron, dave, bhelgaas, lukas
Hi Bjorn,
Please consider this series for kernel 6.10. The series attempt to
add secondary bus reset (SBR) support to CXL. By default, SBR for CXL is
masked. Per CXL specification r3.1 8.1.5.2, the Port Control Extensions
register bit 0 (Unmask SBR) in the host bridge controls the masking of CXL SBR.
"When 0, SBR bit in Bridge Control register of this Port has no effect. When 1,
the Port shall generate hot reset when SBR in Bridge Control gets set to 1.
Default value of this bit is 0. When the Port is operating in PCIe mode or RCD
mode, this field has no effect on SBR functionality and Port shall follow PCIe
Base Specification."
Patch 1:
Add check to PCI bus_reset path for CXL device and return with error if "Unmask
SBR" bit is set to 0. This allows user to realize that SBR is masked for this
CXL device. However, if the user sets the "Unmask SBR" bit via a tool such as
setpci, then the bus_reset will proceed.
Patch2:
Add a new PCI reset method "cxl_bus_force" in order to allow the user an
intetional way to perform SBR. The code will set the "Unmask SBR" bit to
1 and proceed with bus_reset. The original value of the bit will be restored
after the reset operation.
Patch3:
CXL driver change that provides a ->reset_done() callback. It compares the
hardware decoder settings with the existing software configuration and emit
warning if they differ. The difference indicates that decoders were programmed
before the reset and are now cleared after reset. There may be onlined system
memory backed by CXL memory device that are now violently ripped away from
kernel mapping.
Patch series stemmed from this [1] patch. With comments [2] from Bjorn.
[1]: https://lore.kernel.org/linux-cxl/20240215232307.2793530-1-dave.jiang@intel.com/
[2]: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/3] PCI: Add check for CXL Secondary Bus Reset
2024-03-11 20:39 [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL Dave Jiang
@ 2024-03-11 20:39 ` Dave Jiang
2024-03-12 7:30 ` Lukas Wunner
2024-03-11 20:39 ` [PATCH 2/3] PCI: Create new reset method to force SBR for CXL Dave Jiang
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Dave Jiang @ 2024-03-11 20:39 UTC (permalink / raw)
To: linux-cxl, linux-pci
Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
Jonathan.Cameron, dave, bhelgaas, lukas
Per CXL spec r3.1 8.1.5.2, secondary bus reset is masked unless the
"Unmask SBR" bit is set. Add a check to the PCI secondary bus reset
path to fail the CXL SBR request if the "Unmask SBR" bit is clear in
the CXL Port Control Extensions regiser by returning -EPERM.
The expectation is that if a user overrides the "Unmask SBR" via a
user tool such as setpci, they can trigger a bus reset after that.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/cxl/cxlpci.h | 2 --
drivers/pci/pci.c | 45 +++++++++++++++++++++++++++++++++++
include/uapi/linux/pci_regs.h | 7 ++++++
3 files changed, 52 insertions(+), 2 deletions(-)
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 711b05d9a370..8d7952d7ca59 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -13,10 +13,8 @@
* "DVSEC" redundancies removed. When obvious, abbreviations may be used.
*/
#define PCI_DVSEC_HEADER1_LENGTH_MASK GENMASK(31, 20)
-#define PCI_DVSEC_VENDOR_ID_CXL 0x1E98
/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
-#define CXL_DVSEC_PCIE_DEVICE 0
#define CXL_DVSEC_CAP_OFFSET 0xA
#define CXL_DVSEC_MEM_CAPABLE BIT(2)
#define CXL_DVSEC_HDM_COUNT_MASK GENMASK(5, 4)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c3585229c12a..5e5550f54053 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5347,10 +5347,55 @@ static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
}
+static bool is_cxl_device(struct pci_dev *dev)
+{
+ return pci_find_dvsec_capability(dev, PCI_DVSEC_VENDOR_ID_CXL,
+ CXL_DVSEC_PCIE_DEVICE);
+}
+
+static bool is_cxl_port_sbr_masked(struct pci_dev *dev)
+{
+ int dvsec;
+ int rc;
+ u16 reg;
+
+ /*
+ * No DVSEC found, must not be CXL port.
+ */
+ dvsec = pci_find_dvsec_capability(dev, PCI_DVSEC_VENDOR_ID_CXL,
+ CXL_DVSEC_PCIE_PORT);
+ if (!dvsec)
+ return false;
+
+ rc = pci_read_config_word(dev, dvsec + CXL_DVSEC_PORT_CONTROL, ®);
+ if (rc)
+ return true;
+
+ /*
+ * CXL spec r3.1 8.1.5.2
+ * When 0, SBR bit in Bridge Control register of this Port has no effect.
+ * When 1, the Port shall generate hot reset when SBR bit in Bridge
+ * Control gets set to 1.
+ */
+ if (reg & CXL_DVSEC_PORT_CONTROL_UNMASK_SBR)
+ return false;
+
+ return true;
+}
+
static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
{
int rc;
+ /* If it's a CXL port and the SBR control is masked, fail the SBR */
+ if (is_cxl_device(dev) && dev->bus->self &&
+ is_cxl_port_sbr_masked(dev->bus->self)) {
+ if (probe)
+ return 0;
+
+ return -EPERM;
+ }
+
rc = pci_dev_reset_slot_function(dev, probe);
if (rc != -ENOTTY)
return rc;
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index a39193213ff2..5f2c66987299 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -1148,4 +1148,11 @@
#define PCI_DOE_DATA_OBJECT_DISC_RSP_3_PROTOCOL 0x00ff0000
#define PCI_DOE_DATA_OBJECT_DISC_RSP_3_NEXT_INDEX 0xff000000
+/* Compute Express Link (CXL) */
+#define PCI_DVSEC_VENDOR_ID_CXL 0x1e98
+#define CXL_DVSEC_PCIE_DEVICE 0
+#define CXL_DVSEC_PCIE_PORT 3
+#define CXL_DVSEC_PORT_CONTROL 0x0c
+#define CXL_DVSEC_PORT_CONTROL_UNMASK_SBR 0x00000001
+
#endif /* LINUX_PCI_REGS_H */
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] PCI: Create new reset method to force SBR for CXL
2024-03-11 20:39 [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL Dave Jiang
2024-03-11 20:39 ` [PATCH 1/3] PCI: Add check for CXL Secondary Bus Reset Dave Jiang
@ 2024-03-11 20:39 ` Dave Jiang
2024-03-12 7:50 ` Lukas Wunner
2024-03-11 20:39 ` [PATCH 3/3] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR) Dave Jiang
2024-03-12 7:46 ` [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL Lukas Wunner
3 siblings, 1 reply; 9+ messages in thread
From: Dave Jiang @ 2024-03-11 20:39 UTC (permalink / raw)
To: linux-cxl, linux-pci
Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
Jonathan.Cameron, dave, bhelgaas, lukas
CXL spec r3.1 8.1.5.2
By default Secondary Bus Reset (SBR) is masked for CXL ports. Introduce a
new PCI reset method "cxl_bus_force" to force SBR on CXL ports by setting
the unmask SBR bit in the CXL DVSEC port control register before performing
the bus reset and restore the original value of the bit post reset.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/pci/pci.c | 52 +++++++++++++++++++++++++++++++++++++++++++--
include/linux/pci.h | 2 +-
2 files changed, 51 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 5e5550f54053..bbda7781f0f8 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5353,6 +5353,12 @@ static bool is_cxl_device(struct pci_dev *dev)
CXL_DVSEC_PCIE_DEVICE);
}
+static int cxl_port_dvsec(struct pci_dev *dev)
+{
+ return pci_find_dvsec_capability(dev, PCI_DVSEC_VENDOR_ID_CXL,
+ CXL_DVSEC_PCIE_PORT);
+}
+
static bool is_cxl_port_sbr_masked(struct pci_dev *dev)
{
int dvsec;
@@ -5362,8 +5368,7 @@ static bool is_cxl_port_sbr_masked(struct pci_dev *dev)
/*
* No DVSEC found, must not be CXL port.
*/
- dvsec = pci_find_dvsec_capability(dev, PCI_DVSEC_VENDOR_ID_CXL,
- CXL_DVSEC_PCIE_PORT);
+ dvsec = cxl_port_dvsec(dev);
if (!dvsec)
return false;
@@ -5402,6 +5407,48 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
return pci_parent_bus_reset(dev, probe);
}
+static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
+{
+ struct pci_dev *port_pdev;
+ int dvsec;
+ int rc;
+ u16 reg, val;
+
+ if (!is_cxl_device(dev))
+ return -ENODEV;
+
+ port_pdev = dev->bus->self;
+ if (!port_pdev)
+ return -ENODEV;
+
+ dvsec = cxl_port_dvsec(port_pdev);
+ if (!dvsec)
+ return -ENODEV;
+
+ if (probe)
+ return 0;
+
+ rc = pci_read_config_word(port_pdev, dvsec + CXL_DVSEC_PORT_CONTROL,
+ ®);
+ if (rc)
+ return -ENXIO;
+
+ if (!(reg & CXL_DVSEC_PORT_CONTROL_UNMASK_SBR)) {
+ val = reg | CXL_DVSEC_PORT_CONTROL_UNMASK_SBR;
+ pci_write_config_word(port_pdev,
+ dvsec + CXL_DVSEC_PORT_CONTROL, val);
+ } else {
+ val = reg;
+ }
+
+ rc = pci_reset_bus_function(dev, probe);
+
+ if (reg != val)
+ pci_write_config_word(port_pdev, dvsec + CXL_DVSEC_PORT_CONTROL, reg);
+
+ return rc;
+}
+
void pci_dev_lock(struct pci_dev *dev)
{
/* block PM suspend, driver probe, etc. */
@@ -5486,6 +5533,7 @@ static const struct pci_reset_fn_method pci_reset_fn_methods[] = {
{ pci_af_flr, .name = "af_flr" },
{ pci_pm_reset, .name = "pm" },
{ pci_reset_bus_function, .name = "bus" },
+ { cxl_reset_bus_function, .name = "cxl_bus_force" },
};
static ssize_t reset_method_show(struct device *dev,
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7ab0d13672da..20c862c6b20f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -51,7 +51,7 @@
PCI_STATUS_PARITY)
/* Number of reset methods used in pci_reset_fn_methods array in pci.c */
-#define PCI_NUM_RESET_METHODS 7
+#define PCI_NUM_RESET_METHODS 8
#define PCI_RESET_PROBE true
#define PCI_RESET_DO_RESET false
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR)
2024-03-11 20:39 [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL Dave Jiang
2024-03-11 20:39 ` [PATCH 1/3] PCI: Add check for CXL Secondary Bus Reset Dave Jiang
2024-03-11 20:39 ` [PATCH 2/3] PCI: Create new reset method to force SBR for CXL Dave Jiang
@ 2024-03-11 20:39 ` Dave Jiang
2024-03-12 7:46 ` [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL Lukas Wunner
3 siblings, 0 replies; 9+ messages in thread
From: Dave Jiang @ 2024-03-11 20:39 UTC (permalink / raw)
To: linux-cxl, linux-pci
Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
Jonathan.Cameron, dave, bhelgaas, lukas
SBR is equivalent to a device been hot removed and inserted again. Doing a
SBR on a CXL type 3 device is problematic if the exported device memory is
part of system memory that cannot be offlined. The event is equivalent to
violently ripping out that range of memory from the kernel. While the
hardware requires the "Unmask SBR" bit set in the Port Control Extensions
register and the kernel currently does not unmask it, user can unmask
this bit via setpci or similar tool.
The driver does not have a way to detect whether a reset coming from the
PCI subsystem is a Function Level Reset (FLR) or SBR. The only way to
detect is to note if a decoder is marked as enabled in software but the
decoder control register indicates it's not committed.
A helper function is added to find discrepency between the decoder
software state versus the hardware register state.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v2:
- Check decoder flags vs decoder registers for discrepency. (Dan)
---
drivers/cxl/core/port.c | 30 ++++++++++++++++++++++++++++++
drivers/cxl/cxl.h | 2 ++
drivers/cxl/pci.c | 19 +++++++++++++++++++
3 files changed, 51 insertions(+)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index e59d9d37aa65..4de0013f9cc7 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -2170,6 +2170,36 @@ int cxl_endpoint_get_perf_coordinates(struct cxl_port *port,
}
EXPORT_SYMBOL_NS_GPL(cxl_endpoint_get_perf_coordinates, CXL);
+static int decoder_hw_mismatch(struct device *dev, void *data)
+{
+ struct cxl_endpoint_decoder *cxled;
+ struct cxl_port *port = data;
+ struct cxl_decoder *cxld;
+ struct cxl_hdm *cxlhdm;
+ void __iomem *hdm;
+ u32 ctrl;
+
+ if (!is_endpoint_decoder(dev))
+ return 0;
+
+ cxled = to_cxl_endpoint_decoder(dev);
+ if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
+ return 0;
+
+ cxld = &cxled->cxld;
+ cxlhdm = dev_get_drvdata(&port->dev);
+ hdm = cxlhdm->regs.hdm_decoder;
+ ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
+
+ return !FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl);
+}
+
+bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port)
+{
+ return device_for_each_child(&port->dev, port, decoder_hw_mismatch);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, CXL);
+
/* for user tooling to ensure port disable work has completed */
static ssize_t flush_store(const struct bus_type *bus, const char *buf, size_t count)
{
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 003feebab79b..e42b1f3f9288 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -882,6 +882,8 @@ int cxl_endpoint_get_perf_coordinates(struct cxl_port *port,
void cxl_memdev_update_perf(struct cxl_memdev *cxlmd);
+bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
+
/*
* Unit test builds overrides this to __weak, find the 'strong' version
* of these symbols in tools/testing/cxl/.
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 13450e75f5eb..78457542aeec 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -957,11 +957,30 @@ static void cxl_error_resume(struct pci_dev *pdev)
dev->driver ? "successful" : "failed");
}
+static void cxl_reset_done(struct pci_dev *pdev)
+{
+ struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
+ struct cxl_memdev *cxlmd = cxlds->cxlmd;
+ struct device *dev = &pdev->dev;
+
+ /*
+ * FLR does not expect to touch the HDM decoders and related registers.
+ * SBR however will wipe all device configurations.
+ * Issue warning if there was active decoder before reset that no
+ * longer exists.
+ */
+ if (cxl_endpoint_decoder_reset_detected(cxlmd->endpoint)) {
+ dev_warn(dev, "SBR happened without memory regions removal.\n");
+ dev_warn(dev, "System may be unstable if regions hosted system memory.\n");
+ }
+}
+
static const struct pci_error_handlers cxl_error_handlers = {
.error_detected = cxl_error_detected,
.slot_reset = cxl_slot_reset,
.resume = cxl_error_resume,
.cor_error_detected = cxl_cor_error_detected,
+ .reset_done = cxl_reset_done,
};
static struct pci_driver cxl_pci_driver = {
--
2.44.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] PCI: Add check for CXL Secondary Bus Reset
2024-03-11 20:39 ` [PATCH 1/3] PCI: Add check for CXL Secondary Bus Reset Dave Jiang
@ 2024-03-12 7:30 ` Lukas Wunner
2024-03-12 21:35 ` Dave Jiang
0 siblings, 1 reply; 9+ messages in thread
From: Lukas Wunner @ 2024-03-12 7:30 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
alison.schofield, Jonathan.Cameron, dave, bhelgaas
On Mon, Mar 11, 2024 at 01:39:53PM -0700, Dave Jiang wrote:
> +static bool is_cxl_device(struct pci_dev *dev)
> +{
> + return pci_find_dvsec_capability(dev, PCI_DVSEC_VENDOR_ID_CXL,
> + CXL_DVSEC_PCIE_DEVICE);
> +}
If this was my bikeshed, I'd call it pci_is_cxl() to match pci_is_pcie().
> +static bool is_cxl_port_sbr_masked(struct pci_dev *dev)
> +{
> + int dvsec;
> + int rc;
> + u16 reg;
Nit: Inverse Christmas tree?
> static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
> {
> int rc;
>
> + /* If it's a CXL port and the SBR control is masked, fail the SBR */
> + if (is_cxl_device(dev) && dev->bus->self &&
> + is_cxl_port_sbr_masked(dev->bus->self)) {
> + if (probe)
> + return 0;
> +
> + return -EPERM;
> + }
> +
Is this also necessary if CONFIG_CXL_PCI=n?
Return code on non-availability of a reset method is generally -ENOTTY.
Or is the choice deliberate to expose this reset method despite the bit
being set and thus allow user space to unmask it in the first place?
Also, we mostly use pci_upstream_bridge(dev) in lieu of dev->bus->self.
Is the choice to use the latter deliberate because maybe is_virtfn is
never set and the device can never be on the root bus? (What about
RCiEP CXL devices?)
Thanks,
Lukas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL
2024-03-11 20:39 [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL Dave Jiang
` (2 preceding siblings ...)
2024-03-11 20:39 ` [PATCH 3/3] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR) Dave Jiang
@ 2024-03-12 7:46 ` Lukas Wunner
2024-03-12 21:31 ` Dave Jiang
3 siblings, 1 reply; 9+ messages in thread
From: Lukas Wunner @ 2024-03-12 7:46 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
alison.schofield, Jonathan.Cameron, dave, bhelgaas
On Mon, Mar 11, 2024 at 01:39:52PM -0700, Dave Jiang wrote:
> Patch 1:
> Add check to PCI bus_reset path for CXL device and return with error if "Unmask
> SBR" bit is set to 0. This allows user to realize that SBR is masked for this
> CXL device. However, if the user sets the "Unmask SBR" bit via a tool such as
> setpci, then the bus_reset will proceed.
So is the point of patch 1 only to inform the user that the SBR has
no effect? Or does the SBR have any negative side effects that you
want to avoid (e.g. due to the config space save/restore)?
If you only want to inform the user, then this functionality could
live in a ->reset_prepare() callback exposed by the cxl subsystem
and the pci subsystem wouldn't have to be touched at all.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] PCI: Create new reset method to force SBR for CXL
2024-03-11 20:39 ` [PATCH 2/3] PCI: Create new reset method to force SBR for CXL Dave Jiang
@ 2024-03-12 7:50 ` Lukas Wunner
0 siblings, 0 replies; 9+ messages in thread
From: Lukas Wunner @ 2024-03-12 7:50 UTC (permalink / raw)
To: Dave Jiang
Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
alison.schofield, Jonathan.Cameron, dave, bhelgaas
On Mon, Mar 11, 2024 at 01:39:54PM -0700, Dave Jiang wrote:
> CXL spec r3.1 8.1.5.2
> By default Secondary Bus Reset (SBR) is masked for CXL ports. Introduce a
> new PCI reset method "cxl_bus_force" to force SBR on CXL ports by setting
> the unmask SBR bit in the CXL DVSEC port control register before performing
> the bus reset and restore the original value of the bit post reset.
Hm, why not have a sysfs attribute (or sysctl variable) to unmask SBR?
That would avoid the need to touch pci code as the sysfs attribute could
be kept local to the cxl subsystem.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL
2024-03-12 7:46 ` [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL Lukas Wunner
@ 2024-03-12 21:31 ` Dave Jiang
0 siblings, 0 replies; 9+ messages in thread
From: Dave Jiang @ 2024-03-12 21:31 UTC (permalink / raw)
To: Lukas Wunner
Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
alison.schofield, Jonathan.Cameron, dave, bhelgaas
On 3/12/24 12:46 AM, Lukas Wunner wrote:
> On Mon, Mar 11, 2024 at 01:39:52PM -0700, Dave Jiang wrote:
>> Patch 1:
>> Add check to PCI bus_reset path for CXL device and return with error if "Unmask
>> SBR" bit is set to 0. This allows user to realize that SBR is masked for this
>> CXL device. However, if the user sets the "Unmask SBR" bit via a tool such as
>> setpci, then the bus_reset will proceed.
>
> So is the point of patch 1 only to inform the user that the SBR has
> no effect? Or does the SBR have any negative side effects that you
> want to avoid (e.g. due to the config space save/restore)?
>
> If you only want to inform the user, then this functionality could
> live in a ->reset_prepare() callback exposed by the cxl subsystem
> and the pci subsystem wouldn't have to be touched at all.
Patch 1 is to inform the user that SBR has no effect. The user needs to be informed via direct errno feedback that reset is masked and isn't going to happen. In this [1] response, I believe Bjorn wanted pci_reset_function() to fail. ->reset_prepare() is only helpful if the cxl_pci driver is loaded. CXL mem devices can be programmed by the BIOS and be active without CXL driver being loaded. Also, cxl_pci only picks up PCI_CLASS_MEMORY_CXL. So other CXL devices would not be managed by this driver.
[1]: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/
>
> Thanks,
>
> Lukas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] PCI: Add check for CXL Secondary Bus Reset
2024-03-12 7:30 ` Lukas Wunner
@ 2024-03-12 21:35 ` Dave Jiang
0 siblings, 0 replies; 9+ messages in thread
From: Dave Jiang @ 2024-03-12 21:35 UTC (permalink / raw)
To: Lukas Wunner
Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
alison.schofield, Jonathan.Cameron, dave, bhelgaas
On 3/12/24 12:30 AM, Lukas Wunner wrote:
> On Mon, Mar 11, 2024 at 01:39:53PM -0700, Dave Jiang wrote:
>> +static bool is_cxl_device(struct pci_dev *dev)
>> +{
>> + return pci_find_dvsec_capability(dev, PCI_DVSEC_VENDOR_ID_CXL,
>> + CXL_DVSEC_PCIE_DEVICE);
>> +}
>
> If this was my bikeshed, I'd call it pci_is_cxl() to match pci_is_pcie().
Ok will change.
>
>
>> +static bool is_cxl_port_sbr_masked(struct pci_dev *dev)
>> +{
>> + int dvsec;
>> + int rc;
>> + u16 reg;
>
> Nit: Inverse Christmas tree?
Will change.
>
>
>> static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
>> {
>> int rc;
>>
>> + /* If it's a CXL port and the SBR control is masked, fail the SBR */
>> + if (is_cxl_device(dev) && dev->bus->self &&
>> + is_cxl_port_sbr_masked(dev->bus->self)) {
>> + if (probe)
>> + return 0;
>> +
>> + return -EPERM;
>> + }
>> +
>
> Is this also necessary if CONFIG_CXL_PCI=n?
Yes. As the kernel only loads type3 mem class CXL device driver. This is attempt to cover all CXL devices and not dependent on a loaded driver.
>
> Return code on non-availability of a reset method is generally -ENOTTY.
Ok I can change it to that.
> Or is the choice deliberate to expose this reset method despite the bit
> being set and thus allow user space to unmask it in the first place?
Yes the idea is if user intentionally unmasks the bit via a user tool then reset can go through.
>
> Also, we mostly use pci_upstream_bridge(dev) in lieu of dev->bus->self.
> Is the choice to use the latter deliberate because maybe is_virtfn is
> never set and the device can never be on the root bus? (What about
> RCiEP CXL devices?)
I'll change to pci_upstream_bridge() call. I didn't realize that existed.
>
> Thanks,
>
> Lukas
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-03-12 21:35 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-11 20:39 [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL Dave Jiang
2024-03-11 20:39 ` [PATCH 1/3] PCI: Add check for CXL Secondary Bus Reset Dave Jiang
2024-03-12 7:30 ` Lukas Wunner
2024-03-12 21:35 ` Dave Jiang
2024-03-11 20:39 ` [PATCH 2/3] PCI: Create new reset method to force SBR for CXL Dave Jiang
2024-03-12 7:50 ` Lukas Wunner
2024-03-11 20:39 ` [PATCH 3/3] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR) Dave Jiang
2024-03-12 7:46 ` [PATCH 0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL Lukas Wunner
2024-03-12 21:31 ` Dave Jiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox