* [PATCH 0/2] cxl: Dirty shutdown followups @ 2025-02-05 4:08 Davidlohr Bueso 2025-02-05 4:08 ` [PATCH 1/2] cxl/pmem: Export dirty shutdown count via sysfs Davidlohr Bueso 2025-02-05 4:08 ` [PATCH 2/2] tools/testing/cxl: Set Shutdown State support Davidlohr Bueso 0 siblings, 2 replies; 9+ messages in thread From: Davidlohr Bueso @ 2025-02-05 4:08 UTC (permalink / raw) To: dave.jiang, dan.j.williams Cc: jonathan.cameron, alison.schofield, ira.weiny, vishal.l.verma, seven.yi.lee, a.manzanares, fan.ni, anisa.su, dave, linux-cxl Hi, Two followup patches to the GPF work. The first adds a $platform/dirty_shutdown sysfs attribute - my only doubt here is where this should be documented, and for now I just left it in the regular ABI doc for cxl bus. The second patch adds support emulating the set shutdown state command for the mock device. Patch 1 has been tested with qemu, and patch 2 is only compile tested. Applies against the -next branch. Thanks! Davidlohr Bueso (2): cxl/pmem: Export dirty shutdown count via sysfs tools/testing/cxl: Set Shutdown State support Documentation/ABI/testing/sysfs-bus-cxl | 12 ++++ Documentation/driver-api/cxl/maturity-map.rst | 2 +- drivers/cxl/core/mbox.c | 21 ++++++ drivers/cxl/core/pci.c | 23 +++++++ drivers/cxl/cxl.h | 1 + drivers/cxl/cxlmem.h | 15 ++++ drivers/cxl/pmem.c | 69 ++++++++++++++++--- tools/testing/cxl/test/mem.c | 23 +++++++ 8 files changed, 157 insertions(+), 9 deletions(-) -- 2.39.5 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] cxl/pmem: Export dirty shutdown count via sysfs 2025-02-05 4:08 [PATCH 0/2] cxl: Dirty shutdown followups Davidlohr Bueso @ 2025-02-05 4:08 ` Davidlohr Bueso 2025-02-07 10:46 ` Yee Li 2025-02-11 2:09 ` Dave Jiang 2025-02-05 4:08 ` [PATCH 2/2] tools/testing/cxl: Set Shutdown State support Davidlohr Bueso 1 sibling, 2 replies; 9+ messages in thread From: Davidlohr Bueso @ 2025-02-05 4:08 UTC (permalink / raw) To: dave.jiang, dan.j.williams Cc: jonathan.cameron, alison.schofield, ira.weiny, vishal.l.verma, seven.yi.lee, a.manzanares, fan.ni, anisa.su, dave, linux-cxl Similar to how the acpi_nfit driver exports Optane dirty shutdown count, introduce: /sys/bus/cxl/devices/nvdimm-bridge0/ndbusX/nmemY/cxl/dirty_shutdown Under the conditions that 1) dirty shutdown can be set, 2) Device GPF DVSEC exists, and 3) the count itself can be retrieved. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> --- Documentation/ABI/testing/sysfs-bus-cxl | 12 ++++ Documentation/driver-api/cxl/maturity-map.rst | 2 +- drivers/cxl/core/mbox.c | 21 ++++++ drivers/cxl/core/pci.c | 23 +++++++ drivers/cxl/cxl.h | 1 + drivers/cxl/cxlmem.h | 15 ++++ drivers/cxl/pmem.c | 69 ++++++++++++++++--- 7 files changed, 134 insertions(+), 9 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 3f5627a1210a..a7491d214098 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -586,3 +586,15 @@ Description: See Documentation/ABI/stable/sysfs-devices-node. access0 provides the number to the closest initiator and access1 provides the number to the closest CPU. + + +What: /sys/bus/cxl/devices/nvdimm-bridge0/ndbusX/nmemY/cxl/dirty_shutdown +Date: Feb, 2025 +KernelVersion: v6.15 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) The device dirty shutdown count value, which is the number + of times the device could have incurred in potential data loss. + The count is persistent across power loss and wraps back to 0 + upon overflow. If this file is not present, the device does not + have the necessary support for dirty tracking. diff --git a/Documentation/driver-api/cxl/maturity-map.rst b/Documentation/driver-api/cxl/maturity-map.rst index 99dd2c841e69..a2288f9df658 100644 --- a/Documentation/driver-api/cxl/maturity-map.rst +++ b/Documentation/driver-api/cxl/maturity-map.rst @@ -130,7 +130,7 @@ Mailbox commands * [0] Switch CCI * [3] Timestamp * [1] PMEM labels -* [1] PMEM GPF / Dirty Shutdown +* [3] PMEM GPF / Dirty Shutdown * [0] Scan Media PMU diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 4d22bb731177..d03fc7ed76a8 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1290,6 +1290,27 @@ int cxl_mem_dpa_fetch(struct cxl_memdev_state *mds, struct cxl_dpa_info *info) } EXPORT_SYMBOL_NS_GPL(cxl_mem_dpa_fetch, "CXL"); +int cxl_get_dirty_count(struct cxl_memdev_state *mds, u32 *count) +{ + int rc; + struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; + struct cxl_mbox_cmd mbox_cmd; + struct cxl_mbox_get_health_info_out hi; + + mbox_cmd = (struct cxl_mbox_cmd) { + .opcode = CXL_MBOX_OP_GET_HEALTH_INFO, + .size_out = sizeof(hi), + .payload_out = &hi, + }; + + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); + if (!rc) + *count = le32_to_cpu(hi.dirty_shutdown_cnt); + + return rc; +} +EXPORT_SYMBOL_NS_GPL(cxl_get_dirty_count, "CXL"); + int cxl_dirty_shutdown_state(struct cxl_memdev_state *mds) { struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index a5c65f79db18..bcf05d010a77 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -1141,3 +1141,26 @@ int cxl_gpf_port_setup(struct device *dport_dev, struct cxl_port *port) return 0; } + +int cxl_gpf_device(struct cxl_dev_state *cxlds) +{ + int dvsec; + struct device *dev = cxlds->dev; + struct pci_dev *pdev; + + if (!dev_is_pci(dev)) + return 0; + + pdev = to_pci_dev(dev); + if (!pdev) + return -EINVAL; + + dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL, + CXL_DVSEC_DEVICE_GPF); + if (!dvsec) { + pci_warn(pdev, "Device GPF DVSEC not present\n"); + return -EINVAL; + } + + return 0; +} diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 6baec4ba9141..40cc44a18df8 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -542,6 +542,7 @@ struct cxl_nvdimm { struct device dev; struct cxl_memdev *cxlmd; u8 dev_id[CXL_DEV_ID_LEN]; /* for nvdimm, string of 'serial' */ + long dirty_shutdown; }; struct cxl_pmem_region_mapping { diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 536cbe521d16..ee0c93fde50c 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -725,6 +725,18 @@ struct cxl_mbox_set_partition_info { #define CXL_SET_PARTITION_IMMEDIATE_FLAG BIT(0) +/* Get Health Info Output Payload CXL 3.2 Spec 8.2.10.9.3.1 Table 8-148 */ +struct cxl_mbox_get_health_info_out { + u8 health_status; + u8 media_status; + u8 additional_status; + u8 life_used; + __le16 device_temperature; + __le32 dirty_shutdown_cnt; + __le32 corrected_volatile_error_cnt; + __le32 corrected_persistent_error_cnt; +} __packed; + /* Set Shutdown State Input Payload CXL 3.2 Spec 8.2.10.9.3.5 Table 8-152 */ struct cxl_mbox_set_shutdown_state_in { u8 state; @@ -866,6 +878,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd, enum cxl_event_log_type type, enum cxl_event_type event_type, const uuid_t *uuid, union cxl_event *evt); +int cxl_get_dirty_count(struct cxl_memdev_state *mds, u32 *count); int cxl_dirty_shutdown_state(struct cxl_memdev_state *mds); int cxl_set_timestamp(struct cxl_memdev_state *mds); int cxl_poison_state_init(struct cxl_memdev_state *mds); @@ -910,4 +923,6 @@ struct cxl_hdm { struct seq_file; struct dentry *cxl_debugfs_create_dir(const char *dir); void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds); + +int cxl_gpf_device(struct cxl_dev_state *cxlds); #endif /* __CXL_MEM_H__ */ diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c index a39e2c52d7ab..d83ecd568a9c 100644 --- a/drivers/cxl/pmem.c +++ b/drivers/cxl/pmem.c @@ -42,15 +42,41 @@ static ssize_t id_show(struct device *dev, struct device_attribute *attr, char * } static DEVICE_ATTR_RO(id); +static ssize_t dirty_shutdown_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct nvdimm *nvdimm = to_nvdimm(dev); + struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm); + + return sysfs_emit(buf, "%ld\n", cxl_nvd->dirty_shutdown); +} +static DEVICE_ATTR_RO(dirty_shutdown); + static struct attribute *cxl_dimm_attributes[] = { &dev_attr_id.attr, &dev_attr_provider.attr, + &dev_attr_dirty_shutdown.attr, NULL }; +static umode_t cxl_dimm_visible(struct kobject *kobj, struct attribute *a, int n) +{ + if (a == &dev_attr_dirty_shutdown.attr) { + struct device *dev = kobj_to_dev(kobj); + struct nvdimm *nvdimm = to_nvdimm(dev); + struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm); + + if (cxl_nvd->dirty_shutdown == -1) + return 0; + } + + return a->mode; +} + static const struct attribute_group cxl_dimm_attribute_group = { .name = "cxl", .attrs = cxl_dimm_attributes, + .is_visible = cxl_dimm_visible }; static const struct attribute_group *cxl_dimm_attribute_groups[] = { @@ -58,6 +84,33 @@ static const struct attribute_group *cxl_dimm_attribute_groups[] = { NULL }; +static void cxl_nvdimm_setup_dirty_tracking(struct cxl_nvdimm *cxl_nvd) +{ + struct cxl_memdev *cxlmd = cxl_nvd->cxlmd; + struct cxl_dev_state *cxlds = cxlmd->cxlds; + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds); + struct device *dev = &cxl_nvd->dev; + + /* + * Dirty tracking is enabled and exposed to the user, only when: + * - dirty shutdown on the device can be set, and, + * - the device has a Device GPF DVSEC (albeit unused), and, + * - the Get Health Info cmd can retrieve the device's dirty count. + */ + cxl_nvd->dirty_shutdown = -1; + + if (cxl_dirty_shutdown_state(mds)) { + dev_warn(dev, "GPF: could not dirty shutdown state\n"); + } else if (!cxl_gpf_device(cxlds)) { + u32 count; + + if (!cxl_get_dirty_count(mds, &count)) + cxl_nvd->dirty_shutdown = count; + else + dev_warn(dev, "GPF: could not retrieve dirty count\n"); + } +} + static int cxl_nvdimm_probe(struct device *dev) { struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev); @@ -78,20 +131,20 @@ static int cxl_nvdimm_probe(struct device *dev) set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask); set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask); set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask); - nvdimm = __nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, - cxl_dimm_attribute_groups, flags, - cmd_mask, 0, NULL, cxl_nvd->dev_id, - cxl_security_ops, NULL); - if (!nvdimm) - return -ENOMEM; /* * Set dirty shutdown now, with the expectation that the device * clear it upon a successful GPF flow. The exception to this * is upon Viral detection, per CXL 3.2 section 12.4.2. */ - if (cxl_dirty_shutdown_state(mds)) - dev_warn(dev, "GPF: could not dirty shutdown state\n"); + cxl_nvdimm_setup_dirty_tracking(cxl_nvd); + + nvdimm = __nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, + cxl_dimm_attribute_groups, flags, + cmd_mask, 0, NULL, cxl_nvd->dev_id, + cxl_security_ops, NULL); + if (!nvdimm) + return -ENOMEM; dev_set_drvdata(dev, nvdimm); return devm_add_action_or_reset(dev, unregister_nvdimm, nvdimm); -- 2.39.5 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] cxl/pmem: Export dirty shutdown count via sysfs 2025-02-05 4:08 ` [PATCH 1/2] cxl/pmem: Export dirty shutdown count via sysfs Davidlohr Bueso @ 2025-02-07 10:46 ` Yee Li 2025-02-11 2:09 ` Dave Jiang 1 sibling, 0 replies; 9+ messages in thread From: Yee Li @ 2025-02-07 10:46 UTC (permalink / raw) To: Davidlohr Bueso Cc: dave.jiang, dan.j.williams, jonathan.cameron, alison.schofield, ira.weiny, vishal.l.verma, a.manzanares, fan.ni, anisa.su, linux-cxl Hi Davidlohr, I tried the patch in linux-cxl next branch, ERROR: modpost: "cxl_gpf_device" [drivers/cxl/cxl_pmem.ko] undefined A few comments inline. Thanks, Yee > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index a5c65f79db18..bcf05d010a77 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -1141,3 +1141,26 @@ int cxl_gpf_port_setup(struct device *dport_dev, struct cxl_port *port) > > return 0; > } > + > +int cxl_gpf_device(struct cxl_dev_state *cxlds) > +{ > + int dvsec; > + struct device *dev = cxlds->dev; > + struct pci_dev *pdev; > + > + if (!dev_is_pci(dev)) > + return 0; > + > + pdev = to_pci_dev(dev); > + if (!pdev) > + return -EINVAL; > + > + dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL, > + CXL_DVSEC_DEVICE_GPF); > + if (!dvsec) { > + pci_warn(pdev, "Device GPF DVSEC not present\n"); > + return -EINVAL; > + } > + > + return 0; > +} Maybe cxl_gpf_device needs EXPORT_SYMBOL. cxl_gpf_device defined in drivers/cxl/cxlmem.h as a prototype defined in drivers/cxl/core/pci.c as a function -- #include <cxlmem.h> referenced in drivers/cxl/pmem.c -- #include "cxlmem.h" Refers to other functions (external call) in the drivers/cxl/core/pci.c, EXPORT_SYMBOL_NS_GPL(cxl_gpf_device, "CXL"); > struct cxl_pmem_region_mapping { > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h > index 536cbe521d16..ee0c93fde50c 100644 > --- a/drivers/cxl/cxlmem.h > +++ b/drivers/cxl/cxlmem.h > +int cxl_gpf_device(struct cxl_dev_state *cxlds); > #endif /* __CXL_MEM_H__ */ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] cxl/pmem: Export dirty shutdown count via sysfs 2025-02-05 4:08 ` [PATCH 1/2] cxl/pmem: Export dirty shutdown count via sysfs Davidlohr Bueso 2025-02-07 10:46 ` Yee Li @ 2025-02-11 2:09 ` Dave Jiang 2025-02-11 4:50 ` Davidlohr Bueso 1 sibling, 1 reply; 9+ messages in thread From: Dave Jiang @ 2025-02-11 2:09 UTC (permalink / raw) To: Davidlohr Bueso, dan.j.williams Cc: jonathan.cameron, alison.schofield, ira.weiny, vishal.l.verma, seven.yi.lee, a.manzanares, fan.ni, anisa.su, linux-cxl On 2/4/25 9:08 PM, Davidlohr Bueso wrote: > Similar to how the acpi_nfit driver exports Optane dirty shutdown count, > introduce: > > /sys/bus/cxl/devices/nvdimm-bridge0/ndbusX/nmemY/cxl/dirty_shutdown > > Under the conditions that 1) dirty shutdown can be set, 2) Device GPF > DVSEC exists, and 3) the count itself can be retrieved. > > Suggested-by: Dan Williams <dan.j.williams@intel.com> > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> > --- > Documentation/ABI/testing/sysfs-bus-cxl | 12 ++++ > Documentation/driver-api/cxl/maturity-map.rst | 2 +- > drivers/cxl/core/mbox.c | 21 ++++++ > drivers/cxl/core/pci.c | 23 +++++++ > drivers/cxl/cxl.h | 1 + > drivers/cxl/cxlmem.h | 15 ++++ > drivers/cxl/pmem.c | 69 ++++++++++++++++--- > 7 files changed, 134 insertions(+), 9 deletions(-) > > diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl > index 3f5627a1210a..a7491d214098 100644 > --- a/Documentation/ABI/testing/sysfs-bus-cxl > +++ b/Documentation/ABI/testing/sysfs-bus-cxl > @@ -586,3 +586,15 @@ Description: > See Documentation/ABI/stable/sysfs-devices-node. access0 provides > the number to the closest initiator and access1 provides the > number to the closest CPU. > + > + > +What: /sys/bus/cxl/devices/nvdimm-bridge0/ndbusX/nmemY/cxl/dirty_shutdown > +Date: Feb, 2025 > +KernelVersion: v6.15 > +Contact: linux-cxl@vger.kernel.org > +Description: > + (RO) The device dirty shutdown count value, which is the number > + of times the device could have incurred in potential data loss. > + The count is persistent across power loss and wraps back to 0 > + upon overflow. If this file is not present, the device does not > + have the necessary support for dirty tracking. > diff --git a/Documentation/driver-api/cxl/maturity-map.rst b/Documentation/driver-api/cxl/maturity-map.rst > index 99dd2c841e69..a2288f9df658 100644 > --- a/Documentation/driver-api/cxl/maturity-map.rst > +++ b/Documentation/driver-api/cxl/maturity-map.rst > @@ -130,7 +130,7 @@ Mailbox commands > * [0] Switch CCI > * [3] Timestamp > * [1] PMEM labels > -* [1] PMEM GPF / Dirty Shutdown > +* [3] PMEM GPF / Dirty Shutdown > * [0] Scan Media > > PMU > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c > index 4d22bb731177..d03fc7ed76a8 100644 > --- a/drivers/cxl/core/mbox.c > +++ b/drivers/cxl/core/mbox.c > @@ -1290,6 +1290,27 @@ int cxl_mem_dpa_fetch(struct cxl_memdev_state *mds, struct cxl_dpa_info *info) > } > EXPORT_SYMBOL_NS_GPL(cxl_mem_dpa_fetch, "CXL"); > > +int cxl_get_dirty_count(struct cxl_memdev_state *mds, u32 *count) > +{ > + int rc; > + struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; > + struct cxl_mbox_cmd mbox_cmd; > + struct cxl_mbox_get_health_info_out hi; > + > + mbox_cmd = (struct cxl_mbox_cmd) { > + .opcode = CXL_MBOX_OP_GET_HEALTH_INFO, > + .size_out = sizeof(hi), > + .payload_out = &hi, > + }; > + > + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); > + if (!rc) > + *count = le32_to_cpu(hi.dirty_shutdown_cnt); > + > + return rc; > +} > +EXPORT_SYMBOL_NS_GPL(cxl_get_dirty_count, "CXL"); > + > int cxl_dirty_shutdown_state(struct cxl_memdev_state *mds) > { > struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox; > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index a5c65f79db18..bcf05d010a77 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -1141,3 +1141,26 @@ int cxl_gpf_port_setup(struct device *dport_dev, struct cxl_port *port) > > return 0; > } > + > +int cxl_gpf_device(struct cxl_dev_state *cxlds) Maybe turn this into a helper for retrieving the dvsec and use it in cxl_gpf_port_setup() as well? > +{ > + int dvsec; > + struct device *dev = cxlds->dev; > + struct pci_dev *pdev; > + > + if (!dev_is_pci(dev)) > + return 0; > + > + pdev = to_pci_dev(dev); > + if (!pdev) > + return -EINVAL; > + > + dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL, > + CXL_DVSEC_DEVICE_GPF); > + if (!dvsec) { > + pci_warn(pdev, "Device GPF DVSEC not present\n"); > + return -EINVAL; > + } > + > + return 0; > +} I think it needs to export the symbol to use outside of cxl core. > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index 6baec4ba9141..40cc44a18df8 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -542,6 +542,7 @@ struct cxl_nvdimm { > struct device dev; > struct cxl_memdev *cxlmd; > u8 dev_id[CXL_DEV_ID_LEN]; /* for nvdimm, string of 'serial' */ > + long dirty_shutdown; Maybe consider using u64 and call it 'dirty_shutdowns'? > }; > > struct cxl_pmem_region_mapping { > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h > index 536cbe521d16..ee0c93fde50c 100644 > --- a/drivers/cxl/cxlmem.h > +++ b/drivers/cxl/cxlmem.h > @@ -725,6 +725,18 @@ struct cxl_mbox_set_partition_info { > > #define CXL_SET_PARTITION_IMMEDIATE_FLAG BIT(0) > > +/* Get Health Info Output Payload CXL 3.2 Spec 8.2.10.9.3.1 Table 8-148 */ > +struct cxl_mbox_get_health_info_out { > + u8 health_status; > + u8 media_status; > + u8 additional_status; > + u8 life_used; > + __le16 device_temperature; > + __le32 dirty_shutdown_cnt; > + __le32 corrected_volatile_error_cnt; > + __le32 corrected_persistent_error_cnt; > +} __packed; > + > /* Set Shutdown State Input Payload CXL 3.2 Spec 8.2.10.9.3.5 Table 8-152 */ > struct cxl_mbox_set_shutdown_state_in { > u8 state; > @@ -866,6 +878,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd, > enum cxl_event_log_type type, > enum cxl_event_type event_type, > const uuid_t *uuid, union cxl_event *evt); > +int cxl_get_dirty_count(struct cxl_memdev_state *mds, u32 *count); > int cxl_dirty_shutdown_state(struct cxl_memdev_state *mds); > int cxl_set_timestamp(struct cxl_memdev_state *mds); > int cxl_poison_state_init(struct cxl_memdev_state *mds); > @@ -910,4 +923,6 @@ struct cxl_hdm { > struct seq_file; > struct dentry *cxl_debugfs_create_dir(const char *dir); > void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds); > + > +int cxl_gpf_device(struct cxl_dev_state *cxlds); > #endif /* __CXL_MEM_H__ */ > diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c > index a39e2c52d7ab..d83ecd568a9c 100644 > --- a/drivers/cxl/pmem.c > +++ b/drivers/cxl/pmem.c > @@ -42,15 +42,41 @@ static ssize_t id_show(struct device *dev, struct device_attribute *attr, char * > } > static DEVICE_ATTR_RO(id); > > +static ssize_t dirty_shutdown_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct nvdimm *nvdimm = to_nvdimm(dev); > + struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm); > + > + return sysfs_emit(buf, "%ld\n", cxl_nvd->dirty_shutdown); > +} > +static DEVICE_ATTR_RO(dirty_shutdown); > + > static struct attribute *cxl_dimm_attributes[] = { > &dev_attr_id.attr, > &dev_attr_provider.attr, > + &dev_attr_dirty_shutdown.attr, > NULL > }; > > +static umode_t cxl_dimm_visible(struct kobject *kobj, struct attribute *a, int n) > +{ > + if (a == &dev_attr_dirty_shutdown.attr) { > + struct device *dev = kobj_to_dev(kobj); > + struct nvdimm *nvdimm = to_nvdimm(dev); > + struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm); > + > + if (cxl_nvd->dirty_shutdown == -1) > + return 0; > + } > + > + return a->mode; > +} > + > static const struct attribute_group cxl_dimm_attribute_group = { > .name = "cxl", > .attrs = cxl_dimm_attributes, > + .is_visible = cxl_dimm_visible > }; > > static const struct attribute_group *cxl_dimm_attribute_groups[] = { > @@ -58,6 +84,33 @@ static const struct attribute_group *cxl_dimm_attribute_groups[] = { > NULL > }; > > +static void cxl_nvdimm_setup_dirty_tracking(struct cxl_nvdimm *cxl_nvd) Please consider cxl_nvdimm_arm_dirty_shutdown_tracking() > +{ > + struct cxl_memdev *cxlmd = cxl_nvd->cxlmd; > + struct cxl_dev_state *cxlds = cxlmd->cxlds; > + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds); > + struct device *dev = &cxl_nvd->dev; > + > + /* > + * Dirty tracking is enabled and exposed to the user, only when: > + * - dirty shutdown on the device can be set, and, > + * - the device has a Device GPF DVSEC (albeit unused), and, > + * - the Get Health Info cmd can retrieve the device's dirty count. > + */ > + cxl_nvd->dirty_shutdown = -1; #define CXL_INVALID_DIRTY_SHUTDOWN_COUNT -1 perhaps? > + > + if (cxl_dirty_shutdown_state(mds)) { > + dev_warn(dev, "GPF: could not dirty shutdown state\n"); "could not set dirty shutdown state" Speaking of which, should we rename cxl_dirty_shutdown_state() to cxl_arm_dirty_shutdown()? Also, maybe just return here instead of falling through > + } else if (!cxl_gpf_device(cxlds)) { Just return the fail case here > + u32 count; > + > + if (!cxl_get_dirty_count(mds, &count)) > + cxl_nvd->dirty_shutdown = count; > + else > + dev_warn(dev, "GPF: could not retrieve dirty count\n"); Warn and call return directly here for the failed case first DJ > + } > +} > + > static int cxl_nvdimm_probe(struct device *dev) > { > struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev); > @@ -78,20 +131,20 @@ static int cxl_nvdimm_probe(struct device *dev) > set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask); > set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask); > set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask); > - nvdimm = __nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, > - cxl_dimm_attribute_groups, flags, > - cmd_mask, 0, NULL, cxl_nvd->dev_id, > - cxl_security_ops, NULL); > - if (!nvdimm) > - return -ENOMEM; > > /* > * Set dirty shutdown now, with the expectation that the device > * clear it upon a successful GPF flow. The exception to this > * is upon Viral detection, per CXL 3.2 section 12.4.2. > */ > - if (cxl_dirty_shutdown_state(mds)) > - dev_warn(dev, "GPF: could not dirty shutdown state\n"); > + cxl_nvdimm_setup_dirty_tracking(cxl_nvd); > + > + nvdimm = __nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, > + cxl_dimm_attribute_groups, flags, > + cmd_mask, 0, NULL, cxl_nvd->dev_id, > + cxl_security_ops, NULL); > + if (!nvdimm) > + return -ENOMEM; > > dev_set_drvdata(dev, nvdimm); > return devm_add_action_or_reset(dev, unregister_nvdimm, nvdimm); ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] cxl/pmem: Export dirty shutdown count via sysfs 2025-02-11 2:09 ` Dave Jiang @ 2025-02-11 4:50 ` Davidlohr Bueso 2025-02-18 18:58 ` Dave Jiang 0 siblings, 1 reply; 9+ messages in thread From: Davidlohr Bueso @ 2025-02-11 4:50 UTC (permalink / raw) To: Dave Jiang Cc: dan.j.williams, jonathan.cameron, alison.schofield, ira.weiny, vishal.l.verma, seven.yi.lee, a.manzanares, fan.ni, anisa.su, linux-cxl On Mon, 10 Feb 2025, Dave Jiang wrote: >On 2/4/25 9:08 PM, Davidlohr Bueso wrote: >> @@ -542,6 +542,7 @@ struct cxl_nvdimm { >> struct device dev; >> struct cxl_memdev *cxlmd; >> u8 dev_id[CXL_DEV_ID_LEN]; /* for nvdimm, string of 'serial' */ >> + long dirty_shutdown; > >Maybe consider using u64 and call it 'dirty_shutdowns'? Imo signed type having -1 is more natural - are there any benefits in it being u64? Thanks, Davidlohr ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] cxl/pmem: Export dirty shutdown count via sysfs 2025-02-11 4:50 ` Davidlohr Bueso @ 2025-02-18 18:58 ` Dave Jiang 0 siblings, 0 replies; 9+ messages in thread From: Dave Jiang @ 2025-02-18 18:58 UTC (permalink / raw) To: Davidlohr Bueso Cc: dan.j.williams, jonathan.cameron, alison.schofield, ira.weiny, vishal.l.verma, seven.yi.lee, a.manzanares, fan.ni, anisa.su, linux-cxl On 2/10/25 9:50 PM, Davidlohr Bueso wrote: > On Mon, 10 Feb 2025, Dave Jiang wrote: > >> On 2/4/25 9:08 PM, Davidlohr Bueso wrote: >>> @@ -542,6 +542,7 @@ struct cxl_nvdimm { >>> struct device dev; >>> struct cxl_memdev *cxlmd; >>> u8 dev_id[CXL_DEV_ID_LEN]; /* for nvdimm, string of 'serial' */ >>> + long dirty_shutdown; >> >> Maybe consider using u64 and call it 'dirty_shutdowns'? > > Imo signed type having -1 is more natural - are there any benefits in > it being u64? If it's a signed value and when it rolls over, it would show up as negative wouldn't it? Seems inconsistent with what the sysfs documentation says. Although that's a lot of dirty shutdowns for a system.... I suppose you can always force it to be unsigned when emit via sysfs. > > Thanks, > Davidlohr ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] tools/testing/cxl: Set Shutdown State support 2025-02-05 4:08 [PATCH 0/2] cxl: Dirty shutdown followups Davidlohr Bueso 2025-02-05 4:08 ` [PATCH 1/2] cxl/pmem: Export dirty shutdown count via sysfs Davidlohr Bueso @ 2025-02-05 4:08 ` Davidlohr Bueso 2025-02-11 2:16 ` Dave Jiang 1 sibling, 1 reply; 9+ messages in thread From: Davidlohr Bueso @ 2025-02-05 4:08 UTC (permalink / raw) To: dave.jiang, dan.j.williams Cc: jonathan.cameron, alison.schofield, ira.weiny, vishal.l.verma, seven.yi.lee, a.manzanares, fan.ni, anisa.su, dave, linux-cxl Add support to emulate the CXL Set Shutdown State operation. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> --- tools/testing/cxl/test/mem.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c index 495199238335..832680a87c73 100644 --- a/tools/testing/cxl/test/mem.c +++ b/tools/testing/cxl/test/mem.c @@ -65,6 +65,10 @@ static struct cxl_cel_entry mock_cel[] = { .opcode = cpu_to_le16(CXL_MBOX_OP_GET_HEALTH_INFO), .effect = CXL_CMD_EFFECT_NONE, }, + { + .opcode = cpu_to_le16(CXL_MBOX_OP_SET_SHUTDOWN_STATE), + .effect = POLICY_CHANGE_IMMEDIATE, + }, { .opcode = cpu_to_le16(CXL_MBOX_OP_GET_POISON), .effect = CXL_CMD_EFFECT_NONE, @@ -161,6 +165,7 @@ struct cxl_mockmem_data { u8 event_buf[SZ_4K]; u64 timestamp; unsigned long sanitize_timeout; + int shutdown_state; }; static struct mock_event_log *event_find_log(struct device *dev, int log_type) @@ -1088,6 +1093,21 @@ static int mock_health_info(struct cxl_mbox_cmd *cmd) return 0; } +static int mock_set_shutdown_state(struct cxl_mockmem_data *mdata, + struct cxl_mbox_cmd *cmd) +{ + struct cxl_mbox_set_shutdown_state_in *ss = cmd->payload_in; + + if (cmd->size_in != sizeof(*ss)) + return -EINVAL; + + if (cmd->size_out != 0) + return -EINVAL; + + mdata->shutdown_state = ss->state; + return 0; +} + static struct mock_poison { struct cxl_dev_state *cxlds; u64 dpa; @@ -1421,6 +1441,9 @@ static int cxl_mock_mbox_send(struct cxl_mailbox *cxl_mbox, case CXL_MBOX_OP_PASSPHRASE_SECURE_ERASE: rc = mock_passphrase_secure_erase(mdata, cmd); break; + case CXL_MBOX_OP_SET_SHUTDOWN_STATE: + rc = mock_set_shutdown_state(mdata, cmd); + break; case CXL_MBOX_OP_GET_POISON: rc = mock_get_poison(cxlds, cmd); break; -- 2.39.5 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] tools/testing/cxl: Set Shutdown State support 2025-02-05 4:08 ` [PATCH 2/2] tools/testing/cxl: Set Shutdown State support Davidlohr Bueso @ 2025-02-11 2:16 ` Dave Jiang 2025-02-11 4:13 ` Davidlohr Bueso 0 siblings, 1 reply; 9+ messages in thread From: Dave Jiang @ 2025-02-11 2:16 UTC (permalink / raw) To: Davidlohr Bueso, dan.j.williams Cc: jonathan.cameron, alison.schofield, ira.weiny, vishal.l.verma, seven.yi.lee, a.manzanares, fan.ni, anisa.su, linux-cxl On 2/4/25 9:08 PM, Davidlohr Bueso wrote: > Add support to emulate the CXL Set Shutdown State operation. Should the CXL_MBOX_OP_GET_SHUTDOWN_STATE support be added as well so the cxl_test can excercise the sysfs attrib and other parts of patch 1? > > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> otherwise Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > tools/testing/cxl/test/mem.c | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c > index 495199238335..832680a87c73 100644 > --- a/tools/testing/cxl/test/mem.c > +++ b/tools/testing/cxl/test/mem.c > @@ -65,6 +65,10 @@ static struct cxl_cel_entry mock_cel[] = { > .opcode = cpu_to_le16(CXL_MBOX_OP_GET_HEALTH_INFO), > .effect = CXL_CMD_EFFECT_NONE, > }, > + { > + .opcode = cpu_to_le16(CXL_MBOX_OP_SET_SHUTDOWN_STATE), > + .effect = POLICY_CHANGE_IMMEDIATE, > + }, > { > .opcode = cpu_to_le16(CXL_MBOX_OP_GET_POISON), > .effect = CXL_CMD_EFFECT_NONE, > @@ -161,6 +165,7 @@ struct cxl_mockmem_data { > u8 event_buf[SZ_4K]; > u64 timestamp; > unsigned long sanitize_timeout; > + int shutdown_state; > }; > > static struct mock_event_log *event_find_log(struct device *dev, int log_type) > @@ -1088,6 +1093,21 @@ static int mock_health_info(struct cxl_mbox_cmd *cmd) > return 0; > } > > +static int mock_set_shutdown_state(struct cxl_mockmem_data *mdata, > + struct cxl_mbox_cmd *cmd) > +{ > + struct cxl_mbox_set_shutdown_state_in *ss = cmd->payload_in; > + > + if (cmd->size_in != sizeof(*ss)) > + return -EINVAL; > + > + if (cmd->size_out != 0) > + return -EINVAL; > + > + mdata->shutdown_state = ss->state; > + return 0; > +} > + > static struct mock_poison { > struct cxl_dev_state *cxlds; > u64 dpa; > @@ -1421,6 +1441,9 @@ static int cxl_mock_mbox_send(struct cxl_mailbox *cxl_mbox, > case CXL_MBOX_OP_PASSPHRASE_SECURE_ERASE: > rc = mock_passphrase_secure_erase(mdata, cmd); > break; > + case CXL_MBOX_OP_SET_SHUTDOWN_STATE: > + rc = mock_set_shutdown_state(mdata, cmd); > + break; > case CXL_MBOX_OP_GET_POISON: > rc = mock_get_poison(cxlds, cmd); > break; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] tools/testing/cxl: Set Shutdown State support 2025-02-11 2:16 ` Dave Jiang @ 2025-02-11 4:13 ` Davidlohr Bueso 0 siblings, 0 replies; 9+ messages in thread From: Davidlohr Bueso @ 2025-02-11 4:13 UTC (permalink / raw) To: Dave Jiang Cc: dan.j.williams, jonathan.cameron, alison.schofield, ira.weiny, vishal.l.verma, seven.yi.lee, a.manzanares, fan.ni, anisa.su, linux-cxl On Mon, 10 Feb 2025, Dave Jiang wrote: > > >On 2/4/25 9:08 PM, Davidlohr Bueso wrote: >> Add support to emulate the CXL Set Shutdown State operation. > >Should the CXL_MBOX_OP_GET_SHUTDOWN_STATE support be added as well so the cxl_test can excercise the sysfs attrib and other parts of patch 1? CXL_MBOX_OP_GET_SHUTDOWN_STATE is unused altogether - you might be thinking of CXL_MBOX_OP_GET_HEALTH_INFO? (which is already supported by the mock device) Thanks, Davidlohr ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-02-18 18:58 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-05 4:08 [PATCH 0/2] cxl: Dirty shutdown followups Davidlohr Bueso 2025-02-05 4:08 ` [PATCH 1/2] cxl/pmem: Export dirty shutdown count via sysfs Davidlohr Bueso 2025-02-07 10:46 ` Yee Li 2025-02-11 2:09 ` Dave Jiang 2025-02-11 4:50 ` Davidlohr Bueso 2025-02-18 18:58 ` Dave Jiang 2025-02-05 4:08 ` [PATCH 2/2] tools/testing/cxl: Set Shutdown State support Davidlohr Bueso 2025-02-11 2:16 ` Dave Jiang 2025-02-11 4:13 ` Davidlohr Bueso
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox