* [PATCH] cxl/cdat: Free correct buffer on checksum error
@ 2023-11-17 0:03 Ira Weiny
2023-11-17 15:50 ` Dave Jiang
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Ira Weiny @ 2023-11-17 0:03 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Dan Williams
Cc: linux-cxl, linux-kernel, Fan Ni, Ira Weiny
The new 6.7-rc1 kernel now checks the checksum on CDAT data. While
using a branch of Fan's DCD qemu work (and specifying DCD devices), the
following splat was observed.
WARNING: CPU: 1 PID: 1384 at drivers/base/devres.c:1064 devm_kfree+0x4f/0x60
...
RIP: 0010:devm_kfree+0x4f/0x60
...
? devm_kfree+0x4f/0x60
read_cdat_data+0x1a0/0x2a0 [cxl_core]
cxl_port_probe+0xdf/0x200 [cxl_port]
...
The issue in qemu is still unknown but the spat is a straight forward
bug in the CDAT checksum processing code. Use a CDAT buffer variable to
ensure the devm_free() works correctly on error.
Cc: jonathan.cameron@huawei.com
Cc: Fan Ni <nifan.cxl@gmail.com>
Fixes: 670e4e88f3b1 ("cxl: Add checksum verification to CDAT from CXL")
Cc: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
drivers/cxl/core/pci.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index eff20e83d0a6..5aaa0b36c42a 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -620,7 +620,7 @@ void read_cdat_data(struct cxl_port *port)
struct pci_dev *pdev = NULL;
struct cxl_memdev *cxlmd;
size_t cdat_length;
- void *cdat_table;
+ void *cdat_table, *cdat_buf;
int rc;
if (is_cxl_memdev(uport)) {
@@ -651,16 +651,16 @@ void read_cdat_data(struct cxl_port *port)
return;
}
- cdat_table = devm_kzalloc(dev, cdat_length + sizeof(__le32),
+ cdat_buf = devm_kzalloc(dev, cdat_length + sizeof(__le32),
GFP_KERNEL);
- if (!cdat_table)
+ if (!cdat_buf)
return;
- rc = cxl_cdat_read_table(dev, cdat_doe, cdat_table, &cdat_length);
+ rc = cxl_cdat_read_table(dev, cdat_doe, cdat_buf, &cdat_length);
if (rc)
goto err;
- cdat_table = cdat_table + sizeof(__le32);
+ cdat_table = cdat_buf + sizeof(__le32);
if (cdat_checksum(cdat_table, cdat_length))
goto err;
@@ -670,7 +670,7 @@ void read_cdat_data(struct cxl_port *port)
err:
/* Don't leave table data allocated on error */
- devm_kfree(dev, cdat_table);
+ devm_kfree(dev, cdat_buf);
dev_err(dev, "Failed to read/validate CDAT.\n");
}
EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);
---
base-commit: 7475e51b87969e01a6812eac713a1c8310372e8a
change-id: 20231116-fix-cdat-devm-free-b47d32b4b833
Best regards,
--
Ira Weiny <ira.weiny@intel.com>
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH] cxl/cdat: Free correct buffer on checksum error 2023-11-17 0:03 [PATCH] cxl/cdat: Free correct buffer on checksum error Ira Weiny @ 2023-11-17 15:50 ` Dave Jiang 2023-11-17 17:14 ` fan 2023-11-17 20:09 ` Robert Richter 2 siblings, 0 replies; 13+ messages in thread From: Dave Jiang @ 2023-11-17 15:50 UTC (permalink / raw) To: Ira Weiny, Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma, Dan Williams Cc: linux-cxl, linux-kernel, Fan Ni On 11/16/23 17:03, Ira Weiny wrote: > The new 6.7-rc1 kernel now checks the checksum on CDAT data. While > using a branch of Fan's DCD qemu work (and specifying DCD devices), the > following splat was observed. > > WARNING: CPU: 1 PID: 1384 at drivers/base/devres.c:1064 devm_kfree+0x4f/0x60 > ... > RIP: 0010:devm_kfree+0x4f/0x60 > ... > ? devm_kfree+0x4f/0x60 > read_cdat_data+0x1a0/0x2a0 [cxl_core] > cxl_port_probe+0xdf/0x200 [cxl_port] > ... > > The issue in qemu is still unknown but the spat is a straight forward > bug in the CDAT checksum processing code. Use a CDAT buffer variable to > ensure the devm_free() works correctly on error. > > Cc: jonathan.cameron@huawei.com > Cc: Fan Ni <nifan.cxl@gmail.com> > Fixes: 670e4e88f3b1 ("cxl: Add checksum verification to CDAT from CXL") > Cc: Dave Jiang <dave.jiang@intel.com> > Signed-off-by: Ira Weiny <ira.weiny@intel.com> Thanks for the fix Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/pci.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index eff20e83d0a6..5aaa0b36c42a 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -620,7 +620,7 @@ void read_cdat_data(struct cxl_port *port) > struct pci_dev *pdev = NULL; > struct cxl_memdev *cxlmd; > size_t cdat_length; > - void *cdat_table; > + void *cdat_table, *cdat_buf; > int rc; > > if (is_cxl_memdev(uport)) { > @@ -651,16 +651,16 @@ void read_cdat_data(struct cxl_port *port) > return; > } > > - cdat_table = devm_kzalloc(dev, cdat_length + sizeof(__le32), > + cdat_buf = devm_kzalloc(dev, cdat_length + sizeof(__le32), > GFP_KERNEL); > - if (!cdat_table) > + if (!cdat_buf) > return; > > - rc = cxl_cdat_read_table(dev, cdat_doe, cdat_table, &cdat_length); > + rc = cxl_cdat_read_table(dev, cdat_doe, cdat_buf, &cdat_length); > if (rc) > goto err; > > - cdat_table = cdat_table + sizeof(__le32); > + cdat_table = cdat_buf + sizeof(__le32); > if (cdat_checksum(cdat_table, cdat_length)) > goto err; > > @@ -670,7 +670,7 @@ void read_cdat_data(struct cxl_port *port) > > err: > /* Don't leave table data allocated on error */ > - devm_kfree(dev, cdat_table); > + devm_kfree(dev, cdat_buf); > dev_err(dev, "Failed to read/validate CDAT.\n"); > } > EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL); > > --- > base-commit: 7475e51b87969e01a6812eac713a1c8310372e8a > change-id: 20231116-fix-cdat-devm-free-b47d32b4b833 > > Best regards, ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] cxl/cdat: Free correct buffer on checksum error 2023-11-17 0:03 [PATCH] cxl/cdat: Free correct buffer on checksum error Ira Weiny 2023-11-17 15:50 ` Dave Jiang @ 2023-11-17 17:14 ` fan 2023-11-17 20:09 ` Robert Richter 2 siblings, 0 replies; 13+ messages in thread From: fan @ 2023-11-17 17:14 UTC (permalink / raw) To: Ira Weiny Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma, Dan Williams, linux-cxl, linux-kernel, Fan Ni On Thu, Nov 16, 2023 at 04:03:29PM -0800, Ira Weiny wrote: > The new 6.7-rc1 kernel now checks the checksum on CDAT data. While > using a branch of Fan's DCD qemu work (and specifying DCD devices), the > following splat was observed. > > WARNING: CPU: 1 PID: 1384 at drivers/base/devres.c:1064 devm_kfree+0x4f/0x60 > ... > RIP: 0010:devm_kfree+0x4f/0x60 > ... > ? devm_kfree+0x4f/0x60 > read_cdat_data+0x1a0/0x2a0 [cxl_core] > cxl_port_probe+0xdf/0x200 [cxl_port] > ... > > The issue in qemu is still unknown but the spat is a straight forward > bug in the CDAT checksum processing code. Use a CDAT buffer variable to > ensure the devm_free() works correctly on error. > > Cc: jonathan.cameron@huawei.com > Cc: Fan Ni <nifan.cxl@gmail.com> > Fixes: 670e4e88f3b1 ("cxl: Add checksum verification to CDAT from CXL") > Cc: Dave Jiang <dave.jiang@intel.com> > Signed-off-by: Ira Weiny <ira.weiny@intel.com> > --- Good catch. Reviewed-by: Fan Ni <fan.ni@samsung.com> > drivers/cxl/core/pci.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index eff20e83d0a6..5aaa0b36c42a 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -620,7 +620,7 @@ void read_cdat_data(struct cxl_port *port) > struct pci_dev *pdev = NULL; > struct cxl_memdev *cxlmd; > size_t cdat_length; > - void *cdat_table; > + void *cdat_table, *cdat_buf; > int rc; > > if (is_cxl_memdev(uport)) { > @@ -651,16 +651,16 @@ void read_cdat_data(struct cxl_port *port) > return; > } > > - cdat_table = devm_kzalloc(dev, cdat_length + sizeof(__le32), > + cdat_buf = devm_kzalloc(dev, cdat_length + sizeof(__le32), > GFP_KERNEL); > - if (!cdat_table) > + if (!cdat_buf) > return; > > - rc = cxl_cdat_read_table(dev, cdat_doe, cdat_table, &cdat_length); > + rc = cxl_cdat_read_table(dev, cdat_doe, cdat_buf, &cdat_length); > if (rc) > goto err; > > - cdat_table = cdat_table + sizeof(__le32); > + cdat_table = cdat_buf + sizeof(__le32); > if (cdat_checksum(cdat_table, cdat_length)) > goto err; > > @@ -670,7 +670,7 @@ void read_cdat_data(struct cxl_port *port) > > err: > /* Don't leave table data allocated on error */ > - devm_kfree(dev, cdat_table); > + devm_kfree(dev, cdat_buf); > dev_err(dev, "Failed to read/validate CDAT.\n"); > } > EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL); > > --- > base-commit: 7475e51b87969e01a6812eac713a1c8310372e8a > change-id: 20231116-fix-cdat-devm-free-b47d32b4b833 > > Best regards, > -- > Ira Weiny <ira.weiny@intel.com> > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] cxl/cdat: Free correct buffer on checksum error 2023-11-17 0:03 [PATCH] cxl/cdat: Free correct buffer on checksum error Ira Weiny 2023-11-17 15:50 ` Dave Jiang 2023-11-17 17:14 ` fan @ 2023-11-17 20:09 ` Robert Richter 2023-11-17 20:15 ` [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table Robert Richter 2 siblings, 1 reply; 13+ messages in thread From: Robert Richter @ 2023-11-17 20:09 UTC (permalink / raw) To: Ira Weiny Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma, Dan Williams, linux-cxl, linux-kernel, Fan Ni On 16.11.23 16:03:29, Ira Weiny wrote: > The new 6.7-rc1 kernel now checks the checksum on CDAT data. While > using a branch of Fan's DCD qemu work (and specifying DCD devices), the > following splat was observed. > > WARNING: CPU: 1 PID: 1384 at drivers/base/devres.c:1064 devm_kfree+0x4f/0x60 > ... > RIP: 0010:devm_kfree+0x4f/0x60 > ... > ? devm_kfree+0x4f/0x60 > read_cdat_data+0x1a0/0x2a0 [cxl_core] > cxl_port_probe+0xdf/0x200 [cxl_port] > ... > > The issue in qemu is still unknown but the spat is a straight forward > bug in the CDAT checksum processing code. Use a CDAT buffer variable to > ensure the devm_free() works correctly on error. > > Cc: jonathan.cameron@huawei.com > Cc: Fan Ni <nifan.cxl@gmail.com> > Fixes: 670e4e88f3b1 ("cxl: Add checksum verification to CDAT from CXL") > Cc: Dave Jiang <dave.jiang@intel.com> > Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> I will send an on-top patch for 6.8 that reworks that code area to remove the pointer arithmetic. -Robert ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table 2023-11-17 20:09 ` Robert Richter @ 2023-11-17 20:15 ` Robert Richter 2023-11-17 20:25 ` Dave Jiang ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Robert Richter @ 2023-11-17 20:15 UTC (permalink / raw) To: Dan Williams Cc: Ira Weiny, Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma, linux-cxl, linux-kernel, Fan Ni, Lukas Wunner On 17.11.23 21:09:18, Robert Richter wrote: > I will send an on-top patch for 6.8 that reworks that code area to > remove the pointer arithmetic. Here it is: From 13787f72c20b8c54754ae86015d982307eae0397 Mon Sep 17 00:00:00 2001 From: Robert Richter <rrichter@amd.com> Subject: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table Reading the CDAT table using DOE requires a Table Access Response Header in addition to the CDAT entry. In current implementation this has caused offsets with sizeof(__le32) to the actual buffers. This led to hardly readable code and even bugs (see fix of devm_kfree() in read_cdat_data()). Rework code to avoid calculations with sizeof(__le32). Introduce struct cdat_doe for this which contains the Table Access Response Header and a variable payload size for various data structures afterwards to access the CDAT table and its CDAT Data Structures without recalculating buffer offsets. Cc: Lukas Wunner <lukas@wunner.de> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Fan Ni <nifan.cxl@gmail.com> Signed-off-by: Robert Richter <rrichter@amd.com> --- drivers/cxl/core/pci.c | 80 ++++++++++++++++++++---------------------- drivers/cxl/cxlpci.h | 19 ++++++++++ 2 files changed, 57 insertions(+), 42 deletions(-) diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index 5aaa0b36c42a..f900740c6dea 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -517,14 +517,14 @@ EXPORT_SYMBOL_NS_GPL(cxl_hdm_decode_init, CXL); FIELD_PREP(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, (entry_handle))) static int cxl_cdat_get_length(struct device *dev, - struct pci_doe_mb *cdat_doe, + struct pci_doe_mb *doe_mb, size_t *length) { __le32 request = CDAT_DOE_REQ(0); __le32 response[2]; int rc; - rc = pci_doe(cdat_doe, PCI_DVSEC_VENDOR_ID_CXL, + rc = pci_doe(doe_mb, PCI_DVSEC_VENDOR_ID_CXL, CXL_DOE_PROTOCOL_TABLE_ACCESS, &request, sizeof(request), &response, sizeof(response)); @@ -542,56 +542,54 @@ static int cxl_cdat_get_length(struct device *dev, } static int cxl_cdat_read_table(struct device *dev, - struct pci_doe_mb *cdat_doe, - void *cdat_table, size_t *cdat_length) + struct pci_doe_mb *doe_mb, + struct cdat_doe *doe, size_t *length) { - size_t length = *cdat_length + sizeof(__le32); - __le32 *data = cdat_table; + size_t received, remaining = *length; int entry_handle = 0; __le32 saved_dw = 0; do { __le32 request = CDAT_DOE_REQ(entry_handle); - struct cdat_entry_header *entry; - size_t entry_dw; int rc; - rc = pci_doe(cdat_doe, PCI_DVSEC_VENDOR_ID_CXL, + rc = pci_doe(doe_mb, PCI_DVSEC_VENDOR_ID_CXL, CXL_DOE_PROTOCOL_TABLE_ACCESS, &request, sizeof(request), - data, length); + doe, sizeof(*doe) + remaining); if (rc < 0) { dev_err(dev, "DOE failed: %d", rc); return rc; } - /* 1 DW Table Access Response Header + CDAT entry */ - entry = (struct cdat_entry_header *)(data + 1); + if (rc < sizeof(*doe)) + return -EIO; + + received = rc - sizeof(*doe); + if ((entry_handle == 0 && - rc != sizeof(__le32) + sizeof(struct cdat_header)) || + received != sizeof(doe->header[0])) || (entry_handle > 0 && - (rc < sizeof(__le32) + sizeof(*entry) || - rc != sizeof(__le32) + le16_to_cpu(entry->length)))) + (received < sizeof(doe->entry[0]) || + received != le16_to_cpu(doe->entry->length)))) return -EIO; /* Get the CXL table access header entry handle */ entry_handle = FIELD_GET(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, - le32_to_cpu(data[0])); - entry_dw = rc / sizeof(__le32); - /* Skip Header */ - entry_dw -= 1; + le32_to_cpu(doe->doe_header)); + /* * Table Access Response Header overwrote the last DW of * previous entry, so restore that DW */ - *data = saved_dw; - length -= entry_dw * sizeof(__le32); - data += entry_dw; - saved_dw = *data; + doe->doe_header = saved_dw; + remaining -= received; + doe = (void *)doe + received; + saved_dw = doe->doe_header; } while (entry_handle != CXL_DOE_TABLE_ACCESS_LAST_ENTRY); /* Length in CDAT header may exceed concatenation of CDAT entries */ - *cdat_length -= length - sizeof(__le32); + *length -= remaining; return 0; } @@ -616,11 +614,11 @@ void read_cdat_data(struct cxl_port *port) { struct device *uport = port->uport_dev; struct device *dev = &port->dev; - struct pci_doe_mb *cdat_doe; + struct pci_doe_mb *doe_mb; struct pci_dev *pdev = NULL; struct cxl_memdev *cxlmd; - size_t cdat_length; - void *cdat_table, *cdat_buf; + struct cdat_doe *doe; + size_t length; int rc; if (is_cxl_memdev(uport)) { @@ -637,40 +635,38 @@ void read_cdat_data(struct cxl_port *port) if (!pdev) return; - cdat_doe = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, - CXL_DOE_PROTOCOL_TABLE_ACCESS); - if (!cdat_doe) { + doe_mb = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, + CXL_DOE_PROTOCOL_TABLE_ACCESS); + if (!doe_mb) { dev_dbg(dev, "No CDAT mailbox\n"); return; } port->cdat_available = true; - if (cxl_cdat_get_length(dev, cdat_doe, &cdat_length)) { + if (cxl_cdat_get_length(dev, doe_mb, &length)) { dev_dbg(dev, "No CDAT length\n"); return; } - cdat_buf = devm_kzalloc(dev, cdat_length + sizeof(__le32), - GFP_KERNEL); - if (!cdat_buf) - return; + doe = devm_kzalloc(dev, sizeof(*doe) + length, GFP_KERNEL); + if (!doe) + goto err; - rc = cxl_cdat_read_table(dev, cdat_doe, cdat_buf, &cdat_length); + rc = cxl_cdat_read_table(dev, doe_mb, doe, &length); if (rc) goto err; - cdat_table = cdat_buf + sizeof(__le32); - if (cdat_checksum(cdat_table, cdat_length)) + if (cdat_checksum(doe->table, length)) goto err; - port->cdat.table = cdat_table; - port->cdat.length = cdat_length; - return; + port->cdat.table = doe->table; + port->cdat.length = length; + return; err: /* Don't leave table data allocated on error */ - devm_kfree(dev, cdat_buf); + devm_kfree(dev, doe); dev_err(dev, "Failed to read/validate CDAT.\n"); } EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL); diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h index 0fa4799ea316..d12ed9d8dec1 100644 --- a/drivers/cxl/cxlpci.h +++ b/drivers/cxl/cxlpci.h @@ -85,6 +85,25 @@ struct cdat_entry_header { __le16 length; } __packed; +/* + * Response contains the CDAT only response header of the DOE. The + * response payload is a CDAT structure (either CDAT header or entry), + * it may also mark the beginning of the CDAT table. + * + * Spec refs: + * + * CXL 3.1 Table 8-14: Read Entry Response + * CDAT Specification 1.03: 2 CDAT Data Structures + */ +struct cdat_doe { + __le32 doe_header; + union { + u8 table[0]; + struct cdat_header header[0]; + struct cdat_entry_header entry[0]; + }; +} __packed; + int devm_cxl_port_enumerate_dports(struct cxl_port *port); struct cxl_dev_state; int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, -- 2.39.2 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table 2023-11-17 20:15 ` [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table Robert Richter @ 2023-11-17 20:25 ` Dave Jiang 2023-11-28 20:06 ` Ira Weiny 2023-12-15 4:34 ` Dan Williams 2 siblings, 0 replies; 13+ messages in thread From: Dave Jiang @ 2023-11-17 20:25 UTC (permalink / raw) To: Robert Richter, Dan Williams Cc: Ira Weiny, Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma, linux-cxl, linux-kernel, Fan Ni, Lukas Wunner On 11/17/23 13:15, Robert Richter wrote: > On 17.11.23 21:09:18, Robert Richter wrote: >> I will send an on-top patch for 6.8 that reworks that code area to >> remove the pointer arithmetic. > > Here it is: > > From 13787f72c20b8c54754ae86015d982307eae0397 Mon Sep 17 00:00:00 2001 > From: Robert Richter <rrichter@amd.com> > Subject: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table > > Reading the CDAT table using DOE requires a Table Access Response > Header in addition to the CDAT entry. In current implementation this > has caused offsets with sizeof(__le32) to the actual buffers. This led > to hardly readable code and even bugs (see fix of devm_kfree() in > read_cdat_data()). > > Rework code to avoid calculations with sizeof(__le32). Introduce > struct cdat_doe for this which contains the Table Access Response > Header and a variable payload size for various data structures > afterwards to access the CDAT table and its CDAT Data Structures > without recalculating buffer offsets. > > Cc: Lukas Wunner <lukas@wunner.de> > Cc: Dave Jiang <dave.jiang@intel.com> > Cc: Fan Ni <nifan.cxl@gmail.com> > Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/pci.c | 80 ++++++++++++++++++++---------------------- > drivers/cxl/cxlpci.h | 19 ++++++++++ > 2 files changed, 57 insertions(+), 42 deletions(-) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index 5aaa0b36c42a..f900740c6dea 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -517,14 +517,14 @@ EXPORT_SYMBOL_NS_GPL(cxl_hdm_decode_init, CXL); > FIELD_PREP(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, (entry_handle))) > > static int cxl_cdat_get_length(struct device *dev, > - struct pci_doe_mb *cdat_doe, > + struct pci_doe_mb *doe_mb, > size_t *length) > { > __le32 request = CDAT_DOE_REQ(0); > __le32 response[2]; > int rc; > > - rc = pci_doe(cdat_doe, PCI_DVSEC_VENDOR_ID_CXL, > + rc = pci_doe(doe_mb, PCI_DVSEC_VENDOR_ID_CXL, > CXL_DOE_PROTOCOL_TABLE_ACCESS, > &request, sizeof(request), > &response, sizeof(response)); > @@ -542,56 +542,54 @@ static int cxl_cdat_get_length(struct device *dev, > } > > static int cxl_cdat_read_table(struct device *dev, > - struct pci_doe_mb *cdat_doe, > - void *cdat_table, size_t *cdat_length) > + struct pci_doe_mb *doe_mb, > + struct cdat_doe *doe, size_t *length) > { > - size_t length = *cdat_length + sizeof(__le32); > - __le32 *data = cdat_table; > + size_t received, remaining = *length; > int entry_handle = 0; > __le32 saved_dw = 0; > > do { > __le32 request = CDAT_DOE_REQ(entry_handle); > - struct cdat_entry_header *entry; > - size_t entry_dw; > int rc; > > - rc = pci_doe(cdat_doe, PCI_DVSEC_VENDOR_ID_CXL, > + rc = pci_doe(doe_mb, PCI_DVSEC_VENDOR_ID_CXL, > CXL_DOE_PROTOCOL_TABLE_ACCESS, > &request, sizeof(request), > - data, length); > + doe, sizeof(*doe) + remaining); > if (rc < 0) { > dev_err(dev, "DOE failed: %d", rc); > return rc; > } > > - /* 1 DW Table Access Response Header + CDAT entry */ > - entry = (struct cdat_entry_header *)(data + 1); > + if (rc < sizeof(*doe)) > + return -EIO; > + > + received = rc - sizeof(*doe); > + > if ((entry_handle == 0 && > - rc != sizeof(__le32) + sizeof(struct cdat_header)) || > + received != sizeof(doe->header[0])) || > (entry_handle > 0 && > - (rc < sizeof(__le32) + sizeof(*entry) || > - rc != sizeof(__le32) + le16_to_cpu(entry->length)))) > + (received < sizeof(doe->entry[0]) || > + received != le16_to_cpu(doe->entry->length)))) > return -EIO; > > /* Get the CXL table access header entry handle */ > entry_handle = FIELD_GET(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, > - le32_to_cpu(data[0])); > - entry_dw = rc / sizeof(__le32); > - /* Skip Header */ > - entry_dw -= 1; > + le32_to_cpu(doe->doe_header)); > + > /* > * Table Access Response Header overwrote the last DW of > * previous entry, so restore that DW > */ > - *data = saved_dw; > - length -= entry_dw * sizeof(__le32); > - data += entry_dw; > - saved_dw = *data; > + doe->doe_header = saved_dw; > + remaining -= received; > + doe = (void *)doe + received; > + saved_dw = doe->doe_header; > } while (entry_handle != CXL_DOE_TABLE_ACCESS_LAST_ENTRY); > > /* Length in CDAT header may exceed concatenation of CDAT entries */ > - *cdat_length -= length - sizeof(__le32); > + *length -= remaining; > > return 0; > } > @@ -616,11 +614,11 @@ void read_cdat_data(struct cxl_port *port) > { > struct device *uport = port->uport_dev; > struct device *dev = &port->dev; > - struct pci_doe_mb *cdat_doe; > + struct pci_doe_mb *doe_mb; > struct pci_dev *pdev = NULL; > struct cxl_memdev *cxlmd; > - size_t cdat_length; > - void *cdat_table, *cdat_buf; > + struct cdat_doe *doe; > + size_t length; > int rc; > > if (is_cxl_memdev(uport)) { > @@ -637,40 +635,38 @@ void read_cdat_data(struct cxl_port *port) > if (!pdev) > return; > > - cdat_doe = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, > - CXL_DOE_PROTOCOL_TABLE_ACCESS); > - if (!cdat_doe) { > + doe_mb = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, > + CXL_DOE_PROTOCOL_TABLE_ACCESS); > + if (!doe_mb) { > dev_dbg(dev, "No CDAT mailbox\n"); > return; > } > > port->cdat_available = true; > > - if (cxl_cdat_get_length(dev, cdat_doe, &cdat_length)) { > + if (cxl_cdat_get_length(dev, doe_mb, &length)) { > dev_dbg(dev, "No CDAT length\n"); > return; > } > > - cdat_buf = devm_kzalloc(dev, cdat_length + sizeof(__le32), > - GFP_KERNEL); > - if (!cdat_buf) > - return; > + doe = devm_kzalloc(dev, sizeof(*doe) + length, GFP_KERNEL); > + if (!doe) > + goto err; > > - rc = cxl_cdat_read_table(dev, cdat_doe, cdat_buf, &cdat_length); > + rc = cxl_cdat_read_table(dev, doe_mb, doe, &length); > if (rc) > goto err; > > - cdat_table = cdat_buf + sizeof(__le32); > - if (cdat_checksum(cdat_table, cdat_length)) > + if (cdat_checksum(doe->table, length)) > goto err; > > - port->cdat.table = cdat_table; > - port->cdat.length = cdat_length; > - return; > + port->cdat.table = doe->table; > + port->cdat.length = length; > > + return; > err: > /* Don't leave table data allocated on error */ > - devm_kfree(dev, cdat_buf); > + devm_kfree(dev, doe); > dev_err(dev, "Failed to read/validate CDAT.\n"); > } > EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL); > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h > index 0fa4799ea316..d12ed9d8dec1 100644 > --- a/drivers/cxl/cxlpci.h > +++ b/drivers/cxl/cxlpci.h > @@ -85,6 +85,25 @@ struct cdat_entry_header { > __le16 length; > } __packed; > > +/* > + * Response contains the CDAT only response header of the DOE. The > + * response payload is a CDAT structure (either CDAT header or entry), > + * it may also mark the beginning of the CDAT table. > + * > + * Spec refs: > + * > + * CXL 3.1 Table 8-14: Read Entry Response > + * CDAT Specification 1.03: 2 CDAT Data Structures > + */ > +struct cdat_doe { > + __le32 doe_header; > + union { > + u8 table[0]; > + struct cdat_header header[0]; > + struct cdat_entry_header entry[0]; > + }; > +} __packed; > + > int devm_cxl_port_enumerate_dports(struct cxl_port *port); > struct cxl_dev_state; > int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table 2023-11-17 20:15 ` [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table Robert Richter 2023-11-17 20:25 ` Dave Jiang @ 2023-11-28 20:06 ` Ira Weiny 2023-12-07 21:18 ` Dan Williams 2024-01-05 14:49 ` Robert Richter 2023-12-15 4:34 ` Dan Williams 2 siblings, 2 replies; 13+ messages in thread From: Ira Weiny @ 2023-11-28 20:06 UTC (permalink / raw) To: Robert Richter, Dan Williams Cc: Ira Weiny, Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma, linux-cxl, linux-kernel, Fan Ni, Lukas Wunner Robert Richter wrote: > On 17.11.23 21:09:18, Robert Richter wrote: > > I will send an on-top patch for 6.8 that reworks that code area to > > remove the pointer arithmetic. > > Here it is: > > From 13787f72c20b8c54754ae86015d982307eae0397 Mon Sep 17 00:00:00 2001 > From: Robert Richter <rrichter@amd.com> > Subject: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table > > Reading the CDAT table using DOE requires a Table Access Response > Header in addition to the CDAT entry. In current implementation this > has caused offsets with sizeof(__le32) to the actual buffers. This led > to hardly readable code and even bugs (see fix of devm_kfree() in > read_cdat_data()). > > Rework code to avoid calculations with sizeof(__le32). Introduce > struct cdat_doe for this which contains the Table Access Response > Header and a variable payload size for various data structures > afterwards to access the CDAT table and its CDAT Data Structures > without recalculating buffer offsets. Thanks for this. > > Cc: Lukas Wunner <lukas@wunner.de> > Cc: Dave Jiang <dave.jiang@intel.com> > Cc: Fan Ni <nifan.cxl@gmail.com> > Signed-off-by: Robert Richter <rrichter@amd.com> > --- > drivers/cxl/core/pci.c | 80 ++++++++++++++++++++---------------------- > drivers/cxl/cxlpci.h | 19 ++++++++++ > 2 files changed, 57 insertions(+), 42 deletions(-) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index 5aaa0b36c42a..f900740c6dea 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -517,14 +517,14 @@ EXPORT_SYMBOL_NS_GPL(cxl_hdm_decode_init, CXL); > FIELD_PREP(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, (entry_handle))) > > static int cxl_cdat_get_length(struct device *dev, > - struct pci_doe_mb *cdat_doe, > + struct pci_doe_mb *doe_mb, NIT: Why change the variable name here? > size_t *length) > { > __le32 request = CDAT_DOE_REQ(0); > __le32 response[2]; > int rc; > > - rc = pci_doe(cdat_doe, PCI_DVSEC_VENDOR_ID_CXL, > + rc = pci_doe(doe_mb, PCI_DVSEC_VENDOR_ID_CXL, > CXL_DOE_PROTOCOL_TABLE_ACCESS, > &request, sizeof(request), > &response, sizeof(response)); > @@ -542,56 +542,54 @@ static int cxl_cdat_get_length(struct device *dev, > } > > static int cxl_cdat_read_table(struct device *dev, > - struct pci_doe_mb *cdat_doe, > - void *cdat_table, size_t *cdat_length) > + struct pci_doe_mb *doe_mb, > + struct cdat_doe *doe, size_t *length) > { > - size_t length = *cdat_length + sizeof(__le32); > - __le32 *data = cdat_table; > + size_t received, remaining = *length; > int entry_handle = 0; > __le32 saved_dw = 0; > > do { > __le32 request = CDAT_DOE_REQ(entry_handle); > - struct cdat_entry_header *entry; > - size_t entry_dw; > int rc; > > - rc = pci_doe(cdat_doe, PCI_DVSEC_VENDOR_ID_CXL, > + rc = pci_doe(doe_mb, PCI_DVSEC_VENDOR_ID_CXL, > CXL_DOE_PROTOCOL_TABLE_ACCESS, > &request, sizeof(request), > - data, length); > + doe, sizeof(*doe) + remaining); > if (rc < 0) { > dev_err(dev, "DOE failed: %d", rc); > return rc; > } > > - /* 1 DW Table Access Response Header + CDAT entry */ > - entry = (struct cdat_entry_header *)(data + 1); > + if (rc < sizeof(*doe)) > + return -EIO; > + > + received = rc - sizeof(*doe); This is not ideal. See comments on struct cdat_doe below. > + > if ((entry_handle == 0 && > - rc != sizeof(__le32) + sizeof(struct cdat_header)) || > + received != sizeof(doe->header[0])) || > (entry_handle > 0 && > - (rc < sizeof(__le32) + sizeof(*entry) || > - rc != sizeof(__le32) + le16_to_cpu(entry->length)))) > + (received < sizeof(doe->entry[0]) || > + received != le16_to_cpu(doe->entry->length)))) > return -EIO; > > /* Get the CXL table access header entry handle */ > entry_handle = FIELD_GET(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, > - le32_to_cpu(data[0])); > - entry_dw = rc / sizeof(__le32); > - /* Skip Header */ > - entry_dw -= 1; > + le32_to_cpu(doe->doe_header)); > + > /* > * Table Access Response Header overwrote the last DW of > * previous entry, so restore that DW > */ > - *data = saved_dw; > - length -= entry_dw * sizeof(__le32); > - data += entry_dw; > - saved_dw = *data; > + doe->doe_header = saved_dw; > + remaining -= received; > + doe = (void *)doe + received; > + saved_dw = doe->doe_header; > } while (entry_handle != CXL_DOE_TABLE_ACCESS_LAST_ENTRY); > > /* Length in CDAT header may exceed concatenation of CDAT entries */ > - *cdat_length -= length - sizeof(__le32); > + *length -= remaining; > > return 0; > } > @@ -616,11 +614,11 @@ void read_cdat_data(struct cxl_port *port) > { > struct device *uport = port->uport_dev; > struct device *dev = &port->dev; > - struct pci_doe_mb *cdat_doe; > + struct pci_doe_mb *doe_mb; > struct pci_dev *pdev = NULL; > struct cxl_memdev *cxlmd; > - size_t cdat_length; > - void *cdat_table, *cdat_buf; > + struct cdat_doe *doe; Since we are trying to make this cleaner I would prefer a struct and variable name of cdat_doe_rsp. So... struct cdat_doe_rsp *cdat_doe_rsp; > + size_t length; > int rc; > > if (is_cxl_memdev(uport)) { > @@ -637,40 +635,38 @@ void read_cdat_data(struct cxl_port *port) > if (!pdev) > return; > > - cdat_doe = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, > - CXL_DOE_PROTOCOL_TABLE_ACCESS); > - if (!cdat_doe) { > + doe_mb = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, > + CXL_DOE_PROTOCOL_TABLE_ACCESS); > + if (!doe_mb) { > dev_dbg(dev, "No CDAT mailbox\n"); > return; > } > > port->cdat_available = true; > > - if (cxl_cdat_get_length(dev, cdat_doe, &cdat_length)) { > + if (cxl_cdat_get_length(dev, doe_mb, &length)) { > dev_dbg(dev, "No CDAT length\n"); > return; > } > > - cdat_buf = devm_kzalloc(dev, cdat_length + sizeof(__le32), > - GFP_KERNEL); > - if (!cdat_buf) > - return; > + doe = devm_kzalloc(dev, sizeof(*doe) + length, GFP_KERNEL); > + if (!doe) > + goto err; > > - rc = cxl_cdat_read_table(dev, cdat_doe, cdat_buf, &cdat_length); > + rc = cxl_cdat_read_table(dev, doe_mb, doe, &length); > if (rc) > goto err; > > - cdat_table = cdat_buf + sizeof(__le32); > - if (cdat_checksum(cdat_table, cdat_length)) > + if (cdat_checksum(doe->table, length)) > goto err; > > - port->cdat.table = cdat_table; > - port->cdat.length = cdat_length; > - return; > + port->cdat.table = doe->table; As an aside: the type of port->cdat may need to change at some point too. > + port->cdat.length = length; > > + return; > err: > /* Don't leave table data allocated on error */ > - devm_kfree(dev, cdat_buf); > + devm_kfree(dev, doe); > dev_err(dev, "Failed to read/validate CDAT.\n"); > } > EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL); > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h > index 0fa4799ea316..d12ed9d8dec1 100644 > --- a/drivers/cxl/cxlpci.h > +++ b/drivers/cxl/cxlpci.h > @@ -85,6 +85,25 @@ struct cdat_entry_header { > __le16 length; > } __packed; > > +/* > + * Response contains the CDAT only response header of the DOE. The > + * response payload is a CDAT structure (either CDAT header or entry), > + * it may also mark the beginning of the CDAT table. > + * > + * Spec refs: > + * > + * CXL 3.1 Table 8-14: Read Entry Response > + * CDAT Specification 1.03: 2 CDAT Data Structures > + */ > +struct cdat_doe { > + __le32 doe_header; > + union { > + u8 table[0]; At a minimum we need to use flexible arrays here. See: .../Documentation/process/deprecated.rst 'Zero-length and one-element arrays' > + struct cdat_header header[0]; > + struct cdat_entry_header entry[0]; So this would need to be: DECLARE_FLEX_ARRAY(u8, table); DECLARE_FLEX_ARRAY(struct cdat_header, header); DECLARE_FLEX_ARRAY(struct cdat_entry_header, entry); Also I think it would be best to use sizeof_field(). That said I got crossed up when this structure was used to represent not only the individual DOE responses but also the cdat table as a whole. I think it would be best to call this cdat_doe_rsp and somehow make a distinction in read_cdat_data() when assigning to the entire port cdat table. Ira > + }; > +} __packed; > + > int devm_cxl_port_enumerate_dports(struct cxl_port *port); > struct cxl_dev_state; > int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, > -- > 2.39.2 > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table 2023-11-28 20:06 ` Ira Weiny @ 2023-12-07 21:18 ` Dan Williams 2023-12-08 22:52 ` Ira Weiny 2024-01-05 14:49 ` Robert Richter 1 sibling, 1 reply; 13+ messages in thread From: Dan Williams @ 2023-12-07 21:18 UTC (permalink / raw) To: Ira Weiny, Robert Richter, Dan Williams Cc: Ira Weiny, Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma, linux-cxl, linux-kernel, Fan Ni, Lukas Wunner Ira Weiny wrote: > Robert Richter wrote: [..] > > - cdat_table = cdat_buf + sizeof(__le32); > > - if (cdat_checksum(cdat_table, cdat_length)) > > + if (cdat_checksum(doe->table, length)) > > goto err; > > > > - port->cdat.table = cdat_table; > > - port->cdat.length = cdat_length; > > - return; > > + port->cdat.table = doe->table; > > As an aside: the type of port->cdat may need to change at some point too. I did not understand this comment relative to what you expect to see in the next posting, but maybe Robert will when he respins this patch. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table 2023-12-07 21:18 ` Dan Williams @ 2023-12-08 22:52 ` Ira Weiny 0 siblings, 0 replies; 13+ messages in thread From: Ira Weiny @ 2023-12-08 22:52 UTC (permalink / raw) To: Dan Williams, Ira Weiny, Robert Richter Cc: Ira Weiny, Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma, linux-cxl, linux-kernel, Fan Ni, Lukas Wunner Dan Williams wrote: > Ira Weiny wrote: > > Robert Richter wrote: > [..] > > > - cdat_table = cdat_buf + sizeof(__le32); > > > - if (cdat_checksum(cdat_table, cdat_length)) > > > + if (cdat_checksum(doe->table, length)) > > > goto err; > > > > > > - port->cdat.table = cdat_table; > > > - port->cdat.length = cdat_length; > > > - return; > > > + port->cdat.table = doe->table; > > > > As an aside: the type of port->cdat may need to change at some point too. > > I did not understand this comment relative to what you expect to see in > the next posting, but maybe Robert will when he respins this patch. Robert introduced some new types which better defined what the read algorithm was doing. Right now the table is just a void *. It would be nice to type it more strongly as it starts to be used in the kernel. Ira ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table 2023-11-28 20:06 ` Ira Weiny 2023-12-07 21:18 ` Dan Williams @ 2024-01-05 14:49 ` Robert Richter 1 sibling, 0 replies; 13+ messages in thread From: Robert Richter @ 2024-01-05 14:49 UTC (permalink / raw) To: Ira Weiny Cc: Dan Williams, Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma, linux-cxl, linux-kernel, Fan Ni, Lukas Wunner On 28.11.23 12:06:27, Ira Weiny wrote: > Robert Richter wrote: > > On 17.11.23 21:09:18, Robert Richter wrote: > > > I will send an on-top patch for 6.8 that reworks that code area to > > > remove the pointer arithmetic. > > > > Here it is: > > > > From 13787f72c20b8c54754ae86015d982307eae0397 Mon Sep 17 00:00:00 2001 > > From: Robert Richter <rrichter@amd.com> > > Subject: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table > > > > Reading the CDAT table using DOE requires a Table Access Response > > Header in addition to the CDAT entry. In current implementation this > > has caused offsets with sizeof(__le32) to the actual buffers. This led > > to hardly readable code and even bugs (see fix of devm_kfree() in > > read_cdat_data()). > > > > Rework code to avoid calculations with sizeof(__le32). Introduce > > struct cdat_doe for this which contains the Table Access Response > > Header and a variable payload size for various data structures > > afterwards to access the CDAT table and its CDAT Data Structures > > without recalculating buffer offsets. > > Thanks for this. Thanks for your comments. > > > > > Cc: Lukas Wunner <lukas@wunner.de> > > Cc: Dave Jiang <dave.jiang@intel.com> > > Cc: Fan Ni <nifan.cxl@gmail.com> > > Signed-off-by: Robert Richter <rrichter@amd.com> > > --- > > drivers/cxl/core/pci.c | 80 ++++++++++++++++++++---------------------- > > drivers/cxl/cxlpci.h | 19 ++++++++++ > > 2 files changed, 57 insertions(+), 42 deletions(-) > > > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > > index 5aaa0b36c42a..f900740c6dea 100644 > > --- a/drivers/cxl/core/pci.c > > +++ b/drivers/cxl/core/pci.c > > @@ -517,14 +517,14 @@ EXPORT_SYMBOL_NS_GPL(cxl_hdm_decode_init, CXL); > > FIELD_PREP(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, (entry_handle))) > > > > static int cxl_cdat_get_length(struct device *dev, > > - struct pci_doe_mb *cdat_doe, > > + struct pci_doe_mb *doe_mb, > > NIT: Why change the variable name here? There was a conflict with cdat_doe, so I renamed this. Now, that cdat_doe is not introduced any longer there is not really a need. On the other side 'doe_mb' is much more describing the actual meaning, so I decided to change the name anyway in the next respin but do this with a separate patch. This makes esp. this patch much more readable. And of course. Jon would complain about it. :-) > > > size_t *length) > > { > > __le32 request = CDAT_DOE_REQ(0); > > __le32 response[2]; > > int rc; > > > > - rc = pci_doe(cdat_doe, PCI_DVSEC_VENDOR_ID_CXL, > > + rc = pci_doe(doe_mb, PCI_DVSEC_VENDOR_ID_CXL, > > CXL_DOE_PROTOCOL_TABLE_ACCESS, > > &request, sizeof(request), > > &response, sizeof(response)); > > @@ -542,56 +542,54 @@ static int cxl_cdat_get_length(struct device *dev, > > } > > > > static int cxl_cdat_read_table(struct device *dev, > > - struct pci_doe_mb *cdat_doe, > > - void *cdat_table, size_t *cdat_length) > > + struct pci_doe_mb *doe_mb, > > + struct cdat_doe *doe, size_t *length) > > { > > - size_t length = *cdat_length + sizeof(__le32); > > - __le32 *data = cdat_table; > > + size_t received, remaining = *length; > > int entry_handle = 0; > > __le32 saved_dw = 0; > > > > do { > > __le32 request = CDAT_DOE_REQ(entry_handle); > > - struct cdat_entry_header *entry; > > - size_t entry_dw; > > int rc; > > > > - rc = pci_doe(cdat_doe, PCI_DVSEC_VENDOR_ID_CXL, > > + rc = pci_doe(doe_mb, PCI_DVSEC_VENDOR_ID_CXL, > > CXL_DOE_PROTOCOL_TABLE_ACCESS, > > &request, sizeof(request), > > - data, length); > > + doe, sizeof(*doe) + remaining); > > if (rc < 0) { > > dev_err(dev, "DOE failed: %d", rc); > > return rc; > > } > > > > - /* 1 DW Table Access Response Header + CDAT entry */ > > - entry = (struct cdat_entry_header *)(data + 1); > > + if (rc < sizeof(*doe)) > > + return -EIO; > > + > > + received = rc - sizeof(*doe); > > This is not ideal. See comments on struct cdat_doe below. Thanks for your kernel doc pointer, I will rework the flex array handling. Here, there is not other way to get the header size of the struct (without the flexible arrays). Looking esp. at the implementation of struct_size() using sizeof() for actual variables is allowed and safe to use. > > > + > > if ((entry_handle == 0 && > > - rc != sizeof(__le32) + sizeof(struct cdat_header)) || > > + received != sizeof(doe->header[0])) || > > (entry_handle > 0 && > > - (rc < sizeof(__le32) + sizeof(*entry) || > > - rc != sizeof(__le32) + le16_to_cpu(entry->length)))) > > + (received < sizeof(doe->entry[0]) || > > + received != le16_to_cpu(doe->entry->length)))) > > return -EIO; > > > > /* Get the CXL table access header entry handle */ > > entry_handle = FIELD_GET(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, > > - le32_to_cpu(data[0])); > > - entry_dw = rc / sizeof(__le32); > > - /* Skip Header */ > > - entry_dw -= 1; > > + le32_to_cpu(doe->doe_header)); > > + > > /* > > * Table Access Response Header overwrote the last DW of > > * previous entry, so restore that DW > > */ > > - *data = saved_dw; > > - length -= entry_dw * sizeof(__le32); > > - data += entry_dw; > > - saved_dw = *data; > > + doe->doe_header = saved_dw; > > + remaining -= received; > > + doe = (void *)doe + received; > > + saved_dw = doe->doe_header; > > } while (entry_handle != CXL_DOE_TABLE_ACCESS_LAST_ENTRY); > > > > /* Length in CDAT header may exceed concatenation of CDAT entries */ > > - *cdat_length -= length - sizeof(__le32); > > + *length -= remaining; > > > > return 0; > > } > > @@ -616,11 +614,11 @@ void read_cdat_data(struct cxl_port *port) > > { > > struct device *uport = port->uport_dev; > > struct device *dev = &port->dev; > > - struct pci_doe_mb *cdat_doe; > > + struct pci_doe_mb *doe_mb; > > struct pci_dev *pdev = NULL; > > struct cxl_memdev *cxlmd; > > - size_t cdat_length; > > - void *cdat_table, *cdat_buf; > > + struct cdat_doe *doe; > > Since we are trying to make this cleaner I would prefer a struct and > variable name of cdat_doe_rsp. > > So... > struct cdat_doe_rsp *cdat_doe_rsp; Looks reasonable, though I will use the shorter 'rsp' for the variable. > > > + size_t length; > > int rc; > > > > if (is_cxl_memdev(uport)) { > > @@ -637,40 +635,38 @@ void read_cdat_data(struct cxl_port *port) > > if (!pdev) > > return; > > > > - cdat_doe = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, > > - CXL_DOE_PROTOCOL_TABLE_ACCESS); > > - if (!cdat_doe) { > > + doe_mb = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, > > + CXL_DOE_PROTOCOL_TABLE_ACCESS); > > + if (!doe_mb) { > > dev_dbg(dev, "No CDAT mailbox\n"); > > return; > > } > > > > port->cdat_available = true; > > > > - if (cxl_cdat_get_length(dev, cdat_doe, &cdat_length)) { > > + if (cxl_cdat_get_length(dev, doe_mb, &length)) { > > dev_dbg(dev, "No CDAT length\n"); > > return; > > } > > > > - cdat_buf = devm_kzalloc(dev, cdat_length + sizeof(__le32), > > - GFP_KERNEL); > > - if (!cdat_buf) > > - return; > > + doe = devm_kzalloc(dev, sizeof(*doe) + length, GFP_KERNEL); > > + if (!doe) > > + goto err; > > > > - rc = cxl_cdat_read_table(dev, cdat_doe, cdat_buf, &cdat_length); > > + rc = cxl_cdat_read_table(dev, doe_mb, doe, &length); > > if (rc) > > goto err; > > > > - cdat_table = cdat_buf + sizeof(__le32); > > - if (cdat_checksum(cdat_table, cdat_length)) > > + if (cdat_checksum(doe->table, length)) > > goto err; > > > > - port->cdat.table = cdat_table; > > - port->cdat.length = cdat_length; > > - return; > > + port->cdat.table = doe->table; > > As an aside: the type of port->cdat may need to change at some point too. > > > + port->cdat.length = length; > > > > + return; > > err: > > /* Don't leave table data allocated on error */ > > - devm_kfree(dev, cdat_buf); > > + devm_kfree(dev, doe); > > dev_err(dev, "Failed to read/validate CDAT.\n"); > > } > > EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL); > > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h > > index 0fa4799ea316..d12ed9d8dec1 100644 > > --- a/drivers/cxl/cxlpci.h > > +++ b/drivers/cxl/cxlpci.h > > @@ -85,6 +85,25 @@ struct cdat_entry_header { > > __le16 length; > > } __packed; > > > > +/* > > + * Response contains the CDAT only response header of the DOE. The > > + * response payload is a CDAT structure (either CDAT header or entry), > > + * it may also mark the beginning of the CDAT table. > > + * > > + * Spec refs: > > + * > > + * CXL 3.1 Table 8-14: Read Entry Response > > + * CDAT Specification 1.03: 2 CDAT Data Structures > > + */ > > +struct cdat_doe { > > + __le32 doe_header; > > + union { > > + u8 table[0]; > > At a minimum we need to use flexible arrays here. See: > > .../Documentation/process/deprecated.rst > 'Zero-length and one-element arrays' > > > + struct cdat_header header[0]; > > + struct cdat_entry_header entry[0]; > > So this would need to be: > > DECLARE_FLEX_ARRAY(u8, table); > DECLARE_FLEX_ARRAY(struct cdat_header, header); > DECLARE_FLEX_ARRAY(struct cdat_entry_header, entry); Changed that. > > Also I think it would be best to use sizeof_field(). > > That said I got crossed up when this structure was used to represent not > only the individual DOE responses but also the cdat table as a whole. > > I think it would be best to call this cdat_doe_rsp and somehow make a > distinction in read_cdat_data() when assigning to the entire port cdat > table. I will name the table buffer's variable 'buf' and add a comment. Thanks, -Robert > > Ira > > > + }; > > +} __packed; > > + > > int devm_cxl_port_enumerate_dports(struct cxl_port *port); > > struct cxl_dev_state; > > int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, > > -- > > 2.39.2 > > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table 2023-11-17 20:15 ` [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table Robert Richter 2023-11-17 20:25 ` Dave Jiang 2023-11-28 20:06 ` Ira Weiny @ 2023-12-15 4:34 ` Dan Williams 2024-01-04 8:41 ` Robert Richter 2024-01-04 13:43 ` Robert Richter 2 siblings, 2 replies; 13+ messages in thread From: Dan Williams @ 2023-12-15 4:34 UTC (permalink / raw) To: Robert Richter, Dan Williams Cc: Ira Weiny, Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma, linux-cxl, linux-kernel, Fan Ni, Lukas Wunner Robert Richter wrote: > On 17.11.23 21:09:18, Robert Richter wrote: > > I will send an on-top patch for 6.8 that reworks that code area to > > remove the pointer arithmetic. > > Here it is: > > From 13787f72c20b8c54754ae86015d982307eae0397 Mon Sep 17 00:00:00 2001 > From: Robert Richter <rrichter@amd.com> > Subject: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table > > Reading the CDAT table using DOE requires a Table Access Response > Header in addition to the CDAT entry. In current implementation this > has caused offsets with sizeof(__le32) to the actual buffers. This led > to hardly readable code and even bugs (see fix of devm_kfree() in > read_cdat_data()). > > Rework code to avoid calculations with sizeof(__le32). Introduce > struct cdat_doe for this which contains the Table Access Response > Header and a variable payload size for various data structures > afterwards to access the CDAT table and its CDAT Data Structures > without recalculating buffer offsets. I like reworking the code to introduce an explicit type for the response buffer, but as Ira points out, lets call it a "response" not a "cdat_doe". The feedback on the flex array is accurate, but I see no reason to have 3 flex arrays vs: struct cdat_response { __le32 doe_header; union { struct cdat_header header; struct cdat_entry_header entry; u8 table[]; }; } __packed; As far as I can see nothing outside of drivers/cxl/core/pci.c needs 'struct cdat_response', so it can stay local to this C file. While you are working on that I will do another lead-in cleanup to kill the goto in cxl_cdat_read_table() and let you come back and kill off the open-coded "+ sizeof(__le32)" that I will leave behind. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table 2023-12-15 4:34 ` Dan Williams @ 2024-01-04 8:41 ` Robert Richter 2024-01-04 13:43 ` Robert Richter 1 sibling, 0 replies; 13+ messages in thread From: Robert Richter @ 2024-01-04 8:41 UTC (permalink / raw) To: Dan Williams Cc: Ira Weiny, Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma, linux-cxl, linux-kernel, Fan Ni, Lukas Wunner This threat slipped away end of last year... On 14.12.23 20:34:09, Dan Williams wrote: > Robert Richter wrote: > > On 17.11.23 21:09:18, Robert Richter wrote: > > > I will send an on-top patch for 6.8 that reworks that code area to > > > remove the pointer arithmetic. > > > > Here it is: > > > > From 13787f72c20b8c54754ae86015d982307eae0397 Mon Sep 17 00:00:00 2001 > > From: Robert Richter <rrichter@amd.com> > > Subject: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table > > > > Reading the CDAT table using DOE requires a Table Access Response > > Header in addition to the CDAT entry. In current implementation this > > has caused offsets with sizeof(__le32) to the actual buffers. This led > > to hardly readable code and even bugs (see fix of devm_kfree() in > > read_cdat_data()). > > > > Rework code to avoid calculations with sizeof(__le32). Introduce > > struct cdat_doe for this which contains the Table Access Response > > Header and a variable payload size for various data structures > > afterwards to access the CDAT table and its CDAT Data Structures > > without recalculating buffer offsets. > > I like reworking the code to introduce an explicit type for the response > buffer, but as Ira points out, lets call it a "response" not a > "cdat_doe". Looks good. > > The feedback on the flex array is accurate, but I see no reason to have > 3 flex arrays vs: > > struct cdat_response { > __le32 doe_header; > union { > struct cdat_header header; > struct cdat_entry_header entry; > u8 table[]; > }; > } __packed; The flex arrays are due to sizeof(*doe) which is just the size of the base payload without any variable data then. Another nice effect of this is pointer creation of @header and @entry: doe->header vs. &doe->header etc. ... which aligns with doe->table too. This all leads to well readable code. > > As far as I can see nothing outside of drivers/cxl/core/pci.c needs > 'struct cdat_response', so it can stay local to this C file. > > While you are working on that I will do another lead-in cleanup to kill > the goto in cxl_cdat_read_table() and let you come back and kill off the > open-coded "+ sizeof(__le32)" that I will leave behind. I briefly looked into your patch, but will send for reference a v2 with a rebase onto cxl/next and small updates only. I could prepare a v3 that bases on your patch afterwards. Thanks, -Robert ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table 2023-12-15 4:34 ` Dan Williams 2024-01-04 8:41 ` Robert Richter @ 2024-01-04 13:43 ` Robert Richter 1 sibling, 0 replies; 13+ messages in thread From: Robert Richter @ 2024-01-04 13:43 UTC (permalink / raw) To: Dan Williams Cc: Ira Weiny, Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma, linux-cxl, linux-kernel, Fan Ni, Lukas Wunner On 14.12.23 20:34:09, Dan Williams wrote: > struct cdat_response { > __le32 doe_header; > union { > struct cdat_header header; > struct cdat_entry_header entry; > u8 table[]; > }; > } __packed; > > As far as I can see nothing outside of drivers/cxl/core/pci.c needs > 'struct cdat_response', so it can stay local to this C file. I moved that close to cdat_header and cdat_entry_header which is also defined in cxlpci.h but only used in core/pci.c. I would like to keep them together but we could move that completely into pci.c in a separate patch? Thanks, -Robert ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-01-05 14:49 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-11-17 0:03 [PATCH] cxl/cdat: Free correct buffer on checksum error Ira Weiny 2023-11-17 15:50 ` Dave Jiang 2023-11-17 17:14 ` fan 2023-11-17 20:09 ` Robert Richter 2023-11-17 20:15 ` [PATCH] cxl/pci: Get rid of pointer arithmetic reading CDAT table Robert Richter 2023-11-17 20:25 ` Dave Jiang 2023-11-28 20:06 ` Ira Weiny 2023-12-07 21:18 ` Dan Williams 2023-12-08 22:52 ` Ira Weiny 2024-01-05 14:49 ` Robert Richter 2023-12-15 4:34 ` Dan Williams 2024-01-04 8:41 ` Robert Richter 2024-01-04 13:43 ` Robert Richter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox